Job Curation: what curation actually is

A vendor management system sends us a row that says something like TRAVEL---RN---ICU-MED-3W, a company code, a numeric rate, a start date, a position count, and a paragraph of free text glued to a PDF. That is not a job. Nothing in that payload could be shown to a clinician and called a posting. No canonical unit, no validated rate, no rule about who can apply. The vendor sent us source material. Someone has to turn it into a record the marketplace can use.

For most of Trusted’s history, that someone was a human curator in our Manage app, working a queue. Recently, an agent has started doing the same work against the same form and the same validations. Almost every conversation about that agent skips the part that makes it possible to build: a clear answer to the question “what is curation?” If you can’t answer that without pointing at the UI, you can’t automate it. Answer it by pointing at the UI and you end up automating a screen instead of a domain.

Read this as the prerequisite to Agents that fill out forms. The rule shape, the lifecycle, the four tasks behind the form---all of it came out of years of doing this by hand, long before any agent was in the room. Build the model right and the autocuration architecture follows. Skip it, and you build an agent filling out a form that doesn’t mean anything. Automate the actor, not the model.

Four tasks, not one

The first mistake we made, and watched other teams make, was treating curation as a single task that happens to be slow. It isn’t. It’s four distinct tasks that share a UI and a validation contract.

Four equal-width sections in a row labeled Decision (the verdict: approve, reject, escalate), Mapping (specialty to canonical clinical unit), Job Details (rate, shifts, hours, dates), and Rules (certs, experience, float policy, EMR / charge). All four converge with mint connectors into a single CURATED JOB pill at the bottom.

A decision: does this job proceed to the marketplace, get rejected outright, or sit on a waitlist until something resolves? A duplicate of a job we already posted yesterday. A rate that’s implausible for the unit. A facility we don’t contract with. Each of these terminates the curation flow with a different exit code.

A mapping: which canonical clinical unit does the vendor string refer to? We maintain a taxonomy of clinical units: Medical-Surgical ICU, Step-Down, Cardiac Cath Lab, Labor & Delivery, hundreds of them. The vendor sends abbreviations, internal codes, free-text fragments. A clinician searching for “Med-Surg ICU” finds this job only if the mapping is right.

Job details: the structured record. Shift type. Hours per week. Number of positions. Bill rate. Start and end date. Location. The vendor payload contains all of these somewhere---sometimes in a structured field, sometimes glued into a description, sometimes implicit in an attached PDF.

Job rules: the constraints. Required certifications, years of experience, EMR proficiency, charge-nurse status, floating policies, holiday coverage, license types. Every gate that determines who is eligible to accept the shift. These come partly from the vendor payload and partly from standing program rules at the client.

Each task has its own failure mode. A bad mapping means the right clinician can’t find the job. A wrong rate produces a wrong pay package. A missing rule lets unqualified clinicians apply and get bounced later in the funnel, which costs both sides. A wrong decision puts a duplicate or a bad job in front of the market and erodes trust in the catalog. Lumping these together as “curation quality” obscures which kind of error is happening and where to invest.

The four tasks also need different inputs and different validation. Decision compares against existing postings and client policy. Mapping needs the unit taxonomy. Details needs schema-level field validation. Rules needs the rule data model plus a way to merge job-specific rules with program and global defaults. They share a final commit; the work to get there has four different shapes.

The curation data model

Two model shapes carry the load.

The first is Curation, a single record attached to each job. It holds the decision and its reasoning. The decision is one of a small set, and each decision class has its own enumerated reasons. A few from the production model:

approve: the standard path
reject with reasons like duplicate, rate_implausible, facility_not_under_contract, closed_to_travelers, posting_appears_invalid
waitlist with reasons like awaiting_program_confirmation, awaiting_rule_clarification, awaiting_facility_setup

The reason is not optional. Every terminal state has to identify why it ended up there, in machine-readable form, because downstream systems behave differently depending on the cause. A rejected duplicate is recycled silently. A facility_not_under_contract raises a flag to the contracts team. A waitlisted job blocks marketplace visibility but stays available for re-curation when its blocker clears.

The second shape is JobRules, and this is the part that gets misunderstood. It’s not a table. It’s a polymorphic set of rule models, each owning one rule type with its own attributes and validations. References, SkillsChecklist, Certifications, AcuteCareExperience, LocationRestriction, FloatingPolicy, and more. A certifications rule has a list of credential codes. A floating policy has a yes/no plus an allowlist of acceptable float units. A reference rule has a count, a relationship type, and a recency window. Each constraint has its own shape, so each lives in its own model with its own validation logic.

Rules also live at three levels: the job itself, the program (the client’s standing configuration), and a global default. The effective rule set for a specific job is resolved by walking up the hierarchy, with job-level rules overriding program-level rules overriding the global default. It’s the same shape we use elsewhere on the platform, and it’s the reason we can onboard a new program by writing rule data instead of writing code.

The implication for both the UI and the agent is the same: the curation form is a view onto a structured domain, not a screen with a few text fields and a save button. Filling out the form means writing to typed rule models, going through their validations, and persisting them together as a transaction.

One job, five states

Curation is a state machine, and the states tell you who’s responsible for the next move.

The states are defined in a single enumeration.

module JobAutoCurationStatus
  INGESTED         = "ingested"
  PENDING          = "pending"
  APPROVED         = "approved"
  REJECTED         = "rejected"
  AUTO_TRANSITIONED = "auto_transitioned"
end

ingested is the raw state, fresh from the ingestion pipeline. Vendor data has been normalized into a Job record but no curation has run. Nothing is shown to anyone.

pending means curation is required and hasn’t completed. Either the autocuration agent hasn’t run on this job, or it ran and stopped short: low confidence, an unresolved validation, an ambiguity that needs a human eye. Pending jobs are the human queue.

approved and rejected are terminal. The job is either in the marketplace or it isn’t, with a Curation record explaining why.

auto_transitioned is the state we added when the agent shipped. It’s approved plus a flag that the approval came from the autocuration pipeline rather than a human action. We keep the distinction because we audit the two populations separately, and because auditor feedback on auto-transitioned jobs is the signal we use to keep the agent honest in production. We want that seam visible in the data, not buried in a boolean on a metadata column.

What the enumeration lets us say cleanly: every job has one current state, every transition is logged, the human queue is a query (status = pending), and the agent runs against the same state machine the humans do. No parallel path. No shadow status.

What makes curation hard at scale

Volume is the obvious one. A new vendor relationship can dump hundreds of jobs into the queue in a single ingestion run. Multiplied by the count of our vendor relationships and the cadence at which they refresh, the inbound rate dwarfs anything a static curation team can keep up with. Queues that grow faster than they drain become first-in-never-out.

Consistency is harder than volume. The same position title means different things at different health systems. ICU-MED-3W at one facility might be a Medical-Surgical ICU on the third floor; at another it’s an outpatient infusion suite renamed in some old reorg and never updated in the VMS. Mapping that correctly takes institutional knowledge, or a feedback loop that captures it.

Rules evolve. A certification list changes. A new EMR rolls out. A floating policy gets renegotiated. When rules change, previously curated jobs at that program may or may not need re-curation, depending on which rules changed. The domain model has to support “re-curate this slice” as a first-class operation, not a manual cleanup.

Ground truth is delayed. The only honest signal that a curation was right is whether the right clinician ended up submitted, qualified, and placed. That feedback arrives days or weeks later. The audit signals we use day-to-day---auditor thumbs up, auditor thumbs down with a note---are proxies, calibrated against the placement signal over longer windows.

None of these problems get simpler by automating the actor. They’re the same problems whether a human or an agent is doing the work. That is the whole reason the domain model has to come first.

Automate the actor, not the model

That is the architectural lesson, and it’s the reason this post exists as a prerequisite to Agents that fill out forms.

When we built v1 of the autocuration agent, we treated the LLM as the unit of work and the curation result as something the model produced. We sent the raw job in, asked for a fully curated record out, and hoped the schema would hold. It worked often enough to be misleading. When it failed, the failures weren’t localizable: the agent and the domain model were entangled inside one prompt.

When we rebuilt for v2, we inverted the relationship. The domain model came first---the curation form, the rule models, the validation contract, the state machine, sized and structured around the work itself, independent of who or what was doing it. The agent got bolted onto that model by exposing the form as a set of tools. The agent operates the form; the form enforces the model. The model exists whether or not an agent is in the room.

The phrase we keep coming back to: automate the actor, not the model. The work is the same work. What changes is who’s doing it.

A few things follow from that ordering.

One validation contract covers both actors. The error a human sees when they save an invalid combination is the same error the agent sees when it submits. There is no “what AI is allowed to do” rulebook that drifts from “what the curator UI lets a human do.” Constraints live in one place, owned by the engineers who own curation, not by the engineers who own the agent.

Auditing is uniform. A curation produced by an agent and a curation produced by a human are the same kind of artifact in the database. The audit tooling, the feedback queue, the eval harness, the metrics dashboard---all of them treat both populations as members of the same set, with an auto_transitioned flag to slice when needed. We aren’t maintaining two pipelines of analytics for the same domain.

New automation is incremental. The next agent we ship doesn’t invent a new way of writing to the model. Same form, same validations, same state machine. The cost of adding an automated actor is the cost of the agent itself, not the cost of building it a parallel domain. That leverage is what lets us treat Agents that fill out forms as a generalizable pattern rather than a one-off for curation.

--- Engineering