Convert PDF forms
into clean JSON

Upload a PDF, extract its form fields, review AI labels, and produce a validated Schema V3 structure ready for downstream automation.

01 Widget extraction

02 Human review

03 Schema V3

Drag & drop your PDF

or click to browse

Upload target

Form filler

Open completed forms and download filled PDFs.

Classic filler V2 filler

Platform Demo

How CM-PDF works

CM-PDF is a human-in-the-loop pipeline for turning messy PDF forms into reliable structured data. AI does the heavy extraction work, while reviewers keep control over labels, widget ownership, table structure, and the final database update.

Upload a PDF form

The platform stores the source PDF, detects duplicates, and prepares it for extraction.

Extract PDF widgets

AcroForm fields are detected with page, type, bounds, and nearby OCR text for reviewer context.

Link widgets to labels

A vision model proposes human-readable labels, then reviewers correct missed or ambiguous fields.

Build Schema V3

The form becomes clean JSON: sections, text, fields, groups, irregular tables, and repeat groups.

Review and update DB

Validation blocks duplicate, unknown, or missing widget ownership before saving the final output.

Live workflow preview

Stage 3 Form Builder

18/18 widgetsSchema V3

PDF widgetsPage 1

page1-widget1 - First name

page1-widget2 - Date of birth

Yes

reviewer links fields to source widgets

Clean schema outputstored in MongoDB

{
  "schema_version": 3,
  "title": "Patient Intake",
  "pages": [{
    "page": 1,
    "items": [
      { "kind": "section",
        "label": "Applicant" },
      { "kind": "field",
        "label": "First name",
        "input_type": "text",
        "widgets": [
          { "widget_id": "page1-widget1" }
        ] },
      { "kind": "field",
        "label": "Consent?",
        "input_type": "radio",
        "choices": [
          { "label": "Yes",
            "widget_id": "page1-widget3" },
          { "label": "No",
            "widget_id": "page1-widget4" }
        ] }
    ]
  }]
}

Precise Highlighting

Review every extracted PDF widget against the original page.

Smart Workspace

Move through extraction, linking, and schema review in one flow.

Universal Export

Save validated Schema V3 output for automation and training data.

Supports PDF files only

Documentation Demo Form Filler Form Filler · v2 Workspace →Workspace · v2 →

Convert PDF forms into clean JSON