Metadata & Filtering

Metadata is arbitrary key-value data you attach to each document at ingest time. Design your schema before you start ingesting — it determines how precisely you can scope queries later.

Attaching metadata at ingest

Any JSON object is valid. Use keys that reflect how you want to filter — by department, client, document type, date, or any dimension that matters to your application.

python

httpx.post(
    "https://api.usejack.io/v1/ingest",
    headers={"Authorization": "Bearer jack_xxxxxxxx"},
    json={
        "documents": [
            {
                "text": "..contract text..",
                "metadata": {
                    "document_type": "contract",
                    "client_id":     "CLIENT-001",
                    "department":    "legal",
                    "year":          "2024",
                    "signed":        "true"
                }
            }
        ]
    }
)

All metadata values must be strings. If you store numbers or booleans, convert them to strings before ingest: "year": "2024" not "year": 2024.

Filtering at query time

Pass a filters object to POST /v1/query. Filters use AND logic — all specified keys must match. Documents without the filtered key are excluded from retrieval entirely.

python

# Scope to one department
{"question": "What is the leave policy?", "filters": {"department": "HR"}}

# Scope to one document type
{"question": "Find termination clauses", "filters": {"document_type": "contract"}}

# Scope to one client
{"question": "What are the payment terms?", "filters": {"client_id": "CLIENT-001"}}

# Multiple filters — AND logic
{"question": "...", "filters": {"department": "legal", "year": "2024"}}

# No filter — searches across all documents in your org
{"question": "Summarise our refund policy"}

Schema design tips

Decide your schema before ingesting

You can't add metadata to documents after ingest without re-ingesting them. Plan your filter dimensions upfront.

Use consistent key names

Inconsistent casing or spelling breaks filtering. "Department" and "department" are different keys.

Add a document_type key to everything

This is the most useful filter in practice. Values like "policy", "contract", "handbook", "runbook" let you scope queries precisely.

For multi-tenant applications

Add a "tenant_id" or "client_id" key. This lets you scope all queries to one customer's documents without needing separate jack organisations.

Example schema — legal SaaS

json

{
  "document_type": "contract",
  "client_id":     "acme-corp",
  "contract_type": "msa",
  "status":        "active",
  "year":          "2024",
  "jurisdiction":  "england-wales"
}

Example schema — internal knowledge base

json

{
  "document_type": "policy",
  "department":    "engineering",
  "audience":      "all-staff",
  "version":       "2025-Q1",
  "owner":         "people-team"
}

← PREVIOUSAuthentication NEXT →Ingest