jackdocs

Ingest

Three endpoints for getting documents into jack — raw text, file upload, and URL. All three are asynchronous and return a document ID you use to track processing status.

POST/v1/ingest

The primary ingest endpoint. Send extracted text with metadata. Accepts up to 50 documents per request. Use this when you extract text yourself from your own storage or database.

Request body

FieldTypeRequiredDefaultDescription
documentsarrayYesList of documents to ingest
documents[].textstringYesThe document text content
documents[].metadataobjectNo{}Arbitrary key-value metadata (string values only)
chunk_sizeintegerNo512Token size per chunk (128–4096)
chunk_overlapintegerNo64Overlap tokens between chunks (0–512)

Request

python
import httpx

response = httpx.post(
    "https://api.usejack.io/v1/ingest",
    headers={"Authorization": "Bearer jack_xxxxxxxx"},
    json={
        "documents": [
            {
                "text": "Section 4.2 — Remote Work Policy. Employees are permitted to work remotely...",
                "metadata": {
                    "document_type": "policy",
                    "department": "HR",
                    "version": "2024-Q1"
                }
            },
            {
                "text": "Section 7.1 — Annual Leave. All full-time employees are entitled to 21 days...",
                "metadata": {
                    "document_type": "policy",
                    "department": "HR",
                    "version": "2024-Q1"
                }
            }
        ]
    }
)

print(response.json())
bash
curl -X POST https://api.usejack.io/v1/ingest \
  -H "Authorization: Bearer jack_xxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {"text": "...", "metadata": {"document_type": "policy", "department": "HR"}}
    ]
  }'

Response — 202 Accepted

json
{
  "status": "accepted",
  "org_id": "your-org-id",
  "queued": 2,
  "document_ids": [
    "f6c5ea1b-4d9b-4fbf-bb18-b10b97680340",
    "a1b2c3d4-5e6f-7890-abcd-ef1234567890"
  ]
}

Ingest is asynchronous. Save your document_ids — use them to poll status via GET /v1/documents/{id} and delete documents later.

POST/v1/ingest/file

Upload a file directly as multipart form data. jack extracts the text internally. Use this when you have files on disk or in memory rather than pre-extracted text.

Supported formats: .pdf .docx .txt .md · Max size: 100MB

Request

python
with open("employee-handbook.pdf", "rb") as f:
    response = httpx.post(
        "https://api.usejack.io/v1/ingest/file",
        headers={"Authorization": "Bearer jack_xxxxxxxx"},
        files={"file": ("employee-handbook.pdf", f, "application/pdf")},
        data={
            "metadata": '{"document_type": "handbook", "department": "HR"}'
        }
    )

print(response.json()["document_ids"])
bash
curl -X POST https://api.usejack.io/v1/ingest/file \
  -H "Authorization: Bearer jack_xxxxxxxx" \
  -F "[email protected]" \
  -F 'metadata={"document_type": "handbook", "department": "HR"}'

Response — 202 Accepted

json
{
  "status": "accepted",
  "org_id": "your-org-id",
  "queued": 1,
  "document_ids": ["f6c5ea1b-4d9b-4fbf-bb18-b10b97680340"]
}
POST/v1/ingest/url

Pass a URL — jack downloads and indexes it server-side. Works with pre-signed S3 URLs, MinIO, Cloudflare R2, or any publicly reachable URL.

The URL must be reachable by jack's servers. Pre-signed URLs must remain valid for at least 60 seconds.

Request body

FieldTypeRequiredDescription
urlstringYesHTTP/HTTPS URL pointing to a supported file
metadataobjectNoArbitrary key-value metadata (string values only)

Request

python
response = httpx.post(
    "https://api.usejack.io/v1/ingest/url",
    headers={"Authorization": "Bearer jack_xxxxxxxx"},
    json={
        "url": "https://your-bucket.s3.amazonaws.com/contracts/msa-001.pdf?X-Amz-Signature=...",
        "metadata": {
            "document_type": "contract",
            "client_id": "CLIENT-001"
        }
    }
)

print(response.json()["document_ids"])
bash
curl -X POST https://api.usejack.io/v1/ingest/url \
  -H "Authorization: Bearer jack_xxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://...", "metadata": {"document_type": "contract"}}'

Response — 202 Accepted

json
{
  "status": "accepted",
  "org_id": "your-org-id",
  "queued": 1,
  "document_ids": ["f6c5ea1b-4d9b-4fbf-bb18-b10b97680340"]
}
← PREVIOUSMetadata & FilteringNEXT →Query