Skip to main content

Job Endpoints

Execute scrapes and retrieve results via HTTP.

Create job

Create a new scrape job using a strategy.
POST /api/jobs

Request body

{
  "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com/products"
}
FieldTypeRequiredDescription
strategy_idstringYesStrategy UUID
urlstringConditionalSingle URL to scrape (use url OR urls, not both)
urlsarrayConditionalList of URLs to scrape as a batch
parametersobjectNoOverride API parameters for this job (API strategies only)

Response (single URL)

{
  "job_id": "660e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com/products",
  "created_at": "2025-01-15T10:30:00Z"
}

Response (batch URLs)

When using urls, the response includes a batch_id for tracking:
{
  "batch_id": "770e8400-e29b-41d4-a716-446655440000",
  "job_count": 3,
  "status": "pending",
  "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2025-01-15T10:30:00Z"
}

Example

# Single URL
curl -X POST https://api.meter.sh/api/jobs \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com/products"
  }'

# Batch URLs
curl -X POST https://api.meter.sh/api/jobs \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
    "urls": [
      "https://example.com/products/1",
      "https://example.com/products/2"
    ]
  }'

# With API parameters
curl -X POST https://api.meter.sh/api/jobs \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com/api/products",
    "parameters": {"page": 2, "limit": 50}
  }'

Execute job (synchronous)

Create a job and wait for completion. Returns results directly without polling.
POST /api/jobs/execute

Request body

{
  "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com/products",
  "parameters": {"page": 1, "limit": 50}
}
FieldTypeRequiredDescription
strategy_idstringYesStrategy UUID
urlstringYesURL to scrape
parametersobjectNoOverride API parameters (API strategies only)

Response

{
  "job_id": "660e8400-e29b-41d4-a716-446655440000",
  "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com/products",
  "status": "completed",
  "results": [
    {"name": "Product A", "price": "$19.99"},
    {"name": "Product B", "price": "$29.99"}
  ],
  "item_count": 2,
  "error": null,
  "started_at": "2025-01-15T10:30:01Z",
  "completed_at": "2025-01-15T10:30:08Z",
  "created_at": "2025-01-15T10:30:00Z"
}
This endpoint blocks until the job completes (up to 1 hour timeout). Use the async POST /api/jobs endpoint for long-running scrapes or when you don’t need immediate results.

Example

curl -X POST https://api.meter.sh/api/jobs/execute \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com/products"
  }'

Error handling

If the job fails, the response will include the error:
{
  "job_id": "660e8400-e29b-41d4-a716-446655440000",
  "status": "failed",
  "results": null,
  "error": "Connection timeout: target site did not respond",
  "completed_at": "2025-01-15T10:30:15Z"
}

Get job

Get job status and results.
GET /api/jobs/{job_id}

Response

{
  "job_id": "660e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "results": [
    {"name": "Product A", "price": "$19.99"},
    {"name": "Product B", "price": "$29.99"}
  ],
  "item_count": 2,
  "content_hash": "7f3d9a2b4c1e...",
  "structural_signature": {...},
  "started_at": "2025-01-15T10:30:05Z",
  "completed_at": "2025-01-15T10:30:12Z",
  "created_at": "2025-01-15T10:30:00Z"
}
Status values: pending, running, completed, failed

Example

curl https://api.meter.sh/api/jobs/660e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer sk_live_..."

List jobs

List jobs with optional filtering.
GET /api/jobs?strategy_id={id}&status={status}&limit=20&offset=0

Query parameters

ParameterTypeRequiredDescription
strategy_idstringNoFilter by strategy UUID
statusstringNoFilter by status
limitintegerNoMax results (default: 20)
offsetintegerNoResults to skip (default: 0)

Response

Array of job objects (same format as Get job).

Example

# All jobs
curl https://api.meter.sh/api/jobs \
  -H "Authorization: Bearer sk_live_..."

# Filter by strategy
curl https://api.meter.sh/api/jobs?strategy_id=550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer sk_live_..."

# Only failed jobs
curl https://api.meter.sh/api/jobs?status=failed \
  -H "Authorization: Bearer sk_live_..."

Compare jobs

Compare two jobs to detect changes.
POST /api/jobs/compare

Request body

{
  "job_id": "660e8400-e29b-41d4-a716-446655440000",
  "other_job_id": "770e8400-e29b-41d4-a716-446655440000"
}

Response

{
  "content_hash_match": false,
  "structural_match": true,
  "semantic_similarity": 0.95,
  "changes": [
    "Item count changed: 10 -> 12",
    "Field 'price' changed in 3 items"
  ]
}

Example

curl -X POST https://api.meter.sh/api/jobs/compare \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "job_id": "660e8400-e29b-41d4-a716-446655440000",
    "other_job_id": "770e8400-e29b-41d4-a716-446655440000"
  }'

Compare manifest

Compare a manifest of known items against a job’s scrape results using fuzzy matching. Identifies items that were added, removed, or still present.
POST /api/jobs/{job_id}/compare-manifest

Request body

{
  "manifest": [
    {"name": "Acme Corp"},
    {"name": "Beta Industries"},
    {"name": "Gamma Solutions"}
  ],
  "match_fields": ["name"],
  "threshold": 80
}
FieldTypeRequiredDescription
manifestarrayYesList of known items (objects with at least the match_fields keys)
match_fieldsarrayYesField name(s) to fuzzy-match on (e.g., ["name"])
thresholdnumberNoMinimum match score 0-100 (default: 80)

Response

{
  "matched": [
    {
      "manifest_item": {"name": "Acme Corp"},
      "scraped_item": {"name": "Acme Corporation", "website": "acme.com"},
      "score": 90.0,
      "matched_on": "name"
    },
    {
      "manifest_item": {"name": "Gamma Solutions"},
      "scraped_item": {"name": "Gamma Solutions Inc", "website": "gamma.com"},
      "score": 95.0,
      "matched_on": "name"
    }
  ],
  "added": [
    {"name": "Delta Partners", "website": "delta.com"}
  ],
  "removed": [
    {"name": "Beta Industries"}
  ],
  "summary": {
    "matched": 2,
    "added": 1,
    "removed": 1,
    "manifest_count": 3,
    "scraped_count": 3
  },
  "threshold_used": 80.0,
  "match_fields_used": ["name"],
  "job_id": "660e8400-e29b-41d4-a716-446655440000"
}
FieldDescription
matchedItems found in both manifest and scrape results, with confidence scores
addedItems found in scrape results but not in the manifest
removedItems in the manifest but not found in scrape results
summaryCount summary of matched, added, removed, and totals

Example

curl -X POST https://api.meter.sh/api/jobs/660e8400-e29b-41d4-a716-446655440000/compare-manifest \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "manifest": [
      {"name": "Acme Corp"},
      {"name": "Beta Industries"}
    ],
    "match_fields": ["name"],
    "threshold": 80
  }'
Use POST /api/strategies/{strategy_id}/compare-manifest instead if you want to automatically compare against the latest results without specifying a job ID. See Strategy Endpoints.

Get strategy history

Get timeline of all jobs for a strategy.
GET /api/strategies/{strategy_id}/history

Response

[
  {
    "job_id": "660e8400-e29b-41d4-a716-446655440000",
    "status": "completed",
    "item_count": 12,
    "has_changes": true,
    "created_at": "2025-01-15T10:30:00Z"
  },
  {
    "job_id": "770e8400-e29b-41d4-a716-446655440000",
    "status": "completed",
    "item_count": 10,
    "has_changes": false,
    "created_at": "2025-01-15T09:30:00Z"
  }
]

Example

curl https://api.meter.sh/api/strategies/550e8400-e29b-41d4-a716-446655440000/history \
  -H "Authorization: Bearer sk_live_..."

Polling for completion

Use POST /api/jobs/execute instead if you want synchronous behavior without polling.
Since jobs created with POST /api/jobs run asynchronously, poll the Get job endpoint until status is completed or failed:
async function waitForJob(jobId) {
  while (true) {
    const response = await fetch(`https://api.meter.sh/api/jobs/${jobId}`, {
      headers: {
        'Authorization': `Bearer ${process.env.METER_API_KEY}`
      }
    });

    const job = await response.json();

    if (job.status === 'completed') {
      return job.results;
    } else if (job.status === 'failed') {
      throw new Error(job.error);
    }

    // Wait 2 seconds before next check
    await new Promise(resolve => setTimeout(resolve, 2000));
  }
}

Error responses

StatusDescription
400Invalid request (missing strategy_id or url)
401Invalid or missing API key
404Job or strategy not found
500Internal server error
503Service temporarily unavailable
See REST API Errors for detailed error handling.

Next steps

Schedule Endpoints

Automate job execution

Python SDK

Use the Python SDK with built-in polling

Jobs Concept

Learn about job lifecycle

Need help?

Email me at mckinnon@meter.sh