Skip to main content

Workflow Endpoints

Create and manage multi-step scraping pipelines via HTTP.

Create workflow

Create a new workflow with nodes and edges.
POST /api/workflows

Request body

{
  "name": "Job Scraper",
  "description": "Scrape job listings then detail pages",
  "nodes": [
    {
      "node_key": "index",
      "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
      "input_type": "static_urls",
      "static_urls": ["https://jobs.com/listings"]
    },
    {
      "node_key": "details",
      "strategy_id": "660e8400-e29b-41d4-a716-446655440000",
      "input_type": "upstream_urls",
      "url_field": "job_url"
    }
  ],
  "edges": [
    {
      "source_node_key": "index",
      "target_node_key": "details"
    }
  ]
}

Node fields

FieldTypeRequiredDescription
node_keystringYesUnique identifier within the workflow
strategy_idUUIDYesStrategy to use for scraping
input_typestringYesstatic_urls, upstream_urls, or upstream_data
static_urlsarrayConditionalURLs for static_urls input type
url_fieldstringConditionalField name for upstream_urls input type
static_parametersobjectNoAPI parameter overrides
parameter_configobjectNoMap upstream fields to strategy parameters

Edge fields

FieldTypeRequiredDescription
source_node_keystringYesSource node identifier
target_node_keystringYesTarget node identifier
filter_configobjectNoFilter configuration (see below)

Filter config

{
  "mode": "all",
  "conditions": [
    {"field": "category", "operator": "contains", "value": "tech", "case_sensitive": false}
  ]
}
OperatorDescription
containsField contains substring
not_containsField does not contain substring
equalsExact match
not_equalsNot exact match
regex_matchRegex pattern match
existsField exists and is non-empty
not_existsField is missing or empty
gtGreater than
ltLess than
Use mode: "all" for AND logic, mode: "any" for OR logic.

Response

{
  "id": "990e8400-e29b-41d4-a716-446655440000",
  "name": "Job Scraper",
  "description": "Scrape job listings then detail pages",
  "nodes": [...],
  "edges": [...],
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:30:00Z"
}

Example

curl -X POST https://api.meter.sh/api/workflows \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Job Scraper",
    "nodes": [
      {
        "node_key": "index",
        "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
        "input_type": "static_urls",
        "static_urls": ["https://jobs.com/listings"]
      },
      {
        "node_key": "details",
        "strategy_id": "660e8400-e29b-41d4-a716-446655440000",
        "input_type": "upstream_urls",
        "url_field": "job_url"
      }
    ],
    "edges": [
      {"source_node_key": "index", "target_node_key": "details"}
    ]
  }'

Get workflow

GET /api/workflows/{workflow_id}
Returns workflow details including nodes and edges.

List workflows

GET /api/workflows?limit=50&offset=0

Query parameters

ParameterTypeRequiredDescription
limitintegerNoMax results (default: 50)
offsetintegerNoResults to skip (default: 0)

Update workflow

Update workflow metadata.
PUT /api/workflows/{workflow_id}

Request body

{
  "name": "Updated Name",
  "description": "Updated description"
}

Delete workflow

Delete a workflow and all associated runs and schedules.
DELETE /api/workflows/{workflow_id}

Add node

Add a node to an existing workflow.
POST /api/workflows/{workflow_id}/nodes

Request body

{
  "node_key": "new_node",
  "strategy_id": "550e8400-e29b-41d4-a716-446655440000",
  "input_type": "upstream_urls",
  "url_field": "link"
}

Update node

PUT /api/workflows/{workflow_id}/nodes/{node_id}

Delete node

DELETE /api/workflows/{workflow_id}/nodes/{node_id}

Add edge

Connect two nodes.
POST /api/workflows/{workflow_id}/edges

Request body

{
  "source_node_key": "index",
  "target_node_key": "details",
  "filter_config": {
    "mode": "all",
    "conditions": [
      {"field": "category", "operator": "contains", "value": "tech"}
    ]
  }
}

Delete edge

DELETE /api/workflows/{workflow_id}/edges/{edge_id}

Run workflow

Trigger a manual workflow run.
POST /api/workflows/{workflow_id}/run

Request body

{
  "force": false
}
FieldTypeRequiredDescription
forcebooleanNoSkip change detection and re-run all nodes (default: false)

Response

{
  "id": "aa0e8400-e29b-41d4-a716-446655440000",
  "workflow_id": "990e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "trigger": "manual",
  "created_at": "2025-01-15T10:30:00Z"
}

Example

curl -X POST https://api.meter.sh/api/workflows/990e8400-e29b-41d4-a716-446655440000/run \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"force": false}'

Get workflow run

Get details of a specific run including node execution results.
GET /api/workflows/{workflow_id}/runs/{run_id}

Response

{
  "id": "aa0e8400-e29b-41d4-a716-446655440000",
  "workflow_id": "990e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "trigger": "manual",
  "node_executions": [
    {
      "node_key": "index",
      "status": "completed",
      "job_id": "bb0e8400-e29b-41d4-a716-446655440000",
      "item_count": 25,
      "started_at": "2025-01-15T10:30:01Z",
      "completed_at": "2025-01-15T10:30:05Z"
    },
    {
      "node_key": "details",
      "status": "completed",
      "job_id": "cc0e8400-e29b-41d4-a716-446655440000",
      "item_count": 25,
      "started_at": "2025-01-15T10:30:06Z",
      "completed_at": "2025-01-15T10:30:30Z"
    }
  ],
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:30Z"
}
The run response includes node_executions with item_count per node but does not include inline results. Use the output endpoint to fetch full results.
Run status values: pending, running, completed, failed, partial, cancelled

List workflow runs

GET /api/workflows/{workflow_id}/runs?limit=20&offset=0

Get latest workflow output

Get the most recent completed run’s results. By default, results are grouped by URL and then by strategy label.
GET /api/workflows/{workflow_id}/runs/latest/output

Query parameters

ParameterTypeRequiredDescription
flatbooleanNoReturn flat per-URL results instead of grouped by strategy (default: false)
include_intermediatebooleanNoInclude outputs from all nodes, not just leaf nodes (default: false)

Response (default — grouped by strategy)

{
  "workflow_id": "990e8400-e29b-41d4-a716-446655440000",
  "run_id": "aa0e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "final_results_by_url_grouped": {
    "https://jobs.com/listings/software-engineer": {
      "default": [{"title": "Software Engineer", "salary": "$120k"}],
      "benefits": [{"health": "Yes", "dental": "Yes", "401k": "Yes"}]
    },
    "https://jobs.com/listings/product-manager": {
      "default": [{"title": "Product Manager", "salary": "$130k"}]
    }
  },
  "changed_since_previous": false,
  "completed_at": "2025-01-15T10:30:30Z"
}

Response with ?flat=true

{
  "workflow_id": "990e8400-e29b-41d4-a716-446655440000",
  "run_id": "aa0e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "final_results_by_url": {
    "https://jobs.com/listings/software-engineer": [
      {"title": "Software Engineer", "salary": "$120k"},
      {"health": "Yes", "dental": "Yes", "401k": "Yes"}
    ],
    "https://jobs.com/listings/product-manager": [
      {"title": "Product Manager", "salary": "$130k"}
    ]
  },
  "changed_since_previous": false,
  "completed_at": "2025-01-15T10:30:30Z"
}

Response with ?include_intermediate=true

When include_intermediate is set, the response includes a node_outputs field with results from all nodes (not just leaf nodes):
{
  "node_outputs": {
    "index": [{"job_url": "https://jobs.com/listings/software-engineer", "title": "Software Engineer"}],
    "details": [{"title": "Software Engineer", "salary": "$120k"}]
  }
}

Cancel workflow run

Cancel a running workflow.
POST /api/workflows/{workflow_id}/runs/{run_id}/cancel

Workflow schedules

Create workflow schedule

POST /api/workflows/{workflow_id}/schedules

Request body

{
  "interval_seconds": 3600,
  "webhook_url": "https://your-app.com/webhook",
  "webhook_metadata": {"project": "my-project"},
  "webhook_secret": "whsec_your_secret_here",
  "webhook_type": "standard"
}
FieldTypeRequiredDescription
interval_secondsintegerConditionalRun every N seconds
cron_expressionstringConditionalCron expression
webhook_urlstringNoWebhook URL for results
webhook_metadataobjectNoCustom JSON metadata for webhook payloads
webhook_secretstringNoSecret for X-Webhook-Secret header
webhook_typestringNostandard or slack (default: standard)
Provide either interval_seconds or cron_expression, not both.

List workflow schedules

GET /api/workflows/{workflow_id}/schedules

Update workflow schedule

PATCH /api/workflows/{workflow_id}/schedules/{schedule_id}
All fields are optional. Include only fields to update:
{
  "enabled": false,
  "interval_seconds": 7200
}

Delete workflow schedule

DELETE /api/workflows/{workflow_id}/schedules/{schedule_id}

Polling for run completion

Since workflow runs are asynchronous, poll the run endpoint until status is completed or failed:
# Start a run
RUN_ID=$(curl -s -X POST https://api.meter.sh/api/workflows/{workflow_id}/run \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{}' | jq -r '.id')

# Poll for completion
while true; do
  STATUS=$(curl -s https://api.meter.sh/api/workflows/{workflow_id}/runs/$RUN_ID \
    -H "Authorization: Bearer sk_live_..." | jq -r '.status')

  echo "Status: $STATUS"
  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
    break
  fi
  sleep 5
done
Use the Python SDK’s run_workflow(wait=True) to handle polling automatically.

Error responses

StatusDescription
400Invalid request (missing required fields, invalid DAG structure)
401Invalid or missing API key
404Workflow, run, or node not found
409Workflow is already running
500Internal server error
503Service temporarily unavailable
See REST API Errors for detailed error handling.

Next steps

Need help?

Email me at mckinnon@meter.sh