Workflow Endpoints
Create and manage multi-step scraping pipelines via HTTP.
Create workflow
Create a new workflow with nodes and edges.
Request body
{
"name": "Job Scraper",
"description": "Scrape job listings then detail pages",
"nodes": [
{
"node_key": "index",
"strategy_id": "550e8400-e29b-41d4-a716-446655440000",
"input_type": "static_urls",
"static_urls": ["https://jobs.com/listings"]
},
{
"node_key": "details",
"strategy_id": "660e8400-e29b-41d4-a716-446655440000",
"input_type": "upstream_urls",
"url_field": "job_url"
}
],
"edges": [
{
"source_node_key": "index",
"target_node_key": "details"
}
]
}
Node fields
| Field | Type | Required | Description |
|---|
node_key | string | Yes | Unique identifier within the workflow |
strategy_id | UUID | Yes | Strategy to use for scraping |
input_type | string | Yes | static_urls, upstream_urls, or upstream_data |
static_urls | array | Conditional | URLs for static_urls input type |
url_field | string | Conditional | Field name for upstream_urls input type |
static_parameters | object | No | API parameter overrides |
parameter_config | object | No | Map upstream fields to strategy parameters |
Edge fields
| Field | Type | Required | Description |
|---|
source_node_key | string | Yes | Source node identifier |
target_node_key | string | Yes | Target node identifier |
filter_config | object | No | Filter configuration (see below) |
Filter config
{
"mode": "all",
"conditions": [
{"field": "category", "operator": "contains", "value": "tech", "case_sensitive": false}
]
}
| Operator | Description |
|---|
contains | Field contains substring |
not_contains | Field does not contain substring |
equals | Exact match |
not_equals | Not exact match |
regex_match | Regex pattern match |
exists | Field exists and is non-empty |
not_exists | Field is missing or empty |
gt | Greater than |
lt | Less than |
Use mode: "all" for AND logic, mode: "any" for OR logic.
Response
{
"id": "990e8400-e29b-41d4-a716-446655440000",
"name": "Job Scraper",
"description": "Scrape job listings then detail pages",
"nodes": [...],
"edges": [...],
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:30:00Z"
}
Example
curl -X POST https://api.meter.sh/api/workflows \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"name": "Job Scraper",
"nodes": [
{
"node_key": "index",
"strategy_id": "550e8400-e29b-41d4-a716-446655440000",
"input_type": "static_urls",
"static_urls": ["https://jobs.com/listings"]
},
{
"node_key": "details",
"strategy_id": "660e8400-e29b-41d4-a716-446655440000",
"input_type": "upstream_urls",
"url_field": "job_url"
}
],
"edges": [
{"source_node_key": "index", "target_node_key": "details"}
]
}'
Get workflow
GET /api/workflows/{workflow_id}
Returns workflow details including nodes and edges.
List workflows
GET /api/workflows?limit=50&offset=0
Query parameters
| Parameter | Type | Required | Description |
|---|
limit | integer | No | Max results (default: 50) |
offset | integer | No | Results to skip (default: 0) |
Update workflow
Update workflow metadata.
PUT /api/workflows/{workflow_id}
Request body
{
"name": "Updated Name",
"description": "Updated description"
}
Delete workflow
Delete a workflow and all associated runs and schedules.
DELETE /api/workflows/{workflow_id}
Add node
Add a node to an existing workflow.
POST /api/workflows/{workflow_id}/nodes
Request body
{
"node_key": "new_node",
"strategy_id": "550e8400-e29b-41d4-a716-446655440000",
"input_type": "upstream_urls",
"url_field": "link"
}
Update node
PUT /api/workflows/{workflow_id}/nodes/{node_id}
Delete node
DELETE /api/workflows/{workflow_id}/nodes/{node_id}
Add edge
Connect two nodes.
POST /api/workflows/{workflow_id}/edges
Request body
{
"source_node_key": "index",
"target_node_key": "details",
"filter_config": {
"mode": "all",
"conditions": [
{"field": "category", "operator": "contains", "value": "tech"}
]
}
}
Delete edge
DELETE /api/workflows/{workflow_id}/edges/{edge_id}
Run workflow
Trigger a manual workflow run.
POST /api/workflows/{workflow_id}/run
Request body
| Field | Type | Required | Description |
|---|
force | boolean | No | Skip change detection and re-run all nodes (default: false) |
Response
{
"id": "aa0e8400-e29b-41d4-a716-446655440000",
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"trigger": "manual",
"created_at": "2025-01-15T10:30:00Z"
}
Example
curl -X POST https://api.meter.sh/api/workflows/990e8400-e29b-41d4-a716-446655440000/run \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{"force": false}'
Get workflow run
Get details of a specific run including node execution results.
GET /api/workflows/{workflow_id}/runs/{run_id}
Response
{
"id": "aa0e8400-e29b-41d4-a716-446655440000",
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"trigger": "manual",
"node_executions": [
{
"node_key": "index",
"status": "completed",
"job_id": "bb0e8400-e29b-41d4-a716-446655440000",
"item_count": 25,
"started_at": "2025-01-15T10:30:01Z",
"completed_at": "2025-01-15T10:30:05Z"
},
{
"node_key": "details",
"status": "completed",
"job_id": "cc0e8400-e29b-41d4-a716-446655440000",
"item_count": 25,
"started_at": "2025-01-15T10:30:06Z",
"completed_at": "2025-01-15T10:30:30Z"
}
],
"created_at": "2025-01-15T10:30:00Z",
"completed_at": "2025-01-15T10:30:30Z"
}
The run response includes node_executions with item_count per node but does not include inline results. Use the output endpoint to fetch full results.
Run status values: pending, running, completed, failed, partial, cancelled
List workflow runs
GET /api/workflows/{workflow_id}/runs?limit=20&offset=0
Get latest workflow output
Get the most recent completed run’s results. By default, results are grouped by URL and then by strategy label.
GET /api/workflows/{workflow_id}/runs/latest/output
Query parameters
| Parameter | Type | Required | Description |
|---|
flat | boolean | No | Return flat per-URL results instead of grouped by strategy (default: false) |
include_intermediate | boolean | No | Include outputs from all nodes, not just leaf nodes (default: false) |
Response (default — grouped by strategy)
{
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"run_id": "aa0e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"final_results_by_url_grouped": {
"https://jobs.com/listings/software-engineer": {
"default": [{"title": "Software Engineer", "salary": "$120k"}],
"benefits": [{"health": "Yes", "dental": "Yes", "401k": "Yes"}]
},
"https://jobs.com/listings/product-manager": {
"default": [{"title": "Product Manager", "salary": "$130k"}]
}
},
"changed_since_previous": false,
"completed_at": "2025-01-15T10:30:30Z"
}
Response with ?flat=true
{
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"run_id": "aa0e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"final_results_by_url": {
"https://jobs.com/listings/software-engineer": [
{"title": "Software Engineer", "salary": "$120k"},
{"health": "Yes", "dental": "Yes", "401k": "Yes"}
],
"https://jobs.com/listings/product-manager": [
{"title": "Product Manager", "salary": "$130k"}
]
},
"changed_since_previous": false,
"completed_at": "2025-01-15T10:30:30Z"
}
When include_intermediate is set, the response includes a node_outputs field with results from all nodes (not just leaf nodes):
{
"node_outputs": {
"index": [{"job_url": "https://jobs.com/listings/software-engineer", "title": "Software Engineer"}],
"details": [{"title": "Software Engineer", "salary": "$120k"}]
}
}
Cancel workflow run
Cancel a running workflow.
POST /api/workflows/{workflow_id}/runs/{run_id}/cancel
Workflow schedules
Create workflow schedule
POST /api/workflows/{workflow_id}/schedules
Request body
{
"interval_seconds": 3600,
"webhook_url": "https://your-app.com/webhook",
"webhook_metadata": {"project": "my-project"},
"webhook_secret": "whsec_your_secret_here",
"webhook_type": "standard"
}
| Field | Type | Required | Description |
|---|
interval_seconds | integer | Conditional | Run every N seconds |
cron_expression | string | Conditional | Cron expression |
webhook_url | string | No | Webhook URL for results |
webhook_metadata | object | No | Custom JSON metadata for webhook payloads |
webhook_secret | string | No | Secret for X-Webhook-Secret header |
webhook_type | string | No | standard or slack (default: standard) |
Provide either interval_seconds or cron_expression, not both.
List workflow schedules
GET /api/workflows/{workflow_id}/schedules
Update workflow schedule
PATCH /api/workflows/{workflow_id}/schedules/{schedule_id}
All fields are optional. Include only fields to update:
{
"enabled": false,
"interval_seconds": 7200
}
Delete workflow schedule
DELETE /api/workflows/{workflow_id}/schedules/{schedule_id}
Polling for run completion
Since workflow runs are asynchronous, poll the run endpoint until status is completed or failed:
# Start a run
RUN_ID=$(curl -s -X POST https://api.meter.sh/api/workflows/{workflow_id}/run \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{}' | jq -r '.id')
# Poll for completion
while true; do
STATUS=$(curl -s https://api.meter.sh/api/workflows/{workflow_id}/runs/$RUN_ID \
-H "Authorization: Bearer sk_live_..." | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
break
fi
sleep 5
done
Use the Python SDK’s run_workflow(wait=True) to handle polling automatically.
Error responses
| Status | Description |
|---|
400 | Invalid request (missing required fields, invalid DAG structure) |
401 | Invalid or missing API key |
404 | Workflow, run, or node not found |
409 | Workflow is already running |
500 | Internal server error |
503 | Service temporarily unavailable |
See REST API Errors for detailed error handling.
Next steps
Need help?
Email me at mckinnon@meter.sh