Workflow Endpoints
Create and manage multi-step scraping pipelines via HTTP.
Create workflow
Create a new workflow with nodes and edges.
Request body
{
"name": "Job Scraper",
"description": "Scrape job listings then detail pages",
"nodes": [
{
"node_key": "index",
"strategy_id": "550e8400-e29b-41d4-a716-446655440000",
"input_type": "static_urls",
"static_urls": ["https://jobs.com/listings"]
},
{
"node_key": "details",
"strategy_id": "660e8400-e29b-41d4-a716-446655440000",
"input_type": "upstream_urls",
"url_field": "job_url"
}
],
"edges": [
{
"source_node_key": "index",
"target_node_key": "details"
}
]
}
Node fields
| Field | Type | Required | Description |
|---|
node_key | string | Yes | Unique identifier within the workflow |
strategy_id | UUID | Yes | Strategy to use for scraping |
input_type | string | Yes | static_urls, upstream_urls, upstream_data, or trigger_only |
static_urls | array | Conditional | URLs for static_urls input type |
url_field | string | Conditional | Field name for upstream_urls input type |
static_parameters | object | No | API parameter overrides |
parameter_config | object | No | Map upstream fields to strategy parameters |
Edge fields
| Field | Type | Required | Description |
|---|
source_node_key | string | Yes | Source node identifier |
target_node_key | string | Yes | Target node identifier |
filter_config | object | No | Filter configuration (see below) |
Filter config
{
"mode": "all",
"conditions": [
{"field": "category", "operator": "contains", "value": "tech", "case_sensitive": false}
]
}
| Operator | Description |
|---|
contains | Field contains substring |
not_contains | Field does not contain substring |
equals | Exact match |
not_equals | Not exact match |
regex_match | Regex pattern match |
exists | Field exists and is non-empty |
not_exists | Field is missing or empty |
gt | Greater than |
lt | Less than |
Use mode: "all" for AND logic, mode: "any" for OR logic.
Response
{
"id": "990e8400-e29b-41d4-a716-446655440000",
"name": "Job Scraper",
"description": "Scrape job listings then detail pages",
"nodes": [...],
"edges": [...],
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:30:00Z"
}
Example
curl -X POST https://api.meter.sh/api/workflows \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"name": "Job Scraper",
"nodes": [
{
"node_key": "index",
"strategy_id": "550e8400-e29b-41d4-a716-446655440000",
"input_type": "static_urls",
"static_urls": ["https://jobs.com/listings"]
},
{
"node_key": "details",
"strategy_id": "660e8400-e29b-41d4-a716-446655440000",
"input_type": "upstream_urls",
"url_field": "job_url"
}
],
"edges": [
{"source_node_key": "index", "target_node_key": "details"}
]
}'
Get workflow
GET /api/workflows/{workflow_id}
Returns workflow details including nodes and edges.
List workflows
GET /api/workflows?limit=50&offset=0
Query parameters
| Parameter | Type | Required | Description |
|---|
limit | integer | No | Max results (default: 50) |
offset | integer | No | Results to skip (default: 0) |
Update workflow
Update workflow metadata.
PUT /api/workflows/{workflow_id}
Request body
{
"name": "Updated Name",
"description": "Updated description"
}
Delete workflow
Delete a workflow and all associated runs and schedules.
DELETE /api/workflows/{workflow_id}
Add node
Add a node to an existing workflow.
POST /api/workflows/{workflow_id}/nodes
Request body
{
"node_key": "new_node",
"strategy_id": "550e8400-e29b-41d4-a716-446655440000",
"input_type": "upstream_urls",
"url_field": "link"
}
Update node
PUT /api/workflows/{workflow_id}/nodes/{node_id}
Delete node
DELETE /api/workflows/{workflow_id}/nodes/{node_id}
Add edge
Connect two nodes.
POST /api/workflows/{workflow_id}/edges
Request body
{
"source_node_key": "index",
"target_node_key": "details",
"filter_config": {
"mode": "all",
"conditions": [
{"field": "category", "operator": "contains", "value": "tech"}
]
}
}
Delete edge
DELETE /api/workflows/{workflow_id}/edges/{edge_id}
Run workflow
Trigger a manual workflow run.
POST /api/workflows/{workflow_id}/run
Request body
| Field | Type | Required | Description |
|---|
force | boolean | No | Skip change detection and re-run all nodes (default: false) |
Response
{
"id": "aa0e8400-e29b-41d4-a716-446655440000",
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"trigger": "manual",
"created_at": "2025-01-15T10:30:00Z"
}
Example
curl -X POST https://api.meter.sh/api/workflows/990e8400-e29b-41d4-a716-446655440000/run \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{"force": false}'
Get workflow run
Get details of a specific run including node execution results.
GET /api/workflows/{workflow_id}/runs/{run_id}
Response
{
"id": "aa0e8400-e29b-41d4-a716-446655440000",
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"trigger": "manual",
"node_executions": [
{
"node_key": "index",
"status": "completed",
"job_id": "bb0e8400-e29b-41d4-a716-446655440000",
"item_count": 25,
"started_at": "2025-01-15T10:30:01Z",
"completed_at": "2025-01-15T10:30:05Z"
},
{
"node_key": "details",
"status": "completed",
"job_id": "cc0e8400-e29b-41d4-a716-446655440000",
"item_count": 25,
"started_at": "2025-01-15T10:30:06Z",
"completed_at": "2025-01-15T10:30:30Z"
}
],
"created_at": "2025-01-15T10:30:00Z",
"completed_at": "2025-01-15T10:30:30Z"
}
The run response includes node_executions with item_count per node but does not include inline results. Use the output endpoint to fetch full results.
Run status values: pending, running, completed, failed, partial, cancelled
List workflow runs
GET /api/workflows/{workflow_id}/runs?limit=20&offset=0
Get latest workflow output
Get the most recent completed run’s results. By default, results are grouped by URL and then by strategy label.
GET /api/workflows/{workflow_id}/runs/latest/output
Query parameters
| Parameter | Type | Required | Description |
|---|
flat | boolean | No | Return flat per-URL results instead of grouped by strategy (default: false) |
include_intermediate | boolean | No | Include outputs from all nodes, not just leaf nodes (default: false) |
Response (default — grouped by strategy)
{
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"run_id": "aa0e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"final_results_by_url_grouped": {
"https://jobs.com/listings/software-engineer": {
"default": [{"title": "Software Engineer", "salary": "$120k"}],
"benefits": [{"health": "Yes", "dental": "Yes", "401k": "Yes"}]
},
"https://jobs.com/listings/product-manager": {
"default": [{"title": "Product Manager", "salary": "$130k"}]
}
},
"changed_since_previous": false,
"completed_at": "2025-01-15T10:30:30Z"
}
Response with ?flat=true
{
"workflow_id": "990e8400-e29b-41d4-a716-446655440000",
"run_id": "aa0e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"final_results_by_url": {
"https://jobs.com/listings/software-engineer": [
{"title": "Software Engineer", "salary": "$120k"},
{"health": "Yes", "dental": "Yes", "401k": "Yes"}
],
"https://jobs.com/listings/product-manager": [
{"title": "Product Manager", "salary": "$130k"}
]
},
"changed_since_previous": false,
"completed_at": "2025-01-15T10:30:30Z"
}
When include_intermediate is set, the response includes a node_outputs field with results from all nodes (not just leaf nodes):
{
"node_outputs": {
"index": [{"job_url": "https://jobs.com/listings/software-engineer", "title": "Software Engineer"}],
"details": [{"title": "Software Engineer", "salary": "$120k"}]
}
}
Cancel workflow run
Cancel a running workflow.
POST /api/workflows/{workflow_id}/runs/{run_id}/cancel
Workflow schedules
Create workflow schedule
POST /api/workflows/{workflow_id}/schedules
Request body
{
"interval_seconds": 3600,
"webhook_url": "https://your-app.com/webhook",
"webhook_metadata": {"project": "my-project"},
"webhook_secret": "whsec_your_secret_here",
"webhook_type": "standard"
}
| Field | Type | Required | Description |
|---|
interval_seconds | integer | Conditional | Run every N seconds |
cron_expression | string | Conditional | Cron expression |
webhook_url | string | No | Webhook URL for results |
webhook_metadata | object | No | Custom JSON metadata for webhook payloads |
webhook_secret | string | No | Secret for X-Webhook-Secret header |
webhook_type | string | No | standard, slack, slack_workflow, or discord (default: standard). Auto-detected from URL if not specified |
Provide either interval_seconds or cron_expression, not both.
List workflow schedules
GET /api/workflows/{workflow_id}/schedules
Update workflow schedule
PATCH /api/workflows/{workflow_id}/schedules/{schedule_id}
All fields are optional. Include only fields to update:
{
"enabled": false,
"interval_seconds": 7200
}
Delete workflow schedule
DELETE /api/workflows/{workflow_id}/schedules/{schedule_id}
Polling for run completion
Since workflow runs are asynchronous, poll the run endpoint until status is completed or failed:
# Start a run
RUN_ID=$(curl -s -X POST https://api.meter.sh/api/workflows/{workflow_id}/run \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{}' | jq -r '.id')
# Poll for completion
while true; do
STATUS=$(curl -s https://api.meter.sh/api/workflows/{workflow_id}/runs/$RUN_ID \
-H "Authorization: Bearer sk_live_..." | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
break
fi
sleep 5
done
Use the Python SDK’s run_workflow(wait=True) to handle polling automatically.
Error responses
| Status | Description |
|---|
400 | Invalid request (missing required fields, invalid DAG structure) |
401 | Invalid or missing API key |
404 | Workflow, run, or node not found |
409 | Workflow is already running |
500 | Internal server error |
503 | Service temporarily unavailable |
See REST API Errors for detailed error handling.
Next steps
Workflows Concept
Understand workflow architecture and patterns
Python SDK
Use the Python SDK for workflows
Schedule Endpoints
Compare with simple schedules
Webhooks
Receive workflow results via webhook
Need help?
Email me at mckinnon@meter.sh