Skip to main content

Core Concepts

Meter is built around a few key abstractions that make web scraping and monitoring simple and cost-effective. Understanding these concepts will help you get the most out of the platform.

The big picture

1

Strategy

A reusable extraction plan generated by AI that defines how to scrape a website. Created once, used many times.
2

Job

A single execution of a scrape using a strategy. Jobs run asynchronously and return extracted data.
3

Schedule

Automated recurring jobs that run at specified intervals or cron times. Perfect for monitoring websites.
4

Workflow (optional)

Chain strategies into DAG-based pipelines where the output of one scraper feeds into the next. For multi-step scraping like index → detail pages.
5

Change Detection

Intelligent diffing that compares jobs to detect meaningful content changes, filtering out noise.

Key concepts

Strategies

Learn how AI-generated extraction strategies work and when to use them

Strategy Groups

Organize strategies for bulk scheduling and shared output schemas

Output Schemas

Define the exact JSON structure for extraction results

Filtering

Filter extraction results with conditions and operators

Jobs

Understand job execution, status checking, and result retrieval

Schedules

Set up automated monitoring with intervals or cron expressions

Workflows

Chain strategies into multi-step scraping pipelines

Change Detection

Discover how Meter detects meaningful content changes

How it all fits together

Example workflow

  1. Generate a strategy for extracting product data from an e-commerce site
  2. Create a schedule to scrape the site every hour
  3. Jobs run automatically, extracting current product data
  4. Changes are detected by comparing content hashes and structural signatures
  5. You get notified via webhook or pull from the changes API
  6. Update your database only with changed content

Cost model

Understanding Meter’s cost structure helps you optimize usage:
ActionCostFrequency
Strategy generation~$0.02-0.06Once per site/pattern
Job executionIncludedUnlimited
Change detectionIncludedAutomatic
API callsIncludedUnlimited

Why strategy-based is cheaper

Traditional LLM scraping costs scale with usage:
  • Traditional: Pay per scrape ($0.02-0.10 each)
  • Meter: Pay once for strategy ($0.02-0.06), then scrape unlimited times at no extra cost
For a site scraped 100 times:
  • Traditional LLM scraping: $2-10
  • Meter: $0.02-0.06 (97-99% savings)

Data model

Understanding the data model helps you work with the API:
User
  ├── Strategy Groups
  │     └── Strategies (grouped for bulk management)
  ├── Strategies
  │     ├── Preview Data (sample extraction)
  │     ├── Output Schema (optional)
  │     ├── Filter Config (optional)
  │     ├── Jobs
  │     │     ├── Results (extracted data)
  │     │     ├── Content Hash
  │     │     └── Structural Signature
  │     └── Schedules
  │           ├── Interval or Cron
  │           ├── Webhook URL (optional)
  │           └── Associated Jobs
  └── Workflows
        ├── Nodes (strategy + input config)
        ├── Edges (data flow + filters)
        ├── Runs (execution history)
        └── Schedules (interval or cron)

Best practices

If multiple pages have the same structure (e.g., product pages, blog posts), you can reuse the same strategy with different URLs:
# Generate once for pattern
strategy = client.generate_strategy(
    url="https://shop.com/product/123",
    description="Extract name, price, description"
)

# Reuse for different products
job1 = client.create_job(strategy_id, "https://shop.com/product/123")
job2 = client.create_job(strategy_id, "https://shop.com/product/456")
Instead of webhooks for every change, poll the changes API periodically:
# Check once per hour for all changes
changes = client.get_schedule_changes(schedule_id)
if changes['count'] > 0:
    batch_process(changes['changes'])
This reduces webhook traffic and allows batching updates.
Faster intervals (15-30 min):
  • Stock prices, sports scores, breaking news
  • High-priority monitoring
Moderate intervals (1-6 hours):
  • E-commerce products, job listings
  • Most monitoring use cases
Slow intervals (daily):
  • Documentation, blog posts, policies
  • Low-frequency content
Jobs can fail if sites are down or block requests:
job = client.get_job(job_id)
if job['status'] == 'failed':
    print(f"Job failed: {job['error']}")
    # Retry logic, alerts, etc.

Next steps

Strategies Deep Dive

Learn how AI generates extraction strategies

Jobs Deep Dive

Master job execution and result handling

Schedules Deep Dive

Set up automated monitoring

Workflows Deep Dive

Build multi-step scraping pipelines

Need help?

Email me at mckinnon@meter.sh