Core Concepts

Meter is built around a few key abstractions that make web scraping and monitoring simple and cost-effective. Understanding these concepts will help you get the most out of the platform.

The big picture

Strategy

A reusable extraction plan generated by AI that defines how to scrape a website. Created once, used many times.

Job

A single execution of a scrape using a strategy. Jobs run asynchronously and return extracted data.

Schedule

Automated recurring jobs that run at specified intervals or cron times. Perfect for monitoring websites.

Workflow (optional)

Chain strategies into DAG-based pipelines where the output of one scraper feeds into the next. For multi-step scraping like index → detail pages.

Change Detection

Intelligent diffing that compares jobs to detect meaningful content changes, filtering out noise.

Key concepts

Strategies

Learn how AI-generated extraction strategies work and when to use them

Jobs

Understand job execution, status checking, and result retrieval

Schedules

Set up automated monitoring with intervals or cron expressions

Workflows

Chain strategies into multi-step scraping pipelines

Change Detection

Discover how Meter detects meaningful content changes

How it all fits together

Example workflow

Generate a strategy for extracting product data from an e-commerce site
Create a schedule to scrape the site every hour
Jobs run automatically, extracting current product data
Changes are detected by comparing content hashes and structural signatures
You get notified via webhook or pull from the changes API
Update your database only with changed content

Cost model

Understanding Meter’s cost structure helps you optimize usage:

Action	Cost	Frequency
Strategy generation	~$0.02-0.06	Once per site/pattern
Job execution	Free*	Unlimited
Change detection	Free	Automatic
API calls	Free*	Unlimited

*During beta, all features are free. Production pricing will be announced before beta ends.

Why strategy-based is cheaper

Traditional LLM scraping costs scale with usage:

Traditional: Pay per scrape ($0.02-0.10 each)
Meter: Pay once for strategy ($0.02-0.06), then scrape unlimited times for free

For a site scraped 100 times:

Traditional LLM scraping: $2-10
Meter: $0.02-0.06 (97-99% savings)

Data model

Understanding the data model helps you work with the API:

User
  ├── Strategies
  │     ├── Preview Data (sample extraction)
  │     ├── Jobs
  │     │     ├── Results (extracted data)
  │     │     ├── Content Hash
  │     │     └── Structural Signature
  │     └── Schedules
  │           ├── Interval or Cron
  │           ├── Webhook URL (optional)
  │           └── Associated Jobs
  └── Workflows
        ├── Nodes (strategy + input config)
        ├── Edges (data flow + filters)
        ├── Runs (execution history)
        └── Schedules (interval or cron)

Best practices

Reuse strategies across similar pages

If multiple pages have the same structure (e.g., product pages, blog posts), you can reuse the same strategy with different URLs:

# Generate once for pattern
strategy = client.generate_strategy(
    url="https://shop.com/product/123",
    description="Extract name, price, description"
)

# Reuse for different products
job1 = client.create_job(strategy_id, "https://shop.com/product/123")
job2 = client.create_job(strategy_id, "https://shop.com/product/456")

Use pull-based API for batch processing

Instead of webhooks for every change, poll the changes API periodically:

# Check once per hour for all changes
changes = client.get_schedule_changes(schedule_id)
if changes['count'] > 0:
    batch_process(changes['changes'])

This reduces webhook traffic and allows batching updates.

Set appropriate monitoring intervals

Faster intervals (15-30 min):

Stock prices, sports scores, breaking news
High-priority monitoring

Moderate intervals (1-6 hours):

E-commerce products, job listings
Most monitoring use cases

Slow intervals (daily):

Documentation, blog posts, policies
Low-frequency content

Handle job failures gracefully

Jobs can fail if sites are down or block requests:

job = client.get_job(job_id)
if job['status'] == 'failed':
    print(f"Job failed: {job['error']}")
    # Retry logic, alerts, etc.

Next steps

Strategies Deep Dive

Learn how AI generates extraction strategies

Jobs Deep Dive

Master job execution and result handling

Schedules Deep Dive

Set up automated monitoring

Workflows Deep Dive

Build multi-step scraping pipelines

Need help?

Email me at mckinnon@meter.sh

Getting Started

Core Concepts

​Core Concepts

​The big picture

​Key concepts

Strategies

Jobs

Schedules

Workflows

Change Detection

​How it all fits together

​Example workflow

​Cost model

​Why strategy-based is cheaper

​Data model

​Best practices

​Next steps

Strategies Deep Dive

Jobs Deep Dive

Schedules Deep Dive

Workflows Deep Dive

​Need help?

Core Concepts

The big picture

Key concepts

How it all fits together

Example workflow

Cost model

Why strategy-based is cheaper

Data model

Best practices

Next steps

Need help?