Skip to main content

Core Concepts

Meter is built around a few key abstractions that make web scraping and monitoring simple and cost-effective. Understanding these concepts will help you get the most out of the platform.

The big picture

1

Strategy

A reusable extraction plan generated by AI that defines how to scrape a website. Created once, used many times.
2

Job

A single execution of a scrape using a strategy. Jobs run asynchronously and return extracted data.
3

Schedule

Automated recurring jobs that run at specified intervals or cron times. Perfect for monitoring websites.
4

Change Detection

Intelligent diffing that compares jobs to detect meaningful content changes, filtering out noise.

Key concepts

How it all fits together

Example workflow

  1. Generate a strategy for extracting product data from an e-commerce site
  2. Create a schedule to scrape the site every hour
  3. Jobs run automatically, extracting current product data
  4. Changes are detected by comparing content hashes and structural signatures
  5. You get notified via webhook or pull from the changes API
  6. Update your database only with changed content

Cost model

Understanding Meter’s cost structure helps you optimize usage:
ActionCostFrequency
Strategy generation~$0.02-0.06Once per site/pattern
Job executionFree*Unlimited
Change detectionFreeAutomatic
API callsFree*Unlimited
*During beta, all features are free. Production pricing will be announced before beta ends.

Why strategy-based is cheaper

Traditional LLM scraping costs scale with usage:
  • Traditional: Pay per scrape ($0.02-0.10 each)
  • Meter: Pay once for strategy ($0.02-0.06), then scrape unlimited times for free
For a site scraped 100 times:
  • Traditional LLM scraping: $2-10
  • Meter: $0.02-0.06 (97-99% savings)

Data model

Understanding the data model helps you work with the API:
User
  └── Strategies
        ├── Preview Data (sample extraction)
        ├── Jobs
        │     ├── Results (extracted data)
        │     ├── Content Hash
        │     └── Structural Signature
        └── Schedules
              ├── Interval or Cron
              ├── Webhook URL (optional)
              └── Associated Jobs

Best practices

If multiple pages have the same structure (e.g., product pages, blog posts), you can reuse the same strategy with different URLs:
# Generate once for pattern
strategy = client.generate_strategy(
    url="https://shop.com/product/123",
    description="Extract name, price, description"
)

# Reuse for different products
job1 = client.create_job(strategy_id, "https://shop.com/product/123")
job2 = client.create_job(strategy_id, "https://shop.com/product/456")
Instead of webhooks for every change, poll the changes API periodically:
# Check once per hour for all changes
changes = client.get_schedule_changes(schedule_id)
if changes['count'] > 0:
    batch_process(changes['changes'])
This reduces webhook traffic and allows batching updates.
Faster intervals (15-30 min):
  • Stock prices, sports scores, breaking news
  • High-priority monitoring
Moderate intervals (1-6 hours):
  • E-commerce products, job listings
  • Most monitoring use cases
Slow intervals (daily):
  • Documentation, blog posts, policies
  • Low-frequency content
Jobs can fail if sites are down or block requests:
job = client.get_job(job_id)
if job['status'] == 'failed':
    print(f"Job failed: {job['error']}")
    # Retry logic, alerts, etc.

Next steps

Need help?

Email me at [email protected]