Strategies

A strategy is a reusable extraction plan that tells Meter how to extract data from a webpage. Think of it like a recipe: you create it once by describing what you want, and Meter’s AI figures out the exact selectors and extraction logic.

What is a strategy?

A strategy contains:

Extraction method: Either CSS Path (for traditional HTML) or API Path (for JavaScript-heavy sites)
Field definitions mapping selectors or API responses to your data fields
Extraction metadata like item containers, scopes, or API endpoints

Meter automatically detects which extraction method works best for each site—you don’t need to choose. Once created, a strategy can be reused unlimited times across similar pages—no LLM costs after initial generation.

How strategies are generated

Meter uses AI to analyze your target webpage and generate precise extraction strategies:

You provide: A URL and plain-English description of what to extract
Meter analyzes: The page structure, HTML patterns, and content layout
AI generates: CSS selectors and extraction rules optimized for that page
You get: A reusable strategy plus preview data showing what was extracted

This approach combines the intelligence of AI setup with the speed and reliability of traditional scraping.

Example

from meter_sdk import MeterClient

client = MeterClient(api_key="sk_live_...")

# Generate a strategy
result = client.generate_strategy(
    url="https://news.ycombinator.com",
    description="Extract post titles and scores",
    name="HN Front Page"
)

# Check the preview
print(f"Extracted {len(result['preview_data'])} items")
for item in result['preview_data'][:3]:
    print(item)

# Output:
# {'title': 'Launch HN: ...', 'score': 42}
# {'title': 'Ask HN: ...', 'score': 15}
# ...

Extraction methods

Meter automatically selects the best extraction method for each site. You describe what you want, and Meter figures out how to get it.

CSS Path extraction

For traditional HTML pages, Meter generates CSS selectors that target the content you need. Best for:

Static HTML sites
Server-rendered pages
Sites with stable DOM structure
Blogs, news sites, and content pages

CSS Path extraction is fast and reliable for sites where content is present in the initial HTML response.

API Path extraction

For JavaScript-heavy sites, Meter automatically discovers the underlying APIs that power the page and extracts data directly from them.

Automatic detection

Meter identifies when a page relies on JavaScript to load its content.

API discovery

The data source APIs are automatically identified—no reverse engineering required.

Authentication handled

Any required tokens or session data are handled automatically.

Direct extraction

Data is extracted directly from API responses—cleaner and more reliable than parsing the DOM.

Best for:

Single-page applications (React, Vue, Angular)
Financial data sites
Dynamic dashboards
Sites with client-side rendering

API Path extraction often returns cleaner, more structured data than DOM scraping—and it’s more resilient to UI changes.

Automatic token handling

JavaScript-heavy sites often require authentication tokens to access their APIs. Meter handles this automatically—you don’t need to worry about the details.

CSRF Tokens

Many sites protect their APIs with CSRF tokens. Meter detects and includes these tokens automatically, so your extractions work without manual configuration.

Session Cookies

Session state is maintained across the extraction process. If a site requires cookies to access its APIs, Meter handles this for you.

Custom Headers

API keys, authorization headers, and other required headers are automatically included in requests.

Multi-API Chains

Some sites require multiple API calls in sequence. Meter handles these dependencies and chains requests in the correct order.

Real-world example: Financial data

Consider extracting stock quotes from a financial data site. When you visit the page, you see prices updating in real-time—but the HTML source shows almost nothing. The data is loaded via JavaScript from a hidden API. With traditional scraping, you would need to:

Reverse-engineer the API endpoints
Figure out the authentication requirements
Handle CSRF tokens and session management
Parse the JSON response format

With Meter, you simply describe what you want: “Extract stock symbol, current price, and daily change.” Meter automatically:

Discovers the quote API endpoint
Captures any required authentication tokens
Extracts the structured data from API responses
Returns clean, normalized data

The result is faster extraction, cleaner data, and a strategy that’s resilient to UI redesigns—because you’re hitting the same API the site uses internally.

Strategy lifecycle

Generate: Create strategy with AI
Preview: Check the preview_data to verify extraction
Refine (optional): Provide feedback if something’s missing
Use: Run jobs with the strategy
Monitor: Check if results are still accurate over time

Refining strategies

If the initial extraction isn’t perfect, refine it with feedback:

# Initial generation
result = client.generate_strategy(
    url="https://shop.com/products",
    description="Extract product info",
    name="Product Scraper"
)

# Check preview - oops, missing images
print(result['preview_data'])  # No 'image' field

# Refine with feedback
refined = client.refine_strategy(
    strategy_id=result['strategy_id'],
    feedback="Also extract product images"
)

# Check again
print(refined['preview_data'])  # Now has 'image' field

Refinement uses cached HTML from the initial generation, so it’s fast and doesn’t re-fetch the page.

When to create new strategies

Create a new strategy when:

Different Site Structure

Each website layout needs its own strategy

Different Data Fields

Different extraction requirements need different strategies

Major Site Redesign

If a site changes its HTML structure significantly

Different Page Types

Product pages vs. category pages need separate strategies

Reusing strategies

You can reuse the same strategy across:

Multiple URLs on the same site (e.g., different products)
Pagination (if the structure is consistent)
Similar pages (if they share HTML structure)

# Generate once
strategy = client.generate_strategy(
    url="https://shop.com/product/123",
    description="Extract name, price, description"
)
strategy_id = strategy['strategy_id']

# Reuse for different products
job1 = client.create_job(strategy_id, "https://shop.com/product/123")
job2 = client.create_job(strategy_id, "https://shop.com/product/456")
job3 = client.create_job(strategy_id, "https://shop.com/product/789")

Strategy management

Listing strategies

# Get all strategies
strategies = client.list_strategies(limit=20)

for strategy in strategies:
    print(f"{strategy['name']}: {strategy['strategy_id']}")

Getting strategy details

strategy = client.get_strategy(strategy_id)

print(f"Name: {strategy['name']}")
print(f"Description: {strategy['description']}")
print(f"Created: {strategy['created_at']}")
print(f"Preview: {strategy['preview_data']}")

Deleting strategies

# Delete a strategy (also deletes associated jobs and schedules)
client.delete_strategy(strategy_id)

Deleting a strategy also deletes all associated jobs and schedules. This action cannot be undone.

Best practices

Use descriptive names

Give strategies clear names that describe their purpose:Good: "HN Front Page - Titles and Scores" Bad: "Strategy 1"This helps when managing multiple strategies.

Test with refinement

Always check preview_data before creating jobs:

result = client.generate_strategy(...)

# Verify all required fields are present
required_fields = {'title', 'price', 'image'}
actual_fields = set(result['preview_data'][0].keys())

if not required_fields.issubset(actual_fields):
    missing = required_fields - actual_fields
    client.refine_strategy(
        strategy_id=result['strategy_id'],
        feedback=f"Also extract: {', '.join(missing)}"
    )

Be specific in descriptions

Provide clear, specific extraction instructions:Good: “Extract product name, price with currency, main image URL, and stock availability from the product grid”Bad: “Get products”Specific descriptions lead to better strategies on the first try.

Monitor strategy health

Strategies can break if sites change their HTML:

# Check recent jobs for failures
jobs = client.list_jobs(
    strategy_id=strategy_id,
    status='failed',
    limit=5
)

if len(jobs) > 0:
    print(f"Strategy {strategy_id} may need updating")

Troubleshooting

Strategy generation fails

Possible causes:

URL is not accessible
Page requires authentication
Description is too vague

Solutions:

Verify the URL loads in a browser
For auth-required pages, contact support
Make your description more specific

Missing fields in preview

Problem: Some expected fields aren’t in preview_dataSolution: Use refinement:

client.refine_strategy(
    strategy_id=strategy_id,
    feedback="Also extract the product SKU and brand name"
)

Strategy stops working

Problem: Jobs that worked before now fail or return incorrect dataCause: Website HTML structure changedSolutions:

Generate a new strategy for the updated site
Update your jobs to use the new strategy
Delete the old strategy

API Path extraction returns empty data

Problem: Meter detected an API but returns no dataPossible causes:

The API requires authentication that expired
The site changed its API endpoints
Rate limiting is blocking requests

Solutions:

Generate a fresh strategy to capture new authentication tokens
If the site has changed significantly, the strategy may need regeneration
For rate-limited sites, reduce scrape frequency

Next steps

Create Jobs

Learn how to run scrapes using your strategies

Set Up Schedules

Automate scraping with recurring schedules

Python SDK Reference

Explore all strategy methods in the SDK

REST API Reference

View strategy endpoints in the REST API

Need help?

Email me at mckinnon@meter.sh

Getting Started

Core Concepts

Strategies

Strategies

What is a strategy?

How strategies are generated

Example

Extraction methods

CSS Path extraction

API Path extraction

Automatic token handling

Real-world example: Financial data

Strategy lifecycle

Refining strategies

When to create new strategies

Different Site Structure

Different Data Fields

Major Site Redesign

Different Page Types

Reusing strategies

Strategy management

Listing strategies

Getting strategy details

Deleting strategies

Best practices

Troubleshooting

Next steps

Create Jobs

Set Up Schedules

Python SDK Reference

REST API Reference

Need help?

Getting Started

Core Concepts

​Strategies

​What is a strategy?

​How strategies are generated

​Example

​Extraction methods

​CSS Path extraction

​API Path extraction

​Automatic token handling

​Real-world example: Financial data

​Strategy lifecycle

​Refining strategies

​When to create new strategies

Different Site Structure

Different Data Fields

Major Site Redesign

Different Page Types

​Reusing strategies

​Strategy management

​Listing strategies

​Getting strategy details

​Deleting strategies

​Best practices

​Troubleshooting

​Next steps

Create Jobs

Set Up Schedules

Python SDK Reference

REST API Reference

​Need help?

Strategies

What is a strategy?

How strategies are generated

Example

Extraction methods

CSS Path extraction

API Path extraction

Automatic token handling

Real-world example: Financial data

Strategy lifecycle

Refining strategies

When to create new strategies

Reusing strategies

Strategy management

Listing strategies

Getting strategy details

Deleting strategies

Best practices

Troubleshooting

Next steps

Need help?