Skip to main content

Strategies

A strategy is a reusable extraction plan that tells Meter how to extract data from a webpage. Think of it like a recipe: you create it once by describing what you want, and Meter’s AI figures out the exact selectors and extraction logic.

What is a strategy?

A strategy contains:
  • CSS selectors for finding content on the page
  • Field definitions mapping selectors to data fields
  • Extraction metadata like item containers and scopes
Once created, a strategy can be reused unlimited times across similar pages—no LLM costs after initial generation.

How strategies are generated

Meter uses AI to analyze your target webpage and generate precise extraction strategies:
  1. You provide: A URL and plain-English description of what to extract
  2. Meter analyzes: The page structure, HTML patterns, and content layout
  3. AI generates: CSS selectors and extraction rules optimized for that page
  4. You get: A reusable strategy plus preview data showing what was extracted
This approach combines the intelligence of AI setup with the speed and reliability of traditional scraping.

Example

from meter_sdk import MeterClient

client = MeterClient(api_key="sk_live_...")

# Generate a strategy
result = client.generate_strategy(
    url="https://news.ycombinator.com",
    description="Extract post titles and scores",
    name="HN Front Page"
)

# Check the preview
print(f"Extracted {len(result['preview_data'])} items")
for item in result['preview_data'][:3]:
    print(item)

# Output:
# {'title': 'Launch HN: ...', 'score': 42}
# {'title': 'Ask HN: ...', 'score': 15}
# ...

Strategy lifecycle

  1. Generate: Create strategy with AI
  2. Preview: Check the preview_data to verify extraction
  3. Refine (optional): Provide feedback if something’s missing
  4. Use: Run jobs with the strategy
  5. Monitor: Check if results are still accurate over time

Refining strategies

If the initial extraction isn’t perfect, refine it with feedback:
# Initial generation
result = client.generate_strategy(
    url="https://shop.com/products",
    description="Extract product info",
    name="Product Scraper"
)

# Check preview - oops, missing images
print(result['preview_data'])  # No 'image' field

# Refine with feedback
refined = client.refine_strategy(
    strategy_id=result['strategy_id'],
    feedback="Also extract product images"
)

# Check again
print(refined['preview_data'])  # Now has 'image' field
Refinement uses cached HTML from the initial generation, so it’s fast and doesn’t re-fetch the page.

When to create new strategies

Create a new strategy when:

Different Site Structure

Each website layout needs its own strategy

Different Data Fields

Different extraction requirements need different strategies

Major Site Redesign

If a site changes its HTML structure significantly

Different Page Types

Product pages vs. category pages need separate strategies

Reusing strategies

You can reuse the same strategy across:
  • Multiple URLs on the same site (e.g., different products)
  • Pagination (if the structure is consistent)
  • Similar pages (if they share HTML structure)
# Generate once
strategy = client.generate_strategy(
    url="https://shop.com/product/123",
    description="Extract name, price, description"
)
strategy_id = strategy['strategy_id']

# Reuse for different products
job1 = client.create_job(strategy_id, "https://shop.com/product/123")
job2 = client.create_job(strategy_id, "https://shop.com/product/456")
job3 = client.create_job(strategy_id, "https://shop.com/product/789")

Strategy management

Listing strategies

# Get all strategies
strategies = client.list_strategies(limit=20)

for strategy in strategies:
    print(f"{strategy['name']}: {strategy['id']}")

Getting strategy details

strategy = client.get_strategy(strategy_id)

print(f"Name: {strategy['name']}")
print(f"Description: {strategy['description']}")
print(f"Created: {strategy['created_at']}")
print(f"Preview: {strategy['preview_data']}")

Deleting strategies

# Delete a strategy (also deletes associated jobs and schedules)
client.delete_strategy(strategy_id)
Deleting a strategy also deletes all associated jobs and schedules. This action cannot be undone.

Best practices

Give strategies clear names that describe their purpose:Good: "HN Front Page - Titles and Scores" Bad: "Strategy 1"This helps when managing multiple strategies.
Always check preview_data before creating jobs:
result = client.generate_strategy(...)

# Verify all required fields are present
required_fields = {'title', 'price', 'image'}
actual_fields = set(result['preview_data'][0].keys())

if not required_fields.issubset(actual_fields):
    missing = required_fields - actual_fields
    client.refine_strategy(
        strategy_id=result['strategy_id'],
        feedback=f"Also extract: {', '.join(missing)}"
    )
Provide clear, specific extraction instructions:Good: “Extract product name, price with currency, main image URL, and stock availability from the product grid”Bad: “Get products”Specific descriptions lead to better strategies on the first try.
Strategies can break if sites change their HTML:
# Check recent jobs for failures
jobs = client.list_jobs(
    strategy_id=strategy_id,
    status='failed',
    limit=5
)

if len(jobs) > 0:
    print(f"Strategy {strategy_id} may need updating")

Troubleshooting

Possible causes:
  • URL is not accessible
  • Page requires authentication
  • Description is too vague
Solutions:
  • Verify the URL loads in a browser
  • For auth-required pages, contact support
  • Make your description more specific
Problem: Some expected fields aren’t in preview_dataSolution: Use refinement:
client.refine_strategy(
    strategy_id=strategy_id,
    feedback="Also extract the product SKU and brand name"
)
Problem: Jobs that worked before now fail or return incorrect dataCause: Website HTML structure changedSolutions:
  1. Generate a new strategy for the updated site
  2. Update your jobs to use the new strategy
  3. Delete the old strategy

Next steps

Need help?

Email me at [email protected]