For sites that load data via JavaScript APIs, use force_api=True to capture the underlying API:
# Force API-based captureresult = client.generate_strategy( url="https://api-heavy-site.com/listings", description="Extract all listing data", name="Listings API Scraper", force_api=True)# Check scraper typeprint(f"Scraper type: {result['scraper_type']}") # 'api' or 'css'# For API strategies, available parameters are returnedif result.get('api_parameters'): print(f"Available parameters: {result['api_parameters']}") # e.g., {'page': 1, 'limit': 20, 'category': 'all'}
API-based strategies capture underlying API calls instead of using CSS selectors.
This is useful for dynamic sites where data is loaded via JavaScript.
# Get all strategiesstrategies = client.list_strategies()# Find by nameproduct_strategies = [ s for s in strategies if 'product' in s['name'].lower()]# Most recentrecent = client.list_strategies(limit=5, offset=0)
# Delete old strategiesstrategies = client.list_strategies(limit=100)for strategy in strategies: # Delete if created more than 30 days ago if is_old(strategy['created_at']): client.delete_strategy(strategy['strategy_id']) print(f"Deleted {strategy['name']}")