M3U Snapshots

EPGOAT Documentation - User Guides

M3U Snapshot Feature - Usage Guide

Status: Active Last Updated: 2025-11-02 Related Docs: Command Reference, EPG Generation Code Location: backend/epgoat/domain/parsers.py, backend/epgoat/cli/run_provider.py


Overview

The M3U snapshot feature captures the exact M3U file used during EPG generation, saving it with a timestamp for benchmarking purposes. This is critical when M3U files change dynamically and you need to test against identical data.

Quick Start

Save a snapshot during your next run:

python3 backend/epgoat/run_provider.py \
  --provider tps \
  --save-m3u-snapshot \
  --max-channels 5 \
  --force-refresh \
  --verbose

Output:

INFO: ✓ M3U snapshot saved: /Users/you/repo/dist/snapshots/tps-snapshot-20251030-143052.m3u
INFO:   Snapshot size: 524288 bytes (1234 lines)

Features

Exact capture - Saves M3U immediately after fetch, before any processing ✅ Timestamped - Format: {provider}-snapshot-{YYYYMMDD-HHMMSS}.m3uProvider-tagged - Includes provider name in filename ✅ Works with URLs and local files - Handles both HTTP(S) and file:// inputs ✅ Auto-creates directory - dist/snapshots/ created automatically ✅ Git-ignored - Snapshots excluded from version control

File Locations

  • Snapshots directory: dist/snapshots/
  • Naming pattern: {provider}-snapshot-{timestamp}.m3u
  • Example: tps-snapshot-20251030-143052.m3u

Usage Scenarios

1. Baseline for API Matching Project

# Save baseline before starting API overhaul
python3 backend/epgoat/run_provider.py \
  --provider tps \
  --save-m3u-snapshot \
  --max-channels 1000

# Result: dist/snapshots/tps-snapshot-20251030-120000.m3u
# Use this exact file for benchmarking after Phase 7

2. Compare Runs Over Time

# Monday run
python3 backend/epgoat/run_provider.py --provider tps --save-m3u-snapshot
# → tps-snapshot-20251028-090000.m3u

# Friday run
python3 backend/epgoat/run_provider.py --provider tps --save-m3u-snapshot
# → tps-snapshot-20251101-090000.m3u

# Compare changes
diff dist/snapshots/tps-snapshot-20251028-090000.m3u \
     dist/snapshots/tps-snapshot-20251101-090000.m3u

3. Debug Specific Failures

# Save snapshot when failure occurs
python3 backend/epgoat/run_provider.py \
  --provider tps \
  --save-m3u-snapshot \
  --verbose

# Later: replay with exact same data
python3 backend/epgoat/application/epg_generator.py \
  --m3u dist/snapshots/tps-snapshot-20251030-143052.m3u \
  --tz "America/Chicago"

Implementation Details

Code Changes

  1. backend/epgoat/domain/parsers.py (lines 329-397)
  2. Added save_snapshot and provider parameters to parse_m3u()
  3. Saves M3U content immediately after fetch
  4. Creates timestamped files in dist/snapshots/

  5. backend/epgoat/cli/run_provider.py (lines 550, 268, 272)

  6. Added --save-m3u-snapshot CLI flag
  7. Passes flag through to EPG generator
  8. Passes provider name for filename tagging

  9. backend/epgoat/application/epg_generator.py (lines 501-503)

  10. Reads save_m3u_snapshot flag from args
  11. Passes to parse_m3u() during parsing

  12. .gitignore (line 61)

  13. Added dist/snapshots/ to exclusions

What Gets Saved

  • Raw M3U content - Exactly as received from URL or file
  • No modifications - Before any parsing, filtering, or processing
  • Full file - Not truncated (unlike clone_m3u which modifies)
  • UTF-8 encoding - Standard text encoding

Performance Impact

  • Negligible - Only adds ~1-5ms per run
  • Disk usage - Typical M3U: 500KB-2MB per snapshot
  • No API calls - Uses already-fetched content

Comparison with clone_m3u

Feature --save-m3u-snapshot backend/epgoat/utilities/clone_m3u.py
Timing Immediately after fetch After EPG generation
Content Raw, unmodified Modified (tvg-ids added)
Re-fetch No Yes (could be different)
Purpose Benchmarking/debugging Stable IDs for players
When to use Need exact source data Need stable channel IDs

Best Practices

For Benchmarking (Your Use Case)

# 1. Save baseline snapshot
python3 backend/epgoat/run_provider.py \
  --provider tps \
  --save-m3u-snapshot \
  --max-channels 1000 \
  --force-refresh

# 2. Note the snapshot filename from logs
# → tps-snapshot-20251030-120000.m3u

# 3. Copy to baselines directory for safe keeping
mkdir -p dist/baselines
cp dist/snapshots/tps-snapshot-20251030-120000.m3u \
   dist/baselines/tps-baseline-phase0.m3u

# 4. After Phase 7, test against same data
python3 backend/epgoat/application/epg_generator.py \
  --m3u dist/baselines/tps-baseline-phase0.m3u \
  --tz "America/Chicago"

Cleanup Old Snapshots

# Keep only last 7 days
find dist/snapshots -name "*.m3u" -mtime +7 -delete

# Keep only last 10 snapshots per provider
ls -t dist/snapshots/tps-*.m3u | tail -n +11 | xargs rm -f

Troubleshooting

Snapshot not created?

Check the logs for:

ERROR: Failed to fetch M3U from URL: ...

If validation fails, snapshot won't be saved.

Wrong provider name in filename?

Ensure --provider is passed:

python3 backend/epgoat/run_provider.py --provider tps --save-m3u-snapshot
#                                           ^^^^^^^^^^^^^^

Snapshot directory not created?

Directory is auto-created. If failing, check permissions:

mkdir -p dist/snapshots
chmod 755 dist/snapshots

Examples

Minimal Command

python3 backend/epgoat/run_provider.py \
  --provider tps \
  --save-m3u-snapshot
python3 backend/epgoat/run_provider.py \
  --provider tps \
  --save-m3u-snapshot \
  --max-channels 1000 \
  --force-refresh \
  --verbose \
  --debug-matching \
  --date 2025-10-30

With Custom M3U URL

python3 backend/epgoat/run_provider.py \
  --provider custom \
  --save-m3u-snapshot \
  --m3u "https://example.com/playlist.m3u"

FAQ

Q: Does this slow down the run? A: No, negligible impact (~1-5ms). The M3U is already fetched; we just save it.

Q: Can I use this with local M3U files? A: Yes! It will copy the file with a timestamp.

Q: Do I need this every run? A: No, only when you need to capture the exact data for later replay/benchmarking.

Q: How much disk space do snapshots use? A: Typical M3U: 500KB-2MB. 100 snapshots ≈ 50-200MB.

Q: Are snapshots committed to git? A: No, dist/snapshots/ is in .gitignore.


Pro Tip: For your API matching project, save a snapshot NOW before any changes. This will be your "Phase 0 baseline" to compare against after Phase 7 completion!