Common Workflows & Operations

EPGOAT Documentation - AI Reference (Educational)

Common Workflows & Operations (Educational Version)

Note: This is the educational, human-readable version with examples and detailed explanations. For the AI-optimized version, see 1-CONTEXT/_WORKFLOWS.md.


Common Workflows (AI-Optimized)

Purpose: Quick reference for daily operations Token Budget: ~2K tokens (part of 50K Layer 1 budget) Last Updated: 2025-11-13

Full Details: Documentation/04-Guides/ (step-by-step guides with screenshots)


EPG Generation (Primary Workflow)

Generate EPG for Specific Provider + Date

cd backend/epgoat
python cli/run_provider.py --provider tps --date 2025-11-10

Flags: - --provider <slug>: Provider slug (e.g., tps, necro, trex) - --date <YYYY-MM-DD>: Target date (default: today) - --output <path>: Output file path (default: output/<provider>_<date>.xml) - --dry-run: Skip database writes

Output: XMLTV file at backend/epgoat/output/tps_2025-11-10.xml

💡 EPG Generation - Complete Example

# ============================================================
# Generate EPG for specific provider and date
# ============================================================

# Navigate to backend directory:
cd backend/epgoat

# Basic generation (today's date):
python cli/run_provider.py --provider tps

# Specify date:
python cli/run_provider.py --provider tps --date 2025-11-10

# Custom output file:
python cli/run_provider.py \
    --provider tps \
    --date 2025-11-10 \
    --output /tmp/custom_epg.xml

# Dry run (no database writes):
python cli/run_provider.py --provider tps --dry-run

# ============================================================
# Expected Output
# ============================================================

# Console output:
# [2025-11-10 14:30:15] Loading provider config: tps
# [2025-11-10 14:30:16] Config loaded from YAML cache (53x faster)
# [2025-11-10 14:30:16] Fetching M3U playlist...
# [2025-11-10 14:30:18] Fetched 12,847 channels
# [2025-11-10 14:30:18] VOD filter: removed 11,782 channels (91.7%)
# [2025-11-10 14:30:18] Processing 1,065 channels...
#
# Enrichment Progress:
# ████████████████████████████████████████ 1065/1065 (100%)
#
# Match Statistics:
# ├─ Total channels: 1,065
# ├─ Matched: 987 (92.7%)
# ├─ Unmatched: 78 (7.3%)
# └─ Average time: 42ms per channel
#
# Handler Performance:
# ├─ EnhancedMatchCache: 823 matches (83.4%, 0ms avg)
# ├─ LocalDatabase: 142 matches (14.4%, 23ms avg)
# ├─ Regex: 22 matches (2.2%, 3ms avg)
# └─ API: 0 matches (0%, 0ms avg)
#
# [2025-11-10 14:31:04] Generated XMLTV: 987 programmes
# [2025-11-10 14:31:06] Saved matches to database
# [2025-11-10 14:31:06] EPG file: output/tps_2025-11-10.xml
# [2025-11-10 14:31:06] Total time: 51 seconds
# [2025-11-10 14:31:06] ✅ EPG generation complete

# ============================================================
# Output File Structure
# ============================================================

# output/tps_2025-11-10.xml contains:
#
# <?xml version="1.0" encoding="UTF-8"?>
# <!DOCTYPE tv SYSTEM "xmltv.dtd">
# <tv generator-info-name="EPGOAT">
#
#   <!-- Channel definitions -->
#   <channel id="nba-01-lakers">
#     <display-name>NBA 01: Lakers vs Celtics</display-name>
#     <icon src="https://logo.epgoat.tv/nba.png" />
#   </channel>
#
#   <!-- Programme listings -->
#   <programme start="20251110183000 -0600" stop="20251110190000 -0600" channel="nba-01-lakers">
#     <title>NBA Pre-Game Show</title>
#     <sub-title>Lakers vs Celtics Preview</sub-title>
#     <category>Sports</category>
#     <category>Basketball</category>
#   </programme>
#
#   <programme start="20251110190000 -0600" stop="20251110220000 -0600" channel="nba-01-lakers">
#     <title>Lakers vs Celtics</title>
#     <sub-title>NBA - Regular Season</sub-title>
#     <desc>Los Angeles Lakers vs Boston Celtics</desc>
#     <category>Sports</category>
#     <category>Basketball</category>
#     <icon src="https://logo.epgoat.tv/nba.png" />
#   </programme>
#
#   <programme start="20251110220000 -0600" stop="20251110223000 -0600" channel="nba-01-lakers">
#     <title>NBA Post-Game Analysis</title>
#     <sub-title>Lakers vs Celtics Highlights</sub-title>
#     <category>Sports</category>
#     <category>Basketball</category>
#   </programme>
#
# </tv>

Key Features:

  1. Automatic Caching: Config cached in YAML (53x faster loads)
  2. VOD Filtering: 91.7% reduction (movies/TV shows removed)
  3. Multi-Handler Pipeline: 7 handlers try to match each channel
  4. Performance Metrics: Hit rates, timing, success rates
  5. Three-Block Schedule: Pre-game (30min), Live (3h), Post-game (30min)

Performance Breakdown: - Config loading: 1s (YAML cache) - M3U fetch: 2s (HTTP GET) - VOD filter: 2s (91.7% reduction) - Enrichment: 45s (1065 channels @ 42ms each) - XMLTV generation: 1s - Total: ~51 seconds

Common Flags: - --provider <slug>: Provider to process (required) - --date <YYYY-MM-DD>: Target date (default: today) - --output <path>: Output file path (default: output/_.xml) - --dry-run: Skip database writes (testing)

📊 Complete EPG Generation Workflow

flowchart TD
    Start([Start EPG Generation]) --> LoadConfig[Load Provider Config<br/>YAML cache or Database]
    LoadConfig --> FetchM3U[Fetch M3U Playlist<br/>HTTP GET from provider]

    FetchM3U --> VODFilter[VOD Filter<br/>91.7% reduction]
    VODFilter --> ParseChannels[Parse Channels<br/>Extract metadata]

    ParseChannels --> Preprocessing[Preprocessing<br/>Teams, Sport, League, Time]

    Preprocessing --> Handler1[Handler 1: Cache Lookup]
    Handler1 -->|HIT| Match[Match Found]
    Handler1 -->|MISS| Handler2[Handler 2: Event Cache]
    Handler2 -->|HIT| Match
    Handler2 -->|MISS| Handler3[Handler 3: Database]
    Handler3 -->|HIT| Match
    Handler3 -->|MISS| Handler4[Handler 4: Regex]
    Handler4 -->|HIT| Match
    Handler4 -->|MISS| Handler5[Handler 5: Cross-Provider]
    Handler5 -->|HIT| Match
    Handler5 -->|MISS| Handler6[Handler 6: API]
    Handler6 -->|HIT| Match
    Handler6 -->|MISS| Handler7[Handler 7: LLM]
    Handler7 -->|HIT| Match
    Handler7 -->|MISS| NoMatch[No Match]

    Match --> Schedule[Generate Schedule<br/>Pre/Live/Post blocks]
    NoMatch --> Schedule

    Schedule --> XMLTV[Generate XMLTV XML]
    XMLTV --> SaveDB[Save Matches to DB]
    SaveDB --> Output[Write EPG File]
    Output --> Stats[Log Statistics]
    Stats --> End([End])

    style Start fill:#90EE90
    style Match fill:#32CD32
    style NoMatch fill:#FF6B6B
    style Output fill:#87CEEB
    style End fill:#90EE90

*Full EPG Generation Pipeline: Input M3U → Output XMLTV (30-120 seconds)

Key Stages: 1. Config Loading (1-5s): Load provider patterns, VOD filters, TVG-IDs 2. M3U Fetch (2-10s): HTTP GET from provider URL 3. VOD Filter (1-3s): Remove 91.7% of channels (movies, TV shows) 4. Preprocessing (2-5s): Extract teams, sport, league, time from channel names 5. Enrichment (20-90s): 7-handler matching pipeline 6. Scheduling (1-3s): Generate pre-game, live, post-game blocks 7. XMLTV Generation (1-3s): Convert to XMLTV XML format 8. Database Save (2-5s): Cache matches for future runs

Run Command:

cd backend/epgoat
python cli/run_provider.py --provider tps --date 2025-11-10

*

What Happens (Internal Flow)

  1. Load provider config from config/providers/<provider>.yml (or fetch from DB)
  2. Fetch M3U playlist from provider URL
  3. Filter VOD channels (91.7% reduction via vod_detector.py)
  4. Parse channels using provider-specific patterns
  5. Extract teams, times, sport families from channel names
  6. Run 7-handler enrichment pipeline:
  7. Enhanced Match Cache → Event Details Cache → Local DB → Regex → Cross-Provider → API → LLM
  8. Generate XMLTV with programme blocks (pre-event, live, post-event)
  9. Write to output file
  10. Save match results to database

Typical Runtime: 30-120 seconds (depends on provider size, cache hit rate)

📖 EPG Generation Internals - What Really Happens

When you run python cli/run_provider.py --provider tps, here's the complete execution flow:

Phase 1: Initialization (1-5 seconds)

  1. Load environment variables (.env file)
  2. Initialize database connection (Supabase PostgreSQL)
  3. Load provider config:
  4. Check YAML cache (config/providers/tps.yml)
  5. If cache exists and < 24h old: load from YAML (0.1s)
  6. If cache missing/stale: fetch from database (5s), write YAML
  7. Initialize services:
  8. EnhancedMatchCache (in-memory)
  9. CrossProviderCache (in-memory)
  10. HandlerFactory (creates 7 handlers)
  11. EnrichmentPipeline (with handlers)

Phase 2: M3U Fetching (2-10 seconds)

  1. HTTP GET request to provider M3U URL
  2. Parse M3U format:
  3. Extract #EXTINF lines (channel metadata)
  4. Extract URLs (stream links)
  5. Build M3UEntry objects
  6. Create Channel objects (wrap M3UEntry with provider context)

Phase 3: VOD Filtering (1-3 seconds)

  1. Load VOD filter patterns from provider config
  2. Apply patterns to channel names:
  3. Match "1080p", "4K", "HD" (quality indicators)
  4. Match movie/TV show keywords
  5. Match specific channel name patterns
  6. Remove matched channels (91.7% reduction)
  7. Result: ~1,000 sports channels from ~12,000 total

Phase 4: Preprocessing (2-5 seconds)

  1. Extract channel families:
  2. "NBA 01: Lakers vs Celtics" → family="NBA"
  3. "NFL RedZone" → family="NFL"
  4. Extract team names:
  5. "Lakers vs Celtics" → team1="Lakers", team2="Celtics"
  6. Detect sport:
  7. family="NBA" → sport="Basketball"
  8. Infer league:
  9. family="NBA", teams=["Lakers", "Celtics"] → league="NBA"
  10. Extract time:
  11. "7:00 PM ET" → datetime(19, 0, tzinfo=ET)

Phase 5: Enrichment Pipeline (20-90 seconds)

For each channel:

  1. Handler 1: Enhanced Match Cache (0ms)
  2. Lookup by tvg-id OR channel_name
  3. O(1) dict lookup
  4. 83% hit rate (same-day re-processing)
  5. If HIT: return cached match, skip remaining handlers

  6. Handler 2: Event Details Cache (0ms)

  7. Cached event details from previous lookups
  8. O(1) dict lookup
  9. 5% additional hit rate

  10. Handler 3: Local Database (10-50ms)

  11. SQL query: events by date, league, teams
  12. Database index lookup
  13. 10% additional hit rate

  14. Handler 4: Regex Matcher (1-5ms)

  15. Pattern matching on channel name
  16. Multi-stage: exact → fuzzy → team extraction
  17. 2% additional hit rate

  18. Handler 5: Cross-Provider Cache (0ms)

  19. Learn from other providers (shared matches)
  20. O(1) dict lookup
  21. 0.5% additional hit rate

  22. Handler 6: API Handler (100-500ms)

  23. HTTP request to TheSportsDB API
  24. Search by teams, date, league
  25. Rarely used (expensive)
  26. 0.5% additional hit rate

  27. Handler 7: LLM Fallback (1000-3000ms)

  28. Claude API call
  29. Natural language matching
  30. Very rarely used (expensive)
  31. 0.1% additional hit rate

Phase 6: Scheduling (1-3 seconds)

For each matched channel:

  1. Determine sport-specific duration:
  2. NBA/NHL: 180 minutes (3 hours)
  3. NFL: 210 minutes (3.5 hours)
  4. Soccer: 120 minutes (2 hours)
  5. MLB: 180 minutes (3 hours)

  6. Generate three programme blocks:

  7. Pre-game: 30 minutes before event start
  8. Live: Event duration (varies by sport)
  9. Post-game: 30 minutes after event end

  10. Create Programme objects with XMLTV metadata:

  11. Title, subtitle, description
  12. Category (Sports → Basketball)
  13. Icon (sport logo)
  14. Start/stop times (UTC with offset)

Phase 7: XMLTV Generation (1-3 seconds)

  1. Build XML document:
  2. Header (generator info)
  3. Channel definitions (tvg-id, display name, icon)
  4. Programme listings (start, stop, title, desc, category)

  5. Format times:

  6. Convert to XMLTV format: "20251110193000 -0600"
  7. Include timezone offset

  8. Write to file: output/<provider>_<date>.xml

Phase 8: Database Save (2-5 seconds)

  1. Bulk insert matches to match_cache table
  2. Update unmatched_channels table (channels with no match)
  3. Update provider statistics (match rate, avg time)

Phase 9: Reporting (0 seconds)

  1. Calculate statistics:
  2. Total channels processed
  3. Match rate (92.7%)
  4. Handler performance (hit rates, timings)
  5. Total execution time

  6. Log to console and file

Total Time: 30-120 seconds (varies by provider size, cache hit rate)

Performance Optimizations:

  • YAML config caching: 53x faster than database
  • VOD filtering: 91.7% channel reduction
  • Enhanced match cache: 83% hit rate (O(1) lookup)
  • Early exit: Stop at first match (no wasted processing)
  • Bulk database operations: Single transaction for all matches

Database Refresh

Refresh Events from TheSportsDB

cd backend/epgoat
python utilities/refresh_event_db_v2.py --date 2025-11-10

Flags: - --date <YYYY-MM-DD>: Fetch events for this date - --days <N>: Fetch N days starting from date (default: 1) - --force: Force refresh even if events exist

What Happens: 1. Query TheSportsDB API for events on date 2. Deduplicate events (92% reduction) 3. Bulk insert into events table 4. Update participants, event_participants tables 5. Cache event details for matching pipeline

Typical Runtime: 10-30 seconds per day

💡 Database Commands - Complete Guide

# ============================================================
# Refresh Events from TheSportsDB API
# ============================================================
cd backend/epgoat
python utilities/refresh_event_db_v2.py --date 2025-11-10

# Output:
# [2025-11-10 14:00:00] Fetching events for 2025-11-10...
# [2025-11-10 14:00:02] Fetched 1,247 events from API
# [2025-11-10 14:00:02] Deduplicating events...
# [2025-11-10 14:00:03] Removed 1,147 duplicates (92% reduction)
# [2025-11-10 14:00:03] Inserting 100 new events...
# [2025-11-10 14:00:05] Updated 23 existing events
# [2025-11-10 14:00:05] Total events in database: 12,456
# [2025-11-10 14:00:05] ✅ Event refresh complete

# Refresh multiple days:
python utilities/refresh_event_db_v2.py --date 2025-11-10 --days 7

# Force refresh (ignore existing):
python utilities/refresh_event_db_v2.py --date 2025-11-10 --force

# ============================================================
# Refresh Leagues Data
# ============================================================
python utilities/refresh_leagues.py

# Output:
# [2025-11-10 14:00:00] Fetching leagues from TheSportsDB...
# [2025-11-10 14:00:02] Fetched 247 leagues
# [2025-11-10 14:00:02] Updating database...
# [2025-11-10 14:00:04] Added 12 new leagues
# [2025-11-10 14:00:04] Updated 235 existing leagues
# [2025-11-10 14:00:04] Total leagues: 247
# [2025-11-10 14:00:04] ✅ League refresh complete

# ============================================================
# Run Database Migrations
# ============================================================
cd backend/epgoat/infrastructure/database
python migration_runner.py

# Output:
# [2025-11-10 14:00:00] Checking for pending migrations...
# [2025-11-10 14:00:00] Current schema version: 017
# [2025-11-10 14:00:00] Found 1 pending migration: 018_add_new_table.sql
# [2025-11-10 14:00:00] Running migration 018...
# [2025-11-10 14:00:01] ✅ Migration 018 applied successfully
# [2025-11-10 14:00:01] Schema version: 018

# ============================================================
# Create New Migration
# ============================================================
cd migrations/
vi 019_add_cost_tracking.sql

# File content:
# -- Migration 019: Add cost tracking table
# -- Created: 2025-11-10
#
# CREATE TABLE IF NOT EXISTS api_costs (
#     id INTEGER PRIMARY KEY,
#     provider TEXT NOT NULL,
#     api_name TEXT NOT NULL,
#     cost_usd REAL NOT NULL,
#     request_count INTEGER NOT NULL,
#     created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
# );
#
# CREATE INDEX idx_api_costs_provider ON api_costs(provider);
# CREATE INDEX idx_api_costs_created_at ON api_costs(created_at);

# Run the migration:
cd ../
python migration_runner.py

# ============================================================
# Query Database (Example SQL Queries)
# ============================================================

# View recent events:
# SELECT * FROM events WHERE event_date >= '2025-11-10' LIMIT 10;

# View unmatched channels:
# SELECT channel_name, provider_id, attempt_count
# FROM unmatched_channels
# WHERE record_status = 'active'
# ORDER BY attempt_count DESC
# LIMIT 50;

# View provider patterns:
# SELECT pattern, sport_family, priority, match_count
# FROM provider_patterns
# WHERE provider_id = 1 AND is_active = 1
# ORDER BY priority DESC;

# View cache performance:
# SELECT
#     COUNT(*) as total_matches,
#     AVG(confidence) as avg_confidence,
#     matched_by
# FROM match_cache
# WHERE created_at >= datetime('now', '-1 day')
# GROUP BY matched_by;

# ============================================================
# Clear Cache (Force Re-Matching)
# ============================================================

# Clear match cache older than 7 days:
# DELETE FROM match_cache WHERE created_at < datetime('now', '-7 days');

# Clear all cache (force complete re-match):
# DELETE FROM match_cache;
# -- Note: This is a rare operation. Usually not needed.

# ============================================================
# Backup Database (Manual)
# ============================================================

# Using Supabase dashboard:
# 1. Go to https://supabase.com/dashboard
# 2. Select project: epgoat-events
# 3. Navigate to: Settings → Database → Backups
# 4. Download backup file

Database Operations Summary:

Operation Command Frequency
Refresh events refresh_event_db_v2.py Daily
Refresh leagues refresh_leagues.py Weekly
Run migrations migration_runner.py As needed
Clear cache SQL DELETE Rarely

Event Refresh Strategy: - Run daily before EPG generation - Fetches events for next 7 days - Deduplicates automatically (92% reduction) - Updates existing events (if data changed)

Migration Best Practices: - Number migrations sequentially (018, 019, 020...) - Never modify existing migrations - Test migrations on copy of database first - Document breaking changes

Cache Management: - Match cache: 24-48h TTL (automatic) - Event details cache: 7-day TTL (automatic) - Manual clear only when debugging

📊 Database Migration Workflow

flowchart TD
    Change[Need Schema Change] --> Create[Create Migration File<br/>018_your_change.sql]

    Create --> Write[Write SQL<br/>CREATE/ALTER/INSERT]

    Write --> Test[Test Migration Locally]
    Test --> Works{Works?}
    Works -->|No| Debug[Debug SQL]
    Debug --> Write

    Works -->|Yes| Run[Run Migration<br/>python migration_runner.py]

    Run --> Verify[Verify Schema<br/>Check tables/columns]

    Verify --> Correct{Correct?}
    Correct -->|No| Rollback[Rollback Migration]
    Rollback --> Write

    Correct -->|Yes| Commit[Commit Migration File]
    Commit --> Push[Git Push]
    Push --> ProdDeploy[Deploy to Production]

    ProdDeploy --> ProdRun[Run Migration in Prod<br/>Automated or Manual]

    style Change fill:#E8F5E9
    style ProdRun fill:#90EE90
    style Debug fill:#FFB6C1
    style Rollback fill:#FFB6C1

*Migration Process: Create → Test → Run → Verify → Commit → Deploy

Commands:

# Create migration:
cd backend/epgoat/infrastructure/database/migrations
vi 018_add_new_table.sql

# Run migration:
cd ../
python migration_runner.py

# Verify:
python -c "from connection import get_database_connection; ..."

Migration Naming: <number>_<snake_case_description>.sql

Best Practices: - Never modify existing migrations (create new one) - Test both UP and DOWN (if applicable) - Use transactions for safety - Document breaking changes *

Refresh Leagues

cd backend/epgoat
python utilities/refresh_leagues.py

What Happens: Fetches all leagues from TheSportsDB, updates leagues table


Testing

💡 Testing Commands - Complete Guide

# ============================================================
# Run All Tests (from project root)
# ============================================================
make test

# Output:
# ============================= test session starts ==============================
# platform linux -- Python 3.11.0, pytest-7.4.0
# collected 784 items
#
# backend/epgoat/tests/test_patterns.py .................... [ 10%]
# backend/epgoat/tests/test_parsers.py ..................... [ 25%]
# backend/epgoat/tests/test_models.py ...................... [ 35%]
# backend/epgoat/tests/test_enrichment.py .................. [ 60%]
# backend/epgoat/tests/test_integration.py ................. [ 80%]
# backend/epgoat/tests/test_services.py .................... [100%]
#
# ========================= 770 passed, 14 failed in 28.3s =======================

# ============================================================
# Run Specific Test File
# ============================================================
cd backend/epgoat
pytest tests/test_patterns.py -v

# Output:
# tests/test_patterns.py::test_nba_pattern_matches PASSED
# tests/test_patterns.py::test_nfl_pattern_matches PASSED
# tests/test_patterns.py::test_nhl_pattern_matches PASSED
# tests/test_patterns.py::test_invalid_pattern_no_match PASSED

# ============================================================
# Run Specific Test Function
# ============================================================
pytest tests/test_patterns.py::test_nba_pattern_matches -v

# Output:
# tests/test_patterns.py::test_nba_pattern_matches PASSED

# ============================================================
# Run Tests Matching Keyword
# ============================================================
pytest -k "cache" -v

# Output:
# tests/test_enrichment.py::test_enhanced_cache_hit PASSED
# tests/test_enrichment.py::test_enhanced_cache_miss PASSED
# tests/test_enrichment.py::test_cross_provider_cache PASSED

# ============================================================
# Run Tests with Coverage Report
# ============================================================
make test-coverage

# Output:
# ============================= test session starts ==============================
# collected 784 items
# ... tests run ...
#
# ---------- coverage: platform linux, python 3.11.0 ----------
# Name                                          Stmts   Miss  Cover
# -----------------------------------------------------------------
# backend/epgoat/domain/models.py                 150      2    99%
# backend/epgoat/domain/patterns.py               100      0   100%
# backend/epgoat/services/enrichment/pipeline.py  200      5    98%
# backend/epgoat/services/enhanced_match_cache.py 120      3    98%
# -----------------------------------------------------------------
# TOTAL                                          5420     89    98%
#
# HTML coverage report: htmlcov/index.html

# ============================================================
# Run Integration Tests Only
# ============================================================
pytest tests/test_integration.py -v

# Output:
# tests/test_integration.py::test_full_epg_generation PASSED
# tests/test_integration.py::test_provider_onboarding PASSED

# ============================================================
# Run Tests in Parallel (faster)
# ============================================================
pytest -n auto  # Uses all CPU cores

# Output:
# gw0 [784] / gw1 [784] / gw2 [784] / gw3 [784]
# ... tests complete in 8.2s (vs 28.3s sequential) ...

# ============================================================
# Run Tests with Verbose Failures
# ============================================================
pytest -vv --tb=long

# Shows full stack traces and variable values on failure

# ============================================================
# Run Linting
# ============================================================
make lint

# Output:
# Running ruff...
# backend/epgoat/services/enrichment/pipeline.py:45:1: E302 expected 2 blank lines
# backend/epgoat/domain/models.py:120:80: E501 line too long (85 > 100)
#
# Found 2 issues
# Run 'make format' to auto-fix

# ============================================================
# Run Type Checking
# ============================================================
make type-check

# Output:
# Running mypy...
# backend/epgoat/services/enrichment/pipeline.py:45: error: Missing return type annotation
# backend/epgoat/domain/models.py:120: error: Incompatible return type
#
# Found 2 errors in 2 files

# ============================================================
# Run All CI Checks (full validation)
# ============================================================
make ci

# Runs: test + lint + type-check + validate
# Output:
# ✅ Tests passed (770/784)
# ✅ Linting passed
# ✅ Type checking passed
# ✅ Configuration validation passed
# ✅ CI checks complete

Test Commands Summary:

Command Purpose Speed
make test All tests 28s
pytest tests/test_X.py Specific file 2-5s
pytest -k "keyword" Keyword match Varies
make test-coverage Coverage report 30s
make lint Code style 3s
make type-check Type validation 5s
make ci Full validation 40s

Test Organization: - Unit tests: tests/test_*.py (fast, isolated) - Integration tests: tests/test_integration.py (slower, multi-component) - Fixtures: tests/conftest.py (reusable test data)

Coverage Goals: - Minimum: 80% - Target: 90%+ - Current: 98.2%

📊 Testing Workflow - Development to CI

flowchart LR
    Write[Write Code] --> Local[Run Tests Locally]

    Local --> Unit{Unit Tests Pass?}
    Unit -->|No| Fix1[Fix Code]
    Fix1 --> Local

    Unit -->|Yes| Lint{Lint Pass?}
    Lint -->|No| Fix2[Fix Style]
    Fix2 --> Local

    Lint -->|Yes| Types{Type Check Pass?}
    Types -->|No| Fix3[Add Type Hints]
    Fix3 --> Local

    Types -->|Yes| Commit[Git Commit]
    Commit --> Push[Git Push]

    Push --> CI[GitHub Actions CI]
    CI --> CITests[Run All Tests]
    CI --> CILint[Run Linting]
    CI --> CITypes[Run Type Check]

    CITests --> CIResult{All Pass?}
    CILint --> CIResult
    CITypes --> CIResult

    CIResult -->|No| Fix4[Fix Issues]
    Fix4 --> Write

    CIResult -->|Yes| Review[Code Review]
    Review --> Merge[Merge to Main]

    style Write fill:#E8F5E9
    style Merge fill:#90EE90
    style Fix1 fill:#FFB6C1
    style Fix2 fill:#FFB6C1
    style Fix3 fill:#FFB6C1
    style Fix4 fill:#FFB6C1

*Development Workflow: Write → Test → Lint → Type Check → Commit → CI → Merge

Local Commands:

# Run all tests:
make test

# Run linting:
make lint

# Run type checking:
make type-check

# Run all CI checks:
make ci

CI Pipeline (GitHub Actions): - Runs on every push and pull request - Must pass before merge allowed - Same commands as local development *

Run All Tests

make test  # From project root

What Runs: - pytest with coverage - 784 tests (98.2% passing) - Coverage report

Run Specific Test File

cd backend/epgoat
pytest tests/test_patterns.py -v

Run Specific Test Function

pytest tests/test_patterns.py::test_pattern_matching -v

Run with Coverage

make test-coverage  # From project root

Output: HTML coverage report at htmlcov/index.html

Run Integration Tests Only

pytest tests/test_integration.py -v

Code Quality

Lint All Code

make lint  # From project root

What Runs: - Ruff (linting) - Black (format check) - isort (import check)

Format Code

make format  # From project root

What Runs: - Black (auto-format) - isort (auto-sort imports)

Type Check

make type-check  # From project root

What Runs: mypy (strict mode)

Run All CI Checks

make ci  # From project root

What Runs: test + lint + type-check + validate


Provider Onboarding

Onboard New Provider (Pattern Discovery)

cd backend/epgoat
python cli/onboard_provider.py \
    --provider necro \
    --m3u-url "https://example.com/playlist.m3u" \
    --dry-run

Flags: - --provider <slug>: Provider slug (lowercase, alphanumeric) - --m3u-url <url>: M3U playlist URL - --dry-run: Don't write to database (preview only) - --skip-verification: Skip LLM pattern verification

What Happens: 1. Fetch M3U playlist 2. Filter VOD channels 3. Auto-discover channel name patterns (PREFIX + digit detection) 4. Group channels by pattern templates (e.g., "NBA ~% :", "NFL ~% |") 5. Filter patterns by frequency (≥5 matches) 6. (Optional) LLM verification of pattern logic (Claude Haiku 3.5, 80% confidence threshold) - Verifies each pattern is a valid numbered series (not one-off show titles) - Separates passed patterns (added to DB) from failed patterns (rejected) 7. Write passed patterns to database (provider_patterns table with LLM metadata) 8. (Automatic) Create GitHub validation issue for manual review: - Summary statistics (pass/fail rates) - High-priority patterns (high-confidence failures, low-confidence passes) - Top 20 failed patterns with SQL INSERT queries - Top 20 passed patterns with SQL DELETE queries - Links to full R2 reports (JSON, CSV, SQL) - Auto-closes old validation issues for same provider 9. Generate initial EPG for validation

Output: - Console summary (total channels, detected patterns, LLM stats) - Database records (provider_patterns table) - GitHub issue URL (if LLM verification enabled and SKIP_GITHUB_ISSUE != true) - R2 reports: validation-reports/{provider}_{timestamp}.{json,csv,sql} - Initial EPG file: dist/{provider}_onboarding_{date}.xml

Typical Runtime: 60-180 seconds (depends on playlist size and LLM verification)

📊 Provider Onboarding Workflow

flowchart TD
    Start([New Provider]) --> GetM3U[Get M3U Playlist URL]
    GetM3U --> RunOnboard[Run Onboarding CLI]

    RunOnboard --> Fetch[Fetch M3U Playlist]
    Fetch --> FilterVOD[Filter VOD Channels]

    FilterVOD --> Analyze[Analyze Channel Names<br/>Detect Patterns]

    Analyze --> Detect[Auto-Detect Patterns<br/>PREFIX + digit]

    Detect --> Group[Group by Template<br/>Frequency ≥5]

    Group --> Verify{LLM Verification?}
    Verify -->|Yes| LLM[Claude API Verify<br/>80% confidence]
    Verify -->|No| SaveDB

    LLM --> Confidence{Confidence ≥ 80%?}
    Confidence -->|Yes| SaveDB[Save Patterns to DB]
    Confidence -->|No| Issue[Create GitHub Issue<br/>Manual Review]

    SaveDB --> GenerateYAML[Generate YAML Config]
    GenerateYAML --> Test[Test EPG Generation]

    Test --> Works{Works?}
    Works -->|No| Manual[Manual Pattern Tuning]
    Manual --> SaveDB

    Works -->|Yes| Done([Onboarding Complete])

    style Start fill:#90EE90
    style Done fill:#90EE90
    style Manual fill:#FFD700
    style Issue fill:#FFB6C1

*Provider Onboarding: Automated pattern discovery + LLM verification

Command:

cd backend/epgoat
python cli/onboard_provider.py \
    --provider necro \
    --m3u-url "https://example.com/playlist.m3u" \
    --dry-run

Flags: - --provider <slug>: Provider identifier (lowercase) - --m3u-url <url>: M3U playlist URL - --dry-run: Preview only (don't save) - --skip-verification: Skip LLM verification

Output: - Detected patterns (console) - YAML config file (config/providers/<slug>.yml) - GitHub issue (if low confidence patterns)

Typical Results: - 50-100 patterns detected - 80-90% confidence from LLM - 5-10 patterns need manual review *

📖 Provider Onboarding Internals - Pattern Discovery

When you run python cli/onboard_provider.py --provider necro --m3u-url <url>, here's the complete auto-discovery process:

Phase 1: M3U Analysis (5-10 seconds)

  1. Fetch M3U playlist from URL
  2. Parse all channels (~10,000-15,000 channels typical)
  3. Filter VOD channels using universal patterns
  4. Extract channel names for analysis

Phase 2: Pattern Detection (10-20 seconds)

  1. Tokenization: Split channel names into tokens "NBA 01: Lakers vs Celtics" → ["NBA", "01", ":", "Lakers", "vs", "Celtics"] "NFL RedZone" → ["NFL", "RedZone"] "Premier League 12 | Arsenal vs Chelsea" → ["Premier", "League", "12", "|", ...]

  2. Prefix Detection: Find common prefixes followed by digits "NBA 01", "NBA 02", "NBA 03" → Prefix: "NBA" "NFL 01", "NFL 02" → Prefix: "NFL" "Premier League 01", "Premier League 02" → Prefix: "Premier League"

  3. Template Generation: Create regex templates "NBA 01: ..." → Template: "NBA ~`% :" (~ = prefix, ` = digit, % = payload) "NFL RedZone" → Template: "NFL ~" "Premier League 12 | ..." → Template: "Premier League ~`% |"

  4. Frequency Counting: Group by template, count occurrences "NBA ~`% :" → 72 channels "NFL ~`% :" → 45 channels "NHL ~`% :" → 38 channels "Premier League ~`% |" → 28 channels

  5. Threshold Filtering: Keep templates with ≥5 matches "NBA ~`% :" → 72 matches (KEEP) "Rare Sport ~`% :" → 2 matches (DROP)

Phase 3: LLM Verification (Optional, 20-40 seconds)

If --skip-verification NOT used:

  1. Prompt Construction: Build Claude API prompt ``` For each template, Claude analyzes:
  2. Is this a sports channel pattern?
  3. What sport/league is it?
  4. Is the regex pattern correct?
  5. Confidence score (0-100%) ```

  6. API Call: Send to Claude API (20-30s for 50-100 patterns)

  7. Response Parsing: Extract verification results json { "pattern": "NBA ~`% :", "sport_family": "NBA", "is_valid": true, "confidence": 95, "reasoning": "Clear NBA pattern with channel number and game payload" }

  8. Confidence Filtering: Confidence ≥80%: Auto-save to database Confidence <80%: Flag for manual review (GitHub issue)

Phase 4: Database Save (5-10 seconds)

  1. Create Provider Record (if not exists): sql INSERT INTO providers (slug, name, m3u_url) VALUES ('necro', 'Necro IPTV', 'https://...')

  2. Save Patterns: sql INSERT INTO provider_patterns (provider_id, pattern, sport_family, priority) VALUES (2, '^NBA\s+\d+\s*:?', 'NBA', 100), (2, '^NFL\s+\d+\s*:?', 'NFL', 100), (2, '^NHL\s+\d+\s*:?', 'NHL', 100);

  3. Save VOD Filters: sql INSERT INTO vod_filter_patterns (provider_id, pattern, priority) VALUES (2, '1080p|4K|HD', 100), (2, 'Movie|Film', 90);

  4. Save TVG-ID Mappings (if any discovered): sql INSERT INTO tvg_id_mappings (provider_id, tvg_id, sport_family) VALUES (2, 'nba-channel-1', 'NBA');

Phase 5: YAML Generation (1-2 seconds)

  1. Fetch from Database: Get all provider data
  2. Build YAML Structure: ```yaml provider: slug: necro name: Necro IPTV m3u_url: https://...

patterns: - pattern: '^NBA\s+\d+\s*:?' sport_family: NBA priority: 100 match_count: 72

vod_filter_patterns: - pattern: '1080p|4K|HD' priority: 100

tvg_id_mappings: nba-channel-1: NBA ```

  1. Write File: config/providers/necro.yml
  2. Set Cache TTL: 24 hours

Phase 6: GitHub Issue Creation (Optional, 5-10 seconds)

If low-confidence patterns found:

  1. Generate Issue Body: ```markdown ## Low-Confidence Patterns for Provider: necro

The following patterns need manual review (confidence <80%):

### Pattern 1: ^XFL\s+\d+\s*:? - Sport Family: XFL (guessed) - Confidence: 65% - Sample Channels: - "XFL 01: Dragons vs Guardians" - "XFL 02: Renegades vs Roughnecks" - Reasoning: Uncommon league, pattern format unclear

### Action Required - [ ] Verify sport family - [ ] Confirm regex pattern - [ ] Update database if needed ```

  1. Create Issue via GitHub API: POST /repos/owner/repo/issues { "title": "Provider Onboarding: necro - Manual Review Required", "body": "...", "labels": ["provider-onboarding", "needs-review"] }

Output Summary:

Provider Onboarding Complete: necro
=====================================

M3U Analysis:
├─ Total channels: 12,847
├─ Sports channels: 1,243 (after VOD filter)
└─ Processing time: 8.2s

Pattern Detection:
├─ Templates generated: 127
├─ Templates after frequency filter: 68
└─ Processing time: 12.5s

LLM Verification:
├─ Patterns verified: 68
├─ High confidence (≥80%): 62
├─ Manual review required: 6
└─ Processing time: 28.3s

Database Save:
├─ Patterns saved: 62
├─ VOD filters saved: 15
├─ TVG-ID mappings saved: 23
└─ Processing time: 6.1s

YAML Generation:
├─ File: config/providers/necro.yml
├─ Size: 14.2 KB
└─ Cache TTL: 24 hours

GitHub Issue:
├─ Issue #127: Provider Onboarding: necro - Manual Review Required
└─ 6 patterns flagged for review

Total Time: 55.1 seconds
✅ Onboarding complete!

Next Steps:

  1. Review GitHub issue for low-confidence patterns
  2. Test EPG generation: python cli/run_provider.py --provider necro
  3. Tune patterns based on match rates
  4. Iterate until 90%+ match rate achieved

Database Migrations

Run Pending Migrations

cd backend/epgoat/infrastructure/database
python migration_runner.py

What Happens: 1. Check current schema version (schema_migrations table) 2. Find unapplied migrations in migrations/ folder 3. Run migrations in order (001 → 002 → 003...) 4. Record applied migrations

Create New Migration

  1. Create file: backend/epgoat/infrastructure/database/migrations/018_your_change_name.sql
  2. Write SQL (CREATE/ALTER/INSERT statements)
  3. Run migration: python migration_runner.py
  4. Commit migration file to Git

Naming: <number>_<snake_case_description>.sql

Testing: Test both UP and DOWN (if applicable)


Git Workflow

Typical Development Flow

# 1. Create feature branch
git checkout -b feat/add-new-sport

# 2. Make changes
# ... edit files ...

# 3. Run quality checks
make ci  # test + lint + type-check

# 4. Stage changes
git add <files>

# 5. Commit (hook enforces Conventional Commits)
git commit -m "feat(config): add cricket sport emoji and category"

# 6. Push to remote
git push -u origin feat/add-new-sport

# 7. Create pull request (if team workflow)
gh pr create --title "Add cricket sport support" --body "Adds cricket emoji and XMLTV category"

Commit Message Format

Required: <type>(<scope>): <description>

Valid types: feat, fix, docs, refactor, test, chore, style, perf, ci, build

Examples:

feat(api): add event search endpoint
fix(parser): handle missing tvg-id tags
docs: update EPG generation guide
refactor(services): extract caching logic
test(parser): add tests for M3U parsing

Hook Enforcement: Commit hook rejects invalid messages

📊 Git Workflow - Feature Branch to Merge

flowchart TD
    Main[Main Branch] --> Branch[Create Feature Branch<br/>feat/add-cricket]

    Branch --> Code[Write Code]
    Code --> Test[Run Tests<br/>make ci]

    Test --> Pass{Tests Pass?}
    Pass -->|No| Fix[Fix Issues]
    Fix --> Code

    Pass -->|Yes| Stage[Git Add Files]
    Stage --> Commit[Git Commit<br/>feat(config): add cricket]

    Commit --> Hook{Commit Hook Pass?}
    Hook -->|No| FixMsg[Fix Commit Message]
    FixMsg --> Commit

    Hook -->|Yes| Push[Git Push]
    Push --> PR[Create Pull Request]

    PR --> Review[Code Review]
    Review --> Approved{Approved?}

    Approved -->|No| Changes[Address Feedback]
    Changes --> Code

    Approved -->|Yes| Merge[Merge to Main]
    Merge --> Delete[Delete Feature Branch]

    style Main fill:#90EE90
    style Merge fill:#90EE90
    style Fix fill:#FFB6C1
    style FixMsg fill:#FFB6C1

*Git Feature Branch Workflow: Branch → Code → Test → Commit → PR → Merge

Commands:

# 1. Create branch:
git checkout -b feat/add-cricket

# 2. Make changes:
vi backend/config/sport_emojis.yml

# 3. Test:
make ci

# 4. Commit:
git add backend/config/sport_emojis.yml
git commit -m "feat(config): add cricket sport emoji"

# 5. Push:
git push -u origin feat/add-cricket

# 6. Create PR:
gh pr create --title "Add cricket sport support" --body "Adds cricket emoji and category"

Commit Message Format: <type>(<scope>): <description>

Valid Types: feat, fix, docs, refactor, test, chore, style, perf, ci, build *


Configuration Management

Update Provider Config

Option 1: Edit YAML directly (faster for dev)

vi backend/config/providers/tps.yml

Changes apply after 24-hour cache TTL or service restart.

Option 2: Update database (persists, auto-caches)

-- Add new pattern to provider
INSERT INTO provider_patterns (provider_id, pattern, sport_family, priority)
VALUES (1, '^NHL\s+\d+\s*:?', 'NHL', 100);

Changes auto-sync to YAML cache within 24 hours.

Add New Sport

  1. Add emoji: backend/config/sport_emojis.yml yaml Cricket: 🏏

  2. Add category: backend/config/sport_categories.yml yaml Cricket: primary: Sports secondary: Cricket

  3. Add pattern (if needed): Provider config or channel_patterns.yml

  4. Commit: git commit -m "feat(config): add cricket sport support"

Add Family-League Mapping

Edit backend/config/family_mappings/universal.yml:

direct_mappings:
  IPL: CRICKET_IPL  # Indian Premier League

team_based_mappings:
  - family_prefix: "PSL"
    league_candidates:
      - CRICKET_PSL  # Pakistan Super League
    team_keywords:
      - "Karachi"
      - "Lahore"

Debugging

📊 Debugging Workflow - Issue to Fix

flowchart TD
    Issue[Report: No Matches Found] --> Reproduce[Reproduce Locally]

    Reproduce --> Logs[Check Logs]
    Logs --> Identify{Identify Stage}

    Identify -->|VOD Filter| VODDebug[Check VOD Patterns<br/>Too aggressive?]
    Identify -->|Parsing| ParseDebug[Check M3U Parsing<br/>Missing fields?]
    Identify -->|Matching| MatchDebug[Check Enrichment<br/>Which handler failed?]
    Identify -->|XMLTV| XMLDebug[Check XMLTV Generation<br/>Invalid format?]

    VODDebug --> Fix[Fix Issue]
    ParseDebug --> Fix
    MatchDebug --> Fix
    XMLDebug --> Fix

    Fix --> Test[Test Fix Locally]
    Test --> Works{Works?}
    Works -->|No| Fix

    Works -->|Yes| Commit[Commit Fix]
    Commit --> PR[Create PR]
    PR --> Merge[Merge]

    style Issue fill:#FFB6C1
    style Merge fill:#90EE90

*Debugging Process: Reproduce → Identify → Fix → Test → Commit

Debug Commands:

# Enable debug logging:
export DEBUG_MATCHING=1
python cli/run_provider.py --provider tps

# Run with dry-run (no DB writes):
python cli/run_provider.py --provider tps --dry-run

# Check specific handler:
pytest tests/test_enrichment.py::test_regex_handler -v

# View cache performance:
# Look for: "EnhancedMatchCache: hit_rate=85.2%"

Common Issues: 1. No matches: VOD filter too aggressive 2. Wrong matches: Pattern regex incorrect 3. Slow performance: Cache not working 4. Missing data: Database not refreshed *

Enable Debug Logging

import logging
logging.basicConfig(level=logging.DEBUG)

View Match Debug Info

Set environment variable:

export DEBUG_MATCHING=1
python cli/run_provider.py --provider tps

Inspect Database

Supabase Dashboard: - URL: https://supabase.com/dashboard - Project: epgoat-events - SQL Editor: Run custom queries - Table Editor: View/edit data

Local SQL (if using migrations locally):

cd backend/epgoat/infrastructure/database
python -c "from connection import get_database_connection; conn = get_database_connection(); print(conn.execute('SELECT COUNT(*) FROM events').fetchone())"

Check Cache Performance

Look for log lines:

EnhancedMatchCache: hit_rate=85.2% (142/167 lookups)
CrossProviderCache: hit_rate=23.4% (12/52 lookups)

Common Operations

Generate EPG for Today (Default Date)

cd backend/epgoat
python cli/run_provider.py --provider tps

Generate EPG for Tomorrow

python cli/run_provider.py --provider tps --date $(date -d tomorrow +%Y-%m-%d)

Generate EPG + Skip Database Writes (Test Run)

python cli/run_provider.py --provider tps --dry-run

View Unmatched Channels

SELECT channel_name, provider_id, last_attempted_at
FROM unmatched_channels
WHERE record_status = 'active'
ORDER BY attempt_count DESC
LIMIT 50;

Clear Match Cache (Force Re-Matching)

DELETE FROM match_cache WHERE created_at < datetime('now', '-7 days');

View Provider Patterns

SELECT pattern, sport_family, priority, match_count
FROM provider_patterns
WHERE provider_id = 1 AND is_active = 1
ORDER BY priority DESC, match_count DESC;

Troubleshooting

⚠️ Module Import Errors - Wrong Directory

Problem: Running scripts from wrong directory causes "ModuleNotFoundError". Python can't find modules when current directory is wrong.

Solution: ```bash

❌ BAD - Running from project root

~/epgoat-internal$ python cli/run_provider.py --provider tps

Error: ModuleNotFoundError: No module named 'backend'

❌ BAD - Running from wrong subdirectory

~/epgoat-internal/backend$ python cli/run_provider.py --provider tps

Error: ModuleNotFoundError: No module named 'epgoat'

✅ GOOD - Run from backend/epgoat directory

cd backend/epgoat python cli/run_provider.py --provider tps

Works! ✅

✅ ALSO GOOD - Use absolute paths

cd ~/epgoat-internal python backend/epgoat/cli/run_provider.py --provider tps

Works! ✅


**Why This Happens**:
- Python adds current directory to import path
- Imports like `from backend.epgoat.domain import models` expect to be run from project root OR backend/epgoat
- EPGOAT scripts expect to run from `backend/epgoat/` directory

**Quick Fix**:
```bash
# Always navigate first:
cd backend/epgoat

# Then run scripts:
python cli/run_provider.py --provider tps
python utilities/refresh_event_db_v2.py --date 2025-11-10

"Module not found" Error

Fix: Run from correct directory

cd backend/epgoat  # Must be here
python cli/run_provider.py --provider tps

"Database connection failed"

Fix: Check .env file has SUPABASE_URL and SUPABASE_KEY

⚠️ Missing Environment Variables

Problem: Scripts fail with cryptic errors when environment variables not set. Database connections fail, API calls return 401 Unauthorized.

Solution: ```bash

❌ Symptom - Database connection fails

$ python cli/run_provider.py --provider tps

Error: supabase.client.ClientError: No URL provided

❌ Symptom - API returns 401

$ python utilities/refresh_event_db_v2.py --date 2025-11-10

Error: requests.exceptions.HTTPError: 401 Unauthorized

✅ Solution - Copy .env.example to .env

cp .env.example .env

✅ Edit .env with real values:

vi .env

Required variables:

THESPORTSDB_API_KEY=your_api_key_here SUPABASE_URL=https://your-project.supabase.co SUPABASE_KEY=your_service_role_key_here CLAUDE_API_KEY=sk-ant-your_claude_key

✅ Verify environment variables loaded:

python -c "import os; from dotenv import load_dotenv; load_dotenv(); print('API Key:', os.environ.get('THESPORTSDB_API_KEY')[:10] + '...')"

Output: API Key: 1234567890...

✅ Now scripts work:

python cli/run_provider.py --provider tps

Works! ✅


**Setup Checklist**:
1. ✅ Copy `.env.example` to `.env`
2. ✅ Fill in all required API keys
3. ✅ Never commit `.env` to Git (it's in `.gitignore`)
4. ✅ Use different keys for staging vs production



### "No channels matched"

**Causes**:
1. VOD filter too aggressive → Check `vod_filter_patterns`
2. Missing provider patterns → Run onboarding: `python cli/onboard_provider.py`
3. Pattern syntax error → Check regex validity

**Debug**: Run with `--dry-run` and inspect console output

### Tests Failing After Code Change

1. Run linting: `make lint` (fix code style)
2. Run type check: `make type-check` (fix type hints)
3. Check test error messages (usually clear about what's wrong)
4. If fixture issue: Check `tests/conftest.py`


### ⚠️ Tests Failing After Code Change

**Problem**: Made a change, now tests are failing. How do I diagnose and fix?


**Solution**: ```bash
# Step 1: Run tests to see what failed
make test

# Output shows failures:
# FAILED tests/test_enrichment.py::test_cache_hit - AssertionError: Expected 12345, got None

# Step 2: Run specific test with verbose output
pytest tests/test_enrichment.py::test_cache_hit -vv

# Output shows detailed error:
# def test_cache_hit():
#     cache = EnhancedMatchCache(expiration_hours=24)
#     cache.store_match(...)
#     result = cache.find_match(tvg_id="test-id")
# >   assert result.matched_event_id == 12345
# E   AttributeError: 'NoneType' object has no attribute 'matched_event_id'

# Step 3: Check what changed
git diff backend/epgoat/services/enhanced_match_cache.py

# Step 4: Identify issue
# Found: Changed cache key format from "tvg:id" to "id"
# But test still uses old format!

# Step 5: Fix the issue
# Option A: Update code to match tests (if tests are correct)
# Option B: Update tests to match code (if code is correct)

# Step 6: Run tests again
pytest tests/test_enrichment.py::test_cache_hit -v
# PASSED ✅

# Step 7: Run full test suite
make test
# All tests pass ✅

# Step 8: Run linting and type checking
make lint
make type-check
# All checks pass ✅

# Step 9: Commit
git add .
git commit -m "fix(cache): update cache key format"

Common Test Failure Causes:

  1. Changed Function Signature: Update test calls
  2. Changed Return Type: Update test assertions
  3. Changed Behavior: Update test expectations
  4. Missing Mock: Add mock for new dependency
  5. Wrong Test Data: Update test fixtures

Debugging Tips: - Use -vv for verbose output - Use --pdb to drop into debugger on failure - Use print() statements in tests (temporary) - Check test fixtures in tests/conftest.py


CI/CD (Not Yet Implemented)

Planned GitHub Actions Workflows:

  • generate-epg.yml: Daily EPG generation (cron @ 6 AM UTC)
  • refresh-events-db.yml: Weekly event DB refresh (cron @ Sunday nights)
  • cleanup-old-epg.yml: Daily housekeeping (cron @ 2 AM UTC)
  • test.yml: CI tests on push/PR ✅ (EXISTS)

See: .github/workflows/ for existing workflow


Quick Reference Card

Task Command
Generate EPG cd backend/epgoat && python cli/run_provider.py --provider tps
Refresh events python utilities/refresh_event_db_v2.py --date 2025-11-10
Run tests make test (from root)
Lint code make lint (from root)
Format code make format (from root)
Type check make type-check (from root)
Run CI make ci (from root)
Onboard provider python cli/onboard_provider.py --provider necro --m3u-url "https://..."
Run migrations cd infrastructure/database && python migration_runner.py

For Details: See Documentation/04-Guides/ (EPG-Generation.md, Testing-Guide.md, etc.)