Team Sync Automation

EPGOAT Documentation - User Guides

Team Sync Automation Guide

Quick Reference: How to use automated team synchronization with GitHub Actions Last Updated: 2025-11-09

Overview

Team synchronization runs automatically every day at 3 AM UTC, syncing team data from TheSportsDB and ESPN APIs with 100% conflict auto-resolution.

Status: ✅ Fully operational (as of 2025-11-06) - Automated Sync: Daily at 3 AM UTC - Success Rate: 100% (after ESPN constraint fix) - Conflict Resolution: 100% auto-resolved - Monitoring: Health checks twice daily


Automated Workflows

1. Daily Team Sync

File: .github/workflows/team-sync.yml Schedule: Every day at 3 AM UTC (10 PM ET / 7 PM PT)

What it does: - Syncs teams from all supported sports (Basketball, American Football, College Football) - Detects and auto-resolves conflicts - Updates existing teams with API IDs - Creates new teams as needed - Logs all changes for audit

Manual Trigger (via GitHub UI): 1. Go to ActionsTeam Sync - Automated 2. Click Run workflow 3. Optional parameters: - sports: Comma-separated list (e.g., Basketball,Football) or leave empty for all - force: Skip interval check (useful for testing) - dry_run: Preview changes without applying

2. Health Monitoring

File: .github/workflows/team-sync-monitor.yml Schedule: Twice daily at 4 AM and 4 PM UTC

What it does: - Checks sync statistics and success rate - Counts pending conflicts - Auto-creates GitHub issues if: - Success rate drops below 80% - Pending conflicts exceed 50 - Provides summary in workflow run


Local Usage

Run Sync Locally

# Navigate to backend
cd backend/epgoat

# Activate virtual environment
source ../../venv/bin/activate

# Load Supabase environment variables
source ../../.env.supabase

# Run sync for all sports
python utilities/scheduled_team_sync.py

# Run for specific sport
python utilities/scheduled_team_sync.py --sports Basketball

# Dry run (preview only)
python utilities/scheduled_team_sync.py --dry-run

# Force run (skip interval check)
python utilities/scheduled_team_sync.py --force

# Check statistics
python utilities/scheduled_team_sync.py --stats

Check Sync Statistics

# View last 30 days
python utilities/scheduled_team_sync.py --stats

# Example output:
============================================================
Sync Statistics (Last 30 days)
============================================================
Total runs:        10
Completed:         10
Failed:            0
Success rate:      100.0%
Teams created:     8
Teams updated:     628
Conflicts:         1293
Avg duration:      234s
============================================================

Manage Conflicts

# List all conflicts
python utilities/manage_conflicts.py list

# List pending conflicts only
python utilities/manage_conflicts.py list --status pending

# Show specific conflict details
python utilities/manage_conflicts.py show <conflict-id>

# Manually resolve conflict
python utilities/manage_conflicts.py resolve <conflict-id> --action merge

# Export conflicts to CSV
python utilities/manage_conflicts.py export conflicts.csv

Required GitHub Secrets

Location: Repository Settings → Secrets and variables → Actions

The following secrets must be configured for workflows to run:

Secret Name Description Where to Find
SUPABASE_URL Supabase project URL Supabase Dashboard → Project Settings → API
SUPABASE_SERVICE_ROLE_KEY Service role key (full access) Supabase Dashboard → Project Settings → API → Service Role (secret)
THESPORTSDB_API_KEY TheSportsDB API key TheSportsDB Account → API Keys

Verification: Check if secrets are set via GitHub UI or CLI:

gh secret list

Monitoring & Alerts

Where to Check Status

  1. GitHub Actions Tab
  2. View workflow runs: https://github.com/YOUR_ORG/YOUR_REPO/actions
  3. Filter by workflow: "Team Sync - Automated" or "Team Sync - Monitor"
  4. Check logs for detailed output

  5. GitHub Issues (Auto-created alerts)

  6. High pending conflicts (>50): Issue with label team-sync, needs-attention
  7. Low success rate (<80%): Issue with label team-sync, urgent

  8. Supabase Database

  9. Table: team_sync_runs - All sync run history
  10. Table: team_sync_conflicts - Conflict detection logs
  11. Query via Supabase Dashboard → SQL Editor

Workflow Run Summary

Each workflow run provides a summary:

## Team Sync Health Check

- **Total Runs**: 10
- **Success Rate**: 100.0%
- **Pending Conflicts**: 0

✅ **Status**: Healthy

Troubleshooting

Workflow Failed

  1. Check workflow logs in GitHub Actions
  2. Common causes:
  3. API rate limits (wait and retry)
  4. Network timeout (automatically retried)
  5. Missing secrets (check configuration)
  6. Manual retry: Click "Re-run failed jobs" in GitHub UI

High Conflict Count

  1. Check conflict types: bash python utilities/manage_conflicts.py list

  2. Review auto-resolution rules: bash python utilities/manage_conflicts.py list-rules

  3. Manually resolve if needed: bash python utilities/manage_conflicts.py resolve <id> --action merge

Low Success Rate

  1. Review recent error logs in failed workflow runs
  2. Check API connectivity:
  3. TheSportsDB API status
  4. ESPN API availability
  5. Verify environment variables are set correctly
  6. Run manual sync with verbose logging: bash python utilities/sync_teams_from_apis.py --all-sports --conflict-detection --auto-resolve

Understanding the Sync Process

Workflow Steps

  1. Discovery: Fetch teams from TheSportsDB and ESPN APIs
  2. Conflict Detection: Check for duplicates, mismatches, and naming conflicts
  3. Auto-Resolution: Apply priority-based rules to resolve conflicts
  4. Database Updates: Create new teams, update existing teams with API IDs
  5. Tracking: Log sync run with statistics and outcomes

Conflict Types

Type Description Auto-Resolution
duplicate_normalized_name Two teams with same normalized name ✅ Merge into existing
api_id_mismatch Team has conflicting API IDs ✅ Prefer TheSportsDB
canonical_name_conflict Different canonical names for same team ✅ Prefer TheSportsDB
alias_conflict Team name could be alias ⚠️ Requires manual review

Auto-Resolution Rules

Rules are applied in priority order (lowest number = highest priority):

  1. Priority 10: Merge ESPN teams into existing TheSportsDB teams by normalized name
  2. Priority 20: Prefer TheSportsDB ID when API IDs conflict
  3. Priority 30: Prefer TheSportsDB canonical name when names conflict
  4. Priority 40: Alias conflicts require manual review

Performance Metrics

Current Performance (as of 2025-11-06):

  • Avg Sync Duration: 3-4 minutes (all sports)
  • Conflict Auto-Resolution: 100%
  • Teams Synced: 148 total
  • API Coverage: 98% (ESPN), 52% (TheSportsDB)
  • Success Rate: 100% (after ESPN constraint fix)

Expected Growth: - More leagues added → Longer sync times - More teams created → Higher conflict detection - Additional sports → Proportional increase in duration


Recent Changes

2025-11-06: ESPN Team ID Constraint Removed

Problem: ESPN reuses team IDs across leagues (e.g., ID "2" for Boston Celtics and Auburn Tigers)

Solution: Removed unique constraint on espn_team_id

Impact: - ✅ 100% success rate (was 40% with 59 errors) - ✅ All sports can now sync without errors - ✅ 33 ESPN IDs now properly shared across teams

Related: See ADR-003


Next Steps

  1. Monitor tomorrow's 3 AM UTC run (first automated run after fixes)
  2. Watch for GitHub issue alerts (should be none if healthy)
  3. Review conflict patterns after a week of automated runs
  4. Tune auto-resolution rules if needed based on conflict types

Support

Questions or Issues? - Create GitHub issue with label team-sync - Check workflow logs in Actions tab - Run python utilities/scheduled_team_sync.py --stats for diagnostics

Documentation: - ADR-003: ESPN Team ID Constraint Removal - Migration 013: Database constraint removal - Sync scripts: backend/epgoat/utilities/