Change Detector Architecture
Go-based service that monitors scraped websites for content changes and triggers re-scraping
Internal Components
Service Configuration
Replicas1
Workers100 concurrent
LanguageGo (pgx)
Network Mode
service:vpnDatabase Tables
tracked_pagesPages being monitoredpage_changesChange event logwebsite_scrape_jobsRe-scrape triggersPipeline Position
Upstream
Website Scraper (tracked pages)
Downstream
Website Scraper (re-scrape jobs)
Component Breakdown
Scheduler
Fetches batches of 50 pages due for check (by last_check_at). 24-hour check interval.
Worker Pool
100 concurrent HTTP workers with 10s timeout. User agent rotation (5 variants). 3 retries per page.
Hash Comparator
SHA256 content hashing. Compares new hash against stored last_content_hash to detect changes.
Re-scrape Job Creator
Creates website_scrape_jobs when changes are detected. Records change events in page_changes table.