Problem
parseDiscoverPageHTML relies on Kickstarter's [data-project] JSON data attribute. Kickstarter is a React SPA and its HTML structure can change without notice. If the attribute disappears, the parser silently returns an empty slice; the AI fallback may also fail. The nightly cron logs the count but there is no alert or metric that fires when 0 campaigns are upserted for a category.
// kickstarter_parser.go
doc.Find("[data-project]").Each(func(i int, s *goquery.Selection) {
// if this selector matches nothing, campaigns == []
})
// Fallback to HTML structure parse — also fragile
if len(campaigns) == 0 {
campaigns = parseFromHTMLStructure(doc)
}
// cron.go — only logs, never alerts
if len(campaigns) == 0 {
break
}
Expected Behaviour
- If a full crawl run upserts fewer than N campaigns total (e.g. < 50), emit a structured error log or push an internal alert
- Distinguish between "category has 0 live campaigns" and "parsing failed"
Proposed Fix
- Add a post-crawl sanity check: if
upserted == 0 across all categories, log ERROR (not just INFO)
- Optionally track consecutive zero-result runs and surface them in a health endpoint
Problem
parseDiscoverPageHTMLrelies on Kickstarter's[data-project]JSON data attribute. Kickstarter is a React SPA and its HTML structure can change without notice. If the attribute disappears, the parser silently returns an empty slice; the AI fallback may also fail. The nightly cron logs the count but there is no alert or metric that fires when 0 campaigns are upserted for a category.Expected Behaviour
Proposed Fix
upserted == 0across all categories, logERROR(not justINFO)