-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Priority: highenhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Description
Turns out that we took out the Housing-Affordability APIs during an infrastructure change. Shame on me for missing that - it was a subtle mistake and after as many changes that had no destructive impact, I just didn't thoroughly evaluate the API health.
This is normal, and there's no way any distributed system like ours should have to rely on humans to remember to validate every piece of the stack every time a change rolls out.
We need to find a low- or no-cost monitoring solution of production assets that lets us achieve the following:
- validate the health of each endpoint similarly to our smoke tests - e.g. do we get a 200 from each container (aka is the web server running)? Do we receive compliant JSON from each endpoint (aka is the Django app answering with something it got from the database)?
- canary queries - e.g. is there a specific query for each endpoint that will remain mostly stable, and will demonstrate that the database is returning expected data?
- validate the health of the React apps - e.g. do we get a 200 from each React app? Do we receive a reasonable "HTML response" (or some other lightweight way to show the React app is sending valid data to the requesting browser)?
- validate the database listener - do we get a response on 5432? Is there a way to show that each database is up and responding (without having to hard-code creds in our testing harness)?
Metadata
Metadata
Assignees
Labels
Priority: highenhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed