Skip to content

Bulk export of audit data#4842

Merged
jadudm merged 22 commits into
jr/source-of-truth/mainfrom
sk_bulk_export_audit
Apr 7, 2025
Merged

Bulk export of audit data#4842
jadudm merged 22 commits into
jr/source-of-truth/mainfrom
sk_bulk_export_audit

Conversation

@gsa-suk
Copy link
Copy Markdown
Contributor

@gsa-suk gsa-suk commented Mar 31, 2025

This PR addresses bulk data export discussed in #4827. The export_data_audit management command exports data from audit.audit table into S3 in the dissemination format as csv files.

Testing:

  1. Pull branch. Build the app from scratch.
  2. On terminal, run 'docker compose run web python manage.py export_data_audit --year all".
  3. Dissemination data from 2016 onwards will get exported to S3. The console output looks like this:
Screenshot 2025-04-02 at 3 52 21 PM
  1. Open minio console and verify that files exist in the gsa-fac-private-s3/bulk_export folder. The csv files in S3 look like this:
Screenshot 2025-04-02 at 3 50 34 PM
  1. To export for an audit year and the corresponding federal year, on terminal, run 'docker compose run web python manage.py export_data_audit --year 2016".

On terminal:

Screenshot 2025-04-02 at 3 56 04 PM

On Minio console:
Screenshot 2025-04-03 at 1 19 38 PM

PR Checklist: Submitter

  • Link to an issue if possible. If there’s no issue, describe what your branch does. Even if there is an issue, a brief description in the PR is still useful.
  • List any special steps reviewers have to follow to test the PR. For example, adding a local environment variable, creating a local test file, etc.
  • For extra credit, submit a screen recording like this one.
  • Make sure you’ve merged main into your branch shortly before creating the PR. (You should also be merging main into your branch regularly during development.)
  • Make sure you’ve accounted for any migrations. When you’re about to create the PR, bring up the application locally and then run git status | grep migrations. If there are any results, you probably need to add them to the branch for the PR. Your PR should have only one new migration file for each of the component apps, except in rare circumstances; you may need to delete some and re-run python manage.py makemigrations to reduce the number to one. (Also, unless in exceptional circumstances, your PR should not delete any migration files.)
  • Make sure that whatever feature you’re adding has tests that cover the feature. This includes test coverage to make sure that the previous workflow still works, if applicable.
  • Make sure the full-submission.cy.js Cypress test passes, if applicable.
  • Do manual testing locally. Our tests are not good enough yet to allow us to skip this step. If that’s not applicable for some reason, check this box.
  • Verify that no Git surgery was necessary, or, if it was necessary at any point, repeat the testing after it’s finished.
  • Once a PR is merged, keep an eye on it until it’s deployed to dev, and do enough testing on dev to verify that it deployed successfully, the feature works as expected, and the happy path for the broad feature area (such as submission) still works.
  • Ensure that prior to merging, the working branch is up to date with main and the terraform plan is what you expect.

PR Checklist: Reviewer

  • Pull the branch to your local environment and run make docker-clean; make docker-first-run && docker compose up; then run docker compose exec web /bin/bash -c "python manage.py test"
  • Manually test out the changes locally, or check this box to verify that it wasn’t applicable in this case.
  • Check that the PR has appropriate tests. Look out for changes in HTML/JS/JSON Schema logic that may need to be captured in Python tests even though the logic isn’t in Python.
  • Verify that no Git surgery is necessary at any point (such as during a merge party), or, if it was, repeat the testing after it’s finished.

The larger the PR, the stricter we should be about these points.

Pre Merge Checklist: Merger

  • Ensure that prior to approving, the terraform plan is what we expect it to be. -/+ resource "null_resource" "cors_header" should be destroying and recreating its self and ~ resource "cloudfoundry_app" "clamav_api" might be updating its sha256 for the fac-file-scanner and fac-av-${ENV} by default.
  • Ensure that the branch is up to date with main.
  • Ensure that a terraform plan has been recently generated for the pull request.

@gsa-suk gsa-suk requested a review from a team as a code owner March 31, 2025 20:52
@gsa-suk gsa-suk marked this pull request as draft March 31, 2025 20:52
@jadudm jadudm mentioned this pull request Apr 3, 2025
Copy link
Copy Markdown
Contributor

@jrothacker jrothacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there could be a lot of improvements, let me know if you want to pair on some of this. I am particularly worried about usage of the fac_accepted_date and tribal data.

Comment thread .github/workflows/export-audit-data-to-csv.yml
Comment thread .github/workflows/export-audit-data-to-csv.yml
Comment thread .github/workflows/export-audit-data-to-csv.yml
Comment thread backend/support/management/commands/export_data_audit.py
Comment thread backend/support/management/commands/export_data_audit.py
Comment thread backend/support/export_audit_sql.py
Comment thread backend/support/export_audit_sql.py
Comment thread backend/support/management/commands/export_data_audit.py
Comment thread backend/support/export_audit_sql.py
@jadudm jadudm merged commit bbf79c9 into jr/source-of-truth/main Apr 7, 2025
@jadudm jadudm deleted the sk_bulk_export_audit branch April 7, 2025 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants