Skip to content

Add .iterator() to unbounded querysets#7940

Open
foozleface wants to merge 1 commit intospecify:mainfrom
calacademy-research:cas/perf-iterators-7864
Open

Add .iterator() to unbounded querysets#7940
foozleface wants to merge 1 commit intospecify:mainfrom
calacademy-research:cas/perf-iterators-7864

Conversation

@foozleface
Copy link
Copy Markdown
Collaborator

Fixes #7864
Contributed by @foozleface

Approximately 80 callsites in the backend use .all() without .iterator(), causing Django to cache all results in the QuerySet internal result cache. For large tables this means loading entire result sets into memory at once. This PR adds .iterator(chunk_size=2000) to 9 high-impact paths where unbounded querysets are iterated.

Implementation

  • Add .iterator(chunk_size=2000) to COG prep consolidation queries in cog_preps.py (2 callsites)
  • Add .iterator(chunk_size=2000) to role policy serialization in permissions/views.py
  • Add .iterator(chunk_size=2000) to deaccession total calculation in calculated_fields.py
  • Add .iterator(chunk_size=2000) to dependent to-many serialization in api/serializers.py
  • Add .iterator(chunk_size=2000) to tree definition rank loading in trees/views.py
  • Fix batch edit date-part field name handling to use date-part-aware lookup keys, preventing mislabeled column headers for temporal fields
  • Fix workbench upload to not overwrite explicit createdbyagent values in upload_table.py
  • Add tests verifying .iterator() is called on the target querysets

Testing instructions

  • Run batch edit with date fields (catalogedDate Full Date, catalogedDate Month, etc.) and verify column headers are correct
  • Run a workbench upload that explicitly sets createdbyagent and verify the value is preserved
  • Run the test suite: python manage.py test specifyweb.specify.tests.test_queryset_iterators
  • Monitor memory usage during large query exports or COG prep operations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 📋Back Log

Development

Successfully merging this pull request may close these issues.

Use iterators to evaluate large Django QuerySets

1 participant