Skip to content

Make large queries scalable #91

@schlessera

Description

@schlessera

For the commands that are dealing with a potentially huge number of elements (like wp post list), we are currently limited by available memory.

Right now, if you work on a site with 10'000 posts and do a wp post list, it will first do a query with a result set of 10'000 posts all loaded into memory. Then, depending on the --format you chose, this result set will be processed further, like calculating the maximum width of every column for tables. Then only will the output be generated.

As long as we have operations that require processing the entire result set, we are liable to run into the memory limit.

To decouple the number of elements from the used memory, we'd need to use a pagination mechanism. We need an overrideable default chunk size or pagination size that will be used for calculations like the table column dimensions.

So, with a default chunk size of 1000, we'll query the 1000 first elements, then calculate table dimensions. After each chunk, the next one will restart with a header again and have fresh table column dimension calculations.

For other formats, like csv, we don't have any calculations that rely on the entire dataset, so just chunking the DB results can be done transparently without any BC change.

For some formats, like json, the logic might be a bit more complicated, but should still be doable.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions