Skip to content

Conversation

@jayckaiser
Copy link
Collaborator

No description provided.



def get(self, limit: Optional[int] = None, *, params: Optional[dict] = None, **kwargs) -> List[dict]:
def get(self, url: Optional[str] = None, limit: Optional[int] = None, *, params: Optional[dict] = None, **kwargs) -> requests.Response:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep the same List[dict] output as the current code.

Could we instead allow the token argument to be passed into this method?

Copy link
Collaborator Author

@jayckaiser jayckaiser Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1: Call session.get_response() separately in get and get_pages. (decouple)

Option 2: Pass token as an optional argument to get. If passed, change output type from List[dict] to Union[List[dict], Tuple[List[dict], Optional[str]]].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with option 1 in my most recent updates for more output predictability and consistency + break get_pages into get_pages_offset and get_pages_cursor. I do recognize that there's a bit of repetition between the 2 getters, but since cursor-paging has different API requirements, headers used, compatibility rules, and fallback behaviors than off-set paging, I feel like this decomposition is justified. If you have any code design suggestion, that'd be great!



elif cursor_paging and partitioning:
ods_version = tuple(map(int, self.client.get_ods_version().split(".")[:2]))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move the version-check to the top of the method and fail immediately if any disallowed arguments are passed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess with the bifurcation of get_pages(), our notes about disallowed arguments don't apply for now? I did put some built-in checkpoints for get_pages_cursor so that:

  • If ODS version is not 7.3, the paging method will switch to reverse-offset (Up to debate whether this should be the desired behavior or if it should error out completely)
  • Since deletes and key changes are not supported by cursor paging, if the client tries to fetch them via get_pages_cursor() → throws warning and switch to reverse-offset (Again, up to debate if we should require more manual input from client if their requests are incompatible with the new feature)

token = result.headers.get("Next-Page-Token")
return results

results = [element for sublist in Parallel(n_jobs=len(paged_tokens), backend="threading")(delayed(partitioning_with_token)(token) for token in paged_tokens) for element in sublist]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to dive deeper into this approach. There is a future where we enable async in the edfi_api_client and will need this method to be compatible.

Copy link
Collaborator Author

@jayckaiser jayckaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a bunch of comments for things to change, most minor. I want to make sure that we:

  • maintain backwards compatibility
  • minimize code-hopping

I'd also request going through and making sure whitespace is kept the same between this branch and main, just to minimize the number of lines changed.

paged_params['limit'] = limit

# Fall back to reverse-offset paging if incompatible with cursor paging
def _fall_back_to_pages_by_offset():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather put all conditional logic on choosing the internal helper method into get_rows() instead of defining a fallback. The choice of pagination should be made and declared immediately.


### Prepare pagination variables
### First request should not have any `page_token` and `page_size` defined
paged_params.init_page_by_token(page_token = None, page_size = None)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page_size should be defined regardless, since we always set a default value.


### Prepare pagination variables
### First request should not have any `page_token` and `page_size` defined
paged_params.init_page_by_token(page_token = None, page_size = None)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consolidate 317 and 332 and put it at the start of the while-loop. We can define page_token = None above the loop for the first pass.

"""
# Override init params if passed
paged_params = EdFiParams(params or self.params).copy()
logging.info(f"[Get {self.component}] Endpoint: {self.url}")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this logging statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants