Skip to content

Conversation

@shunping
Copy link
Collaborator

@shunping shunping commented Mar 8, 2025

  • Add the ptransform components.
  • Add detectors: z-score, robust z-score and IQR.
  • Move threshold DoFns out of threshold.py
  • Add more docstrings
  • Minor refactoring of existing code

- Detectors include z-score, robust z-score and IQR.
- Move threshold DoFns out of threshold.py
- Add more docstrings
- Minor refactor of existing code
@github-actions github-actions bot added the python label Mar 8, 2025
@shunping shunping self-assigned this Mar 8, 2025
@shunping shunping added this to the 2.64.0 Release milestone Mar 8, 2025
@shunping shunping requested a review from damccorm March 8, 2025 03:54
@shunping shunping marked this pull request as ready for review March 8, 2025 03:54
@shunping shunping force-pushed the anomaly-detection-4 branch from 5901a43 to fe32cc2 Compare March 8, 2025 05:24
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2025

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @tvalentyn for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@shunping shunping force-pushed the anomaly-detection-4 branch 2 times, most recently from 76d2728 to 7b55027 Compare March 8, 2025 19:45
@shunping shunping force-pushed the anomaly-detection-4 branch from 7b55027 to f18bf73 Compare March 8, 2025 20:16
Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is quite large (~2,000 lines), could we try splitting it up a bit? Specifically, it includes several pieces of functionality (new detectors and trackers, main PTransform) which could be parallelized, and it mixes in significant amounts of refactoring which make it hard to tell what is a functional/non-functional change. If we could split those components up into their own PRs it would be much easier to review well.

You might be able to chunk it and just open a few PRs at once

yield k1, (k2, self._apply_threshold_to_predictions(result))


class StatefulThresholdDoFn(BaseThresholdDoFn):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to move this out of thresholds.py? It seems more natural here.

If it is just a refactor, could we split that into a separate PR? This one is already quite large

@shunping
Copy link
Collaborator Author

Sure. Let me split it into smaller ones. Thanks!

@shunping shunping closed this Mar 10, 2025
@shunping
Copy link
Collaborator Author

Closing this PR as it is now split into three: #34232, #34234, #34235

@shunping shunping changed the title [AnomalyDetection] Add transforms and detectors. [Obsolete][AnomalyDetection] Add transforms and detectors. Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants