[AnomalyDetection] Call RunInference for custom detectors #34286

shunping · 2025-03-13T20:24:57Z

Depends on #34285

shunping · 2025-03-14T02:14:20Z

github-actions · 2025-03-14T02:15:36Z

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

- torch.Tensor(n) gives a nx1-tensor with zeros: i.e. Tensor(0, 0, ... 0). - torch.tensor(n) gives a 1x1-tensor with value n: i.e. Tensor(n).

damccorm

I didn't do a deep dive on the PR because I'm a bit skeptical about the high level of what this looks like. I think we can simplify the implementation and end up with a more coherent user experience by limiting the types of model handlers we support.

damccorm · 2025-03-14T14:08:11Z

sdks/python/apache_beam/ml/anomaly/detectors/custom.py

+
+def _to_numpy_array(row: beam.Row):
+  """Converts an Apache Beam Row to a NumPy array."""
+  return numpy.array(list(row))


I don't think this adapter really makes sense. For example, lets say you have a row like:

Row(a=1, b=2, c=3)

These are all different features, and it is unlikely that you actually want to actually treat them as a single numpy array. Even if you do, there's not really a guarantee that this will happen in the right order.

Worse, this row could actually be:

Row(a=1, b=2, c='foo')

where we only want to run anomaly detection against a (we solve this elsewhere with

beam/sdks/python/apache_beam/ml/anomaly/transforms.py

Line 78 in 863e293

x = beam.Row(**{f: getattr(data, f) for f in self._underlying._features})

)

I've expressed this less strongly before, but seeing this in practice I really think that we're better off just supporting ModelHandler[beam.Row, float] and making users handle the conversion from row to input/output types.

I think a bunch of these adapters either don't really make sense or are overly opinionated in unpredictable ways which will be hard for users to reason about, and there is a much easier path for them to define the exact behavior they want (with_preproces_fn/with_postprocess_fn)

This is similarly true with the postprocessing. There is no single way that models will output an anomaly prediction, and it often may require some light postprocessing which can be pretty custom.

I agree with you that it will be much simpler to only support ModelHandler[beam.Row, float], but I am also hesitating to put all the adapter burden to users, which could be a friction of adapting the new transform.

With that said, I think instead of putting those functions in the SDK, maybe we can show them in examples later. WDYT?

Yeah, examples seem reasonable. I think taking the burden away from users would be great, but in practice I don't think their model preprocessing steps will be predictable enough to do that. Users will also be accustomed to having some simple preprocessing steps, and this fits in neatly with that.

damccorm · 2025-03-14T14:14:36Z

sdks/python/apache_beam/ml/anomaly/detectors/custom.py

+
+
+@specifiable
+class CustomDetector(AnomalyDetector):


If we're using RunInference, this should probably be called OfflineDetector or something similar. CustomDetector could also include online detectors the user defines themselves

Sure. That makes sense.

shunping · 2025-03-17T03:06:51Z

I didn't do a deep dive on the PR because I'm a bit skeptical about the high level of what this looks like. I think we can simplify the implementation and end up with a more coherent user experience by limiting the types of model handlers we support.

This PR has been superseded by #34310 and #34311.

github-actions bot added the python label Mar 13, 2025

shunping force-pushed the anomaly-detection-5-3 branch from d4d8953 to cb3bb70 Compare March 14, 2025 02:06

shunping mentioned this pull request Mar 14, 2025

[AnomalyDetection] Add Custom Detector to support custom model handlers #34285

Closed

shunping marked this pull request as ready for review March 14, 2025 02:14

shunping self-assigned this Mar 14, 2025

shunping added this to the 2.64.0 Release milestone Mar 14, 2025

shunping force-pushed the anomaly-detection-5-3 branch 2 times, most recently from 7b3218f to 670b948 Compare March 14, 2025 03:39

shunping added 3 commits March 14, 2025 08:43

Add custom detector to support model handlers from RunInference.

f38a6eb

Fix tests on pytorch.

ce275a6

- torch.Tensor(n) gives a nx1-tensor with zeros: i.e. Tensor(0, 0, ... 0). - torch.tensor(n) gives a 1x1-tensor with value n: i.e. Tensor(n).

Expand the pipeline with RunInference when custom detectors are used.

863e293

shunping force-pushed the anomaly-detection-5-3 branch from 670b948 to 863e293 Compare March 14, 2025 12:49

damccorm reviewed Mar 14, 2025

View reviewed changes

shunping closed this Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AnomalyDetection] Call RunInference for custom detectors #34286

[AnomalyDetection] Call RunInference for custom detectors #34286

Uh oh!

shunping commented Mar 13, 2025 •

edited

Loading

Uh oh!

shunping commented Mar 14, 2025

Uh oh!

github-actions bot commented Mar 14, 2025

Uh oh!

damccorm left a comment

Uh oh!

damccorm Mar 14, 2025

Uh oh!

damccorm Mar 14, 2025

Uh oh!

damccorm Mar 14, 2025

Uh oh!

shunping Mar 14, 2025

Uh oh!

damccorm Mar 14, 2025

Uh oh!

damccorm Mar 14, 2025

Uh oh!

shunping Mar 14, 2025

Uh oh!

shunping commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		@specifiable
		class CustomDetector(AnomalyDetector):

[AnomalyDetection] Call RunInference for custom detectors #34286

[AnomalyDetection] Call RunInference for custom detectors #34286

Uh oh!

Conversation

shunping commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shunping commented Mar 14, 2025

Uh oh!

github-actions bot commented Mar 14, 2025

Uh oh!

damccorm left a comment

Choose a reason for hiding this comment

Uh oh!

damccorm Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

damccorm Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

damccorm Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

shunping Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

damccorm Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

damccorm Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

shunping Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

shunping commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shunping commented Mar 13, 2025 •

edited

Loading