Skip to content

Deduce synthetic carry events for event providers which don't annotate them #396

Closed
lodevt wants to merge 0 commit intoPySport:masterfrom
lodevt:master
Closed

Deduce synthetic carry events for event providers which don't annotate them #396
lodevt wants to merge 0 commit intoPySport:masterfrom
lodevt:master

Conversation

@lodevt
Copy link
Copy Markdown
Contributor

@lodevt lodevt commented Jan 7, 2025

Addressing Issue #379

I tried to follow a similar approach as in socceraction to add carry events for providers which don't annotate them: (https://github.com/ML-KULeuven/socceraction/blob/af297a490fc521d27622b947e93fa15c5e0092da/socceraction/spadl/base.py#L38).

The implementation and tests are not complete yet. I think that a great way to test this is to remove the carry events from a StatsBomb dataset and then compare the generated carries with the carries from the original dataset. It is impossible to have an accuracy of 100% but still the output should have high similarities.

I am happy to hear any feedback while further implementing this.

@lodevt
Copy link
Copy Markdown
Contributor Author

lodevt commented Jan 8, 2025

To test this, I removed the carry events from a StatsBomb dataset and considered these events as the ground truth to compare the generated carries with.

True positives are generated carries where a matching StatsBomb carry from the same player within 5 seconds time is found.
False positives are generated carries where no matching StatsBomb carry is found.
False negatives are StatsBomb carries with a length greater than the min_carry_length_meters (3 meters) that have no match in the generated carries.

The current implementation reaches as accuracy of around 89% (466 TP, 38 FP, 18 FN).

Comment thread kloppy/domain/models/event.py Outdated
Comment thread kloppy/domain/models/event.py Outdated
Comment thread kloppy/domain/models/event.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
Comment thread kloppy/domain/services/event_deducers/carry.py Outdated
@lodevt
Copy link
Copy Markdown
Contributor Author

lodevt commented Jan 16, 2025

What do you think would be the best convention to follow to generate the event_ids of the synthetic carries?

The StatsBomb deserialiser follows the convention '<event_type>-<previous_event_id>' (see duels and ball out) and the Wyscout V3 deserializer similarly uses 'synthetic-<team_id>-<previous_event_id>' for synthetic formation change events (see formation change).

Based on this, I suggest using 'carry-<previous_event_id>' as the event_id for the synthetic carries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants