Deduce synthetic carry events for event providers which don't annotate them #396
Deduce synthetic carry events for event providers which don't annotate them #396lodevt wants to merge 0 commit intoPySport:masterfrom
Conversation
|
To test this, I removed the carry events from a StatsBomb dataset and considered these events as the ground truth to compare the generated carries with. True positives are generated carries where a matching StatsBomb carry from the same player within 5 seconds time is found. The current implementation reaches as accuracy of around 89% (466 TP, 38 FP, 18 FN). |
|
What do you think would be the best convention to follow to generate the event_ids of the synthetic carries? The StatsBomb deserialiser follows the convention '<event_type>-<previous_event_id>' (see duels and ball out) and the Wyscout V3 deserializer similarly uses 'synthetic-<team_id>-<previous_event_id>' for synthetic formation change events (see formation change). Based on this, I suggest using 'carry-<previous_event_id>' as the event_id for the synthetic carries. |
Addressing Issue #379
I tried to follow a similar approach as in socceraction to add carry events for providers which don't annotate them: (https://github.com/ML-KULeuven/socceraction/blob/af297a490fc521d27622b947e93fa15c5e0092da/socceraction/spadl/base.py#L38).
The implementation and tests are not complete yet. I think that a great way to test this is to remove the carry events from a StatsBomb dataset and then compare the generated carries with the carries from the original dataset. It is impossible to have an accuracy of 100% but still the output should have high similarities.
I am happy to hear any feedback while further implementing this.