Boost parsing performance by abelfodil · Pull Request #67 · PolyCortex/polydodo

abelfodil · 2020-11-08T16:05:11Z

No description provided.

…rsing time)

…time)

…1s parse time)

WilliamHarvey97

Review fait par Claudia, William était secrétaire

WilliamHarvey97 · 2020-11-08T20:36:34Z

+                          usecols=retained_columns
+                          ).to_numpy()

-        if len(line_splitted) < CYTON_TOTAL_NB_CHANNELS:


Cela permettait de voir s'il y avait un problème dans le fichier envoyé. Par exemple, si le Cyton se ferme et se rallume momentanément, il y a aura deux commentaires qui vont indiquer le début de l'enregistrement. Voir docu %STOP AT et %START AT

Peut-être juste mettre un try catch autour read_csv pour les lignes de commentaire où le retained_columns ne seront pas présentes. On pourra renvoyer une erreur 400 avec l'explication dans le body.

Est-ce qu'on drop les lignes qui pètent sinon? pandas le fait bien

idk, s'il manque des samples le fichier ne vaut pas probablement pas grand chose

Les lignes tu veux dire? Oui, c'est un peu embêtant. Dans le cas où il n'y a qu'un arrêt de ~ 30 secondes, on peut drop ces lignes. Dans le cas d'un arrêt de > 5 minutes, ça ne marcherait pas tant de juste drop les lignes, comme il y a eu un arrêt non négligeable. On ne supporte pas non plus une séquence de nuit non contigüe, ni dans la classification et ni dans les visualisations. C'est pourquoi je pensais refuser le fichier dans ce cas.
De toute façon, c'est un cas limite, on peut pour l'instant drop les mauvaises lignes.

Ok, j'ai catch l'erreur de pandas et raise ClassificationError

abelfodil added 3 commits November 8, 2020 10:32

Use pandas to boost parsing performance by ~58% (from ~26s to ~11s pa…

623e008

…rsing time)

Remove branching for ~11% boost in performance (~11.1 to ~9.8s parse …

4923ebc

…time)

Simplify hex string parsing and boost performance by ~18% (9.8s to 8.…

eadef42

…1s parse time)

WilliamHarvey97 reviewed Nov 8, 2020

View reviewed changes

Remove TODO

1d16d62

abelfodil marked this pull request as ready for review November 8, 2020 22:48

WilliamHarvey97 approved these changes Nov 8, 2020

View reviewed changes

Catch and raise ClassificationError

ddee023

abelfodil merged commit 4011387 into master Nov 8, 2020

abelfodil deleted the performance branch November 8, 2020 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boost parsing performance#67

Boost parsing performance#67
abelfodil merged 5 commits intomasterfrom
performance

abelfodil commented Nov 8, 2020

Uh oh!

WilliamHarvey97 left a comment •

edited

Loading

Uh oh!

Uh oh!

WilliamHarvey97 Nov 8, 2020

Uh oh!

WilliamHarvey97 Nov 8, 2020

Uh oh!

abelfodil Nov 8, 2020 •

edited

Loading

Uh oh!

WilliamHarvey97 Nov 8, 2020

Uh oh!

conorato Nov 8, 2020

Uh oh!

abelfodil Nov 8, 2020

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

abelfodil commented Nov 8, 2020

Uh oh!

WilliamHarvey97 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

WilliamHarvey97 Nov 8, 2020

Choose a reason for hiding this comment

Uh oh!

WilliamHarvey97 Nov 8, 2020

Choose a reason for hiding this comment

Uh oh!

abelfodil Nov 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WilliamHarvey97 Nov 8, 2020

Choose a reason for hiding this comment

Uh oh!

conorato Nov 8, 2020

Choose a reason for hiding this comment

Uh oh!

abelfodil Nov 8, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

WilliamHarvey97 left a comment •

edited

Loading

abelfodil Nov 8, 2020 •

edited

Loading