Add NetEaseCrowd dataset#101
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #101 +/- ##
==========================================
+ Coverage 92.80% 92.96% +0.15%
==========================================
Files 47 47
Lines 2070 2216 +146
==========================================
+ Hits 1921 2060 +139
- Misses 149 156 +7 ☔ View full report in Codecov by Sentry. |
pilot7747
left a comment
There was a problem hiding this comment.
Hi @shenxiangzhuang! Thank you for contributing this dataset. Lgtm
|
Besides the CI test, I also tested to use this dataset do categorical aggregation and it works well: from crowdkit.aggregation import DawidSkene
from crowdkit.datasets import load_dataset
df, gt = load_dataset('netease_crowd')
ds = DawidSkene(10)
result = ds.fit_predict(df)
print(len(result))
# 999799 |
dustalov
left a comment
There was a problem hiding this comment.
Thank you for a very well-done PR! I noticed a small imperfection in the dataset metadata. Could you please check my suggestion?
Co-authored-by: Dmitry Ustalov <dmitry.ustalov@gmail.com>
Thanks a lot for your carefully review! |
|
Great job, thank you again! |
Checklist
Dataset info
Adding our open-source dataset, NetEaseCrowd(https://github.com/fuxiAIlab/NetEaseCrowd-Dataset).