GRA-122: Data loader implementation by MaxVorosh · Pull Request #67 · SPbuMinecraft/GraphicalEditorForNN

MaxVorosh · 2023-12-09T20:27:03Z

MaxVorosh · 2023-12-09T20:30:26Z

Pr получился большой, простите...
Я бы попросил особое внимание обратить на то, как я собираю Blob, потому что там я мог не правильно понять принцип. Это примерно по одному методу в классе. Остальное - работа с данными как с векторами. Я написал парочку тестов по этому поводу (именно на вектора). Надеюсь, они помогут прояснить, как вообще пользоваться тем, что я написал.

lpetrov02

Я много несущественного написал с пометкой ПОФИГ, а так в целом всё вроде круто

lpetrov02 · 2023-12-13T06:51:48Z

server/api/CsvLoader.h

    static std::vector<std::vector<float>> load_csv(std::string path);
+    static std::vector<std::pair<std::string, float>> load_labels(std::string path);


Кажется, методы можно сделать const
Но тут как хочешь, раз уж мы договорились придерживаться принципа ПОФИГ

lpetrov02 · 2023-12-13T09:38:28Z

server/api/DataMarker.cpp

+#include "Blob.h"
+
+DataMarker::DataMarker(std::string path, FileExtension type, int percentage_for_train, std::size_t batch_size) {
+    if (percentage_for_train > 100 || percentage_for_train < 0) {


А не лучше ли тут float? Вряд ли, конечно, кому-то нужно именно 20,5% на тест, но как будто бы можно сделать более гибко практически бесплатно (ну и в торче/sklearn так сделано))
НО! Так как у нас ПОФИГ, на это можно забить, так тоже норм)

lpetrov02 · 2023-12-13T10:41:51Z

server/api/UnshuffledImgLoader.h

+
+class UnshuffledImgLoader: public UnshuffledDataLoader {
+private:
+    std::vector<std::pair<std::string, float>> data;


Вот здесь и в остальных местах тоже: много раз используется вот эта пара, не лучше ли написать структурку с понятными названиями полей?

lpetrov02 · 2023-12-13T11:02:45Z

server/api/DataLoader.cpp

+    auto dims = shape.getDims();
+    int data_size = 1;
+    for (int i = 0; i < dims.size(); ++i) {
+        data_size *= dims[i];
+    }


По-моему, я видел у Shape метод size(), делающий ровно это

lpetrov02 · 2023-12-13T11:32:15Z

server/api/DataLoader.cpp

+    }
+    data.resize(data_size, 0);
+    int cur_data = 0;
+    for (int i = index; i < index + batch_size; ++i) {


[ПОФИГ]
Насколько я понимаю, index - это откуда мы читаем батч. Лично мне было бы удобнее подавать номер батча, и тогда цикл выглядел бы как
for (int i = index * batch_size; i < (index + 1) * batch_size; ++i)
Но в целом реально ПОФИГ, лучше оставить как есть, раз работает)

Data loader implementation

Makes preparations for metrics logging on python Functionality for c++ http added, but not working yet Adds saving train metrics Adds saving train metrics and responding with PNG Adaptates code for new 4D blob cpprest CI support Add load possibility for zip Add load possibility for png on predict ------- GRA-122: Data loader implementation (#67) Data loader implementation ------- ID-154: Loss type selection (#70) * Add loss type selection * Add loss type selection * Remove layer-class loss * Clean up Loss type * Make format ------- ID-171: Fix input selection (#69) * Fix input selection * Clean up fix input selection ------- Change train and predict for zip file case Starts fixing train Fixes train with dataloader It's not fucking working :( (x3) server train fix Fixes train and predcit

* Started migration from Data2dLayer to DataLayer Makes preparations for metrics logging on python Functionality for c++ http added, but not working yet Adds saving train metrics Adds saving train metrics and responding with PNG Adaptates code for new 4D blob cpprest CI support Add load possibility for zip Add load possibility for png on predict ------- GRA-122: Data loader implementation (#67) Data loader implementation ------- ID-154: Loss type selection (#70) * Add loss type selection * Add loss type selection * Remove layer-class loss * Clean up Loss type * Make format ------- ID-171: Fix input selection (#69) * Fix input selection * Clean up fix input selection ------- Change train and predict for zip file case Starts fixing train Fixes train with dataloader It's not fucking working :( (x3) server train fix Fixes train and predcit * Follow up review * Follow up review * ID-167: Upload zip (#74) * Fixes graph tests * Fixes DataLayer --------- Co-authored-by: lpetrov02 <lpetrov02@mail.ru> Co-authored-by: Artem Goldenberg <58527023+Artem-Goldenberg@users.noreply.github.com>

* Started migration from Data2dLayer to DataLayer * Makes preparations for metrics logging on python Functionality for c++ http added, but not working yet Adds saving train metrics Adds saving train metrics and responding with PNG * Adaptates code for new 4D blob * cpprest CI support * Add load possibility for zip * Add load possibility for png on predict * GRA-122: Data loader implementation (#67) Data loader implementation * ID-154: Loss type selection (#70) * Add loss type selection * Add loss type selection * Remove layer-class loss * Clean up Loss type * Make format * ID-171: Fix input selection (#69) * Fix input selection * Clean up fix input selection * Change train and predict for zip file case * Starts fixing train * Fixes train with dataloader * It's not fucking working :( * It's not fucking working :( * Input selection bug fix * Started migration from Data2dLayer to DataLayer Makes preparations for metrics logging on python Functionality for c++ http added, but not working yet Adds saving train metrics Adds saving train metrics and responding with PNG Adaptates code for new 4D blob cpprest CI support Add load possibility for zip Add load possibility for png on predict ------- GRA-122: Data loader implementation (#67) Data loader implementation ------- ID-154: Loss type selection (#70) * Add loss type selection * Add loss type selection * Remove layer-class loss * Clean up Loss type * Make format ------- ID-171: Fix input selection (#69) * Fix input selection * Clean up fix input selection ------- Change train and predict for zip file case Starts fixing train Fixes train with dataloader It's not fucking working :( (x3) server train fix Fixes train and predcit * Follow up review * Follow up review * Make formats * Fix bug (not deleting connection after deleting layer) * Minor changes in cpp_server * tests fix * Fix order of addition inputs in layer --------- Co-authored-by: lpetrov02 <lpetrov02@mail.ru> Co-authored-by: Artem Goldenberg <st087953@student.spbu.ru> Co-authored-by: MaxVorosh <ma_voroshilov@mail.ru> Co-authored-by: Voroshilov Maksim <47945698+MaxVorosh@users.noreply.github.com>

MaxVorosh added 6 commits December 7, 2023 11:06

First implementation

1470733

Add tests for csv loading

5db37f4

Add unshuffled loader for zip

89650b5

Add test for image

d539cdc

Add batches

866bb3c

Add batches

30f5e25

MaxVorosh requested review from Artem-Goldenberg, Maxon081102 and lpetrov02 December 9, 2023 20:30

Add shuffle by seed, manual shuffle and fix bug

aecc66e

lpetrov02 approved these changes Dec 13, 2023

View reviewed changes

MaxVorosh merged commit 52d4fa3 into main Dec 13, 2023

MaxVorosh deleted the task122_data_loader branch December 13, 2023 19:00

lpetrov02 pushed a commit that referenced this pull request Dec 15, 2023

GRA-122: Data loader implementation (#67)

4cafb33

Data loader implementation

lpetrov02 pushed a commit that referenced this pull request Dec 15, 2023

GRA-122: Data loader implementation (#67)

b16e7af

Data loader implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRA-122: Data loader implementation#67

GRA-122: Data loader implementation#67
MaxVorosh merged 7 commits intomainfrom
task122_data_loader

MaxVorosh commented Dec 9, 2023

Uh oh!

MaxVorosh commented Dec 9, 2023

Uh oh!

lpetrov02 left a comment

Uh oh!

lpetrov02 Dec 13, 2023

Uh oh!

lpetrov02 Dec 13, 2023

Uh oh!

lpetrov02 Dec 13, 2023

Uh oh!

lpetrov02 Dec 13, 2023

Uh oh!

lpetrov02 Dec 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		static std::vector<std::vector<float>> load_csv(std::string path);
		static std::vector<std::pair<std::string, float>> load_labels(std::string path);

Conversation

MaxVorosh commented Dec 9, 2023

Uh oh!

MaxVorosh commented Dec 9, 2023

Uh oh!

lpetrov02 left a comment

Choose a reason for hiding this comment

Uh oh!

lpetrov02 Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

lpetrov02 Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

lpetrov02 Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

lpetrov02 Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

lpetrov02 Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants