Conversation
|
Thanks for a great PR. Sparsity is very promising topic. |
|
Unfortunately I don't have a good comparison available. I could also commit a data layer that makes the sparse data into a dense blob before loading it to the network. This could be used to compare directly the 2 approaches:
|
|
Great PR. I've tried to use this patch to train my sparse data. However, I found there were so many memory re-allocations, due to the different non-zero numbers (nnz) between mini-batch data which requires to reshape the prefetch_data. All those memory alloc/free may cost extra times besides fp/bp computing. Is there any future plan for this issue? Thanks |
|
What do you think of this project https://github.com/btgraham/SparseConvNet/? |
|
Guys, may I know the progress or plan of merging this Sparse support into Caffe master? |
|
@yangjunpro I'm interested and can help on / test a merge. |
|
This PR is very useful. We need it to scale to sparse NLP entries, therefore I have reworked it so that it is up to date with our (customized) Caffe master, here: https://github.com/beniz/caffe/tree/master_dd_integ_sparse It's been a bit of a struggle to refactor the code (1), but everything is working fine on both CPU and GPU (e.g. @alemagnani initial example based on 20 newsgroups), as well as unit tests. I understand that maintainers either could not find the time to integrate this great piece by @alemagnani or judged that it should be implemented otherwise, but for us, this is great addition, and therefore we will maintain it against Caffe I could provide a PR or a patch against current FTR, at this point there's nothing we can quantify about the experience reported by @buaaliyi but we will try to provide informative measures. Thanks @alemagnani for the great initial work! (1) this includes the templating of |
This PR is a replacement of #937 rebased to master and with an added example
This adds some basic support for sparse data in CSR format. The main thing is a SparseBlob to store the sparse data and an extension to InnerProduct that handles both dense and sparse depending on what is presented at the input. A new data layer is added to read data from DBs.
Some more details: