Proposition: After the OpenCL branch maturing for over 3 years, maybe it's time to give Caffe a bump by naibaf7 · Pull Request #6427 · BVLC/caffe

naibaf7 · 2018-06-12T05:55:57Z

The OpenCL branch has been maturing for 3 years. In this time, a lot of features have been accumulated that are absent in the main branch. Notably, there are many features (and many more to come) that would differentiate Caffe from other frameworks such as TensorFlow, Caffe2 and Torch in a positive way. Caffe has been lacking such features lately.

I'm also standing on a crossroad now, where adding more features to my branch would ultimately make it too difficult to merge upstream changes from the master branch into the opencl branch.

A selection of core features:

Virtual pointers and complete device abstraction.
Multiplexer supporting different BLAS libraries on GPUs and CPUs.
A single, device-abstracted code path for the GPU layers.
Runtime compiled kernels for CUDA and OpenCL with a SQLite3 kernel cache.
LibDNN with support for convolution, pooling and a complete BLAS, that can act as a drop-in replacement for cuDNN and runs on all OpenCL and CUDA GPUs.
Extended Python interface.
Vastly improved NetSpec.
Quantized data types (INT8, INT16) and half-precision float (FP16).
Quantization with support of pseudo-quantization, quantization parameter estimation and mixed-precision neural networks.
Gemmlowp support on the CPU (INT8 inference).
Topological MALIS-loss layer.
Mixture-of-experts (MOE) layer.
Explicit support for AMD, nVidia, ARM Mali and Intel GPUs.
CPU OpenCL support.
Extended OpenMP support.
Dropping the Makefile build system.
CMake cross-compile support for Android and ARM (Raspberry Pi, Asus Tinkerboard)
Reduced memory inference.
Runtime compiled kernels means no more conflicts between NVCC and host compiler, making builds much faster and easier.
Forward and backward compatibility to existing networks and trained weights.

Upcoming features:

Raspberry Pi VideoCore IV support
Writing GPU layers in Python

Currently broken features:

NCCL
Multi-GPU
HDF5 (in some cases)
Some runtests (easy to fix)

Missing/incomplete features:

Some unit tests for quantized types are not implemented yet
Quantization only implemented for some layers (LeNet and AlexNet work quantized).

So I hope we can get some maintainers to help me hunt down the remaining few bugs in this branch and make it ready for a merge with the master branch (maybe until December?).

I hope this finds consideration among other Caffe maintainers. I will follow up with all changes and additions made to this branch once core maintainers will agree to go down this path for the future of Caffe.

Mention SKX support

The logic for setting the library RPATH checks whether or not ${CMAKE_INSTALL_PREFIX}/lib is a system directory, and if not adds it to the library RPATH. However, caffe does not install to ${CMAKE_INSTALL_PREFIX}/lib, it installs to ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR} (from GNUInstallDirs). CMAKE_INSTALL_LIBDIR may be something like "lib/x86_64-linux-gnu"

Fix division operator for Compatibility of python 3 in classifier.py

The pointers could be used by CUDA wrapper libraries in Python such as PyCUDA, gnumpy, Theano etc.

This line is needed for Ubuntu 16.04: sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev For reference: * https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide * https://youtu.be/DnIs4DRjNL4

Expose GPU pointers to Python

fix bilinear filler (and make constant filler more strict, as it should be)

Signed-off-by: Finnian Anderson <get@finnian.io>

…op_k, no need for fancy priority_queue etc. (2) GPU implementation

…ch-1 [docs] packages needed by Ubuntu 16.04, not just Ubuntu 14.04

Add absolute tolerance to test_net.py to prevent random Travis fails

fixes "NameError: name 'int_tp' is not defined" when running plot_training_log.py

Update link to google style guide.

Fix default mode warning in io.resize_image

Fixed bug where make distribute duplicates python files

fix web demo install instruction link

This commit is to fix issue #5718. * reference: 1. https://groups.google.com/forum/#!topic/caffe-users/nBpWJCcJoCU 2. https://stackoverflow.com/questions/28692209/using-gpu-despite-setting-cpu-only-yielding-unexpected-keyword-argument Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com>

Fix: mean shape incompatible with input shape

…default parameters for VC4C and Mali.

…mode.

Opencl: Remove 2d input check

shelhamer · 2018-08-26T23:59:50Z

@naibaf7 Thank you for dedicating so much effort and engineering to OpenCL Caffe! This branch has only improved since the old times of #2610. Throughout the discussion and comments you have always contributed code, and that takes a lot of time and thought.

I have clearly not had the bandwidth for Caffe development that I once did. I still have my sights on a 1.1 to include module layers (and by extension Halide layers), because these are excellent directions that I was always interested in—among many other good directions, of course—but it has been a long while now. While I still hope to rally the time to take care of last bits of grooming and merge those, I do not expect to be able to contribute as necessary to make a difference in the ~100k line patch that you are hammering on here.

That said, do not let that be an obstacle to your hacking! The only constraint I would advise is that existing capabilities not go missing, like multi-GPU training, HDF5 IO, and so forth. Whether this line of development is a branch, fork, or merged in master can be up to you and the others who keep pushing, pulling, reviewing, and of course brewing ☕

TechnikEmpire · 2018-09-27T17:09:35Z

"Thanks for all the hard work" - Lets it rot as a dangling PR forever.

Coderx7 · 2018-10-11T17:15:00Z

@naibaf7 : what happened? whats the ultimate decision ?

TechnikEmpire · 2018-10-11T17:26:24Z

I have to say that right now, there's compilers issues. Msvc rejects the function instantiation macros being used in some layers.

naibaf7 · 2018-10-11T21:38:52Z

@TechnikEmpire
Currently I do not have time to continue developing it.
Neither seems interest big enough to do so, and since this was an academic exercise, I doubt it will go on like it is.
However, I am looking into porting the Caffe OpenCL code into LibDNN, and then trying to get PyTorch onto that. But it's not sure yet.
Don't expect any news short-time.

TechnikEmpire · 2018-10-11T21:50:58Z

Thanks for your contribution @naibaf7. It's unfortunate but Nvidia has a strangle hold on deep learning, and now Intel has joined the ranks, even destroying opencv and making it depend on proprietary acceleration that doesn't even work on amd cpus.

People like you with your contributions are opening these technologies to the general public while others hold back progress to line their corporate pockets. Will watch your personal repos. Thanks again.

skn123 · 2018-10-12T18:59:47Z

@naibaf7 This was indeed a welcome commit. Can you please check the PR That I made? If a few issues related to "double" is solved then the code will be fully functional. However, there is a HSA version of caffe being developed at ROCm! I am waiting for that. The older version of the code is also there in my personal repo and it should work without any hitch.

naibaf7 · 2018-10-13T10:19:28Z

Thanks for your support. Stay tuned for future developments in deep learning by me, it just may take a while longer. :)

cypof and others added 30 commits August 16, 2017 18:24

Update README.md

147ff58

Mention SKX support

Update link to google style guide.

953f23b

modified division operator for compatibility of python 3

884209e

Merge pull request #5900 from wasnot/fix/py3-division-compat

e8ab85d

Fix division operator for Compatibility of python 3 in classifier.py

Expose GPU pointers to Python

f83ab2c

The pointers could be used by CUDA wrapper libraries in Python such as PyCUDA, gnumpy, Theano etc.

Implement CuDNN-based deconvolution layer and test

61d8592

Fix format

e4c5ee2

Packages needed by Ubuntu 16.04 also

3c20968

This line is needed for Ubuntu 16.04: sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev For reference: * https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide * https://youtu.be/DnIs4DRjNL4

Merge pull request #5904 from longjon/gpu-ptr

57bc41a

Expose GPU pointers to Python

Fixed bilinear filler, added tests

5431266

Merge pull request #5713 from Noiredd/filler

e69caf6

fix bilinear filler (and make constant filler more strict, as it should be)

[docs] fix link to AbsVal layer

0581324

Fix default mode warning in io.resize_image

e2d233d

Signed-off-by: Finnian Anderson <get@finnian.io>

upgrading Accuracy layer: (1) efficient CPU implementation O(L) for t…

020bc05

…op_k, no need for fancy priority_queue etc. (2) GPU implementation

add supports for cuDNN v7

b3128d7

Add absolute tolerance to test_net.py to prevent random Travis fails

d5fdbdf

Merge pull request #5925 from BVLC/williford-install-ubuntu-16.04-pat…

8f13a70

…ch-1 [docs] packages needed by Ubuntu 16.04, not just Ubuntu 14.04

infogain loss: fix bottom blobs description

3d3330c

Merge pull request #5973 from Noiredd/pytest

f14da1a

Add absolute tolerance to test_net.py to prevent random Travis fails

Device abstraction demo. Non-working state.

b535c82

Device abstraction demo. Non-working state.

6d458ee

Fix int_tp not defined in extract_seconds.py

7665289

fixes "NameError: name 'int_tp' is not defined" when running plot_training_log.py

Fix int_tp not defined in extract_seconds.py

35964f1

fixes "NameError: name 'int_tp' is not defined" when running plot_training_log.py

Merge pull request #5866 from cijianzy/update_link_to_google_style_guide

cd2d2ea

Update link to google style guide.

Merge pull request #5969 from developius/fix-default-mode-warning

0818c5e

Fix default mode warning in io.resize_image

Merge pull request #5704 from ArneSuppe/dupDistDirFix

f7a59e9

Fixed bug where make distribute duplicates python files

Merge pull request #5813 from jqueguiner/patch-1

22eea5e

fix web demo install instruction link

Merge pull request #5719 from leemgs/upstream-issue5718

8aa6da4

Fix: mean shape incompatible with input shape

naibaf7 added 7 commits June 1, 2018 21:06

Crosscompile-enabled CMake compilation for Caffe.

97f4f74

Improvements to cross compile system.

dcda290

Added kernel hints for CUDA and OpenCL, LibDNN conv improvements and …

7694afa

…default parameters for VC4C and Mali.

MOE layer implementation and net_spec improvement for nested networks.

29567c7

Merge branch 'master' of github.com:BVLC/caffe

be7126b

Prototype noise layer and improvements to MOE.

c8d3cb9

Operator precision bug fixes.

bc4153a

naibaf7 assigned shelhamer and Noiredd Jun 12, 2018

naibaf7 requested review from jeffdonahue and shelhamer June 12, 2018 05:56

naibaf7 self-assigned this Jun 12, 2018

naibaf7 requested a review from Noiredd June 12, 2018 05:57

naibaf7 added 2 commits June 12, 2018 23:41

Bugfixes.

8f37a14

Fix split layer bug when used together with reduced memory inference …

c0541f4

…mode.

BVLC deleted a comment from skn123 Jun 26, 2018

naibaf7 and others added 3 commits June 27, 2018 16:51

Fix some INDEX_64 build issues.

2996015

Remove 2d input check

e1fa78f

Merge pull request #6456 from samuelwx/deconv

3f2b97e

Opencl: Remove 2d input check

willyd mentioned this pull request Nov 28, 2018

Windows support - Suggestion #6623

Open

5 tasks

naibaf7 closed this Jan 10, 2023

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Proposition: After the OpenCL branch maturing for over 3 years, maybe it's time to give Caffe a bump#6427

Proposition: After the OpenCL branch maturing for over 3 years, maybe it's time to give Caffe a bump#6427
naibaf7 wants to merge 6150 commits intomasterfrom
opencl

naibaf7 commented Jun 12, 2018 •

edited

Loading

Uh oh!

shelhamer commented Aug 26, 2018

Uh oh!

TechnikEmpire commented Sep 27, 2018

Uh oh!

Coderx7 commented Oct 11, 2018

Uh oh!

TechnikEmpire commented Oct 11, 2018

Uh oh!

naibaf7 commented Oct 11, 2018

Uh oh!

TechnikEmpire commented Oct 11, 2018 •

edited

Loading

Uh oh!

skn123 commented Oct 12, 2018

Uh oh!

naibaf7 commented Oct 13, 2018

Uh oh!

This comment was marked as off-topic.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments

Conversation

naibaf7 commented Jun 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shelhamer commented Aug 26, 2018

Uh oh!

TechnikEmpire commented Sep 27, 2018

Uh oh!

Coderx7 commented Oct 11, 2018

Uh oh!

TechnikEmpire commented Oct 11, 2018

Uh oh!

naibaf7 commented Oct 11, 2018

Uh oh!

TechnikEmpire commented Oct 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skn123 commented Oct 12, 2018

Uh oh!

naibaf7 commented Oct 13, 2018

Uh oh!

This comment was marked as off-topic.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

naibaf7 commented Jun 12, 2018 •

edited

Loading

TechnikEmpire commented Oct 11, 2018 •

edited

Loading