Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
c941969
Wrap the CPU and GPU math functions in math backend classes
kloudkl May 14, 2014
540f103
Add the math backend to the Layer base class
kloudkl May 14, 2014
ab5ffab
Add device type independent getters to Blob
kloudkl May 14, 2014
e347813
Remove tab from the code and reformat using google style
kloudkl May 15, 2014
8b11f51
Allow Layer::Forward and Backward to be overridden
kloudkl May 15, 2014
88ba49b
Use zero as the default return values of Blob data and diff methods
kloudkl May 15, 2014
e534e50
Add and test device type ignorant Forward and Backward in ConcatLayer
kloudkl May 15, 2014
54b86aa
Add default implementations of Layer::Forward_cpu and Backward_cpu
kloudkl May 15, 2014
0e65f7f
Directly implement device neutral Forward and Backward in ConcatLayer
kloudkl May 15, 2014
3d2e16e
Generalize the math backend classes into device wrapper classes
kloudkl May 24, 2014
faef1ff
Add Device::copy_from_cpu for the data layers
kloudkl May 25, 2014
c10fa56
Unify the CPU and the GPU Forward of the DataLayer
kloudkl May 25, 2014
a38d30a
Unify the CPU and the GPU Forward of the ImageDataLayer
kloudkl May 25, 2014
e362333
Unify the CPU and the GPU Forward of the HDF5DataLayer
kloudkl May 25, 2014
37e3f67
Unify the CPU and the GPU Forward & Backward of the HDF5OutputDataLayer
kloudkl May 25, 2014
881a728
Merge the CPU and the GPU Backward of the data layers
kloudkl May 25, 2014
dd703f9
Consolidate the CPU and GPU Forward of the WindowDataLayer
kloudkl May 25, 2014
8fc3e1d
Deduplicate the CPU and the GPU Forward & Backward of the FlattenLayer
kloudkl May 25, 2014
d80ba9f
Use the newly implemented caffe_gpu_{add,sub} in the GPU device wrapper
kloudkl May 26, 2014
7bf9e67
Replace caffe_gpu_{copy+axpy} with sub in SigmoidCrossEntropyLossLayer
kloudkl May 26, 2014
3ff2afb
Unify the CPU/GPU Forward/Backward of the SigmoidCrossEntropyLossLayer
kloudkl May 26, 2014
071a5d0
Merge the CPU/GPU Forward/Backward of the SoftmaxWithLossLayer
kloudkl May 26, 2014
26f7b34
Use {const, mutable}_{data, diff} in the unified Forward/Backward
kloudkl May 26, 2014
fc675ca
Unify the CPU/GPU Forward/Backward of the InnerProductLayer
kloudkl May 26, 2014
7a3faf0
Unify the CPU/GPU Forward/Backward of the SplitLayer
kloudkl May 26, 2014
9d9b1c3
Unify the CPU/GPU Forward/Backward of the EltwiseLayer
kloudkl May 26, 2014
5d3be7a
Add im2col and col2im to wrap im2col_{cpu, gpu} and col2im_{cpu, gpu}
kloudkl May 26, 2014
b2a90d9
Unify the CPU/GPU versions of Forward/Backward of the ConvolutionLayer
kloudkl May 26, 2014
d622419
Unify the CPU/GPU versions of Forward/Backward of the Im2colLayer
kloudkl May 26, 2014
5089d44
Unify the CPU/GPU versions of Forward/Backward of the PowerLayer
kloudkl May 26, 2014
5c468a1
Move im2col and col2im into the device wrapper classes
kloudkl May 26, 2014
2662987
Update the include guard of the util/device.hpp
kloudkl Jun 8, 2014
8c3d26a
Add OpenCLDevice header file and to_clblasTranspose inline function
kloudkl Jun 8, 2014
1a2f04c
Add macros and get error string functions for the OpenCL device
kloudkl Jun 8, 2014
dbb0cc5
Implement OpenCLDevice::gemm
kloudkl Jun 8, 2014
d9add06
Split OpenCLDevice<double>::gemm into float and double
kloudkl Jun 8, 2014
5d26453
Add OpenCLDevice macro ARRAY & CLBALS_TRAILING_ARGS, edit CREATE_CL_MEM
kloudkl Jun 8, 2014
fc14be2
Simplify OpenCLDevice::gemm with the new macros
kloudkl Jun 8, 2014
15c54e6
Implement OpenCLDevice<float/double>::gemv
kloudkl Jun 8, 2014
bc8da2a
Implement OpenCLDevice::axpy and fix gemm, gemv
kloudkl Jun 8, 2014
c67b937
Implement OpenCLDevice::axpby
kloudkl Jun 8, 2014
10f48f0
Implement OpenCLDevice<>::copy
kloudkl Jun 8, 2014
d0aecd4
Implement OpenCLDevice<Dtype>::copy_from_cpu with clEnqueueWriteBuffer
kloudkl Jun 8, 2014
9094297
Implement OpenCLDevice<Dtype>::set with clEnqueueFillBuffer
kloudkl Jun 8, 2014
95371e2
Implement OpenCLDevice<Dtype>::scale with copy and scal
kloudkl Jun 8, 2014
1c996f4
Add OPENCL_KERNEL_LOOP and DEFINE_AND_INSTANTIATE_OPENCL_BINARY_FUNC
kloudkl Jun 8, 2014
c0b3f7f
Declare, define and instantiate caffe_opencl_{add, sub, mul, div}
kloudkl Jun 8, 2014
238a712
Use caffe_opencl_{add,sub,mul,div} in OpenCLDevice::{add,sub,mul,div}
kloudkl Jun 8, 2014
a4bf96b
Add the macro DEFINE_AND_INSTANTIATE_OPENCL_UNARY_FUNC
kloudkl Jun 8, 2014
cf8e21e
Define and instantiate caffe_opencl_{sqr, exp, sign, sgnbit, fabs}
kloudkl Jun 8, 2014
a245ecc
Use caffe_opencl_{sqr,exp,sign,sgnbit,fabs} in OpenCLDevice::{...}
kloudkl Jun 8, 2014
3cb16cc
Move the definitions of OpenCLDevice unary & binary methods into macros
kloudkl Jun 8, 2014
65342d8
Replace Caffe::opencl_queue with OpenCLDevice<Dtype>::cl_command_queue
kloudkl Jun 8, 2014
a07ec18
Add a new Brew::OPENCL and OpenCL Device in DeviceFactory
kloudkl Jun 8, 2014
b0dc9b3
Device wrapper methods no longer pure virtual, default not implemented
kloudkl Jun 18, 2014
821c74d
DeviceFactory opts out OpenCLDevice by default, users can opt in
kloudkl Jun 18, 2014
dc7e05f
Add variables in Makefile and config to build OpenCL related codes
kloudkl Jun 18, 2014
20eaad8
Fix all the issues in OpenCLDevice that prevent successful building
kloudkl Jun 18, 2014
18c136e
Fix the rebase errors introduced when merge conflicts are resolved
kloudkl Jun 18, 2014
7b619ce
OpenCLDevice supports multiple platforms and devices
kloudkl Jun 18, 2014
22acd5d
Initialize and finalize clBLAS in the OpenCLDevice
kloudkl Jun 18, 2014
ba4cf00
Split math functions and global device statuses out of OpenCLDevice
kloudkl Jun 18, 2014
2294bc9
Implement the macros to define the OpenCL kernels for the math functions
kloudkl Jun 19, 2014
be9549a
Implement the macros to define the OpenCL kernels for the math functions
kloudkl Jun 19, 2014
1dfc70f
Using Shared Context for Multiple OpenCL Devices
kloudkl Jun 19, 2014
11d7d96
Implement OpenCLSyncedMemory keeping the API of SyncedMemory
kloudkl Jun 19, 2014
4694626
Replace clEnqueue{Read,Write}Buffer with MapBuffer/UnmapMemObject
kloudkl Jun 19, 2014
5688b2c
Add tests for OpenCL math functions
kloudkl Jun 19, 2014
1fe92d0
Add common abstract base class for SyncedMemory and OpenCLSyncedMemory
kloudkl Jun 19, 2014
26917dc
Add the factory function to produce synced memory
kloudkl Jun 20, 2014
c31fbaa
Add tests for OpenCL synced memory
kloudkl Jun 19, 2014
2dd1d7d
Opt out OpenCL codes with the macro USE_OPENCL
kloudkl Jun 28, 2014
1605431
Replace all the blob data getters with the device independent version
kloudkl Jun 28, 2014
e20f8a7
Dynamically get the device to perform the math computations
kloudkl Jun 28, 2014
8f868c1
Implement device independent data getters for the SyncedMemory
kloudkl Jun 28, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,14 @@ endif
INCLUDE_DIRS += $(BLAS_INCLUDE)
LIBRARY_DIRS += $(BLAS_LIB)

OPENCL ?= 0
ifeq ($(OPENCL), 1)
INCLUDE_DIRS += $(OPENCL_INCLUDE_DIR) $(CLBLAS_INCLUDE_DIR)
LIBRARY_DIRS += $(OPENCL_LIB_DIR) $(CLBLAS_LIB_DIR)
LIBRARIES += $(OPENCL_LIBS) $(CLBLAS_LIBS)
COMMON_FLAGS += -DUSE_OPENCL
endif

# Complete build flags.
COMMON_FLAGS += $(foreach includedir,$(INCLUDE_DIRS),-I$(includedir))
CXXFLAGS += -pthread -fPIC $(COMMON_FLAGS) $(WARNINGS)
Expand Down
7 changes: 7 additions & 0 deletions Makefile.config.example
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,13 @@ PYTHON_INCLUDE := /usr/local/include/python2.7 \
PYTHON_LIB := /usr/local/lib
# PYTHON_LIB := $(HOME)/anaconda/lib

OPENCL_INCLUDE_DIR := /opt/AMDAPP/include/
OPENCL_LIB_DIR := /opt/AMDAPP/lib/x86_64/
OPENCL_LIBS := OpenCL
CLBLAS_INCLUDE_DIR := /home/user/Codes/clBLAS/src/package/include
CLBLAS_LIB_DIR := /home/user/Codes/clBLAS/src/package/lib64
CLBLAS_LIBS := clBLAS

# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
Expand Down
6 changes: 6 additions & 0 deletions include/caffe/blob.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ class Blob {
Dtype* mutable_gpu_data();
Dtype* mutable_cpu_diff();
Dtype* mutable_gpu_diff();

const Dtype* const_data() const;
const Dtype* const_diff() const;
Dtype* mutable_data();
Dtype* mutable_diff();

void Update();
void FromProto(const BlobProto& proto);
void ToProto(BlobProto* proto, bool write_diff = false) const;
Expand Down
2 changes: 1 addition & 1 deletion include/caffe/common.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ class Caffe {
}
return *singleton_;
}
enum Brew { CPU, GPU };
enum Brew { CPU, GPU, OPENCL_CPU, OPENCL_GPU, OPENCL_ALL };
enum Phase { TRAIN, TEST };


Expand Down
83 changes: 27 additions & 56 deletions include/caffe/data_layers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ class HDF5OutputLayer : public Layer<Dtype> {
virtual ~HDF5OutputLayer();
virtual void SetUp(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top) {}
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) { return; }

virtual inline LayerParameter_LayerType type() const {
return LayerParameter_LayerType_HDF5_OUTPUT;
Expand All @@ -42,14 +46,6 @@ class HDF5OutputLayer : public Layer<Dtype> {
inline std::string file_name() const { return file_name_; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward_gpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom);
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom);
virtual void SaveBlobs();

std::string file_name_;
Expand All @@ -67,6 +63,10 @@ class HDF5DataLayer : public Layer<Dtype> {
virtual ~HDF5DataLayer();
virtual void SetUp(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) { return; }

virtual inline LayerParameter_LayerType type() const {
return LayerParameter_LayerType_HDF5_DATA;
Expand All @@ -75,14 +75,6 @@ class HDF5DataLayer : public Layer<Dtype> {
virtual inline int ExactNumTopBlobs() const { return 2; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward_gpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void LoadHDF5FileData(const char* filename);

std::vector<std::string> hdf_filenames_;
Expand Down Expand Up @@ -111,6 +103,10 @@ class DataLayer : public Layer<Dtype> {
virtual ~DataLayer();
virtual void SetUp(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) { return; }

virtual inline LayerParameter_LayerType type() const {
return LayerParameter_LayerType_DATA;
Expand All @@ -120,15 +116,6 @@ class DataLayer : public Layer<Dtype> {
virtual inline int MaxTopBlobs() const { return 2; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward_gpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}

virtual void CreatePrefetchThread();
virtual void JoinPrefetchThread();
virtual unsigned int PrefetchRand();
Expand Down Expand Up @@ -170,15 +157,12 @@ class DummyDataLayer : public Layer<Dtype> {
}
virtual inline int ExactNumBottomBlobs() const { return 0; }
virtual inline int MinTopBlobs() const { return 1; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}

protected:
vector<shared_ptr<Filler<Dtype> > > fillers_;
vector<bool> refill_;
};
Expand All @@ -198,6 +182,10 @@ class ImageDataLayer : public Layer<Dtype> {
virtual ~ImageDataLayer();
virtual void SetUp(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) { return; }

virtual inline LayerParameter_LayerType type() const {
return LayerParameter_LayerType_IMAGE_DATA;
Expand All @@ -206,15 +194,6 @@ class ImageDataLayer : public Layer<Dtype> {
virtual inline int ExactNumTopBlobs() const { return 2; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward_gpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}

virtual void ShuffleImages();

virtual void CreatePrefetchThread();
Expand Down Expand Up @@ -244,6 +223,10 @@ class MemoryDataLayer : public Layer<Dtype> {
: Layer<Dtype>(param) {}
virtual void SetUp(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}

virtual inline LayerParameter_LayerType type() const {
return LayerParameter_LayerType_MEMORY_DATA;
Expand All @@ -260,13 +243,6 @@ class MemoryDataLayer : public Layer<Dtype> {
int batch_size() { return batch_size_; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}

Dtype* data_;
Dtype* labels_;
int datum_channels_;
Expand All @@ -293,6 +269,10 @@ class WindowDataLayer : public Layer<Dtype> {
virtual ~WindowDataLayer();
virtual void SetUp(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) { return; }

virtual inline LayerParameter_LayerType type() const {
return LayerParameter_LayerType_WINDOW_DATA;
Expand All @@ -301,15 +281,6 @@ class WindowDataLayer : public Layer<Dtype> {
virtual inline int ExactNumTopBlobs() const { return 2; }

protected:
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual Dtype Forward_gpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {}

virtual void CreatePrefetchThread();
virtual void JoinPrefetchThread();
virtual unsigned int PrefetchRand();
Expand Down
16 changes: 8 additions & 8 deletions include/caffe/filler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ class ConstantFiller : public Filler<Dtype> {
explicit ConstantFiller(const FillerParameter& param)
: Filler<Dtype>(param) {}
virtual void Fill(Blob<Dtype>* blob) {
Dtype* data = blob->mutable_cpu_data();
Dtype* data = blob->mutable_data();
const int count = blob->count();
const Dtype value = this->filler_param_.value();
CHECK(count);
Expand All @@ -54,7 +54,7 @@ class UniformFiller : public Filler<Dtype> {
virtual void Fill(Blob<Dtype>* blob) {
CHECK(blob->count());
caffe_rng_uniform<Dtype>(blob->count(), Dtype(this->filler_param_.min()),
Dtype(this->filler_param_.max()), blob->mutable_cpu_data());
Dtype(this->filler_param_.max()), blob->mutable_data());
CHECK_EQ(this->filler_param_.sparse(), -1)
<< "Sparsity not supported by this Filler.";
}
Expand All @@ -66,10 +66,10 @@ class GaussianFiller : public Filler<Dtype> {
explicit GaussianFiller(const FillerParameter& param)
: Filler<Dtype>(param) {}
virtual void Fill(Blob<Dtype>* blob) {
Dtype* data = blob->mutable_cpu_data();
Dtype* data = blob->mutable_data();
CHECK(blob->count());
caffe_rng_gaussian<Dtype>(blob->count(), Dtype(this->filler_param_.mean()),
Dtype(this->filler_param_.std()), blob->mutable_cpu_data());
Dtype(this->filler_param_.std()), blob->mutable_data());
int sparse = this->filler_param_.sparse();
CHECK_GE(sparse, -1);
if (sparse >= 0) {
Expand All @@ -82,7 +82,7 @@ class GaussianFiller : public Filler<Dtype> {
int num_inputs = blob->height();
Dtype non_zero_probability = Dtype(sparse) / Dtype(num_inputs);
rand_vec_.reset(new SyncedMemory(blob->count() * sizeof(int)));
int* mask = reinterpret_cast<int*>(rand_vec_->mutable_cpu_data());
int* mask = reinterpret_cast<int*>(rand_vec_->mutable_data());
caffe_rng_bernoulli(blob->count(), non_zero_probability, mask);
for (int i = 0; i < blob->count(); ++i) {
data[i] *= mask[i];
Expand All @@ -100,9 +100,9 @@ class PositiveUnitballFiller : public Filler<Dtype> {
explicit PositiveUnitballFiller(const FillerParameter& param)
: Filler<Dtype>(param) {}
virtual void Fill(Blob<Dtype>* blob) {
Dtype* data = blob->mutable_cpu_data();
Dtype* data = blob->mutable_data();
DCHECK(blob->count());
caffe_rng_uniform<Dtype>(blob->count(), 0, 1, blob->mutable_cpu_data());
caffe_rng_uniform<Dtype>(blob->count(), 0, 1, blob->mutable_data());
// We expect the filler to not be called very frequently, so we will
// just use a simple implementation
int dim = blob->count() / blob->num();
Expand Down Expand Up @@ -139,7 +139,7 @@ class XavierFiller : public Filler<Dtype> {
int fan_in = blob->count() / blob->num();
Dtype scale = sqrt(Dtype(3) / fan_in);
caffe_rng_uniform<Dtype>(blob->count(), -scale, scale,
blob->mutable_cpu_data());
blob->mutable_data());
CHECK_EQ(this->filler_param_.sparse(), -1)
<< "Sparsity not supported by this Filler.";
}
Expand Down
13 changes: 7 additions & 6 deletions include/caffe/layer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/device.hpp"

using std::string;
using std::vector;
Expand Down Expand Up @@ -43,9 +44,9 @@ class Layer {
// Forward and backward wrappers. You should implement the cpu and
// gpu specific implementations instead, and should not change these
// functions.
inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
virtual Dtype Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top);
inline void Backward(const vector<Blob<Dtype>*>& top,
virtual void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
vector<Blob<Dtype>*>* bottom);

Expand Down Expand Up @@ -101,7 +102,7 @@ class Layer {
// Forward functions: compute the layer output
// (and loss layers return the loss; other layers return the dummy value 0.)
virtual Dtype Forward_cpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top) = 0;
vector<Blob<Dtype>*>* top) { return static_cast<Dtype>(0); }
// If no gpu code is provided, we will simply use cpu code.
virtual Dtype Forward_gpu(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top) {
Expand All @@ -113,7 +114,7 @@ class Layer {
// for the bottom blobs if propagate_down is true.
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
vector<Blob<Dtype>*>* bottom) = 0;
vector<Blob<Dtype>*>* bottom) { return; }
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
vector<Blob<Dtype>*>* bottom) {
Expand Down Expand Up @@ -165,7 +166,7 @@ class Layer {
// gpu specific implementations instead, and should not change these
// functions.
template <typename Dtype>
inline Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,
Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,
vector<Blob<Dtype>*>* top) {
switch (Caffe::mode()) {
case Caffe::CPU:
Expand All @@ -179,7 +180,7 @@ inline Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,
}

template <typename Dtype>
inline void Layer<Dtype>::Backward(const vector<Blob<Dtype>*>& top,
void Layer<Dtype>::Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
vector<Blob<Dtype>*>* bottom) {
switch (Caffe::mode()) {
Expand Down
Loading