From a7e300925cb658b7c3288cb19bd82b74496d4723 Mon Sep 17 00:00:00 2001 From: muli Date: Mon, 14 Sep 2015 23:48:57 -0400 Subject: [PATCH] update doc --- doc/index.md | 3 + doc/python/io.md | 12 -- doc/python/kvstore.md | 3 - doc/python/ndarray.md | 209 -------------------- doc/python/python_api.md | 29 +++ doc/python/python_guide.md | 392 ++++++++++++++++++++++++++++++++++--- doc/python/symbol.md | 169 ---------------- 7 files changed, 402 insertions(+), 415 deletions(-) delete mode 100644 doc/python/io.md delete mode 100644 doc/python/kvstore.md delete mode 100644 doc/python/ndarray.md create mode 100644 doc/python/python_api.md delete mode 100644 doc/python/symbol.md diff --git a/doc/index.md b/doc/index.md index 9d3b7c0e8805..82887d661717 100644 --- a/doc/index.md +++ b/doc/index.md @@ -3,7 +3,10 @@ MXNet Documentation Contents -------- + + * [Python User Guide](python/python_guide.md) +* [Python API](python/python_api.md) * [C++ Developer Guide](cpp/cpp_guide.md) * [Doxygen Version of C++ API](https://mxnet.readthedocs.org/en/latest/doxygen) diff --git a/doc/python/io.md b/doc/python/io.md deleted file mode 100644 index 933945563339..000000000000 --- a/doc/python/io.md +++ /dev/null @@ -1,12 +0,0 @@ -## Data Input and Output - -Mxnet handles IO for you by implementing data iterators. -It is like an iterable class in python, you can traverse the data using a for loop. - - -## IO API Reference - -```eval_rst -.. automodule:: mxnet.io - :members: -``` diff --git a/doc/python/kvstore.md b/doc/python/kvstore.md deleted file mode 100644 index ffd3835b2f0e..000000000000 --- a/doc/python/kvstore.md +++ /dev/null @@ -1,3 +0,0 @@ -# Distributed Key-value Store - -TODO diff --git a/doc/python/ndarray.md b/doc/python/ndarray.md deleted file mode 100644 index 0a342f60428b..000000000000 --- a/doc/python/ndarray.md +++ /dev/null @@ -1,209 +0,0 @@ -# NDArray: Numpy style tensor computations on CPU/GPU - -`NDArray` is the basic operation unit in MXNet for matrix and tensor -computations. It is similar to `numpy.ndarray`, but with two additional -features: - -1. **multiple devices**: all operations can be run on various devices including -CPU and GPU -2. **automatic parallelization**: all operations are automatically executed in - parallel with each other - -## Create and Initialization - -We can create `NDArray` on either GPU or GPU - -```python ->>> import mxnet as mx ->>> a = mx.nd.empty((2, 3)) # create a 2-by-3 matrix on cpu ->>> b = mx.nd.empty((2, 3), mx.gpu()) # create a 2-by-3 matrix on gpu 0 ->>> c = mx.nd.empty((2, 3), mx.gpu(2)) # create a 2-by-3 matrix on gpu 2 ->>> c.shape # get shape -(2L, 3L) ->>> c.context # get device info -Context(device_type=gpu, device_id=2) -``` - -They can be initialized by various ways: - -```python ->>> a = mx.nd.zeros((2, 3)) # create a 2-by-3 matrix and filled with 0 ->>> b = mx.nd.ones((2, 3)) # create a 2-by-3 matrix and filled with 1 ->>> b[:] = 2 # assign all elements of b with 2 -``` - -We can copy the value from one to anther, even if they sit on different devices - -```python ->>> a = mx.nd.ones((2, 3)) ->>> b = mx.nd.zeros((2, 3), mx.gpu()) ->>> a.copyto(b) # copy data from cpu to gpu -``` - -We can also convert `NDArray` to `numpy.ndarray` - -```python ->>> a = mx.nd.ones((2, 3)) ->>> b = a.asnumpy() ->>> type(b) - ->>> print b -[[ 1. 1. 1.] - [ 1. 1. 1.]] -``` - -and verse vice - -```python ->>> a = mx.nd.empty((2, 3)) ->>> a[:] = np.random.uniform(-0.1, 0.1, a.shape) ->>> print a.asnumpy() -[[-0.06821112 -0.03704893 0.06688045] - [ 0.09947646 -0.07700162 0.07681718]] -``` - -## Basic Operations - -### Elemental-wise operations - -In default, `NDArray` performs elemental-wise operations: - -```python ->>> a = mx.nd.ones((2, 3)) * 2 ->>> b = mx.nd.ones((2, 3)) * 4 ->>> print a.asnumpy() -[[ 4. 4. 4.] - [ 4. 4. 4.]] ->>> c = a + b ->>> print c.asnumpy() -[[ 6. 6. 6.] - [ 6. 6. 6.]] ->>> d = a * b ->>> print d.asnumpy() -[[ 8. 8. 8.] - [ 8. 8. 8.]] -``` - -If two `NDArray` sit on different devices, we need explicitly move them into the -same one. The following example performing computations on GPU 0: - -```python ->>> a = mx.nd.ones((2, 3)) * 2 ->>> b = mx.nd.ones((2, 3), mx.gpu()) * 3 ->>> c = a.copyto(mx.gpu()) * b ->>> print c.asnumpy() -[[ 6. 6. 6.] - [ 6. 6. 6.]] -``` - -### Indexing - -TODO - -### Linear Algebra - -TODO - -## Load and Save - -There are two ways to save data to (load from) disks easily. The first way uses -`pickle`. `NDArray` is pickle compatible, which means you can simply pickle the -NArray like what you did with `numpy.ndarray`. - -```python ->>> import mxnet as mx ->>> import pickle as pkl - ->>> a = mx.nd.ones((2, 3)) * 2 ->>> data = pkl.dumps(a) ->>> b = pkl.loads(data) ->>> print b.asnumpy() -[[ 2. 2. 2.] - [ 2. 2. 2.]] -``` - -On the second way, we directly dump a list of `NDArray` into disk in binary format. - -```python ->>> a = mx.nd.ones((2,3))*2 ->>> b = mx.nd.ones((2,3))*3 ->>> mx.nd.save('mydata.bin', [a, b]) ->>> c = mx.nd.load('mydata.bin') ->>> print c[0].asnumpy() -[[ 2. 2. 2.] - [ 2. 2. 2.]] ->>> print c[1].asnumpy() -[[ 3. 3. 3.] - [ 3. 3. 3.]] -``` - -We can also dump a dict. - -```python ->>> mx.nd.save('mydata.bin', {'a':a, 'b':b}) ->>> c = mx.nd.load('mydata.bin') ->>> print c['a'].asnumpy() -[[ 2. 2. 2.] - [ 2. 2. 2.]] ->>> print c['b'].asnumpy() -[[ 3. 3. 3.] - [ 3. 3. 3.]] -``` - -In addition, we have setup the distributed filesystem such as S3 and HDFS, we -can directly save to and load from them. For example: - -```python ->>> mx.nd.save('s3://mybucket/mydata.bin', [a,b]) ->>> mx.nd.save('hdfs///users/myname/mydata.bin', [a,b]) -``` - -## Parallelization - -The operations of `NDArray` are executed by third libraries such as `cblas`, -`mkl`, and `cuda`. In default, each operation is executed by multi-threads. In -addition, `NDArray` can execute operations in parallel. It is desirable when we -use multiple resources such as CPU, GPU cards, and CPU-to-GPU memory bandwidth. - -For example, if we write `a += 1` followed by `b += 1`, and `a` is on CPU while -`b` is on GPU, then want to execute them in parallel to improve the -efficiency. Furthermore, data copy between CPU and GPU are also expensive, we -hope to run it parallel with other computations as well. - -However, finding the codes can be executed in parallel by eye is hard. In the -following example, `a+=1` and `c*=3` can be executed in parallel, but `a+=1` and -`b*=3` should be in sequential. - -```python -a = mx.nd.ones((2,3)) -b = a -c = a.copyto(mx.cpu()) -a += 1 -b *= 3 -c *= 3 -``` - -Luckily, MXNet can automatically resolve the dependencies and -execute operations in parallel with correctness guaranteed. In other words, we -can write program as by assuming there is only a single thread, while MXNet will -automatically dispatch it into multi-devices, such as multi GPU cards or multi -machines. - -It is achieved by lazy evaluation. Any operation we write down is issued into a -internal DAG engine, and then returned. For example, if we run `a += 1`, it -returns immediately after pushing the plus operator to the engine. This -asynchronous allows us to push more operators to the engine, so it can determine -the read and write dependency and find a best way to execute them in -parallel. - -The actual computations are finished if we want to copy the results into some -other place, such as `print a.asnumpy()` or `mx.nd.save([a])`. Therefore, if we -want to write highly parallelized codes, we only need to postpone when we need -the results. - -## NDArray API - -```eval_rst -.. automodule:: mxnet.ndarray - :members: -``` diff --git a/doc/python/python_api.md b/doc/python/python_api.md new file mode 100644 index 000000000000..12b2af026f8b --- /dev/null +++ b/doc/python/python_api.md @@ -0,0 +1,29 @@ +# MXNet Python API + +## NDArray API + +```eval_rst +.. automodule:: mxnet.ndarray + :members: +``` + +## Symbol API + +```eval_rst +.. automodule:: mxnet.symbol + :members: +``` + + +## Executor API +```eval_rst +.. automodule:: mxnet.executor + :members: +``` + +## IO API + +```eval_rst +.. automodule:: mxnet.io + :members: +``` diff --git a/doc/python/python_guide.md b/doc/python/python_guide.md index 440131eacb3c..f3c70af4709b 100644 --- a/doc/python/python_guide.md +++ b/doc/python/python_guide.md @@ -4,37 +4,385 @@ This page gives a general overvie of MXNet python package. MXNet contains a mixed flavor of elements you might need to bake flexible and efficient applications. There are mainly three concepts in MXNet: -* Numpy style `NDArray` offers matrix and tensor computations on both CPU and +* Numpy style [NDArray](#ndarray-numpy-style-tensor-computations-on-cpu-gpu) offers matrix and tensor computations on both CPU and GPU, with automatic parallelization -* `Symbol` makes defining a neural network extremely easy, and it provides +* [Symbol](#symbolic-and-automatic-differentiation) makes defining a neural network extremely easy, and it provides automatic differentiation. -* `KVStore` allows data synchronization between multi-GPUs and multi-machine - easily +* [KVStore](#distributed-key-value-store) allows data synchronization between + multi-GPUs and multi-machine easily -**Table of contents** +## NDArray: Numpy style tensor computations on CPU/GPU -```eval_rst -.. toctree:: - :maxdepth: 2 +`NDArray` is the basic operation unit in MXNet for matrix and tensor +computations. It is similar to `numpy.ndarray`, but with two additional +features: - ndarray - symbol - kvstore - io +1. **multiple devices**: all operations can be run on various devices including +CPU and GPU +2. **automatic parallelization**: all operations are automatically executed in + parallel with each other + +### Create and Initialization + +We can create `NDArray` on either GPU or GPU + +```python +>>> import mxnet as mx +>>> a = mx.nd.empty((2, 3)) # create a 2-by-3 matrix on cpu +>>> b = mx.nd.empty((2, 3), mx.gpu()) # create a 2-by-3 matrix on gpu 0 +>>> c = mx.nd.empty((2, 3), mx.gpu(2)) # create a 2-by-3 matrix on gpu 2 +>>> c.shape # get shape +(2L, 3L) +>>> c.context # get device info +Context(device_type=gpu, device_id=2) +``` + +They can be initialized by various ways: + +```python +>>> a = mx.nd.zeros((2, 3)) # create a 2-by-3 matrix and filled with 0 +>>> b = mx.nd.ones((2, 3)) # create a 2-by-3 matrix and filled with 1 +>>> b[:] = 2 # assign all elements of b with 2 +``` + +We can copy the value from one to anther, even if they sit on different devices + +```python +>>> a = mx.nd.ones((2, 3)) +>>> b = mx.nd.zeros((2, 3), mx.gpu()) +>>> a.copyto(b) # copy data from cpu to gpu +``` + +We can also convert `NDArray` to `numpy.ndarray` + +```python +>>> a = mx.nd.ones((2, 3)) +>>> b = a.asnumpy() +>>> type(b) + +>>> print b +[[ 1. 1. 1.] + [ 1. 1. 1.]] +``` + +and verse vice + +```python +>>> a = mx.nd.empty((2, 3)) +>>> a[:] = np.random.uniform(-0.1, 0.1, a.shape) +>>> print a.asnumpy() +[[-0.06821112 -0.03704893 0.06688045] + [ 0.09947646 -0.07700162 0.07681718]] +``` + +### Basic Operations + +#### Elemental-wise operations + +In default, `NDArray` performs elemental-wise operations: + +```python +>>> a = mx.nd.ones((2, 3)) * 2 +>>> b = mx.nd.ones((2, 3)) * 4 +>>> print a.asnumpy() +[[ 4. 4. 4.] + [ 4. 4. 4.]] +>>> c = a + b +>>> print c.asnumpy() +[[ 6. 6. 6.] + [ 6. 6. 6.]] +>>> d = a * b +>>> print d.asnumpy() +[[ 8. 8. 8.] + [ 8. 8. 8.]] +``` + +If two `NDArray` sit on different devices, we need explicitly move them into the +same one. The following example performing computations on GPU 0: + +```python +>>> a = mx.nd.ones((2, 3)) * 2 +>>> b = mx.nd.ones((2, 3), mx.gpu()) * 3 +>>> c = a.copyto(mx.gpu()) * b +>>> print c.asnumpy() +[[ 6. 6. 6.] + [ 6. 6. 6.]] +``` + +#### Indexing + +TODO + +#### Linear Algebra + +TODO + +### Load and Save + +There are two ways to save data to (load from) disks easily. The first way uses +`pickle`. `NDArray` is pickle compatible, which means you can simply pickle the +NArray like what you did with `numpy.ndarray`. + +```python +>>> import mxnet as mx +>>> import pickle as pkl + +>>> a = mx.nd.ones((2, 3)) * 2 +>>> data = pkl.dumps(a) +>>> b = pkl.loads(data) +>>> print b.asnumpy() +[[ 2. 2. 2.] + [ 2. 2. 2.]] +``` + +On the second way, we directly dump a list of `NDArray` into disk in binary format. + +```python +>>> a = mx.nd.ones((2,3))*2 +>>> b = mx.nd.ones((2,3))*3 +>>> mx.nd.save('mydata.bin', [a, b]) +>>> c = mx.nd.load('mydata.bin') +>>> print c[0].asnumpy() +[[ 2. 2. 2.] + [ 2. 2. 2.]] +>>> print c[1].asnumpy() +[[ 3. 3. 3.] + [ 3. 3. 3.]] +``` + +We can also dump a dict. + +```python +>>> mx.nd.save('mydata.bin', {'a':a, 'b':b}) +>>> c = mx.nd.load('mydata.bin') +>>> print c['a'].asnumpy() +[[ 2. 2. 2.] + [ 2. 2. 2.]] +>>> print c['b'].asnumpy() +[[ 3. 3. 3.] + [ 3. 3. 3.]] ``` +In addition, we have setup the distributed filesystem such as S3 and HDFS, we +can directly save to and load from them. For example: + +```python +>>> mx.nd.save('s3://mybucket/mydata.bin', [a,b]) +>>> mx.nd.save('hdfs///users/myname/mydata.bin', [a,b]) +``` + +### Parallelization + +The operations of `NDArray` are executed by third libraries such as `cblas`, +`mkl`, and `cuda`. In default, each operation is executed by multi-threads. In +addition, `NDArray` can execute operations in parallel. It is desirable when we +use multiple resources such as CPU, GPU cards, and CPU-to-GPU memory bandwidth. + +For example, if we write `a += 1` followed by `b += 1`, and `a` is on CPU while +`b` is on GPU, then want to execute them in parallel to improve the +efficiency. Furthermore, data copy between CPU and GPU are also expensive, we +hope to run it parallel with other computations as well. + +However, finding the codes can be executed in parallel by eye is hard. In the +following example, `a+=1` and `c*=3` can be executed in parallel, but `a+=1` and +`b*=3` should be in sequential. + +```python +a = mx.nd.ones((2,3)) +b = a +c = a.copyto(mx.cpu()) +a += 1 +b *= 3 +c *= 3 +``` + +Luckily, MXNet can automatically resolve the dependencies and +execute operations in parallel with correctness guaranteed. In other words, we +can write program as by assuming there is only a single thread, while MXNet will +automatically dispatch it into multi-devices, such as multi GPU cards or multi +machines. + +It is achieved by lazy evaluation. Any operation we write down is issued into a +internal DAG engine, and then returned. For example, if we run `a += 1`, it +returns immediately after pushing the plus operator to the engine. This +asynchronous allows us to push more operators to the engine, so it can determine +the read and write dependency and find a best way to execute them in +parallel. + +The actual computations are finished if we want to copy the results into some +other place, such as `print a.asnumpy()` or `mx.nd.save([a])`. Therefore, if we +want to write highly parallelized codes, we only need to postpone when we need +the results. + + +## Symbolic and Automatic Differentiation + +Now you have seen the power of NArray of MXNet. It seems to be interesting and +we are ready to build some real deep learning. Hmm, this seems to be really +exciting, but wait, do we need to build things from scratch? It seems that we +need to re-implement all the layers in deep learning toolkits such as +[CXXNet](https://github.com/dmlc/cxxnet) in NArray? Well, you do not have +to. There is a Symbolic API in MXNet that readily helps you to do all these. + +More importantly, the Symbolic API is designed to bring in the advantage of C++ +static layers(operators) to ***maximumly optimizes the performance and memory*** +that is even better than CXXNet. Sounds exciting? Let us get started on this. + +### Creating Symbols + +A common way to create a neural network is to create it via some way of +configuration file or API. The following code creates a configuration two layer +perceptrons. + +```python +import mxnet.symbol as sym +data = sym.Variable('data') +net = sym.FullyConnected(data=data, name='fc1', num_hidden=128) +net = sym.Activation(data=net, name='relu1', act_type="relu") +net = sym.FullyConnected(data=net, name='fc2', num_hidden=10) +net = sym.Softmax(data=net, name = 'sm') +``` + +If you are familiar with tools such as cxxnet or caffe, the ```Symbol``` object +is like configuration files that configures the network structure. If you are +more familiar with tools like theano, the ```Symbol``` object something that +defines the computation graph. Basically, it creates a computation graph that +defines the forward pass of neural network. + +The Configuration API allows you to define the computation graph via +compositions. If you have not used symbolic configuration tools like theano +before, one thing to note is that the ```net``` can also be viewed as function +that have input arguments. + +You can get the list of arguments by calling ```Symbol.list_arguments```. + +```python +>>> net.list_arguments() +['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias'] +``` + +In our example, you can find that the arguments contains the parameters in each +layer, as well as input data. One thing that worth noticing is that the +argument names like ```fc1_weight``` are automatically generated because it was +not specified in creation of fc1. You can also specify it explicitly, like the +following code. + +```python +>>> import mxnet.symbol as sym +>>> data = sym.Variable('data') +>>> w = sym.Variable('myweight') +>>> net = sym.FullyConnected(data=data, weight=w, + name='fc1', num_hidden=128) +>>> net.list_arguments() +['data', 'myweight', 'fc1_bias'] +``` + +Besides the coarse grained neuralnet operators such as FullyConnected, +Convolution. MXNet also provides fine graned operations such as elementwise +add, multiplications. The following example first performs an elementwise add +between two symbols, then feed them to the FullyConnected operator. + +```python +>>> import mxnet.symbol as sym +>>> lhs = sym.Variable('data1') +>>> rhs = sym.Variable('data2') +>>> net = sym.FullyConnected(data=lhs + rhs, + name='fc1', num_hidden=128) +>>> net.list_arguments() +['data1', 'data2', 'fc1_weight', 'fc1_bias'] +``` + +### More Complicated Composition + +In the previous example, Symbols are constructed in a forward compositional way. +Besides doing things in a forward compistion way. You can also treat composed +symbols as functions, and apply them to existing symbols. + +```python +>>> import mxnet.symbol as sym +>>> data = sym.Variable('data') +>>> net = sym.FullyConnected(data=data, + name='fc1', num_hidden=128) +>>> net.list_arguments() +['data', 'fc1_weight', 'fc1_bias'] +>>> data2 = sym.Variable('data2') +>>> in_net = sym.FullyConnected(data=data, + name='in', num_hidden=128) +>>> composed_net = net(data=in_net, name='compose') +>>> composed_net.list_arguments() +['data2', 'in_weight', 'in_bias', 'compose_fc1_weight', 'compose_fc1_bias'] +``` + +In the above example, net is used a function to apply to an existing symbol +```in_net```, the resulting composed_net will replace the original ```data``` by +the the in_net instead. This is useful when you want to change the input of some +neural-net to be other structure. + +### Shape Inference + +Now we have defined the computation graph. A common problem in the computation +graph, is to figure out shapes of each parameters. Usually, we want to know the +shape of all the weights, bias and outputs. + +You can use ```Symbol.infer_shape``` to do that. THe shape inference function +allows you to pass in shapes of arguments that you know, +and it will try to infer the shapes of all arguments and outputs. + +```python +>>> import mxnet.symbol as sym +>>> data = sym.Variable('data') +>>> net = sym.FullyConnected(data=data, name='fc1', + num_hidden=10) +>>> arg_shape, out_shape = net.infer_shape(data=(100, 100)) +>>> dict(zip(net.list_arguments(), arg_shape)) +{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)} +>>> out_shape +[(100, 10)] +``` + +In common practice, you only need to provide the shape of input data, and it +will automatically infers the shape of all the parameters. You can always also +provide more shape information, such as shape of weights. The ```infer_shape``` +will detect if there is inconsitency in the shapes, and raise an Error if some +of them are inconsistent. + +### Bind the Symbols + +Symbols are configuration objects that represents a computation graph (a +configuration of neuralnet). So far we have introduced how to build up the +computation graph (i.e. a configuration). The remaining question is, how we can +do computation using the defined graph. + +TODO. + +### How Efficient is Symbolic API + +In short, they design to be very efficienct in both memory and runtime. + +The major reason for us to introduce Symbolic API, is to bring the efficient C++ +operations in powerful toolkits such as cxxnet and caffe together with the +flexible dynamic NArray operations. All the memory and computation resources are +allocated statically during Bind, to maximize the runtime performance and memory +utilization. + +The coarse grained operators are equivalent to cxxnet layers, which are +extremely efficient. We also provide fine grained operators for more flexible +composition. Because we are also doing more inplace memory allocation, mxnet can +be ***more memory efficient*** than cxxnet, and gets to same runtime, with +greater flexiblity. + +## Distributed Key-value Store +## How to Choose between APIs - - - - - - +You can mix them all as much as you like. Here are some guidelines +* Use Symbolic API and coarse grained operator to create established structure. +* Use fine-grained operator to extend parts of of more flexible symbolic graph. +* Do some dynamic NArray tricks, which are even more flexible, between the calls of forward and backward of executors. - - - - +We believe that different ways offers you different levels of flexibilty and efficiency. Normally you do not need to +be flexible in all parts of the networks, so we allow you to use the fast optimized parts, +and compose it flexibly with fine-grained operator or dynamic NArray. We believe such kind of mixture allows you to build +the deep learning architecture both efficiently and flexibly as your choice. To mix is to maximize the peformance and flexiblity. diff --git a/doc/python/symbol.md b/doc/python/symbol.md deleted file mode 100644 index b306ddc9b7c0..000000000000 --- a/doc/python/symbol.md +++ /dev/null @@ -1,169 +0,0 @@ -# Symbolic and Automatic Differentiation - -Now you have seen the power of NArray of MXNet. It seems to be interesting and -we are ready to build some real deep learning. Hmm, this seems to be really -exciting, but wait, do we need to build things from scratch? It seems that we -need to re-implement all the layers in deep learning toolkits such as -[CXXNet](https://github.com/dmlc/cxxnet) in NArray? Well, you do not have -to. There is a Symbolic API in MXNet that readily helps you to do all these. - -More importantly, the Symbolic API is designed to bring in the advantage of C++ -static layers(operators) to ***maximumly optimizes the performance and memory*** -that is even better than CXXNet. Sounds exciting? Let us get started on this. - -## Creating Symbols - -A common way to create a neural network is to create it via some way of -configuration file or API. The following code creates a configuration two layer -perceptrons. - -```python -import mxnet.symbol as sym -data = sym.Variable('data') -net = sym.FullyConnected(data=data, name='fc1', num_hidden=128) -net = sym.Activation(data=net, name='relu1', act_type="relu") -net = sym.FullyConnected(data=net, name='fc2', num_hidden=10) -net = sym.Softmax(data=net, name = 'sm') -``` - -If you are familiar with tools such as cxxnet or caffe, the ```Symbol``` object -is like configuration files that configures the network structure. If you are -more familiar with tools like theano, the ```Symbol``` object something that -defines the computation graph. Basically, it creates a computation graph that -defines the forward pass of neural network. - -The Configuration API allows you to define the computation graph via -compositions. If you have not used symbolic configuration tools like theano -before, one thing to note is that the ```net``` can also be viewed as function -that have input arguments. - -You can get the list of arguments by calling ```Symbol.list_arguments```. - -```python ->>> net.list_arguments() -['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias'] -``` - -In our example, you can find that the arguments contains the parameters in each -layer, as well as input data. One thing that worth noticing is that the -argument names like ```fc1_weight``` are automatically generated because it was -not specified in creation of fc1. You can also specify it explicitly, like the -following code. - -```python ->>> import mxnet.symbol as sym ->>> data = sym.Variable('data') ->>> w = sym.Variable('myweight') ->>> net = sym.FullyConnected(data=data, weight=w, - name='fc1', num_hidden=128) ->>> net.list_arguments() -['data', 'myweight', 'fc1_bias'] -``` - -Besides the coarse grained neuralnet operators such as FullyConnected, -Convolution. MXNet also provides fine graned operations such as elementwise -add, multiplications. The following example first performs an elementwise add -between two symbols, then feed them to the FullyConnected operator. - -```python ->>> import mxnet.symbol as sym ->>> lhs = sym.Variable('data1') ->>> rhs = sym.Variable('data2') ->>> net = sym.FullyConnected(data=lhs + rhs, - name='fc1', num_hidden=128) ->>> net.list_arguments() -['data1', 'data2', 'fc1_weight', 'fc1_bias'] -``` - -## More Complicated Composition - -In the previous example, Symbols are constructed in a forward compositional way. -Besides doing things in a forward compistion way. You can also treat composed -symbols as functions, and apply them to existing symbols. - -```python ->>> import mxnet.symbol as sym ->>> data = sym.Variable('data') ->>> net = sym.FullyConnected(data=data, - name='fc1', num_hidden=128) ->>> net.list_arguments() -['data', 'fc1_weight', 'fc1_bias'] ->>> data2 = sym.Variable('data2') ->>> in_net = sym.FullyConnected(data=data, - name='in', num_hidden=128) ->>> composed_net = net(data=in_net, name='compose') ->>> composed_net.list_arguments() -['data2', 'in_weight', 'in_bias', 'compose_fc1_weight', 'compose_fc1_bias'] -``` - -In the above example, net is used a function to apply to an existing symbol -```in_net```, the resulting composed_net will replace the original ```data``` by -the the in_net instead. This is useful when you want to change the input of some -neural-net to be other structure. - -## Shape Inference - -Now we have defined the computation graph. A common problem in the computation -graph, is to figure out shapes of each parameters. Usually, we want to know the -shape of all the weights, bias and outputs. - -You can use ```Symbol.infer_shape``` to do that. THe shape inference function -allows you to pass in shapes of arguments that you know, -and it will try to infer the shapes of all arguments and outputs. - -```python ->>> import mxnet.symbol as sym ->>> data = sym.Variable('data') ->>> net = sym.FullyConnected(data=data, name='fc1', - num_hidden=10) ->>> arg_shape, out_shape = net.infer_shape(data=(100, 100)) ->>> dict(zip(net.list_arguments(), arg_shape)) -{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)} ->>> out_shape -[(100, 10)] -``` - -In common practice, you only need to provide the shape of input data, and it -will automatically infers the shape of all the parameters. You can always also -provide more shape information, such as shape of weights. The ```infer_shape``` -will detect if there is inconsitency in the shapes, and raise an Error if some -of them are inconsistent. - -## Bind the Symbols - -Symbols are configuration objects that represents a computation graph (a -configuration of neuralnet). So far we have introduced how to build up the -computation graph (i.e. a configuration). The remaining question is, how we can -do computation using the defined graph. - -TODO. - -## How Efficient is Symbolic API - -In short, they design to be very efficienct in both memory and runtime. - -The major reason for us to introduce Symbolic API, is to bring the efficient C++ -operations in powerful toolkits such as cxxnet and caffe together with the -flexible dynamic NArray operations. All the memory and computation resources are -allocated statically during Bind, to maximize the runtime performance and memory -utilization. - -The coarse grained operators are equivalent to cxxnet layers, which are -extremely efficient. We also provide fine grained operators for more flexible -composition. Because we are also doing more inplace memory allocation, mxnet can -be ***more memory efficient*** than cxxnet, and gets to same runtime, with -greater flexiblity. - -## Symbol API - -```eval_rst -.. automodule:: mxnet.symbol - :members: -``` - - -## Executor API -```eval_rst -.. automodule:: mxnet.executor - :members: -```