-
Notifications
You must be signed in to change notification settings - Fork 6.7k
problem about exponential linear unit #743
Description
I found that you have realized ELU in mshadow_op.h, so I add it in activation-inl.h and try to test it.
But it seems divergent after runing some training epoches.
I use the network in 4.2 of elu paper and the following is part of my traning log.
2015-11-29 04:05:03,875 Node[0] Epoch[2] Train-accuracy=0.350284
2015-11-29 04:05:03,875 Node[0] Epoch[2] Time cost=79.385
2015-11-29 04:05:09,214 Node[0] Epoch[2] Validation-accuracy=0.413662
2015-11-29 04:05:19,073 Node[0] Iter[3] Batch [50] Speed: 655.08 samples/sec
2015-11-29 04:05:29,146 Node[0] Iter[3] Batch [100] Speed: 635.41 samples/sec
2015-11-29 04:05:39,230 Node[0] Iter[3] Batch [150] Speed: 634.68 samples/sec
2015-11-29 04:05:49,465 Node[0] Iter[3] Batch [200] Speed: 625.30 samples/sec
2015-11-29 04:05:59,699 Node[0] Iter[3] Batch [250] Speed: 625.40 samples/sec
2015-11-29 04:06:09,897 Node[0] Iter[3] Batch [300] Speed: 627.58 samples/sec
2015-11-29 04:06:20,124 Node[0] Iter[3] Batch [350] Speed: 625.79 samples/sec
2015-11-29 04:06:28,728 Node[0] Epoch[3] Train-accuracy=0.372582
2015-11-29 04:06:28,728 Node[0] Epoch[3] Time cost=79.514
2015-11-29 04:06:34,069 Node[0] Epoch[3] Validation-accuracy=0.418069
2015-11-29 04:06:44,051 Node[0] Iter[4] Batch [50] Speed: 645.99 samples/sec
2015-11-29 04:06:54,273 Node[0] Iter[4] Batch [100] Speed: 626.14 samples/sec
2015-11-29 04:07:04,505 Node[0] Iter[4] Batch [150] Speed: 625.51 samples/sec
2015-11-29 04:07:14,749 Node[0] Iter[4] Batch [200] Speed: 624.71 samples/sec
2015-11-29 04:07:24,988 Node[0] Iter[4] Batch [250] Speed: 625.13 samples/sec
2015-11-29 04:07:35,207 Node[0] Iter[4] Batch [300] Speed: 626.25 samples/sec
2015-11-29 04:07:45,444 Node[0] Iter[4] Batch [350] Speed: 625.23 samples/sec
2015-11-29 04:07:54,027 Node[0] Epoch[4] Train-accuracy=0.293998
2015-11-29 04:07:54,027 Node[0] Epoch[4] Time cost=79.958
2015-11-29 04:07:59,366 Node[0] Epoch[4] Validation-accuracy=0.153646
2015-11-29 04:08:09,329 Node[0] Iter[5] Batch [50] Speed: 648.67 samples/sec
2015-11-29 04:08:19,419 Node[0] Iter[5] Batch [100] Speed: 634.29 samples/sec
2015-11-29 04:08:29,292 Node[0] Iter[5] Batch [150] Speed: 648.30 samples/sec
2015-11-29 04:08:39,158 Node[0] Iter[5] Batch [200] Speed: 648.67 samples/sec
2015-11-29 04:08:49,020 Node[0] Iter[5] Batch [250] Speed: 648.96 samples/sec
2015-11-29 04:08:58,878 Node[0] Iter[5] Batch [300] Speed: 649.25 samples/sec
2015-11-29 04:09:08,664 Node[0] Iter[5] Batch [350] Speed: 653.99 samples/sec
2015-11-29 04:09:16,865 Node[0] Epoch[5] Train-accuracy=0.103061
2015-11-29 04:09:16,866 Node[0] Epoch[5] Time cost=77.499
2015-11-29 04:09:21,915 Node[0] Epoch[5] Validation-accuracy=0.100060