-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Closed
Labels
Description
Hi,
I was testing several BLAS implementations to see the performance difference. I'm using the MNIST dataset as instructed in its tutorial but with max_iter
set to 1000
.
I just discovered that the training (using train_lenet.sh
) is significantly slow compared to the master branch. I tested on two different machines. The results below are from an Intel Xeon W3530 Nehalem CPU. I'm training using CPU mode. Is this an expected slow-down caused by some implementation change?
atlas-sse3 - fedora 19 x86_64 (dev branch)
-------------------------------------------------------
I0828 17:59:20.025907 20321 solver.cpp:302] Test net output #0: accuracy = 0.9788
I0828 17:59:20.025959 20321 solver.cpp:302] Test net output #1: loss = 0.0642497 (* 1 = 0.0642497 loss)
I0828 17:59:20.025979 20321 solver.cpp:237] Optimization Done.
I0828 17:59:20.025987 20321 caffe.cpp:113] Optimization Done.
real6m11.887s
user6m31.207s
sys0m1.324s
atlas-sse3 - fedora 19 x86_64 (master branch)
-----------------------------------------------------------
I0828 18:06:28.892992 11738 solver.cpp:270] Test score #0: 0.9776
I0828 18:06:28.893049 11738 solver.cpp:270] Test score #1: 0.0670089
I0828 18:06:28.893060 11738 solver.cpp:218] Optimization Done.
I0828 18:06:28.893131 11738 caffe.cpp:113] Optimization Done.
real4m6.125s
user4m5.772s
sys0m0.140s