CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
    Hardware threads: 12
    Total Memory: 57700428 kB
-------------------------------------------------------------------
=== Running mpiexec -n 4 /home/ubuntu/workspace/build/gpu/release/bin/cntk configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../.. OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu DeviceId=-1 timestamping=true numCPUThreads=3 precision=double SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]] stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
--------------------------------------------------------------------------
[[3793,1],2]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: fdb4dbbde386

Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (3) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (0) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (1) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (2) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
12/12/2017 15:02:30: Redirecting stderr to file /tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr_SimpleMultiGPU.logrank0
12/12/2017 15:02:31: Redirecting stderr to file /tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr_SimpleMultiGPU.logrank1
12/12/2017 15:02:31: Redirecting stderr to file /tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr_SimpleMultiGPU.logrank2
12/12/2017 15:02:32: Redirecting stderr to file /tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr_SimpleMultiGPU.logrank3
[fdb4dbbde386:45295] 3 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[fdb4dbbde386:45295] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
MPI Rank 0: CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30
MPI Rank 0: 
MPI Rank 0: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
MPI Rank 0: 12/12/2017 15:02:30: -------------------------------------------------------------------
MPI Rank 0: 12/12/2017 15:02:30: Build info: 
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: 		Built time: Dec 11 2017 18:28:39
MPI Rank 0: 12/12/2017 15:02:30: 		Last modified date: Wed Nov 15 09:27:10 2017
MPI Rank 0: 12/12/2017 15:02:30: 		Build type: release
MPI Rank 0: 12/12/2017 15:02:30: 		Build target: GPU
MPI Rank 0: 12/12/2017 15:02:30: 		With ASGD: yes
MPI Rank 0: 12/12/2017 15:02:30: 		Math lib: mkl
MPI Rank 0: 12/12/2017 15:02:30: 		CUDA version: 9.0.0
MPI Rank 0: 12/12/2017 15:02:30: 		CUDNN version: 7.0.4
MPI Rank 0: 12/12/2017 15:02:30: 		Build Branch: HEAD
MPI Rank 0: 12/12/2017 15:02:30: 		Build SHA1: f4f0f82eabcc482dbd03af1f946a44ae2b8b97bf
MPI Rank 0: 12/12/2017 15:02:30: 		MPI distribution: Open MPI
MPI Rank 0: 12/12/2017 15:02:30: 		MPI version: 1.10.7
MPI Rank 0: 12/12/2017 15:02:30: -------------------------------------------------------------------
MPI Rank 0: 12/12/2017 15:02:30: -------------------------------------------------------------------
MPI Rank 0: 12/12/2017 15:02:30: GPU info:
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8112 MB
MPI Rank 0: 12/12/2017 15:02:30: -------------------------------------------------------------------
MPI Rank 0: 12/12/2017 15:02:30: Using 3 CPU threads.
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: ##############################################################################
MPI Rank 0: 12/12/2017 15:02:30: #                                                                            #
MPI Rank 0: 12/12/2017 15:02:30: # SimpleMultiGPU command (train action)                                      #
MPI Rank 0: 12/12/2017 15:02:30: #                                                                            #
MPI Rank 0: 12/12/2017 15:02:30: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: 
MPI Rank 0: Creating virgin network.
MPI Rank 0: SimpleNetworkBuilder Using CPU
MPI Rank 0: 12/12/2017 15:02:30: 
MPI Rank 0: Model has 25 nodes. Using CPU.
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 0: 12/12/2017 15:02:30: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: Allocating matrices for forward and/or backward propagation.
MPI Rank 0: 
MPI Rank 0: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 0: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 0: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 0: 
MPI Rank 0: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 0: 
MPI Rank 0: Here are the ones that share memory:
MPI Rank 0: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 0: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 0: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 0: 	  W0 : [50 x 2] (gradient)
MPI Rank 0: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 0: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  W2*H1 : [2 x 1 x *]
MPI Rank 0: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 0: 	{ B0 : [50 x 1] (gradient)
MPI Rank 0: 	  H1 : [50 x 1 x *] }
MPI Rank 0: 	{ H2 : [50 x 1 x *]
MPI Rank 0: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 0: 	  W1 : [50 x 50] (gradient)
MPI Rank 0: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 0: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  HLast : [2 x 1 x *]
MPI Rank 0: 	  W0*features : [50 x *]
MPI Rank 0: 	  W0*features : [50 x *] (gradient) }
MPI Rank 0: 
MPI Rank 0: Here are the ones that don't share memory:
MPI Rank 0: 	{MeanOfFeatures : [2]}
MPI Rank 0: 	{features : [2 x *]}
MPI Rank 0: 	{InvStdOfFeatures : [2]}
MPI Rank 0: 	{W0 : [50 x 2]}
MPI Rank 0: 	{B0 : [50 x 1]}
MPI Rank 0: 	{W1 : [50 x 50]}
MPI Rank 0: 	{B1 : [50 x 1]}
MPI Rank 0: 	{W2 : [2 x 50]}
MPI Rank 0: 	{B2 : [2 x 1]}
MPI Rank 0: 	{labels : [2 x *]}
MPI Rank 0: 	{Prior : [2]}
MPI Rank 0: 	{EvalClassificationError : [1]}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 0: 	{LogOfPrior : [2]}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 0: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 0: 	{B2 : [2 x 1] (gradient)}
MPI Rank 0: 	{B1 : [50 x 1] (gradient)}
MPI Rank 0: 	{W2 : [2 x 50] (gradient)}
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:30: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 0: 12/12/2017 15:02:30: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 0: 12/12/2017 15:02:30: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 0: 12/12/2017 15:02:30: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 0: 12/12/2017 15:02:30: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 0: 12/12/2017 15:02:30: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 0: 
MPI Rank 0: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 0: NcclComm: disabled, at least one rank using CPU device
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:32: Precomputing --> 3 PreCompute nodes found.
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:32: 	MeanOfFeatures = Mean()
MPI Rank 0: 12/12/2017 15:02:32: 	InvStdOfFeatures = InvStdDev()
MPI Rank 0: 12/12/2017 15:02:32: 	Prior = Mean()
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:33: Precomputing --> Completed.
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:33: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:33: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 12/12/2017 15:02:33:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.69973268 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.4052s; samplesPerSecond = 617.0
MPI Rank 0: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71436905 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.1983s; samplesPerSecond = 1260.8
MPI Rank 0: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72871054 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2003s; samplesPerSecond = 1248.4
MPI Rank 0: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70038993 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.1999s; samplesPerSecond = 1250.4
MPI Rank 0: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70593818 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.4011s; samplesPerSecond = 623.2
MPI Rank 0: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71604646 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2388s; samplesPerSecond = 1047.1
MPI Rank 0: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72247949 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.2352s; samplesPerSecond = 1063.0
MPI Rank 0: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79884413 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2257s; samplesPerSecond = 1107.8
MPI Rank 0: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69622447 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.1229s; samplesPerSecond = 2033.5
MPI Rank 0: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70749459 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.1879s; samplesPerSecond = 1330.2
MPI Rank 0: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71485824 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.2668s; samplesPerSecond = 936.9
MPI Rank 0: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69579152 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.2330s; samplesPerSecond = 1072.9
MPI Rank 0: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70174138 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.2024s; samplesPerSecond = 1235.0
MPI Rank 0: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71926586 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.4453s; samplesPerSecond = 561.4
MPI Rank 0: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72009917 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.2232s; samplesPerSecond = 1120.0
MPI Rank 0: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71854573 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.2128s; samplesPerSecond = 1174.9
MPI Rank 0: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74083729 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2253s; samplesPerSecond = 1109.7
MPI Rank 0: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71762852 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.1746s; samplesPerSecond = 1431.9
MPI Rank 0: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71530686 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.1878s; samplesPerSecond = 1331.1
MPI Rank 0: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71768617 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.1800s; samplesPerSecond = 1388.5
MPI Rank 0: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71515312 * 250; EvalClassificationError = 0.53600000 * 250; time = 0.1604s; samplesPerSecond = 1558.9
MPI Rank 0: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72047060 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.1978s; samplesPerSecond = 1263.8
MPI Rank 0: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72033071 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.1576s; samplesPerSecond = 1585.9
MPI Rank 0: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71295324 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.4627s; samplesPerSecond = 540.3
MPI Rank 0: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69737817 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.2302s; samplesPerSecond = 1085.8
MPI Rank 0: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70251892 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.3668s; samplesPerSecond = 681.6
MPI Rank 0: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70879704 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.2376s; samplesPerSecond = 1052.4
MPI Rank 0: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69856459 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.1823s; samplesPerSecond = 1371.5
MPI Rank 0: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69425907 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.2000s; samplesPerSecond = 1250.3
MPI Rank 0: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69599736 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1664s; samplesPerSecond = 1502.6
MPI Rank 0: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69591177 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.1750s; samplesPerSecond = 1428.8
MPI Rank 0: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.69133098 * 250; EvalClassificationError = 0.40000000 * 250; time = 0.2385s; samplesPerSecond = 1048.3
MPI Rank 0: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69822648 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.4431s; samplesPerSecond = 564.2
MPI Rank 0: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.71031539 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.2477s; samplesPerSecond = 1009.1
MPI Rank 0: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.70097460 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2090s; samplesPerSecond = 1196.3
MPI Rank 0: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.68927867 * 250; EvalClassificationError = 0.45200000 * 250; time = 0.1870s; samplesPerSecond = 1337.1
MPI Rank 0: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.68908389 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.1867s; samplesPerSecond = 1339.0
MPI Rank 0: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.67796901 * 250; EvalClassificationError = 0.45600000 * 250; time = 0.1799s; samplesPerSecond = 1389.6
MPI Rank 0: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.67863593 * 250; EvalClassificationError = 0.38400000 * 250; time = 0.2452s; samplesPerSecond = 1019.5
MPI Rank 0: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.67150936 * 250; EvalClassificationError = 0.42800000 * 250; time = 0.2168s; samplesPerSecond = 1152.9
MPI Rank 0: 12/12/2017 15:02:43: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70804123 * 10000; EvalClassificationError = 0.49380000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=9.491s
MPI Rank 0: 12/12/2017 15:02:43: SGD: Saving checkpoint model '/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/models/Simple.dnn.1'
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:43: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:43: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.69566490 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1897s; samplesPerSecond = 1317.5
MPI Rank 0: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.64058119 * 250; EvalClassificationError = 0.22400000 * 250; time = 0.4258s; samplesPerSecond = 587.2
MPI Rank 0: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.62577202 * 250; EvalClassificationError = 0.30400000 * 250; time = 0.2814s; samplesPerSecond = 888.5
MPI Rank 0: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.62974783 * 250; EvalClassificationError = 0.34000000 * 250; time = 0.2768s; samplesPerSecond = 903.3
MPI Rank 0: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.60705897 * 250; EvalClassificationError = 0.22800000 * 250; time = 0.3239s; samplesPerSecond = 771.8
MPI Rank 0: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.59038668 * 250; EvalClassificationError = 0.18000000 * 250; time = 0.2188s; samplesPerSecond = 1142.8
MPI Rank 0: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.55033195 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2783s; samplesPerSecond = 898.4
MPI Rank 0: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.53624170 * 250; EvalClassificationError = 0.23200000 * 250; time = 0.2549s; samplesPerSecond = 981.0
MPI Rank 0: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.48688308 * 250; EvalClassificationError = 0.12000000 * 250; time = 0.2678s; samplesPerSecond = 933.4
MPI Rank 0: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.43212926 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.3449s; samplesPerSecond = 724.9
MPI Rank 0: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.38559516 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2430s; samplesPerSecond = 1028.8
MPI Rank 0: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.34249535 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.3439s; samplesPerSecond = 726.9
MPI Rank 0: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.28670698 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.3079s; samplesPerSecond = 811.9
MPI Rank 0: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.26990400 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2519s; samplesPerSecond = 992.3
MPI Rank 0: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.23285507 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2081s; samplesPerSecond = 1201.3
MPI Rank 0: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.25464189 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2607s; samplesPerSecond = 959.1
MPI Rank 0: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.21253995 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3790s; samplesPerSecond = 659.6
MPI Rank 0: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.18708213 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.5119s; samplesPerSecond = 488.4
MPI Rank 0: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.21363034 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.3687s; samplesPerSecond = 678.0
MPI Rank 0: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.23505436 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2281s; samplesPerSecond = 1096.0
MPI Rank 0: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.20180377 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1647s; samplesPerSecond = 1518.0
MPI Rank 0: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.19780589 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.1373s; samplesPerSecond = 1821.2
MPI Rank 0: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.16131109 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3582s; samplesPerSecond = 697.9
MPI Rank 0: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.16479151 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.5462s; samplesPerSecond = 457.7
MPI Rank 0: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20226364 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3986s; samplesPerSecond = 627.2
MPI Rank 0: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.14809078 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2972s; samplesPerSecond = 841.2
MPI Rank 0: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.19001813 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.3371s; samplesPerSecond = 741.7
MPI Rank 0: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19616890 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2864s; samplesPerSecond = 872.8
MPI Rank 0: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17887468 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3232s; samplesPerSecond = 773.6
MPI Rank 0: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.14040410 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.5597s; samplesPerSecond = 446.7
MPI Rank 0: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17935152 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.3574s; samplesPerSecond = 699.4
MPI Rank 0: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.13249072 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3653s; samplesPerSecond = 684.3
MPI Rank 0: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15483358 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3480s; samplesPerSecond = 718.5
MPI Rank 0: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19796159 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2932s; samplesPerSecond = 852.7
MPI Rank 0: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.13179462 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2607s; samplesPerSecond = 958.9
MPI Rank 0: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.14028323 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4134s; samplesPerSecond = 604.7
MPI Rank 0: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12849508 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2778s; samplesPerSecond = 900.0
MPI Rank 0: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16702669 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1260s; samplesPerSecond = 1984.5
MPI Rank 0: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20390304 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2592s; samplesPerSecond = 964.6
MPI Rank 0: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14594790 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.2101s; samplesPerSecond = 1190.1
MPI Rank 0: 12/12/2017 15:02:55: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.29447308 * 10000; EvalClassificationError = 0.11490000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=12.3189s
MPI Rank 0: 12/12/2017 15:02:55: SGD: Saving checkpoint model '/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/models/Simple.dnn.2'
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:55: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:02:55: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 12/12/2017 15:02:55:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12813296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4326s; samplesPerSecond = 578.0
MPI Rank 0: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17615627 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.3326s; samplesPerSecond = 751.7
MPI Rank 0: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14587002 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.4391s; samplesPerSecond = 569.4
MPI Rank 0: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15938467 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3009s; samplesPerSecond = 830.7
MPI Rank 0: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17100049 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.5174s; samplesPerSecond = 483.2
MPI Rank 0: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18281055 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.3152s; samplesPerSecond = 793.2
MPI Rank 0: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14781537 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2894s; samplesPerSecond = 863.9
MPI Rank 0: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18045490 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3744s; samplesPerSecond = 667.7
MPI Rank 0: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15847199 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5491s; samplesPerSecond = 455.3
MPI Rank 0: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14513057 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3007s; samplesPerSecond = 831.4
MPI Rank 0: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13519578 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.4109s; samplesPerSecond = 608.5
MPI Rank 0: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13723644 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3436s; samplesPerSecond = 727.6
MPI Rank 0: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11692067 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3306s; samplesPerSecond = 756.2
MPI Rank 0: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16729043 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2498s; samplesPerSecond = 1001.0
MPI Rank 0: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12836481 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.6240s; samplesPerSecond = 400.6
MPI Rank 0: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17320383 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1880s; samplesPerSecond = 1329.8
MPI Rank 0: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17634559 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3532s; samplesPerSecond = 707.9
MPI Rank 0: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14124514 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3226s; samplesPerSecond = 775.0
MPI Rank 0: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19167718 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2679s; samplesPerSecond = 933.1
MPI Rank 0: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20913003 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2599s; samplesPerSecond = 961.8
MPI Rank 0: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18460750 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2519s; samplesPerSecond = 992.3
MPI Rank 0: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18188216 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5039s; samplesPerSecond = 496.1
MPI Rank 0: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14069101 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2919s; samplesPerSecond = 856.4
MPI Rank 0: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14812247 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.3239s; samplesPerSecond = 771.7
MPI Rank 0: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20274092 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2239s; samplesPerSecond = 1116.4
MPI Rank 0: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12887866 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2159s; samplesPerSecond = 1157.8
MPI Rank 0: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18595256 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2076s; samplesPerSecond = 1204.3
MPI Rank 0: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19565326 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2563s; samplesPerSecond = 975.5
MPI Rank 0: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16678525 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.4119s; samplesPerSecond = 606.9
MPI Rank 0: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12552459 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.1988s; samplesPerSecond = 1257.3
MPI Rank 0: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17414175 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1650s; samplesPerSecond = 1514.8
MPI Rank 0: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12295855 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1559s; samplesPerSecond = 1603.7
MPI Rank 0: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14757012 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2159s; samplesPerSecond = 1157.8
MPI Rank 0: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19785856 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1919s; samplesPerSecond = 1302.6
MPI Rank 0: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12600285 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2080s; samplesPerSecond = 1202.2
MPI Rank 0: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13742899 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2239s; samplesPerSecond = 1116.5
MPI Rank 0: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12847649 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1519s; samplesPerSecond = 1645.3
MPI Rank 0: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16652416 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2039s; samplesPerSecond = 1225.9
MPI Rank 0: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20675721 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2239s; samplesPerSecond = 1116.5
MPI Rank 0: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14562268 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4239s; samplesPerSecond = 589.7
MPI Rank 0: 12/12/2017 15:03:07: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15965044 * 10000; EvalClassificationError = 0.07650000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=12.2591s
MPI Rank 0: 12/12/2017 15:03:07: SGD: Saving checkpoint model '/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/models/Simple.dnn.3'
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:03:07: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:03:07: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 12/12/2017 15:03:07:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12392293 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2824s; samplesPerSecond = 885.3
MPI Rank 0: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18033422 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2119s; samplesPerSecond = 1179.6
MPI Rank 0: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14284000 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2680s; samplesPerSecond = 932.9
MPI Rank 0: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15662491 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1801s; samplesPerSecond = 1388.0
MPI Rank 0: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16985801 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2357s; samplesPerSecond = 1060.6
MPI Rank 0: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18190608 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1799s; samplesPerSecond = 1389.6
MPI Rank 0: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14495470 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2119s; samplesPerSecond = 1179.6
MPI Rank 0: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18022154 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3119s; samplesPerSecond = 801.4
MPI Rank 0: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15852461 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2564s; samplesPerSecond = 975.2
MPI Rank 0: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14466589 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2346s; samplesPerSecond = 1065.7
MPI Rank 0: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13346404 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2649s; samplesPerSecond = 943.8
MPI Rank 0: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13683061 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1999s; samplesPerSecond = 1250.7
MPI Rank 0: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11589011 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2055s; samplesPerSecond = 1216.7
MPI Rank 0: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16881193 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1639s; samplesPerSecond = 1525.4
MPI Rank 0: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12736965 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.2144s; samplesPerSecond = 1165.8
MPI Rank 0: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17123603 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.1762s; samplesPerSecond = 1418.8
MPI Rank 0: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17706403 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1556s; samplesPerSecond = 1606.3
MPI Rank 0: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14104103 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4279s; samplesPerSecond = 584.2
MPI Rank 0: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19313360 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2919s; samplesPerSecond = 856.4
MPI Rank 0: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20870745 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1716s; samplesPerSecond = 1457.1
MPI Rank 0: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18510294 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2563s; samplesPerSecond = 975.5
MPI Rank 0: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18167137 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2479s; samplesPerSecond = 1008.4
MPI Rank 0: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14026276 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2080s; samplesPerSecond = 1202.1
MPI Rank 0: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14811532 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2359s; samplesPerSecond = 1059.8
MPI Rank 0: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20368129 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2559s; samplesPerSecond = 976.8
MPI Rank 0: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12819272 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.1879s; samplesPerSecond = 1330.5
MPI Rank 0: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18632901 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3440s; samplesPerSecond = 726.8
MPI Rank 0: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19568750 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2879s; samplesPerSecond = 868.4
MPI Rank 0: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16449544 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.1959s; samplesPerSecond = 1276.1
MPI Rank 0: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12454886 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.2399s; samplesPerSecond = 1042.0
MPI Rank 0: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17307192 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2239s; samplesPerSecond = 1116.4
MPI Rank 0: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12249522 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2159s; samplesPerSecond = 1157.8
MPI Rank 0: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14709682 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1884s; samplesPerSecond = 1327.2
MPI Rank 0: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789048 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2438s; samplesPerSecond = 1025.6
MPI Rank 0: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12572171 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1837s; samplesPerSecond = 1361.2
MPI Rank 0: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13732392 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3239s; samplesPerSecond = 771.7
MPI Rank 0: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12857569 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3123s; samplesPerSecond = 800.6
MPI Rank 0: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16653116 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1836s; samplesPerSecond = 1361.8
MPI Rank 0: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20715348 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2799s; samplesPerSecond = 893.0
MPI Rank 0: 12/12/2017 15:03:17:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14571730 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2621s; samplesPerSecond = 953.8
MPI Rank 0: 12/12/2017 15:03:17: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15917666 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=9.53105s
MPI Rank 0: 12/12/2017 15:03:17: SGD: Saving checkpoint model '/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/models/Simple.dnn'
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:03:17: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: 12/12/2017 15:03:17: __COMPLETED__
MPI Rank 1: CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30
MPI Rank 1: 
MPI Rank 1: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
MPI Rank 1: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 1: 12/12/2017 15:02:31: Build info: 
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: 		Built time: Dec 11 2017 18:28:39
MPI Rank 1: 12/12/2017 15:02:31: 		Last modified date: Wed Nov 15 09:27:10 2017
MPI Rank 1: 12/12/2017 15:02:31: 		Build type: release
MPI Rank 1: 12/12/2017 15:02:31: 		Build target: GPU
MPI Rank 1: 12/12/2017 15:02:31: 		With ASGD: yes
MPI Rank 1: 12/12/2017 15:02:31: 		Math lib: mkl
MPI Rank 1: 12/12/2017 15:02:31: 		CUDA version: 9.0.0
MPI Rank 1: 12/12/2017 15:02:31: 		CUDNN version: 7.0.4
MPI Rank 1: 12/12/2017 15:02:31: 		Build Branch: HEAD
MPI Rank 1: 12/12/2017 15:02:31: 		Build SHA1: f4f0f82eabcc482dbd03af1f946a44ae2b8b97bf
MPI Rank 1: 12/12/2017 15:02:31: 		MPI distribution: Open MPI
MPI Rank 1: 12/12/2017 15:02:31: 		MPI version: 1.10.7
MPI Rank 1: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 1: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 1: 12/12/2017 15:02:31: GPU info:
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8029 MB
MPI Rank 1: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 1: 12/12/2017 15:02:31: Using 3 CPU threads.
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: ##############################################################################
MPI Rank 1: 12/12/2017 15:02:31: #                                                                            #
MPI Rank 1: 12/12/2017 15:02:31: # SimpleMultiGPU command (train action)                                      #
MPI Rank 1: 12/12/2017 15:02:31: #                                                                            #
MPI Rank 1: 12/12/2017 15:02:31: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: 
MPI Rank 1: Creating virgin network.
MPI Rank 1: SimpleNetworkBuilder Using CPU
MPI Rank 1: 12/12/2017 15:02:31: 
MPI Rank 1: Model has 25 nodes. Using CPU.
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 1: 12/12/2017 15:02:31: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: Allocating matrices for forward and/or backward propagation.
MPI Rank 1: 
MPI Rank 1: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 1: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 1: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 1: 
MPI Rank 1: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 1: 
MPI Rank 1: Here are the ones that share memory:
MPI Rank 1: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 1: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 1: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 1: 	  W0 : [50 x 2] (gradient)
MPI Rank 1: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 1: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  W2*H1 : [2 x 1 x *]
MPI Rank 1: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 1: 	{ B0 : [50 x 1] (gradient)
MPI Rank 1: 	  H1 : [50 x 1 x *] }
MPI Rank 1: 	{ H2 : [50 x 1 x *]
MPI Rank 1: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 1: 	  W1 : [50 x 50] (gradient)
MPI Rank 1: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 1: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  HLast : [2 x 1 x *]
MPI Rank 1: 	  W0*features : [50 x *]
MPI Rank 1: 	  W0*features : [50 x *] (gradient) }
MPI Rank 1: 
MPI Rank 1: Here are the ones that don't share memory:
MPI Rank 1: 	{MeanOfFeatures : [2]}
MPI Rank 1: 	{features : [2 x *]}
MPI Rank 1: 	{InvStdOfFeatures : [2]}
MPI Rank 1: 	{W0 : [50 x 2]}
MPI Rank 1: 	{B0 : [50 x 1]}
MPI Rank 1: 	{W1 : [50 x 50]}
MPI Rank 1: 	{B1 : [50 x 1]}
MPI Rank 1: 	{W2 : [2 x 50]}
MPI Rank 1: 	{B2 : [2 x 1]}
MPI Rank 1: 	{labels : [2 x *]}
MPI Rank 1: 	{Prior : [2]}
MPI Rank 1: 	{EvalClassificationError : [1]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 1: 	{LogOfPrior : [2]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 1: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 1: 	{B2 : [2 x 1] (gradient)}
MPI Rank 1: 	{B1 : [50 x 1] (gradient)}
MPI Rank 1: 	{W2 : [2 x 50] (gradient)}
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:31: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 1: 12/12/2017 15:02:31: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 1: 12/12/2017 15:02:31: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 1: 12/12/2017 15:02:31: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 1: 12/12/2017 15:02:31: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 1: 12/12/2017 15:02:31: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 1: 
MPI Rank 1: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 1: NcclComm: disabled, at least one rank using CPU device
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:32: Precomputing --> 3 PreCompute nodes found.
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:32: 	MeanOfFeatures = Mean()
MPI Rank 1: 12/12/2017 15:02:32: 	InvStdOfFeatures = InvStdDev()
MPI Rank 1: 12/12/2017 15:02:32: 	Prior = Mean()
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:33: Precomputing --> Completed.
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:33: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:33: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 12/12/2017 15:02:33:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.69973268 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.4017s; samplesPerSecond = 622.3
MPI Rank 1: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71436905 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.2089s; samplesPerSecond = 1196.9
MPI Rank 1: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72871054 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.1887s; samplesPerSecond = 1324.6
MPI Rank 1: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70038993 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.2000s; samplesPerSecond = 1250.2
MPI Rank 1: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70593818 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.4101s; samplesPerSecond = 609.5
MPI Rank 1: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71604646 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2325s; samplesPerSecond = 1075.4
MPI Rank 1: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72247949 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.2572s; samplesPerSecond = 972.1
MPI Rank 1: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79884413 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.1999s; samplesPerSecond = 1250.3
MPI Rank 1: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69622447 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.1195s; samplesPerSecond = 2092.2
MPI Rank 1: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70749459 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.2021s; samplesPerSecond = 1236.9
MPI Rank 1: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71485824 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.2580s; samplesPerSecond = 968.9
MPI Rank 1: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69579152 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.2281s; samplesPerSecond = 1095.8
MPI Rank 1: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70174138 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.2199s; samplesPerSecond = 1137.0
MPI Rank 1: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71926586 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.4399s; samplesPerSecond = 568.3
MPI Rank 1: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72009917 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.2359s; samplesPerSecond = 1059.6
MPI Rank 1: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71854573 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.1893s; samplesPerSecond = 1320.6
MPI Rank 1: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74083729 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2345s; samplesPerSecond = 1065.9
MPI Rank 1: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71762852 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.1879s; samplesPerSecond = 1330.2
MPI Rank 1: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71530686 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.1717s; samplesPerSecond = 1456.3
MPI Rank 1: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71768617 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.1762s; samplesPerSecond = 1418.8
MPI Rank 1: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71515312 * 250; EvalClassificationError = 0.53600000 * 250; time = 0.1589s; samplesPerSecond = 1573.5
MPI Rank 1: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72047060 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.2046s; samplesPerSecond = 1221.7
MPI Rank 1: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72033071 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.1625s; samplesPerSecond = 1538.5
MPI Rank 1: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71295324 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.4578s; samplesPerSecond = 546.1
MPI Rank 1: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69737817 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.2519s; samplesPerSecond = 992.5
MPI Rank 1: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70251892 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.3599s; samplesPerSecond = 694.5
MPI Rank 1: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70879704 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.2228s; samplesPerSecond = 1122.0
MPI Rank 1: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69856459 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.1851s; samplesPerSecond = 1350.3
MPI Rank 1: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69425907 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.1920s; samplesPerSecond = 1302.2
MPI Rank 1: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69599736 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1798s; samplesPerSecond = 1390.4
MPI Rank 1: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69591177 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.1667s; samplesPerSecond = 1499.7
MPI Rank 1: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.69133098 * 250; EvalClassificationError = 0.40000000 * 250; time = 0.2613s; samplesPerSecond = 956.9
MPI Rank 1: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69822648 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.4162s; samplesPerSecond = 600.6
MPI Rank 1: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.71031539 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.2555s; samplesPerSecond = 978.3
MPI Rank 1: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.70097460 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2000s; samplesPerSecond = 1249.9
MPI Rank 1: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.68927867 * 250; EvalClassificationError = 0.45200000 * 250; time = 0.1959s; samplesPerSecond = 1276.5
MPI Rank 1: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.68908389 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.1999s; samplesPerSecond = 1250.4
MPI Rank 1: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.67796901 * 250; EvalClassificationError = 0.45600000 * 250; time = 0.1799s; samplesPerSecond = 1389.5
MPI Rank 1: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.67863593 * 250; EvalClassificationError = 0.38400000 * 250; time = 0.2359s; samplesPerSecond = 1059.6
MPI Rank 1: 12/12/2017 15:02:43:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.67150936 * 250; EvalClassificationError = 0.42800000 * 250; time = 0.2359s; samplesPerSecond = 1059.6
MPI Rank 1: 12/12/2017 15:02:43: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70804123 * 10000; EvalClassificationError = 0.49380000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=9.49385s
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:43: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:43: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.69566490 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1835s; samplesPerSecond = 1362.3
MPI Rank 1: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.64058119 * 250; EvalClassificationError = 0.22400000 * 250; time = 0.4203s; samplesPerSecond = 594.8
MPI Rank 1: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.62577202 * 250; EvalClassificationError = 0.30400000 * 250; time = 0.2916s; samplesPerSecond = 857.3
MPI Rank 1: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.62974783 * 250; EvalClassificationError = 0.34000000 * 250; time = 0.2650s; samplesPerSecond = 943.3
MPI Rank 1: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.60705897 * 250; EvalClassificationError = 0.22800000 * 250; time = 0.3468s; samplesPerSecond = 720.8
MPI Rank 1: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.59038668 * 250; EvalClassificationError = 0.18000000 * 250; time = 0.2051s; samplesPerSecond = 1219.2
MPI Rank 1: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.55033195 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2828s; samplesPerSecond = 884.0
MPI Rank 1: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.53624170 * 250; EvalClassificationError = 0.23200000 * 250; time = 0.2439s; samplesPerSecond = 1024.9
MPI Rank 1: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.48688308 * 250; EvalClassificationError = 0.12000000 * 250; time = 0.3279s; samplesPerSecond = 762.4
MPI Rank 1: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.43212926 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2891s; samplesPerSecond = 864.8
MPI Rank 1: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.38559516 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2348s; samplesPerSecond = 1064.7
MPI Rank 1: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.34249535 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.3519s; samplesPerSecond = 710.3
MPI Rank 1: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.28670698 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.2919s; samplesPerSecond = 856.5
MPI Rank 1: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.26990400 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2656s; samplesPerSecond = 941.4
MPI Rank 1: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.23285507 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1986s; samplesPerSecond = 1258.8
MPI Rank 1: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.25464189 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2734s; samplesPerSecond = 914.3
MPI Rank 1: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.21253995 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3743s; samplesPerSecond = 667.9
MPI Rank 1: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.18708213 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.5278s; samplesPerSecond = 473.7
MPI Rank 1: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.21363034 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.3599s; samplesPerSecond = 694.6
MPI Rank 1: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.23505436 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2441s; samplesPerSecond = 1024.0
MPI Rank 1: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.20180377 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1479s; samplesPerSecond = 1690.4
MPI Rank 1: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.19780589 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.1329s; samplesPerSecond = 1881.4
MPI Rank 1: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.16131109 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3628s; samplesPerSecond = 689.1
MPI Rank 1: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.16479151 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.5560s; samplesPerSecond = 449.7
MPI Rank 1: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20226364 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3959s; samplesPerSecond = 631.5
MPI Rank 1: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.14809078 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3039s; samplesPerSecond = 822.6
MPI Rank 1: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.19001813 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.3390s; samplesPerSecond = 737.4
MPI Rank 1: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19616890 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2768s; samplesPerSecond = 903.1
MPI Rank 1: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17887468 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3120s; samplesPerSecond = 801.4
MPI Rank 1: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.14040410 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.5799s; samplesPerSecond = 431.1
MPI Rank 1: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17935152 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.3559s; samplesPerSecond = 702.4
MPI Rank 1: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.13249072 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3559s; samplesPerSecond = 702.4
MPI Rank 1: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15483358 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3439s; samplesPerSecond = 726.9
MPI Rank 1: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19796159 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2903s; samplesPerSecond = 861.1
MPI Rank 1: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.13179462 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2695s; samplesPerSecond = 927.8
MPI Rank 1: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.14028323 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4087s; samplesPerSecond = 611.8
MPI Rank 1: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12849508 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2877s; samplesPerSecond = 868.8
MPI Rank 1: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16702669 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1121s; samplesPerSecond = 2229.2
MPI Rank 1: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20390304 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2551s; samplesPerSecond = 980.1
MPI Rank 1: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14594790 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.2137s; samplesPerSecond = 1169.7
MPI Rank 1: 12/12/2017 15:02:55: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.29447308 * 10000; EvalClassificationError = 0.11490000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=12.3123s
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:55: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:02:55: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 12/12/2017 15:02:55:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12813296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4540s; samplesPerSecond = 550.6
MPI Rank 1: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17615627 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.3117s; samplesPerSecond = 802.1
MPI Rank 1: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14587002 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.4653s; samplesPerSecond = 537.2
MPI Rank 1: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15938467 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2857s; samplesPerSecond = 875.2
MPI Rank 1: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17100049 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.5024s; samplesPerSecond = 497.6
MPI Rank 1: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18281055 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.3288s; samplesPerSecond = 760.4
MPI Rank 1: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14781537 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2899s; samplesPerSecond = 862.3
MPI Rank 1: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18045490 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3637s; samplesPerSecond = 687.4
MPI Rank 1: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15847199 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5427s; samplesPerSecond = 460.7
MPI Rank 1: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14513057 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3306s; samplesPerSecond = 756.2
MPI Rank 1: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13519578 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.3878s; samplesPerSecond = 644.7
MPI Rank 1: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13723644 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3432s; samplesPerSecond = 728.5
MPI Rank 1: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11692067 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3425s; samplesPerSecond = 729.9
MPI Rank 1: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16729043 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2421s; samplesPerSecond = 1032.5
MPI Rank 1: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12836481 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.6018s; samplesPerSecond = 415.4
MPI Rank 1: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17320383 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2194s; samplesPerSecond = 1139.4
MPI Rank 1: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17634559 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3504s; samplesPerSecond = 713.5
MPI Rank 1: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14124514 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3079s; samplesPerSecond = 811.9
MPI Rank 1: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19167718 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2679s; samplesPerSecond = 933.1
MPI Rank 1: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20913003 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2611s; samplesPerSecond = 957.6
MPI Rank 1: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18460750 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2769s; samplesPerSecond = 902.9
MPI Rank 1: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18188216 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5018s; samplesPerSecond = 498.2
MPI Rank 1: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14069101 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2805s; samplesPerSecond = 891.2
MPI Rank 1: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14812247 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.3074s; samplesPerSecond = 813.3
MPI Rank 1: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20274092 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2279s; samplesPerSecond = 1097.0
MPI Rank 1: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12887866 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2162s; samplesPerSecond = 1156.6
MPI Rank 1: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18595256 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2157s; samplesPerSecond = 1159.0
MPI Rank 1: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19565326 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2409s; samplesPerSecond = 1037.9
MPI Rank 1: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16678525 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.4190s; samplesPerSecond = 596.7
MPI Rank 1: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12552459 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.2039s; samplesPerSecond = 1225.9
MPI Rank 1: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17414175 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1583s; samplesPerSecond = 1579.4
MPI Rank 1: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12295855 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1496s; samplesPerSecond = 1671.6
MPI Rank 1: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14757012 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2194s; samplesPerSecond = 1139.4
MPI Rank 1: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19785856 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1965s; samplesPerSecond = 1272.6
MPI Rank 1: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12600285 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2008s; samplesPerSecond = 1244.8
MPI Rank 1: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13742899 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2230s; samplesPerSecond = 1121.0
MPI Rank 1: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12847649 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1675s; samplesPerSecond = 1492.9
MPI Rank 1: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16652416 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1963s; samplesPerSecond = 1273.5
MPI Rank 1: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20675721 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2365s; samplesPerSecond = 1056.9
MPI Rank 1: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14562268 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3978s; samplesPerSecond = 628.4
MPI Rank 1: 12/12/2017 15:03:07: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15965044 * 10000; EvalClassificationError = 0.07650000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=12.2698s
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:03:07: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:03:07: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 12/12/2017 15:03:07:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12392293 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2664s; samplesPerSecond = 938.3
MPI Rank 1: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18033422 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2197s; samplesPerSecond = 1137.8
MPI Rank 1: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14284000 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2668s; samplesPerSecond = 937.0
MPI Rank 1: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15662491 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1853s; samplesPerSecond = 1349.3
MPI Rank 1: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16985801 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2319s; samplesPerSecond = 1078.1
MPI Rank 1: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18190608 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1859s; samplesPerSecond = 1344.8
MPI Rank 1: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14495470 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.1971s; samplesPerSecond = 1268.5
MPI Rank 1: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18022154 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3328s; samplesPerSecond = 751.1
MPI Rank 1: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15852461 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2524s; samplesPerSecond = 990.5
MPI Rank 1: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14466589 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2335s; samplesPerSecond = 1070.7
MPI Rank 1: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13346404 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2579s; samplesPerSecond = 969.2
MPI Rank 1: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13683061 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2080s; samplesPerSecond = 1201.8
MPI Rank 1: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11589011 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2010s; samplesPerSecond = 1244.0
MPI Rank 1: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16881193 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1707s; samplesPerSecond = 1464.6
MPI Rank 1: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12736965 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.2119s; samplesPerSecond = 1179.7
MPI Rank 1: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17123603 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.1681s; samplesPerSecond = 1487.0
MPI Rank 1: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17706403 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1584s; samplesPerSecond = 1577.9
MPI Rank 1: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14104103 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4221s; samplesPerSecond = 592.3
MPI Rank 1: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19313360 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2911s; samplesPerSecond = 858.7
MPI Rank 1: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20870745 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1799s; samplesPerSecond = 1390.0
MPI Rank 1: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18510294 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2719s; samplesPerSecond = 919.4
MPI Rank 1: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18167137 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2239s; samplesPerSecond = 1116.4
MPI Rank 1: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14026276 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2279s; samplesPerSecond = 1096.8
MPI Rank 1: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14811532 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2106s; samplesPerSecond = 1187.1
MPI Rank 1: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20368129 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2733s; samplesPerSecond = 914.8
MPI Rank 1: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12819272 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.1931s; samplesPerSecond = 1294.8
MPI Rank 1: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18632901 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3409s; samplesPerSecond = 733.3
MPI Rank 1: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19568750 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2778s; samplesPerSecond = 900.0
MPI Rank 1: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16449544 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.1759s; samplesPerSecond = 1421.5
MPI Rank 1: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12454886 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.2719s; samplesPerSecond = 919.4
MPI Rank 1: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17307192 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2127s; samplesPerSecond = 1175.4
MPI Rank 1: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12249522 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2232s; samplesPerSecond = 1120.2
MPI Rank 1: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14709682 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1845s; samplesPerSecond = 1355.2
MPI Rank 1: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789048 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2482s; samplesPerSecond = 1007.1
MPI Rank 1: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12572171 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1846s; samplesPerSecond = 1354.2
MPI Rank 1: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13732392 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3015s; samplesPerSecond = 829.2
MPI Rank 1: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12857569 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3328s; samplesPerSecond = 751.1
MPI Rank 1: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16653116 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1679s; samplesPerSecond = 1488.7
MPI Rank 1: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20715348 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2831s; samplesPerSecond = 883.2
MPI Rank 1: 12/12/2017 15:03:17:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14571730 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2688s; samplesPerSecond = 930.2
MPI Rank 1: 12/12/2017 15:03:17: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15917666 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=9.52707s
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:03:17: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: 12/12/2017 15:03:17: __COMPLETED__
MPI Rank 2: CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30
MPI Rank 2: 
MPI Rank 2: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
MPI Rank 2: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 2: 12/12/2017 15:02:31: Build info: 
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: 		Built time: Dec 11 2017 18:28:39
MPI Rank 2: 12/12/2017 15:02:31: 		Last modified date: Wed Nov 15 09:27:10 2017
MPI Rank 2: 12/12/2017 15:02:31: 		Build type: release
MPI Rank 2: 12/12/2017 15:02:31: 		Build target: GPU
MPI Rank 2: 12/12/2017 15:02:31: 		With ASGD: yes
MPI Rank 2: 12/12/2017 15:02:31: 		Math lib: mkl
MPI Rank 2: 12/12/2017 15:02:31: 		CUDA version: 9.0.0
MPI Rank 2: 12/12/2017 15:02:31: 		CUDNN version: 7.0.4
MPI Rank 2: 12/12/2017 15:02:31: 		Build Branch: HEAD
MPI Rank 2: 12/12/2017 15:02:31: 		Build SHA1: f4f0f82eabcc482dbd03af1f946a44ae2b8b97bf
MPI Rank 2: 12/12/2017 15:02:31: 		MPI distribution: Open MPI
MPI Rank 2: 12/12/2017 15:02:31: 		MPI version: 1.10.7
MPI Rank 2: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 2: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 2: 12/12/2017 15:02:31: GPU info:
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 7947 MB
MPI Rank 2: 12/12/2017 15:02:31: -------------------------------------------------------------------
MPI Rank 2: 12/12/2017 15:02:31: Using 3 CPU threads.
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: ##############################################################################
MPI Rank 2: 12/12/2017 15:02:31: #                                                                            #
MPI Rank 2: 12/12/2017 15:02:31: # SimpleMultiGPU command (train action)                                      #
MPI Rank 2: 12/12/2017 15:02:31: #                                                                            #
MPI Rank 2: 12/12/2017 15:02:31: ##############################################################################
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: 
MPI Rank 2: Creating virgin network.
MPI Rank 2: SimpleNetworkBuilder Using CPU
MPI Rank 2: 12/12/2017 15:02:31: 
MPI Rank 2: Model has 25 nodes. Using CPU.
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 2: 12/12/2017 15:02:31: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: Allocating matrices for forward and/or backward propagation.
MPI Rank 2: 
MPI Rank 2: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 2: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 2: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 2: 
MPI Rank 2: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 2: 
MPI Rank 2: Here are the ones that share memory:
MPI Rank 2: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 2: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 2: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 2: 	  W0 : [50 x 2] (gradient)
MPI Rank 2: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 2: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  W2*H1 : [2 x 1 x *]
MPI Rank 2: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 2: 	{ B0 : [50 x 1] (gradient)
MPI Rank 2: 	  H1 : [50 x 1 x *] }
MPI Rank 2: 	{ H2 : [50 x 1 x *]
MPI Rank 2: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 2: 	  W1 : [50 x 50] (gradient)
MPI Rank 2: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 2: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  HLast : [2 x 1 x *]
MPI Rank 2: 	  W0*features : [50 x *]
MPI Rank 2: 	  W0*features : [50 x *] (gradient) }
MPI Rank 2: 
MPI Rank 2: Here are the ones that don't share memory:
MPI Rank 2: 	{MeanOfFeatures : [2]}
MPI Rank 2: 	{features : [2 x *]}
MPI Rank 2: 	{InvStdOfFeatures : [2]}
MPI Rank 2: 	{W0 : [50 x 2]}
MPI Rank 2: 	{B0 : [50 x 1]}
MPI Rank 2: 	{W1 : [50 x 50]}
MPI Rank 2: 	{B1 : [50 x 1]}
MPI Rank 2: 	{W2 : [2 x 50]}
MPI Rank 2: 	{B2 : [2 x 1]}
MPI Rank 2: 	{labels : [2 x *]}
MPI Rank 2: 	{Prior : [2]}
MPI Rank 2: 	{EvalClassificationError : [1]}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 2: 	{LogOfPrior : [2]}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 2: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 2: 	{B2 : [2 x 1] (gradient)}
MPI Rank 2: 	{B1 : [50 x 1] (gradient)}
MPI Rank 2: 	{W2 : [2 x 50] (gradient)}
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:31: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 2: 12/12/2017 15:02:31: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 2: 12/12/2017 15:02:31: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 2: 12/12/2017 15:02:31: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 2: 12/12/2017 15:02:31: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 2: 12/12/2017 15:02:31: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 2: 
MPI Rank 2: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 2: NcclComm: disabled, at least one rank using CPU device
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:32: Precomputing --> 3 PreCompute nodes found.
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:32: 	MeanOfFeatures = Mean()
MPI Rank 2: 12/12/2017 15:02:32: 	InvStdOfFeatures = InvStdDev()
MPI Rank 2: 12/12/2017 15:02:32: 	Prior = Mean()
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:33: Precomputing --> Completed.
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:33: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:33: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 2: 12/12/2017 15:02:33:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.69973268 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.4078s; samplesPerSecond = 613.1
MPI Rank 2: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71436905 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.2080s; samplesPerSecond = 1202.2
MPI Rank 2: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72871054 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.1799s; samplesPerSecond = 1389.6
MPI Rank 2: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70038993 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.2077s; samplesPerSecond = 1203.5
MPI Rank 2: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70593818 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.4013s; samplesPerSecond = 623.0
MPI Rank 2: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71604646 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2443s; samplesPerSecond = 1023.3
MPI Rank 2: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72247949 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.2344s; samplesPerSecond = 1066.4
MPI Rank 2: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79884413 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2101s; samplesPerSecond = 1190.0
MPI Rank 2: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69622447 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.1298s; samplesPerSecond = 1926.1
MPI Rank 2: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70749459 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.1884s; samplesPerSecond = 1327.1
MPI Rank 2: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71485824 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.2642s; samplesPerSecond = 946.2
MPI Rank 2: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69579152 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.2355s; samplesPerSecond = 1061.4
MPI Rank 2: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70174138 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.2116s; samplesPerSecond = 1181.6
MPI Rank 2: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71926586 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.4506s; samplesPerSecond = 554.8
MPI Rank 2: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72009917 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.2121s; samplesPerSecond = 1178.4
MPI Rank 2: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71854573 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.2090s; samplesPerSecond = 1196.1
MPI Rank 2: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74083729 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2430s; samplesPerSecond = 1028.7
MPI Rank 2: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71762852 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.1574s; samplesPerSecond = 1587.9
MPI Rank 2: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71530686 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.1953s; samplesPerSecond = 1279.9
MPI Rank 2: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71768617 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.1657s; samplesPerSecond = 1508.6
MPI Rank 2: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71515312 * 250; EvalClassificationError = 0.53600000 * 250; time = 0.1742s; samplesPerSecond = 1435.5
MPI Rank 2: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72047060 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.1877s; samplesPerSecond = 1332.2
MPI Rank 2: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72033071 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.1642s; samplesPerSecond = 1522.8
MPI Rank 2: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71295324 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.4627s; samplesPerSecond = 540.3
MPI Rank 2: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69737817 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.2198s; samplesPerSecond = 1137.3
MPI Rank 2: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70251892 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.3773s; samplesPerSecond = 662.6
MPI Rank 2: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70879704 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.2361s; samplesPerSecond = 1058.9
MPI Rank 2: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69856459 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.1836s; samplesPerSecond = 1361.3
MPI Rank 2: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69425907 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.1959s; samplesPerSecond = 1276.4
MPI Rank 2: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69599736 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1650s; samplesPerSecond = 1515.3
MPI Rank 2: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69591177 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.1804s; samplesPerSecond = 1385.6
MPI Rank 2: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.69133098 * 250; EvalClassificationError = 0.40000000 * 250; time = 0.2307s; samplesPerSecond = 1083.8
MPI Rank 2: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69822648 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.4460s; samplesPerSecond = 560.6
MPI Rank 2: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.71031539 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.2465s; samplesPerSecond = 1014.0
MPI Rank 2: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.70097460 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2151s; samplesPerSecond = 1162.0
MPI Rank 2: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.68927867 * 250; EvalClassificationError = 0.45200000 * 250; time = 0.1827s; samplesPerSecond = 1368.5
MPI Rank 2: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.68908389 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.1909s; samplesPerSecond = 1309.5
MPI Rank 2: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.67796901 * 250; EvalClassificationError = 0.45600000 * 250; time = 0.1816s; samplesPerSecond = 1376.4
MPI Rank 2: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.67863593 * 250; EvalClassificationError = 0.38400000 * 250; time = 0.2378s; samplesPerSecond = 1051.5
MPI Rank 2: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.67150936 * 250; EvalClassificationError = 0.42800000 * 250; time = 0.2077s; samplesPerSecond = 1203.5
MPI Rank 2: 12/12/2017 15:02:43: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70804123 * 10000; EvalClassificationError = 0.49380000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=9.491s
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:43: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:43: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 2: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.69566490 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1896s; samplesPerSecond = 1318.9
MPI Rank 2: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.64058119 * 250; EvalClassificationError = 0.22400000 * 250; time = 0.4177s; samplesPerSecond = 598.6
MPI Rank 2: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.62577202 * 250; EvalClassificationError = 0.30400000 * 250; time = 0.2884s; samplesPerSecond = 866.7
MPI Rank 2: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.62974783 * 250; EvalClassificationError = 0.34000000 * 250; time = 0.2740s; samplesPerSecond = 912.4
MPI Rank 2: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.60705897 * 250; EvalClassificationError = 0.22800000 * 250; time = 0.3279s; samplesPerSecond = 762.4
MPI Rank 2: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.59038668 * 250; EvalClassificationError = 0.18000000 * 250; time = 0.2164s; samplesPerSecond = 1155.3
MPI Rank 2: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.55033195 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2804s; samplesPerSecond = 891.7
MPI Rank 2: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.53624170 * 250; EvalClassificationError = 0.23200000 * 250; time = 0.2391s; samplesPerSecond = 1045.5
MPI Rank 2: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.48688308 * 250; EvalClassificationError = 0.12000000 * 250; time = 0.2519s; samplesPerSecond = 992.6
MPI Rank 2: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.43212926 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.3940s; samplesPerSecond = 634.5
MPI Rank 2: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.38559516 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2275s; samplesPerSecond = 1098.7
MPI Rank 2: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.34249535 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.3343s; samplesPerSecond = 747.9
MPI Rank 2: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.28670698 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.2946s; samplesPerSecond = 848.5
MPI Rank 2: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.26990400 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2612s; samplesPerSecond = 957.2
MPI Rank 2: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.23285507 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2130s; samplesPerSecond = 1174.0
MPI Rank 2: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.25464189 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2709s; samplesPerSecond = 922.9
MPI Rank 2: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.21253995 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3721s; samplesPerSecond = 671.8
MPI Rank 2: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.18708213 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.5081s; samplesPerSecond = 492.1
MPI Rank 2: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.21363034 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.3752s; samplesPerSecond = 666.3
MPI Rank 2: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.23505436 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2290s; samplesPerSecond = 1091.9
MPI Rank 2: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.20180377 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1620s; samplesPerSecond = 1543.2
MPI Rank 2: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.19780589 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.1381s; samplesPerSecond = 1810.4
MPI Rank 2: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.16131109 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3510s; samplesPerSecond = 712.2
MPI Rank 2: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.16479151 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.5448s; samplesPerSecond = 458.9
MPI Rank 2: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20226364 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.4270s; samplesPerSecond = 585.5
MPI Rank 2: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.14809078 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2719s; samplesPerSecond = 919.3
MPI Rank 2: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.19001813 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.3421s; samplesPerSecond = 730.8
MPI Rank 2: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19616890 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2872s; samplesPerSecond = 870.6
MPI Rank 2: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17887468 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3227s; samplesPerSecond = 774.6
MPI Rank 2: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.14040410 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.5508s; samplesPerSecond = 453.9
MPI Rank 2: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17935152 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.3627s; samplesPerSecond = 689.3
MPI Rank 2: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.13249072 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3620s; samplesPerSecond = 690.5
MPI Rank 2: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15483358 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3514s; samplesPerSecond = 711.4
MPI Rank 2: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19796159 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2984s; samplesPerSecond = 837.7
MPI Rank 2: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.13179462 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2599s; samplesPerSecond = 962.0
MPI Rank 2: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.14028323 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4339s; samplesPerSecond = 576.1
MPI Rank 2: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12849508 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2555s; samplesPerSecond = 978.4
MPI Rank 2: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16702669 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1289s; samplesPerSecond = 1940.2
MPI Rank 2: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20390304 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2712s; samplesPerSecond = 921.8
MPI Rank 2: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14594790 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.2039s; samplesPerSecond = 1226.2
MPI Rank 2: 12/12/2017 15:02:55: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.29447308 * 10000; EvalClassificationError = 0.11490000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=12.3102s
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:55: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:02:55: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 2: 12/12/2017 15:02:55:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12813296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4254s; samplesPerSecond = 587.7
MPI Rank 2: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17615627 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.3399s; samplesPerSecond = 735.5
MPI Rank 2: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14587002 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.4331s; samplesPerSecond = 577.3
MPI Rank 2: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15938467 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3030s; samplesPerSecond = 825.1
MPI Rank 2: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17100049 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.5121s; samplesPerSecond = 488.1
MPI Rank 2: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18281055 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.3197s; samplesPerSecond = 781.9
MPI Rank 2: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14781537 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2917s; samplesPerSecond = 857.0
MPI Rank 2: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18045490 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3645s; samplesPerSecond = 685.9
MPI Rank 2: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15847199 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5295s; samplesPerSecond = 472.1
MPI Rank 2: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14513057 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3314s; samplesPerSecond = 754.5
MPI Rank 2: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13519578 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.4060s; samplesPerSecond = 615.8
MPI Rank 2: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13723644 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3244s; samplesPerSecond = 770.8
MPI Rank 2: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11692067 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3478s; samplesPerSecond = 718.8
MPI Rank 2: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16729043 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2759s; samplesPerSecond = 906.2
MPI Rank 2: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12836481 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.5818s; samplesPerSecond = 429.7
MPI Rank 2: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17320383 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2060s; samplesPerSecond = 1213.4
MPI Rank 2: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17634559 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3533s; samplesPerSecond = 707.7
MPI Rank 2: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14124514 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3109s; samplesPerSecond = 804.1
MPI Rank 2: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19167718 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2656s; samplesPerSecond = 941.2
MPI Rank 2: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20913003 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2640s; samplesPerSecond = 947.1
MPI Rank 2: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18460750 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2379s; samplesPerSecond = 1051.1
MPI Rank 2: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18188216 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5249s; samplesPerSecond = 476.2
MPI Rank 2: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14069101 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2948s; samplesPerSecond = 848.1
MPI Rank 2: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14812247 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.3126s; samplesPerSecond = 799.7
MPI Rank 2: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20274092 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2247s; samplesPerSecond = 1112.3
MPI Rank 2: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12887866 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2187s; samplesPerSecond = 1143.3
MPI Rank 2: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18595256 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2152s; samplesPerSecond = 1161.7
MPI Rank 2: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19565326 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2429s; samplesPerSecond = 1029.0
MPI Rank 2: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16678525 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.4175s; samplesPerSecond = 598.7
MPI Rank 2: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12552459 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.2074s; samplesPerSecond = 1205.3
MPI Rank 2: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17414175 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1644s; samplesPerSecond = 1521.0
MPI Rank 2: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12295855 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1384s; samplesPerSecond = 1806.1
MPI Rank 2: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14757012 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2257s; samplesPerSecond = 1107.6
MPI Rank 2: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19785856 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1917s; samplesPerSecond = 1304.4
MPI Rank 2: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12600285 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2097s; samplesPerSecond = 1192.4
MPI Rank 2: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13742899 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2135s; samplesPerSecond = 1171.1
MPI Rank 2: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12847649 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1743s; samplesPerSecond = 1434.0
MPI Rank 2: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16652416 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1945s; samplesPerSecond = 1285.1
MPI Rank 2: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20675721 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2215s; samplesPerSecond = 1128.6
MPI Rank 2: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14562268 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4184s; samplesPerSecond = 597.5
MPI Rank 2: 12/12/2017 15:03:07: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15965044 * 10000; EvalClassificationError = 0.07650000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=12.2533s
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:03:07: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:03:07: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 2: 12/12/2017 15:03:07:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12392293 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2745s; samplesPerSecond = 910.9
MPI Rank 2: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18033422 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2160s; samplesPerSecond = 1157.6
MPI Rank 2: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14284000 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2714s; samplesPerSecond = 921.2
MPI Rank 2: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15662491 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1803s; samplesPerSecond = 1386.5
MPI Rank 2: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16985801 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2398s; samplesPerSecond = 1042.5
MPI Rank 2: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18190608 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1799s; samplesPerSecond = 1389.5
MPI Rank 2: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14495470 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.1974s; samplesPerSecond = 1266.5
MPI Rank 2: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18022154 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3265s; samplesPerSecond = 765.7
MPI Rank 2: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15852461 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2631s; samplesPerSecond = 950.3
MPI Rank 2: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14466589 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2238s; samplesPerSecond = 1116.9
MPI Rank 2: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13346404 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2689s; samplesPerSecond = 929.7
MPI Rank 2: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13683061 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2039s; samplesPerSecond = 1225.9
MPI Rank 2: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11589011 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2030s; samplesPerSecond = 1231.5
MPI Rank 2: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16881193 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1557s; samplesPerSecond = 1605.7
MPI Rank 2: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12736965 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.2051s; samplesPerSecond = 1219.0
MPI Rank 2: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17123603 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.1864s; samplesPerSecond = 1341.0
MPI Rank 2: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17706403 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1616s; samplesPerSecond = 1546.9
MPI Rank 2: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14104103 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4113s; samplesPerSecond = 607.8
MPI Rank 2: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19313360 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2992s; samplesPerSecond = 835.5
MPI Rank 2: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20870745 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1727s; samplesPerSecond = 1447.7
MPI Rank 2: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18510294 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2544s; samplesPerSecond = 982.9
MPI Rank 2: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18167137 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2420s; samplesPerSecond = 1033.2
MPI Rank 2: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14026276 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2273s; samplesPerSecond = 1100.0
MPI Rank 2: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14811532 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2213s; samplesPerSecond = 1129.6
MPI Rank 2: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20368129 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2511s; samplesPerSecond = 995.6
MPI Rank 2: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12819272 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.1989s; samplesPerSecond = 1257.2
MPI Rank 2: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18632901 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3827s; samplesPerSecond = 653.3
MPI Rank 2: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19568750 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2545s; samplesPerSecond = 982.3
MPI Rank 2: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16449544 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.1836s; samplesPerSecond = 1361.7
MPI Rank 2: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12454886 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.2519s; samplesPerSecond = 992.3
MPI Rank 2: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17307192 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2119s; samplesPerSecond = 1179.7
MPI Rank 2: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12249522 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2119s; samplesPerSecond = 1179.6
MPI Rank 2: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14709682 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2003s; samplesPerSecond = 1248.2
MPI Rank 2: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789048 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2276s; samplesPerSecond = 1098.6
MPI Rank 2: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12572171 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1959s; samplesPerSecond = 1276.0
MPI Rank 2: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13732392 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3107s; samplesPerSecond = 804.6
MPI Rank 2: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12857569 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3253s; samplesPerSecond = 768.5
MPI Rank 2: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16653116 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1709s; samplesPerSecond = 1462.7
MPI Rank 2: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20715348 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2880s; samplesPerSecond = 868.0
MPI Rank 2: 12/12/2017 15:03:17:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14571730 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2602s; samplesPerSecond = 960.9
MPI Rank 2: 12/12/2017 15:03:17: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15917666 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=9.53105s
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:03:17: Action "train" complete.
MPI Rank 2: 
MPI Rank 2: 12/12/2017 15:03:17: __COMPLETED__
MPI Rank 3: CNTK 2.3.1+ (HEAD f4f0f8, Dec 11 2017 18:34:12) at 2017/12/12 15:02:30
MPI Rank 3: 
MPI Rank 3: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/DoublePrecision/../..  OutputDir=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=3  precision=double  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=64]]]]  stderr=/tmp/cntk-test-20171211223423.932710/ParallelTraining/NoQuantization_DoublePrecision@release_cpu/stderr
MPI Rank 3: 12/12/2017 15:02:32: -------------------------------------------------------------------
MPI Rank 3: 12/12/2017 15:02:32: Build info: 
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: 		Built time: Dec 11 2017 18:28:39
MPI Rank 3: 12/12/2017 15:02:32: 		Last modified date: Wed Nov 15 09:27:10 2017
MPI Rank 3: 12/12/2017 15:02:32: 		Build type: release
MPI Rank 3: 12/12/2017 15:02:32: 		Build target: GPU
MPI Rank 3: 12/12/2017 15:02:32: 		With ASGD: yes
MPI Rank 3: 12/12/2017 15:02:32: 		Math lib: mkl
MPI Rank 3: 12/12/2017 15:02:32: 		CUDA version: 9.0.0
MPI Rank 3: 12/12/2017 15:02:32: 		CUDNN version: 7.0.4
MPI Rank 3: 12/12/2017 15:02:32: 		Build Branch: HEAD
MPI Rank 3: 12/12/2017 15:02:32: 		Build SHA1: f4f0f82eabcc482dbd03af1f946a44ae2b8b97bf
MPI Rank 3: 12/12/2017 15:02:32: 		MPI distribution: Open MPI
MPI Rank 3: 12/12/2017 15:02:32: 		MPI version: 1.10.7
MPI Rank 3: 12/12/2017 15:02:32: -------------------------------------------------------------------
MPI Rank 3: 12/12/2017 15:02:32: -------------------------------------------------------------------
MPI Rank 3: 12/12/2017 15:02:32: GPU info:
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 7864 MB
MPI Rank 3: 12/12/2017 15:02:32: -------------------------------------------------------------------
MPI Rank 3: 12/12/2017 15:02:32: Using 3 CPU threads.
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: ##############################################################################
MPI Rank 3: 12/12/2017 15:02:32: #                                                                            #
MPI Rank 3: 12/12/2017 15:02:32: # SimpleMultiGPU command (train action)                                      #
MPI Rank 3: 12/12/2017 15:02:32: #                                                                            #
MPI Rank 3: 12/12/2017 15:02:32: ##############################################################################
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: 
MPI Rank 3: Creating virgin network.
MPI Rank 3: SimpleNetworkBuilder Using CPU
MPI Rank 3: 12/12/2017 15:02:32: 
MPI Rank 3: Model has 25 nodes. Using CPU.
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 3: 12/12/2017 15:02:32: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 3: 
MPI Rank 3: 
MPI Rank 3: Allocating matrices for forward and/or backward propagation.
MPI Rank 3: 
MPI Rank 3: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 3: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 3: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 3: 
MPI Rank 3: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 3: 
MPI Rank 3: Here are the ones that share memory:
MPI Rank 3: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 3: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 3: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 3: 	  W0 : [50 x 2] (gradient)
MPI Rank 3: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 3: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  W2*H1 : [2 x 1 x *]
MPI Rank 3: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 3: 	{ B0 : [50 x 1] (gradient)
MPI Rank 3: 	  H1 : [50 x 1 x *] }
MPI Rank 3: 	{ H2 : [50 x 1 x *]
MPI Rank 3: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 3: 	  W1 : [50 x 50] (gradient)
MPI Rank 3: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 3: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  HLast : [2 x 1 x *]
MPI Rank 3: 	  W0*features : [50 x *]
MPI Rank 3: 	  W0*features : [50 x *] (gradient) }
MPI Rank 3: 
MPI Rank 3: Here are the ones that don't share memory:
MPI Rank 3: 	{features : [2 x *]}
MPI Rank 3: 	{MeanOfFeatures : [2]}
MPI Rank 3: 	{InvStdOfFeatures : [2]}
MPI Rank 3: 	{W0 : [50 x 2]}
MPI Rank 3: 	{B0 : [50 x 1]}
MPI Rank 3: 	{W1 : [50 x 50]}
MPI Rank 3: 	{B1 : [50 x 1]}
MPI Rank 3: 	{W2 : [2 x 50]}
MPI Rank 3: 	{B2 : [2 x 1]}
MPI Rank 3: 	{labels : [2 x *]}
MPI Rank 3: 	{Prior : [2]}
MPI Rank 3: 	{EvalClassificationError : [1]}
MPI Rank 3: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 3: 	{LogOfPrior : [2]}
MPI Rank 3: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 3: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 3: 	{B2 : [2 x 1] (gradient)}
MPI Rank 3: 	{B1 : [50 x 1] (gradient)}
MPI Rank 3: 	{W2 : [2 x 50] (gradient)}
MPI Rank 3: 
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 3: 12/12/2017 15:02:32: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 3: 12/12/2017 15:02:32: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 3: 12/12/2017 15:02:32: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 3: 12/12/2017 15:02:32: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 3: 12/12/2017 15:02:32: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 3: 
MPI Rank 3: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 3: NcclComm: disabled, at least one rank using CPU device
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: Precomputing --> 3 PreCompute nodes found.
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: 	MeanOfFeatures = Mean()
MPI Rank 3: 12/12/2017 15:02:32: 	InvStdOfFeatures = InvStdDev()
MPI Rank 3: 12/12/2017 15:02:32: 	Prior = Mean()
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:32: Precomputing --> Completed.
MPI Rank 3: 
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:33: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:33: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 3: 12/12/2017 15:02:33:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.69973268 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.4020s; samplesPerSecond = 621.8
MPI Rank 3: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71436905 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.2013s; samplesPerSecond = 1242.2
MPI Rank 3: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72871054 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.1915s; samplesPerSecond = 1305.7
MPI Rank 3: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70038993 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.2047s; samplesPerSecond = 1221.2
MPI Rank 3: 12/12/2017 15:02:34:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70593818 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.4135s; samplesPerSecond = 604.5
MPI Rank 3: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71604646 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2253s; samplesPerSecond = 1109.7
MPI Rank 3: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72247949 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.2338s; samplesPerSecond = 1069.3
MPI Rank 3: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79884413 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.2179s; samplesPerSecond = 1147.4
MPI Rank 3: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69622447 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.1233s; samplesPerSecond = 2028.3
MPI Rank 3: 12/12/2017 15:02:35:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70749459 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.2003s; samplesPerSecond = 1248.4
MPI Rank 3: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71485824 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.2583s; samplesPerSecond = 967.7
MPI Rank 3: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69579152 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.2286s; samplesPerSecond = 1093.5
MPI Rank 3: 12/12/2017 15:02:36:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70174138 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.2036s; samplesPerSecond = 1228.1
MPI Rank 3: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71926586 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.4441s; samplesPerSecond = 563.0
MPI Rank 3: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72009917 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.2295s; samplesPerSecond = 1089.4
MPI Rank 3: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71854573 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.1999s; samplesPerSecond = 1250.9
MPI Rank 3: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74083729 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2317s; samplesPerSecond = 1079.2
MPI Rank 3: 12/12/2017 15:02:37:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71762852 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.1850s; samplesPerSecond = 1351.2
MPI Rank 3: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71530686 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.1788s; samplesPerSecond = 1398.4
MPI Rank 3: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71768617 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.1725s; samplesPerSecond = 1448.9
MPI Rank 3: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71515312 * 250; EvalClassificationError = 0.53600000 * 250; time = 0.1691s; samplesPerSecond = 1478.4
MPI Rank 3: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72047060 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.1981s; samplesPerSecond = 1262.1
MPI Rank 3: 12/12/2017 15:02:38:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72033071 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.1555s; samplesPerSecond = 1608.2
MPI Rank 3: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71295324 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.4722s; samplesPerSecond = 529.5
MPI Rank 3: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69737817 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.2202s; samplesPerSecond = 1135.3
MPI Rank 3: 12/12/2017 15:02:39:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70251892 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.3776s; samplesPerSecond = 662.1
MPI Rank 3: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70879704 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.2268s; samplesPerSecond = 1102.1
MPI Rank 3: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69856459 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.1836s; samplesPerSecond = 1361.3
MPI Rank 3: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69425907 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.1888s; samplesPerSecond = 1324.0
MPI Rank 3: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69599736 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1704s; samplesPerSecond = 1466.8
MPI Rank 3: 12/12/2017 15:02:40:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69591177 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.1750s; samplesPerSecond = 1428.8
MPI Rank 3: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.69133098 * 250; EvalClassificationError = 0.40000000 * 250; time = 0.2465s; samplesPerSecond = 1014.4
MPI Rank 3: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69822648 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.4326s; samplesPerSecond = 577.9
MPI Rank 3: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.71031539 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.2469s; samplesPerSecond = 1012.6
MPI Rank 3: 12/12/2017 15:02:41:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.70097460 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.2081s; samplesPerSecond = 1201.2
MPI Rank 3: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.68927867 * 250; EvalClassificationError = 0.45200000 * 250; time = 0.1871s; samplesPerSecond = 1336.0
MPI Rank 3: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.68908389 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.1906s; samplesPerSecond = 1311.9
MPI Rank 3: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.67796901 * 250; EvalClassificationError = 0.45600000 * 250; time = 0.1758s; samplesPerSecond = 1421.8
MPI Rank 3: 12/12/2017 15:02:42:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.67863593 * 250; EvalClassificationError = 0.38400000 * 250; time = 0.2493s; samplesPerSecond = 1002.9
MPI Rank 3: 12/12/2017 15:02:43:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.67150936 * 250; EvalClassificationError = 0.42800000 * 250; time = 0.2323s; samplesPerSecond = 1076.1
MPI Rank 3: 12/12/2017 15:02:43: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70804123 * 10000; EvalClassificationError = 0.49380000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=9.49413s
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:43: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:43: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 3: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.69566490 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.1894s; samplesPerSecond = 1320.0
MPI Rank 3: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.64058119 * 250; EvalClassificationError = 0.22400000 * 250; time = 0.4243s; samplesPerSecond = 589.3
MPI Rank 3: 12/12/2017 15:02:43:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.62577202 * 250; EvalClassificationError = 0.30400000 * 250; time = 0.2959s; samplesPerSecond = 844.7
MPI Rank 3: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.62974783 * 250; EvalClassificationError = 0.34000000 * 250; time = 0.2826s; samplesPerSecond = 884.6
MPI Rank 3: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.60705897 * 250; EvalClassificationError = 0.22800000 * 250; time = 0.3144s; samplesPerSecond = 795.2
MPI Rank 3: 12/12/2017 15:02:44:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.59038668 * 250; EvalClassificationError = 0.18000000 * 250; time = 0.2096s; samplesPerSecond = 1193.0
MPI Rank 3: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.55033195 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2730s; samplesPerSecond = 915.8
MPI Rank 3: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.53624170 * 250; EvalClassificationError = 0.23200000 * 250; time = 0.2476s; samplesPerSecond = 1009.7
MPI Rank 3: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.48688308 * 250; EvalClassificationError = 0.12000000 * 250; time = 0.2713s; samplesPerSecond = 921.5
MPI Rank 3: 12/12/2017 15:02:45:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.43212926 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.3440s; samplesPerSecond = 726.7
MPI Rank 3: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.38559516 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2398s; samplesPerSecond = 1042.4
MPI Rank 3: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.34249535 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.3479s; samplesPerSecond = 718.7
MPI Rank 3: 12/12/2017 15:02:46:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.28670698 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.2953s; samplesPerSecond = 846.6
MPI Rank 3: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.26990400 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2592s; samplesPerSecond = 964.7
MPI Rank 3: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.23285507 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2053s; samplesPerSecond = 1217.5
MPI Rank 3: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.25464189 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2726s; samplesPerSecond = 917.0
MPI Rank 3: 12/12/2017 15:02:47:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.21253995 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3752s; samplesPerSecond = 666.4
MPI Rank 3: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.18708213 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.5278s; samplesPerSecond = 473.7
MPI Rank 3: 12/12/2017 15:02:48:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.21363034 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.3556s; samplesPerSecond = 703.1
MPI Rank 3: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.23505436 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2314s; samplesPerSecond = 1080.4
MPI Rank 3: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.20180377 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1603s; samplesPerSecond = 1559.8
MPI Rank 3: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.19780589 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.1373s; samplesPerSecond = 1820.4
MPI Rank 3: 12/12/2017 15:02:49:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.16131109 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.3574s; samplesPerSecond = 699.4
MPI Rank 3: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.16479151 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.5638s; samplesPerSecond = 443.4
MPI Rank 3: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20226364 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3802s; samplesPerSecond = 657.6
MPI Rank 3: 12/12/2017 15:02:50:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.14809078 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3007s; samplesPerSecond = 831.5
MPI Rank 3: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.19001813 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.3347s; samplesPerSecond = 747.0
MPI Rank 3: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19616890 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2872s; samplesPerSecond = 870.5
MPI Rank 3: 12/12/2017 15:02:51:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17887468 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3225s; samplesPerSecond = 775.2
MPI Rank 3: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.14040410 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.5653s; samplesPerSecond = 442.2
MPI Rank 3: 12/12/2017 15:02:52:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17935152 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.3483s; samplesPerSecond = 717.7
MPI Rank 3: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.13249072 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3701s; samplesPerSecond = 675.5
MPI Rank 3: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15483358 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3433s; samplesPerSecond = 728.2
MPI Rank 3: 12/12/2017 15:02:53:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19796159 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2985s; samplesPerSecond = 837.5
MPI Rank 3: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.13179462 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2599s; samplesPerSecond = 961.7
MPI Rank 3: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.14028323 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4379s; samplesPerSecond = 571.0
MPI Rank 3: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12849508 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2634s; samplesPerSecond = 949.2
MPI Rank 3: 12/12/2017 15:02:54:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16702669 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1194s; samplesPerSecond = 2094.6
MPI Rank 3: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20390304 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2846s; samplesPerSecond = 878.4
MPI Rank 3: 12/12/2017 15:02:55:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14594790 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.1821s; samplesPerSecond = 1372.6
MPI Rank 3: 12/12/2017 15:02:55: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.29447308 * 10000; EvalClassificationError = 0.11490000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=12.3184s
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:55: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:02:55: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 3: 12/12/2017 15:02:55:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12813296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.4426s; samplesPerSecond = 564.8
MPI Rank 3: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17615627 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.3114s; samplesPerSecond = 802.8
MPI Rank 3: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14587002 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.4548s; samplesPerSecond = 549.7
MPI Rank 3: 12/12/2017 15:02:56:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15938467 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2949s; samplesPerSecond = 847.7
MPI Rank 3: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17100049 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.5040s; samplesPerSecond = 496.0
MPI Rank 3: 12/12/2017 15:02:57:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18281055 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.3437s; samplesPerSecond = 727.3
MPI Rank 3: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14781537 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2946s; samplesPerSecond = 848.6
MPI Rank 3: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18045490 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3439s; samplesPerSecond = 727.0
MPI Rank 3: 12/12/2017 15:02:58:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15847199 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5294s; samplesPerSecond = 472.3
MPI Rank 3: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14513057 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.3440s; samplesPerSecond = 726.7
MPI Rank 3: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13519578 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.3957s; samplesPerSecond = 631.7
MPI Rank 3: 12/12/2017 15:02:59:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13723644 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3359s; samplesPerSecond = 744.2
MPI Rank 3: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11692067 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.3519s; samplesPerSecond = 710.4
MPI Rank 3: 12/12/2017 15:03:00:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16729043 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2278s; samplesPerSecond = 1097.3
MPI Rank 3: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12836481 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.6062s; samplesPerSecond = 412.4
MPI Rank 3: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17320383 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2192s; samplesPerSecond = 1140.3
MPI Rank 3: 12/12/2017 15:03:01:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17634559 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3585s; samplesPerSecond = 697.4
MPI Rank 3: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14124514 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2945s; samplesPerSecond = 848.8
MPI Rank 3: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19167718 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2638s; samplesPerSecond = 947.5
MPI Rank 3: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20913003 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2642s; samplesPerSecond = 946.3
MPI Rank 3: 12/12/2017 15:03:02:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18460750 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2532s; samplesPerSecond = 987.4
MPI Rank 3: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18188216 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.5237s; samplesPerSecond = 477.4
MPI Rank 3: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14069101 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2844s; samplesPerSecond = 879.0
MPI Rank 3: 12/12/2017 15:03:03:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14812247 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.3098s; samplesPerSecond = 806.9
MPI Rank 3: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20274092 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2253s; samplesPerSecond = 1109.5
MPI Rank 3: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12887866 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2170s; samplesPerSecond = 1152.0
MPI Rank 3: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18595256 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2146s; samplesPerSecond = 1164.9
MPI Rank 3: 12/12/2017 15:03:04:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19565326 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2429s; samplesPerSecond = 1029.1
MPI Rank 3: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16678525 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.4208s; samplesPerSecond = 594.1
MPI Rank 3: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12552459 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.2130s; samplesPerSecond = 1173.9
MPI Rank 3: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17414175 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1488s; samplesPerSecond = 1679.8
MPI Rank 3: 12/12/2017 15:03:05:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12295855 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1485s; samplesPerSecond = 1683.3
MPI Rank 3: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14757012 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2273s; samplesPerSecond = 1100.1
MPI Rank 3: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19785856 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1962s; samplesPerSecond = 1274.2
MPI Rank 3: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12600285 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2012s; samplesPerSecond = 1242.4
MPI Rank 3: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13742899 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2124s; samplesPerSecond = 1176.9
MPI Rank 3: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12847649 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1723s; samplesPerSecond = 1451.0
MPI Rank 3: 12/12/2017 15:03:06:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16652416 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2056s; samplesPerSecond = 1216.0
MPI Rank 3: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20675721 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.2204s; samplesPerSecond = 1134.5
MPI Rank 3: 12/12/2017 15:03:07:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14562268 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4046s; samplesPerSecond = 617.9
MPI Rank 3: 12/12/2017 15:03:07: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15965044 * 10000; EvalClassificationError = 0.07650000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=12.2587s
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:03:07: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:03:07: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 3: 12/12/2017 15:03:07:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12392293 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.2664s; samplesPerSecond = 938.3
MPI Rank 3: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18033422 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2267s; samplesPerSecond = 1102.7
MPI Rank 3: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14284000 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2661s; samplesPerSecond = 939.5
MPI Rank 3: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15662491 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1832s; samplesPerSecond = 1364.9
MPI Rank 3: 12/12/2017 15:03:08:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16985801 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.2277s; samplesPerSecond = 1098.0
MPI Rank 3: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18190608 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.1785s; samplesPerSecond = 1400.7
MPI Rank 3: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14495470 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.1992s; samplesPerSecond = 1254.8
MPI Rank 3: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18022154 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.3329s; samplesPerSecond = 751.0
MPI Rank 3: 12/12/2017 15:03:09:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15852461 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2482s; samplesPerSecond = 1007.1
MPI Rank 3: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14466589 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2370s; samplesPerSecond = 1054.7
MPI Rank 3: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13346404 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2895s; samplesPerSecond = 863.5
MPI Rank 3: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13683061 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.1772s; samplesPerSecond = 1411.0
MPI Rank 3: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11589011 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2010s; samplesPerSecond = 1244.0
MPI Rank 3: 12/12/2017 15:03:10:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16881193 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.1773s; samplesPerSecond = 1409.8
MPI Rank 3: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12736965 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.1975s; samplesPerSecond = 1265.6
MPI Rank 3: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17123603 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.1759s; samplesPerSecond = 1421.5
MPI Rank 3: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17706403 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1614s; samplesPerSecond = 1548.7
MPI Rank 3: 12/12/2017 15:03:11:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14104103 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.4190s; samplesPerSecond = 596.7
MPI Rank 3: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19313360 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.2910s; samplesPerSecond = 859.2
MPI Rank 3: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20870745 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.1756s; samplesPerSecond = 1424.1
MPI Rank 3: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18510294 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.2675s; samplesPerSecond = 934.7
MPI Rank 3: 12/12/2017 15:03:12:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18167137 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2321s; samplesPerSecond = 1077.2
MPI Rank 3: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14026276 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2219s; samplesPerSecond = 1126.9
MPI Rank 3: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14811532 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.2173s; samplesPerSecond = 1150.7
MPI Rank 3: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20368129 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2663s; samplesPerSecond = 938.6
MPI Rank 3: 12/12/2017 15:03:13:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12819272 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.2029s; samplesPerSecond = 1231.9
MPI Rank 3: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18632901 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.3374s; samplesPerSecond = 741.0
MPI Rank 3: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19568750 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.2876s; samplesPerSecond = 869.3
MPI Rank 3: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16449544 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.1654s; samplesPerSecond = 1511.7
MPI Rank 3: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12454886 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.2703s; samplesPerSecond = 924.9
MPI Rank 3: 12/12/2017 15:03:14:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17307192 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.2134s; samplesPerSecond = 1171.6
MPI Rank 3: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12249522 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.2178s; samplesPerSecond = 1147.7
MPI Rank 3: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14709682 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.1967s; samplesPerSecond = 1270.8
MPI Rank 3: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789048 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.2437s; samplesPerSecond = 1026.1
MPI Rank 3: 12/12/2017 15:03:15:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12572171 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.1871s; samplesPerSecond = 1335.9
MPI Rank 3: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13732392 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.2988s; samplesPerSecond = 836.8
MPI Rank 3: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12857569 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.3255s; samplesPerSecond = 768.1
MPI Rank 3: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16653116 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.1707s; samplesPerSecond = 1464.9
MPI Rank 3: 12/12/2017 15:03:16:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20715348 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.2886s; samplesPerSecond = 866.4
MPI Rank 3: 12/12/2017 15:03:17:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14571730 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.2622s; samplesPerSecond = 953.5
MPI Rank 3: 12/12/2017 15:03:17: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15917666 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=9.52707s
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:03:17: Action "train" complete.
MPI Rank 3: 
MPI Rank 3: 12/12/2017 15:03:17: __COMPLETED__