CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
    Hardware threads: 12
    Total Memory: 57700428 kB
-------------------------------------------------------------------
=== Running mpiexec -n 4 /home/ubuntu/workspace/build/gpu/release/bin/cntk configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../.. OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu DeviceId=0 timestamping=true numCPUThreads=3 precision=float SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]] stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24

/home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data
--------------------------------------------------------------------------
[[39249,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: 7fee1579d8b2

Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (3) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (0) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (2) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (1) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
01/16/2018 19:05:24: Redirecting stderr to file /tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr_SimpleMultiGPU.logrank0
01/16/2018 19:05:25: Redirecting stderr to file /tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr_SimpleMultiGPU.logrank1
01/16/2018 19:05:25: Redirecting stderr to file /tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr_SimpleMultiGPU.logrank2
01/16/2018 19:05:26: Redirecting stderr to file /tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr_SimpleMultiGPU.logrank3
[7fee1579d8b2:08151] 3 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[7fee1579d8b2:08151] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
MPI Rank 0: CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24
MPI Rank 0: 
MPI Rank 0: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
MPI Rank 0: 01/16/2018 19:05:24: -------------------------------------------------------------------
MPI Rank 0: 01/16/2018 19:05:24: Build info: 
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:24: 		Built time: Jan 16 2018 16:15:42
MPI Rank 0: 01/16/2018 19:05:24: 		Last modified date: Tue Jan 16 16:13:51 2018
MPI Rank 0: 01/16/2018 19:05:24: 		Build type: release
MPI Rank 0: 01/16/2018 19:05:24: 		Build target: GPU
MPI Rank 0: 01/16/2018 19:05:24: 		With ASGD: yes
MPI Rank 0: 01/16/2018 19:05:24: 		Math lib: mkl
MPI Rank 0: 01/16/2018 19:05:24: 		CUDA version: 9.0.0
MPI Rank 0: 01/16/2018 19:05:24: 		CUDNN version: 7.0.4
MPI Rank 0: 01/16/2018 19:05:24: 		Build Branch: HEAD
MPI Rank 0: 01/16/2018 19:05:24: 		Build SHA1: c4c2ce8c6e89b5c32e4d07523081283417bcfc6d
MPI Rank 0: 01/16/2018 19:05:24: 		MPI distribution: Open MPI
MPI Rank 0: 01/16/2018 19:05:24: 		MPI version: 1.10.7
MPI Rank 0: 01/16/2018 19:05:24: -------------------------------------------------------------------
MPI Rank 0: 01/16/2018 19:05:24: -------------------------------------------------------------------
MPI Rank 0: 01/16/2018 19:05:24: GPU info:
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:24: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8112 MB
MPI Rank 0: 01/16/2018 19:05:24: -------------------------------------------------------------------
MPI Rank 0: 01/16/2018 19:05:24: Using 3 CPU threads.
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:24: ##############################################################################
MPI Rank 0: 01/16/2018 19:05:24: #                                                                            #
MPI Rank 0: 01/16/2018 19:05:24: # SimpleMultiGPU command (train action)                                      #
MPI Rank 0: 01/16/2018 19:05:24: #                                                                            #
MPI Rank 0: 01/16/2018 19:05:24: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:24: 
MPI Rank 0: Creating virgin network.
MPI Rank 0: SimpleNetworkBuilder Using GPU 0
MPI Rank 0: 01/16/2018 19:05:25: 
MPI Rank 0: Model has 25 nodes. Using GPU 0.
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:25: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 0: 01/16/2018 19:05:25: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: Allocating matrices for forward and/or backward propagation.
MPI Rank 0: 
MPI Rank 0: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 0: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 0: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 0: 
MPI Rank 0: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 0: 
MPI Rank 0: Here are the ones that share memory:
MPI Rank 0: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 0: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 0: 	{ H2 : [50 x 1 x *]
MPI Rank 0: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 0: 	  W1 : [50 x 50] (gradient)
MPI Rank 0: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 0: 	{ B0 : [50 x 1] (gradient)
MPI Rank 0: 	  H1 : [50 x 1 x *] }
MPI Rank 0: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 0: 	  W0 : [50 x 2] (gradient)
MPI Rank 0: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 0: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  W2*H1 : [2 x 1 x *]
MPI Rank 0: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 0: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 0: 	  HLast : [2 x 1 x *]
MPI Rank 0: 	  W0*features : [50 x *]
MPI Rank 0: 	  W0*features : [50 x *] (gradient) }
MPI Rank 0: 
MPI Rank 0: Here are the ones that don't share memory:
MPI Rank 0: 	{EvalClassificationError : [1]}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 0: 	{LogOfPrior : [2]}
MPI Rank 0: 	{W2 : [2 x 50] (gradient)}
MPI Rank 0: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 0: 	{B2 : [2 x 1] (gradient)}
MPI Rank 0: 	{B1 : [50 x 1] (gradient)}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 0: 	{W2 : [2 x 50]}
MPI Rank 0: 	{B2 : [2 x 1]}
MPI Rank 0: 	{labels : [2 x *]}
MPI Rank 0: 	{Prior : [2]}
MPI Rank 0: 	{B0 : [50 x 1]}
MPI Rank 0: 	{W1 : [50 x 50]}
MPI Rank 0: 	{B1 : [50 x 1]}
MPI Rank 0: 	{InvStdOfFeatures : [2]}
MPI Rank 0: 	{W0 : [50 x 2]}
MPI Rank 0: 	{MeanOfFeatures : [2]}
MPI Rank 0: 	{features : [2 x *]}
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:25: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:25: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 0: 01/16/2018 19:05:25: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 0: 01/16/2018 19:05:25: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 0: 01/16/2018 19:05:25: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 0: 01/16/2018 19:05:25: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 0: 01/16/2018 19:05:25: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 0: 
MPI Rank 0: Initializing dataParallelSGD with FP32 aggregation.
MPI Rank 0: NcclComm: disabled, same device used by more than one rank
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:26: Precomputing --> 3 PreCompute nodes found.
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:26: 	MeanOfFeatures = Mean()
MPI Rank 0: 01/16/2018 19:05:26: 	InvStdOfFeatures = InvStdDev()
MPI Rank 0: 01/16/2018 19:05:26: 	Prior = Mean()
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:26: Precomputing --> Completed.
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:27: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:27: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.70007977 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0280s; samplesPerSecond = 8927.2
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71514543 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0426s; samplesPerSecond = 5866.6
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72945593 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0313s; samplesPerSecond = 7998.1
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70079058 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0281s; samplesPerSecond = 8905.8
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70605617 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0221s; samplesPerSecond = 11304.3
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71572398 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0280s; samplesPerSecond = 8936.7
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72149850 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.0278s; samplesPerSecond = 8981.1
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79845604 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0299s; samplesPerSecond = 8357.2
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69665185 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0258s; samplesPerSecond = 9673.9
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70723325 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.0276s; samplesPerSecond = 9063.6
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71420345 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0313s; samplesPerSecond = 7975.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69535258 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.0304s; samplesPerSecond = 8212.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70078532 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.0282s; samplesPerSecond = 8876.4
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71857914 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.0298s; samplesPerSecond = 8387.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72088357 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.0223s; samplesPerSecond = 11213.7
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71798840 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0238s; samplesPerSecond = 10484.9
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74162164 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0257s; samplesPerSecond = 9712.3
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71835127 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.0228s; samplesPerSecond = 10971.4
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71529461 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.0253s; samplesPerSecond = 9881.8
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71727656 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.0279s; samplesPerSecond = 8965.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71745516 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0218s; samplesPerSecond = 11462.1
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72088398 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0243s; samplesPerSecond = 10307.5
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72006809 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.0293s; samplesPerSecond = 8538.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71275468 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0286s; samplesPerSecond = 8744.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69644781 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0278s; samplesPerSecond = 8992.3
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70129698 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0275s; samplesPerSecond = 9106.0
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70768095 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.0279s; samplesPerSecond = 8955.6
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69744379 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.0227s; samplesPerSecond = 10997.9
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69266187 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.0330s; samplesPerSecond = 7566.3
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69347266 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.0284s; samplesPerSecond = 8789.3
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69257409 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0278s; samplesPerSecond = 9007.5
MPI Rank 0: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.68625741 * 250; EvalClassificationError = 0.38000000 * 250; time = 0.0237s; samplesPerSecond = 10553.4
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69064011 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0260s; samplesPerSecond = 9602.1
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.70192153 * 250; EvalClassificationError = 0.46000000 * 250; time = 0.0243s; samplesPerSecond = 10270.3
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.69058912 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0285s; samplesPerSecond = 8782.1
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.67041492 * 250; EvalClassificationError = 0.39200000 * 250; time = 0.0370s; samplesPerSecond = 6757.3
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.65913973 * 250; EvalClassificationError = 0.35600000 * 250; time = 0.0221s; samplesPerSecond = 11305.1
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.63919877 * 250; EvalClassificationError = 0.36400000 * 250; time = 0.0216s; samplesPerSecond = 11596.4
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.61293885 * 250; EvalClassificationError = 0.19200000 * 250; time = 0.0301s; samplesPerSecond = 8298.8
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.55255355 * 250; EvalClassificationError = 0.18800000 * 250; time = 0.0301s; samplesPerSecond = 8307.6
MPI Rank 0: 01/16/2018 19:05:28: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70019555 * 10000; EvalClassificationError = 0.47350000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=1.10558s
MPI Rank 0: 01/16/2018 19:05:28: SGD: Saving checkpoint model '/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/models/Simple.dnn.1'
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:28: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:28: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.50774630 * 250; EvalClassificationError = 0.24000000 * 250; time = 0.0252s; samplesPerSecond = 9902.2
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.43388933 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0263s; samplesPerSecond = 9505.5
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.36674877 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0246s; samplesPerSecond = 10176.1
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.33768770 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0314s; samplesPerSecond = 7973.5
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.30320952 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0252s; samplesPerSecond = 9925.6
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.29576047 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0310s; samplesPerSecond = 8058.8
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.24924496 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0276s; samplesPerSecond = 9048.5
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.24632418 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0200s; samplesPerSecond = 12485.8
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.20943161 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0240s; samplesPerSecond = 10425.9
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.19115999 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0258s; samplesPerSecond = 9697.4
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.17923233 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0235s; samplesPerSecond = 10640.8
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.17075425 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0285s; samplesPerSecond = 8761.6
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.14442373 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0238s; samplesPerSecond = 10490.4
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.17753820 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0238s; samplesPerSecond = 10496.2
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.15087857 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0301s; samplesPerSecond = 8308.8
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.19253022 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0320s; samplesPerSecond = 7814.2
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17830684 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0351s; samplesPerSecond = 7117.0
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.15115429 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0224s; samplesPerSecond = 11154.6
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19135969 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0200s; samplesPerSecond = 12528.7
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.21491485 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0286s; samplesPerSecond = 8740.5
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18682346 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0235s; samplesPerSecond = 10617.2
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18483206 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0216s; samplesPerSecond = 11567.9
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14684504 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0300s; samplesPerSecond = 8343.7
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.15322117 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0300s; samplesPerSecond = 8321.0
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.19882571 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0318s; samplesPerSecond = 7872.7
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.13683833 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0259s; samplesPerSecond = 9634.5
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18621189 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0234s; samplesPerSecond = 10680.4
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19408050 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0325s; samplesPerSecond = 7691.4
MPI Rank 0: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17298137 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0266s; samplesPerSecond = 9410.3
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.13265130 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0285s; samplesPerSecond = 8769.8
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17627179 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0314s; samplesPerSecond = 7959.4
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12734628 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0262s; samplesPerSecond = 9558.0
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15108452 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0252s; samplesPerSecond = 9928.7
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19729184 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0265s; samplesPerSecond = 9447.2
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12857333 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0411s; samplesPerSecond = 6086.9
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13867804 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0263s; samplesPerSecond = 9507.0
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12786050 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0241s; samplesPerSecond = 10384.8
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16643303 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0317s; samplesPerSecond = 7876.0
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20440407 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0259s; samplesPerSecond = 9651.0
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14566238 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0206s; samplesPerSecond = 12119.8
MPI Rank 0: 01/16/2018 19:05:29: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.20373031 * 10000; EvalClassificationError = 0.08270000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=1.08571s
MPI Rank 0: 01/16/2018 19:05:29: SGD: Saving checkpoint model '/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/models/Simple.dnn.2'
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:29: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:29: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12590085 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0249s; samplesPerSecond = 10034.9
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17780229 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0225s; samplesPerSecond = 11095.7
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14417637 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0296s; samplesPerSecond = 8449.4
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15796897 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0245s; samplesPerSecond = 10210.7
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17002999 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0264s; samplesPerSecond = 9479.5
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18262114 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0290s; samplesPerSecond = 8622.6
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14643695 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0217s; samplesPerSecond = 11516.3
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18030529 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0261s; samplesPerSecond = 9593.7
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15846151 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0237s; samplesPerSecond = 10559.3
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14486534 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0283s; samplesPerSecond = 8821.4
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13469094 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0236s; samplesPerSecond = 10579.1
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13720020 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0420s; samplesPerSecond = 5947.0
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11641296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0318s; samplesPerSecond = 7873.2
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16786646 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0240s; samplesPerSecond = 10420.6
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12811514 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0218s; samplesPerSecond = 11491.7
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17257851 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0283s; samplesPerSecond = 8849.3
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17623655 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0217s; samplesPerSecond = 11533.8
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14121117 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0245s; samplesPerSecond = 10206.3
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19243443 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0297s; samplesPerSecond = 8428.0
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20908162 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0256s; samplesPerSecond = 9768.4
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18472067 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0257s; samplesPerSecond = 9726.6
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18185536 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0303s; samplesPerSecond = 8261.4
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14074205 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0280s; samplesPerSecond = 8914.5
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14871620 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0280s; samplesPerSecond = 8942.6
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20299705 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0288s; samplesPerSecond = 8669.1
MPI Rank 0: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12852038 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0232s; samplesPerSecond = 10768.9
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18660439 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0315s; samplesPerSecond = 7924.3
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19575997 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0245s; samplesPerSecond = 10213.1
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16667676 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0278s; samplesPerSecond = 9002.4
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12526168 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0219s; samplesPerSecond = 11406.3
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17392132 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0283s; samplesPerSecond = 8830.2
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12281615 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0364s; samplesPerSecond = 6876.5
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14759390 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0246s; samplesPerSecond = 10168.3
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19801301 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0227s; samplesPerSecond = 11023.5
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12593395 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0236s; samplesPerSecond = 10572.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13756617 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0256s; samplesPerSecond = 9760.1
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12838525 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0261s; samplesPerSecond = 9570.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16654369 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0287s; samplesPerSecond = 8722.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20658951 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0273s; samplesPerSecond = 9156.4
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14583322 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0240s; samplesPerSecond = 10437.9
MPI Rank 0: 01/16/2018 19:05:30: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15948618 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=1.07049s
MPI Rank 0: 01/16/2018 19:05:30: SGD: Saving checkpoint model '/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/models/Simple.dnn.3'
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:30: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:30: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12371232 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0275s; samplesPerSecond = 9096.9
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18070514 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0193s; samplesPerSecond = 12927.6
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14239731 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0272s; samplesPerSecond = 9192.6
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15630155 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0274s; samplesPerSecond = 9130.6
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16935525 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0245s; samplesPerSecond = 10223.6
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18198833 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0295s; samplesPerSecond = 8467.3
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14475946 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0234s; samplesPerSecond = 10682.9
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18021601 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0250s; samplesPerSecond = 9994.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15849308 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0296s; samplesPerSecond = 8436.1
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14474426 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0314s; samplesPerSecond = 7957.6
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13362926 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0259s; samplesPerSecond = 9659.3
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13708299 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0242s; samplesPerSecond = 10313.9
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11569776 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0258s; samplesPerSecond = 9707.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16892331 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0219s; samplesPerSecond = 11391.4
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12752163 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0259s; samplesPerSecond = 9664.6
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17100866 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0256s; samplesPerSecond = 9747.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17660425 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0244s; samplesPerSecond = 10256.7
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14105803 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0218s; samplesPerSecond = 11461.8
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19333553 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0284s; samplesPerSecond = 8787.5
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20859524 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0317s; samplesPerSecond = 7894.9
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18499676 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0257s; samplesPerSecond = 9723.9
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18152439 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0259s; samplesPerSecond = 9667.8
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14037158 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0263s; samplesPerSecond = 9490.1
MPI Rank 0: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14866863 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0236s; samplesPerSecond = 10585.6
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20347746 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0283s; samplesPerSecond = 8834.2
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12815013 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0238s; samplesPerSecond = 10486.1
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18672809 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0273s; samplesPerSecond = 9144.6
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19552989 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0305s; samplesPerSecond = 8203.7
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16452642 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0439s; samplesPerSecond = 5690.1
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12461825 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0246s; samplesPerSecond = 10142.1
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17285251 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0273s; samplesPerSecond = 9163.4
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12253619 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0238s; samplesPerSecond = 10501.2
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14723334 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0379s; samplesPerSecond = 6589.9
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789538 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0216s; samplesPerSecond = 11567.0
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12575877 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0262s; samplesPerSecond = 9531.9
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13745928 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0380s; samplesPerSecond = 6575.6
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12839652 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0338s; samplesPerSecond = 7402.1
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16647280 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0264s; samplesPerSecond = 9477.3
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20679434 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0325s; samplesPerSecond = 7699.8
MPI Rank 0: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14585245 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0264s; samplesPerSecond = 9456.7
MPI Rank 0: 01/16/2018 19:05:31: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15914931 * 10000; EvalClassificationError = 0.07670000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=1.09866s
MPI Rank 0: 01/16/2018 19:05:31: SGD: Saving checkpoint model '/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/models/Simple.dnn'
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:31: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: 01/16/2018 19:05:31: __COMPLETED__
MPI Rank 1: CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24
MPI Rank 1: 
MPI Rank 1: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
MPI Rank 1: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 1: 01/16/2018 19:05:25: Build info: 
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: 		Built time: Jan 16 2018 16:15:42
MPI Rank 1: 01/16/2018 19:05:25: 		Last modified date: Tue Jan 16 16:13:51 2018
MPI Rank 1: 01/16/2018 19:05:25: 		Build type: release
MPI Rank 1: 01/16/2018 19:05:25: 		Build target: GPU
MPI Rank 1: 01/16/2018 19:05:25: 		With ASGD: yes
MPI Rank 1: 01/16/2018 19:05:25: 		Math lib: mkl
MPI Rank 1: 01/16/2018 19:05:25: 		CUDA version: 9.0.0
MPI Rank 1: 01/16/2018 19:05:25: 		CUDNN version: 7.0.4
MPI Rank 1: 01/16/2018 19:05:25: 		Build Branch: HEAD
MPI Rank 1: 01/16/2018 19:05:25: 		Build SHA1: c4c2ce8c6e89b5c32e4d07523081283417bcfc6d
MPI Rank 1: 01/16/2018 19:05:25: 		MPI distribution: Open MPI
MPI Rank 1: 01/16/2018 19:05:25: 		MPI version: 1.10.7
MPI Rank 1: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 1: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 1: 01/16/2018 19:05:25: GPU info:
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8025 MB
MPI Rank 1: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 1: 01/16/2018 19:05:25: Using 3 CPU threads.
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: ##############################################################################
MPI Rank 1: 01/16/2018 19:05:25: #                                                                            #
MPI Rank 1: 01/16/2018 19:05:25: # SimpleMultiGPU command (train action)                                      #
MPI Rank 1: 01/16/2018 19:05:25: #                                                                            #
MPI Rank 1: 01/16/2018 19:05:25: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: 
MPI Rank 1: Creating virgin network.
MPI Rank 1: SimpleNetworkBuilder Using GPU 0
MPI Rank 1: 01/16/2018 19:05:25: 
MPI Rank 1: Model has 25 nodes. Using GPU 0.
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 1: 01/16/2018 19:05:25: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: Allocating matrices for forward and/or backward propagation.
MPI Rank 1: 
MPI Rank 1: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 1: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 1: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 1: 
MPI Rank 1: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 1: 
MPI Rank 1: Here are the ones that share memory:
MPI Rank 1: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 1: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 1: 	{ H2 : [50 x 1 x *]
MPI Rank 1: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 1: 	  W1 : [50 x 50] (gradient)
MPI Rank 1: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 1: 	{ B0 : [50 x 1] (gradient)
MPI Rank 1: 	  H1 : [50 x 1 x *] }
MPI Rank 1: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 1: 	  W0 : [50 x 2] (gradient)
MPI Rank 1: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 1: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  W2*H1 : [2 x 1 x *]
MPI Rank 1: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 1: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 1: 	  HLast : [2 x 1 x *]
MPI Rank 1: 	  W0*features : [50 x *]
MPI Rank 1: 	  W0*features : [50 x *] (gradient) }
MPI Rank 1: 
MPI Rank 1: Here are the ones that don't share memory:
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 1: 	{LogOfPrior : [2]}
MPI Rank 1: 	{W2 : [2 x 50] (gradient)}
MPI Rank 1: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 1: 	{B2 : [2 x 1] (gradient)}
MPI Rank 1: 	{B1 : [50 x 1] (gradient)}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 1: 	{W2 : [2 x 50]}
MPI Rank 1: 	{B2 : [2 x 1]}
MPI Rank 1: 	{labels : [2 x *]}
MPI Rank 1: 	{Prior : [2]}
MPI Rank 1: 	{EvalClassificationError : [1]}
MPI Rank 1: 	{B0 : [50 x 1]}
MPI Rank 1: 	{W1 : [50 x 50]}
MPI Rank 1: 	{B1 : [50 x 1]}
MPI Rank 1: 	{MeanOfFeatures : [2]}
MPI Rank 1: 	{InvStdOfFeatures : [2]}
MPI Rank 1: 	{W0 : [50 x 2]}
MPI Rank 1: 	{features : [2 x *]}
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:25: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 1: 01/16/2018 19:05:25: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 1: 01/16/2018 19:05:25: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 1: 01/16/2018 19:05:25: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 1: 01/16/2018 19:05:25: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 1: 01/16/2018 19:05:25: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 1: 
MPI Rank 1: Initializing dataParallelSGD with FP32 aggregation.
MPI Rank 1: NcclComm: disabled, same device used by more than one rank
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:26: Precomputing --> 3 PreCompute nodes found.
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:26: 	MeanOfFeatures = Mean()
MPI Rank 1: 01/16/2018 19:05:26: 	InvStdOfFeatures = InvStdDev()
MPI Rank 1: 01/16/2018 19:05:26: 	Prior = Mean()
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:27: Precomputing --> Completed.
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:27: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:27: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.70007977 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0279s; samplesPerSecond = 8961.0
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71514543 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0427s; samplesPerSecond = 5857.4
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72945593 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0312s; samplesPerSecond = 8008.6
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70079058 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0282s; samplesPerSecond = 8860.1
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70605617 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0220s; samplesPerSecond = 11366.5
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71572398 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0281s; samplesPerSecond = 8899.6
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72149850 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.0278s; samplesPerSecond = 8979.3
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79845604 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0300s; samplesPerSecond = 8329.8
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69665185 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0256s; samplesPerSecond = 9775.1
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70723325 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.0278s; samplesPerSecond = 9004.1
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71420345 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0316s; samplesPerSecond = 7923.8
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69535258 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.0302s; samplesPerSecond = 8286.3
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70078532 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.0283s; samplesPerSecond = 8838.7
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71857914 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.0298s; samplesPerSecond = 8385.7
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72088357 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.0223s; samplesPerSecond = 11213.1
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71798840 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0238s; samplesPerSecond = 10483.1
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74162164 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0258s; samplesPerSecond = 9704.3
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71835127 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.0226s; samplesPerSecond = 11040.7
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71529461 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.0253s; samplesPerSecond = 9879.9
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71727656 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.0281s; samplesPerSecond = 8908.6
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71745516 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0218s; samplesPerSecond = 11461.3
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72088398 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0242s; samplesPerSecond = 10325.2
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72006809 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.0293s; samplesPerSecond = 8536.4
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71275468 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0286s; samplesPerSecond = 8744.3
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69644781 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0277s; samplesPerSecond = 9037.1
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70129698 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0275s; samplesPerSecond = 9105.8
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70768095 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.0281s; samplesPerSecond = 8899.6
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69744379 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.0227s; samplesPerSecond = 11007.5
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69266187 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.0329s; samplesPerSecond = 7601.9
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69347266 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.0286s; samplesPerSecond = 8734.4
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69257409 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0276s; samplesPerSecond = 9065.4
MPI Rank 1: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.68625741 * 250; EvalClassificationError = 0.38000000 * 250; time = 0.0237s; samplesPerSecond = 10551.8
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69064011 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0260s; samplesPerSecond = 9602.5
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.70192153 * 250; EvalClassificationError = 0.46000000 * 250; time = 0.0243s; samplesPerSecond = 10269.2
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.69058912 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0286s; samplesPerSecond = 8726.2
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.67041492 * 250; EvalClassificationError = 0.39200000 * 250; time = 0.0369s; samplesPerSecond = 6766.8
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.65913973 * 250; EvalClassificationError = 0.35600000 * 250; time = 0.0222s; samplesPerSecond = 11240.0
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.63919877 * 250; EvalClassificationError = 0.36400000 * 250; time = 0.0215s; samplesPerSecond = 11614.2
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.61293885 * 250; EvalClassificationError = 0.19200000 * 250; time = 0.0301s; samplesPerSecond = 8311.6
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.55255355 * 250; EvalClassificationError = 0.18800000 * 250; time = 0.0300s; samplesPerSecond = 8345.6
MPI Rank 1: 01/16/2018 19:05:28: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70019555 * 10000; EvalClassificationError = 0.47350000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=1.10542s
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:28: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:28: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.50774630 * 250; EvalClassificationError = 0.24000000 * 250; time = 0.0251s; samplesPerSecond = 9952.6
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.43388933 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0265s; samplesPerSecond = 9441.1
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.36674877 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0244s; samplesPerSecond = 10247.5
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.33768770 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0315s; samplesPerSecond = 7930.4
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.30320952 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0250s; samplesPerSecond = 9992.6
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.29576047 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0312s; samplesPerSecond = 8015.7
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.24924496 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0275s; samplesPerSecond = 9103.4
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.24632418 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0198s; samplesPerSecond = 12596.1
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.20943161 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0242s; samplesPerSecond = 10349.1
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.19115999 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0257s; samplesPerSecond = 9746.3
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.17923233 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0234s; samplesPerSecond = 10661.5
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.17075425 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0285s; samplesPerSecond = 8761.3
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.14442373 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0240s; samplesPerSecond = 10412.8
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.17753820 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0238s; samplesPerSecond = 10494.3
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.15087857 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0300s; samplesPerSecond = 8344.2
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.19253022 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0321s; samplesPerSecond = 7784.9
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17830684 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0351s; samplesPerSecond = 7123.7
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.15115429 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0223s; samplesPerSecond = 11229.1
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19135969 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0201s; samplesPerSecond = 12418.8
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.21491485 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0284s; samplesPerSecond = 8794.3
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18682346 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0235s; samplesPerSecond = 10617.6
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18483206 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0218s; samplesPerSecond = 11484.8
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14684504 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0300s; samplesPerSecond = 8343.0
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.15322117 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0299s; samplesPerSecond = 8372.7
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.19882571 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0317s; samplesPerSecond = 7881.3
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.13683833 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0259s; samplesPerSecond = 9660.6
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18621189 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0235s; samplesPerSecond = 10652.2
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19408050 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0326s; samplesPerSecond = 7677.0
MPI Rank 1: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17298137 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0264s; samplesPerSecond = 9452.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.13265130 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0287s; samplesPerSecond = 8716.6
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17627179 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0313s; samplesPerSecond = 7992.4
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12734628 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0263s; samplesPerSecond = 9510.2
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15108452 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0251s; samplesPerSecond = 9979.8
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19729184 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0264s; samplesPerSecond = 9458.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12857333 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0412s; samplesPerSecond = 6061.2
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13867804 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0261s; samplesPerSecond = 9572.6
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12786050 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0243s; samplesPerSecond = 10309.2
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16643303 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0317s; samplesPerSecond = 7875.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20440407 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0259s; samplesPerSecond = 9652.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14566238 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0206s; samplesPerSecond = 12121.0
MPI Rank 1: 01/16/2018 19:05:29: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.20373031 * 10000; EvalClassificationError = 0.08270000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=1.08554s
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:29: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:29: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12590085 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0250s; samplesPerSecond = 10018.1
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17780229 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0225s; samplesPerSecond = 11096.4
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14417637 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0295s; samplesPerSecond = 8461.9
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15796897 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0243s; samplesPerSecond = 10267.8
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17002999 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0264s; samplesPerSecond = 9461.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18262114 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0290s; samplesPerSecond = 8628.2
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14643695 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0217s; samplesPerSecond = 11537.5
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18030529 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0262s; samplesPerSecond = 9529.4
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15846151 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0235s; samplesPerSecond = 10637.6
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14486534 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0285s; samplesPerSecond = 8784.8
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13469094 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0236s; samplesPerSecond = 10580.2
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13720020 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0419s; samplesPerSecond = 5967.5
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11641296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0319s; samplesPerSecond = 7835.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16786646 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0240s; samplesPerSecond = 10419.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12811514 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0218s; samplesPerSecond = 11490.5
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17257851 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0283s; samplesPerSecond = 8848.1
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17623655 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0217s; samplesPerSecond = 11534.8
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14121117 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0245s; samplesPerSecond = 10204.8
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19243443 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0297s; samplesPerSecond = 8427.5
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20908162 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0256s; samplesPerSecond = 9767.9
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18472067 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0257s; samplesPerSecond = 9744.8
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18185536 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0303s; samplesPerSecond = 8248.3
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14074205 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0279s; samplesPerSecond = 8970.4
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14871620 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0281s; samplesPerSecond = 8898.3
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20299705 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0287s; samplesPerSecond = 8711.7
MPI Rank 1: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12852038 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0234s; samplesPerSecond = 10690.6
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18660439 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0316s; samplesPerSecond = 7921.7
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19575997 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0243s; samplesPerSecond = 10284.7
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16667676 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0277s; samplesPerSecond = 9027.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12526168 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0222s; samplesPerSecond = 11276.7
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17392132 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0282s; samplesPerSecond = 8878.2
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12281615 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0364s; samplesPerSecond = 6865.9
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14759390 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0245s; samplesPerSecond = 10192.2
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19801301 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0228s; samplesPerSecond = 10941.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12593395 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0236s; samplesPerSecond = 10593.0
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13756617 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0257s; samplesPerSecond = 9740.8
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12838525 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0259s; samplesPerSecond = 9636.6
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16654369 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0288s; samplesPerSecond = 8687.2
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20658951 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0273s; samplesPerSecond = 9156.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14583322 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0238s; samplesPerSecond = 10515.2
MPI Rank 1: 01/16/2018 19:05:30: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15948618 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=1.07032s
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:30: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:30: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12371232 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0275s; samplesPerSecond = 9077.9
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18070514 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0192s; samplesPerSecond = 13033.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14239731 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0274s; samplesPerSecond = 9137.4
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15630155 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0272s; samplesPerSecond = 9190.3
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16935525 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0246s; samplesPerSecond = 10148.5
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18198833 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0294s; samplesPerSecond = 8516.7
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14475946 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0236s; samplesPerSecond = 10602.7
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18021601 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0248s; samplesPerSecond = 10064.5
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15849308 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0298s; samplesPerSecond = 8386.7
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14474426 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0314s; samplesPerSecond = 7954.8
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13362926 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0259s; samplesPerSecond = 9658.5
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13708299 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0242s; samplesPerSecond = 10309.9
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11569776 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0257s; samplesPerSecond = 9718.9
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16892331 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0220s; samplesPerSecond = 11377.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12752163 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0258s; samplesPerSecond = 9679.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17100866 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0255s; samplesPerSecond = 9799.6
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17660425 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0246s; samplesPerSecond = 10182.3
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14105803 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0216s; samplesPerSecond = 11552.9
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19333553 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0286s; samplesPerSecond = 8736.8
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20859524 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0315s; samplesPerSecond = 7939.2
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18499676 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0259s; samplesPerSecond = 9655.5
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18152439 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0259s; samplesPerSecond = 9669.3
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14037158 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0263s; samplesPerSecond = 9488.1
MPI Rank 1: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14866863 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0236s; samplesPerSecond = 10586.5
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20347746 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0283s; samplesPerSecond = 8834.5
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12815013 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0238s; samplesPerSecond = 10483.1
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18672809 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0273s; samplesPerSecond = 9144.8
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19552989 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0305s; samplesPerSecond = 8206.7
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16452642 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0439s; samplesPerSecond = 5689.2
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12461825 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0245s; samplesPerSecond = 10185.9
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17285251 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0272s; samplesPerSecond = 9178.2
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12253619 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0240s; samplesPerSecond = 10432.5
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14723334 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0378s; samplesPerSecond = 6618.8
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789538 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0217s; samplesPerSecond = 11499.2
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12575877 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0262s; samplesPerSecond = 9546.5
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13745928 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0381s; samplesPerSecond = 6568.5
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12839652 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0337s; samplesPerSecond = 7426.4
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16647280 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0265s; samplesPerSecond = 9429.8
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20679434 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0325s; samplesPerSecond = 7700.8
MPI Rank 1: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14585245 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0264s; samplesPerSecond = 9454.0
MPI Rank 1: 01/16/2018 19:05:31: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15914931 * 10000; EvalClassificationError = 0.07670000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=1.09849s
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:31: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: 01/16/2018 19:05:31: __COMPLETED__
MPI Rank 2: CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24
MPI Rank 2: 
MPI Rank 2: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
MPI Rank 2: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 2: 01/16/2018 19:05:25: Build info: 
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:25: 		Built time: Jan 16 2018 16:15:42
MPI Rank 2: 01/16/2018 19:05:25: 		Last modified date: Tue Jan 16 16:13:51 2018
MPI Rank 2: 01/16/2018 19:05:25: 		Build type: release
MPI Rank 2: 01/16/2018 19:05:25: 		Build target: GPU
MPI Rank 2: 01/16/2018 19:05:25: 		With ASGD: yes
MPI Rank 2: 01/16/2018 19:05:25: 		Math lib: mkl
MPI Rank 2: 01/16/2018 19:05:25: 		CUDA version: 9.0.0
MPI Rank 2: 01/16/2018 19:05:25: 		CUDNN version: 7.0.4
MPI Rank 2: 01/16/2018 19:05:25: 		Build Branch: HEAD
MPI Rank 2: 01/16/2018 19:05:25: 		Build SHA1: c4c2ce8c6e89b5c32e4d07523081283417bcfc6d
MPI Rank 2: 01/16/2018 19:05:25: 		MPI distribution: Open MPI
MPI Rank 2: 01/16/2018 19:05:25: 		MPI version: 1.10.7
MPI Rank 2: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 2: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 2: 01/16/2018 19:05:25: GPU info:
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:25: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 7939 MB
MPI Rank 2: 01/16/2018 19:05:25: -------------------------------------------------------------------
MPI Rank 2: 01/16/2018 19:05:25: Using 3 CPU threads.
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:25: ##############################################################################
MPI Rank 2: 01/16/2018 19:05:25: #                                                                            #
MPI Rank 2: 01/16/2018 19:05:25: # SimpleMultiGPU command (train action)                                      #
MPI Rank 2: 01/16/2018 19:05:25: #                                                                            #
MPI Rank 2: 01/16/2018 19:05:25: ##############################################################################
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:25: 
MPI Rank 2: Creating virgin network.
MPI Rank 2: SimpleNetworkBuilder Using GPU 0
MPI Rank 2: 01/16/2018 19:05:26: 
MPI Rank 2: Model has 25 nodes. Using GPU 0.
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:26: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 2: 01/16/2018 19:05:26: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: Allocating matrices for forward and/or backward propagation.
MPI Rank 2: 
MPI Rank 2: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 2: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 2: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 2: 
MPI Rank 2: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 2: 
MPI Rank 2: Here are the ones that share memory:
MPI Rank 2: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 2: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 2: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 2: 	  W0 : [50 x 2] (gradient)
MPI Rank 2: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 2: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  W2*H1 : [2 x 1 x *]
MPI Rank 2: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 2: 	{ B0 : [50 x 1] (gradient)
MPI Rank 2: 	  H1 : [50 x 1 x *] }
MPI Rank 2: 	{ H2 : [50 x 1 x *]
MPI Rank 2: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 2: 	  W1 : [50 x 50] (gradient)
MPI Rank 2: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 2: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 2: 	  HLast : [2 x 1 x *]
MPI Rank 2: 	  W0*features : [50 x *]
MPI Rank 2: 	  W0*features : [50 x *] (gradient) }
MPI Rank 2: 
MPI Rank 2: Here are the ones that don't share memory:
MPI Rank 2: 	{features : [2 x *]}
MPI Rank 2: 	{MeanOfFeatures : [2]}
MPI Rank 2: 	{InvStdOfFeatures : [2]}
MPI Rank 2: 	{W0 : [50 x 2]}
MPI Rank 2: 	{B2 : [2 x 1]}
MPI Rank 2: 	{labels : [2 x *]}
MPI Rank 2: 	{Prior : [2]}
MPI Rank 2: 	{B0 : [50 x 1]}
MPI Rank 2: 	{W1 : [50 x 50]}
MPI Rank 2: 	{B1 : [50 x 1]}
MPI Rank 2: 	{W2 : [2 x 50]}
MPI Rank 2: 	{EvalClassificationError : [1]}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 2: 	{W2 : [2 x 50] (gradient)}
MPI Rank 2: 	{LogOfPrior : [2]}
MPI Rank 2: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 2: 	{B1 : [50 x 1] (gradient)}
MPI Rank 2: 	{B2 : [2 x 1] (gradient)}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:26: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:26: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 2: 01/16/2018 19:05:26: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 2: 01/16/2018 19:05:26: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 2: 01/16/2018 19:05:26: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 2: 01/16/2018 19:05:26: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 2: 01/16/2018 19:05:26: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 2: 
MPI Rank 2: Initializing dataParallelSGD with FP32 aggregation.
MPI Rank 2: NcclComm: disabled, same device used by more than one rank
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:26: Precomputing --> 3 PreCompute nodes found.
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:26: 	MeanOfFeatures = Mean()
MPI Rank 2: 01/16/2018 19:05:26: 	InvStdOfFeatures = InvStdDev()
MPI Rank 2: 01/16/2018 19:05:26: 	Prior = Mean()
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:26: Precomputing --> Completed.
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:27: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:27: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.70007977 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0279s; samplesPerSecond = 8962.5
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71514543 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0427s; samplesPerSecond = 5859.0
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72945593 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0312s; samplesPerSecond = 8002.3
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70079058 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0282s; samplesPerSecond = 8861.4
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70605617 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0220s; samplesPerSecond = 11356.4
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71572398 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0279s; samplesPerSecond = 8960.4
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72149850 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.0278s; samplesPerSecond = 8980.3
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79845604 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0299s; samplesPerSecond = 8356.7
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69665185 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0258s; samplesPerSecond = 9672.5
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70723325 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.0276s; samplesPerSecond = 9062.8
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71420345 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0316s; samplesPerSecond = 7909.2
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69535258 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.0304s; samplesPerSecond = 8213.3
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70078532 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.0281s; samplesPerSecond = 8881.5
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71857914 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.0298s; samplesPerSecond = 8386.2
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72088357 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.0223s; samplesPerSecond = 11193.0
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71798840 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0237s; samplesPerSecond = 10543.0
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74162164 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0256s; samplesPerSecond = 9767.5
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71835127 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.0228s; samplesPerSecond = 10987.1
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71529461 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.0253s; samplesPerSecond = 9881.5
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71727656 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.0279s; samplesPerSecond = 8964.6
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71745516 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0218s; samplesPerSecond = 11464.0
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72088398 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0245s; samplesPerSecond = 10222.1
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72006809 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.0293s; samplesPerSecond = 8544.4
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71275468 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0286s; samplesPerSecond = 8743.0
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69644781 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0277s; samplesPerSecond = 9031.4
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70129698 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0275s; samplesPerSecond = 9105.3
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70768095 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.0281s; samplesPerSecond = 8901.3
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69744379 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.0227s; samplesPerSecond = 11012.5
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69266187 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.0329s; samplesPerSecond = 7600.3
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69347266 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.0286s; samplesPerSecond = 8735.2
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69257409 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0276s; samplesPerSecond = 9065.9
MPI Rank 2: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.68625741 * 250; EvalClassificationError = 0.38000000 * 250; time = 0.0237s; samplesPerSecond = 10553.1
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69064011 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0260s; samplesPerSecond = 9603.5
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.70192153 * 250; EvalClassificationError = 0.46000000 * 250; time = 0.0243s; samplesPerSecond = 10270.1
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.69058912 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0286s; samplesPerSecond = 8735.7
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.67041492 * 250; EvalClassificationError = 0.39200000 * 250; time = 0.0370s; samplesPerSecond = 6749.4
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.65913973 * 250; EvalClassificationError = 0.35600000 * 250; time = 0.0222s; samplesPerSecond = 11258.3
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.63919877 * 250; EvalClassificationError = 0.36400000 * 250; time = 0.0216s; samplesPerSecond = 11600.7
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.61293885 * 250; EvalClassificationError = 0.19200000 * 250; time = 0.0301s; samplesPerSecond = 8316.8
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.55255355 * 250; EvalClassificationError = 0.18800000 * 250; time = 0.0300s; samplesPerSecond = 8338.0
MPI Rank 2: 01/16/2018 19:05:28: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70019555 * 10000; EvalClassificationError = 0.47350000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=1.10545s
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:28: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:28: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.50774630 * 250; EvalClassificationError = 0.24000000 * 250; time = 0.0252s; samplesPerSecond = 9926.3
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.43388933 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0266s; samplesPerSecond = 9396.3
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.36674877 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0244s; samplesPerSecond = 10232.4
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.33768770 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0315s; samplesPerSecond = 7932.2
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.30320952 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0249s; samplesPerSecond = 10022.1
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.29576047 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0312s; samplesPerSecond = 8017.2
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.24924496 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0275s; samplesPerSecond = 9104.4
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.24632418 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0198s; samplesPerSecond = 12596.6
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.20943161 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0240s; samplesPerSecond = 10427.5
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.19115999 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0258s; samplesPerSecond = 9697.0
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.17923233 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0235s; samplesPerSecond = 10644.6
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.17075425 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0285s; samplesPerSecond = 8763.2
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.14442373 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0240s; samplesPerSecond = 10414.6
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.17753820 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0238s; samplesPerSecond = 10496.4
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.15087857 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0299s; samplesPerSecond = 8358.9
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.19253022 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0320s; samplesPerSecond = 7817.3
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17830684 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0353s; samplesPerSecond = 7078.0
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.15115429 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0223s; samplesPerSecond = 11223.5
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19135969 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0201s; samplesPerSecond = 12420.6
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.21491485 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0284s; samplesPerSecond = 8796.5
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18682346 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0235s; samplesPerSecond = 10618.7
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18483206 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0218s; samplesPerSecond = 11466.7
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14684504 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0300s; samplesPerSecond = 8329.6
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.15322117 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0301s; samplesPerSecond = 8308.0
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.19882571 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0316s; samplesPerSecond = 7923.2
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.13683833 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0259s; samplesPerSecond = 9653.8
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18621189 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0234s; samplesPerSecond = 10676.5
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19408050 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0326s; samplesPerSecond = 7666.9
MPI Rank 2: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17298137 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0264s; samplesPerSecond = 9453.6
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.13265130 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0287s; samplesPerSecond = 8719.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17627179 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0315s; samplesPerSecond = 7940.6
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12734628 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0261s; samplesPerSecond = 9585.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15108452 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0253s; samplesPerSecond = 9900.7
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19729184 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0261s; samplesPerSecond = 9561.4
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12857333 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0412s; samplesPerSecond = 6065.1
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13867804 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0261s; samplesPerSecond = 9571.6
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12786050 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0241s; samplesPerSecond = 10385.1
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16643303 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0319s; samplesPerSecond = 7831.7
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20440407 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0259s; samplesPerSecond = 9653.0
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14566238 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0204s; samplesPerSecond = 12227.6
MPI Rank 2: 01/16/2018 19:05:29: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.20373031 * 10000; EvalClassificationError = 0.08270000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=1.08559s
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:29: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:29: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12590085 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0250s; samplesPerSecond = 9996.8
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17780229 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0223s; samplesPerSecond = 11187.0
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14417637 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0305s; samplesPerSecond = 8190.3
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15796897 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0236s; samplesPerSecond = 10615.1
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17002999 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0265s; samplesPerSecond = 9446.0
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18262114 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0292s; samplesPerSecond = 8576.0
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14643695 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0214s; samplesPerSecond = 11688.3
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18030529 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0262s; samplesPerSecond = 9530.6
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15846151 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0235s; samplesPerSecond = 10639.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14486534 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0283s; samplesPerSecond = 8823.8
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13469094 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0238s; samplesPerSecond = 10500.8
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13720020 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0419s; samplesPerSecond = 5966.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11641296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0319s; samplesPerSecond = 7839.7
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16786646 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0240s; samplesPerSecond = 10421.3
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12811514 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0218s; samplesPerSecond = 11489.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17257851 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0282s; samplesPerSecond = 8853.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17623655 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0217s; samplesPerSecond = 11537.8
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14121117 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0243s; samplesPerSecond = 10279.9
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19243443 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0297s; samplesPerSecond = 8427.8
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20908162 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0258s; samplesPerSecond = 9703.3
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18472067 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0257s; samplesPerSecond = 9719.3
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18185536 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0303s; samplesPerSecond = 8250.2
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14074205 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0279s; samplesPerSecond = 8972.0
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14871620 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0281s; samplesPerSecond = 8899.1
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20299705 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0287s; samplesPerSecond = 8712.5
MPI Rank 2: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12852038 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0233s; samplesPerSecond = 10710.7
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18660439 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0316s; samplesPerSecond = 7912.1
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19575997 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0243s; samplesPerSecond = 10286.0
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16667676 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0277s; samplesPerSecond = 9027.9
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12526168 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0220s; samplesPerSecond = 11369.2
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17392132 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0283s; samplesPerSecond = 8821.2
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12281615 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0364s; samplesPerSecond = 6872.7
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14759390 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0245s; samplesPerSecond = 10189.4
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19801301 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0228s; samplesPerSecond = 10943.8
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12593395 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0236s; samplesPerSecond = 10593.5
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13756617 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0257s; samplesPerSecond = 9741.2
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12838525 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0261s; samplesPerSecond = 9571.8
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16654369 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0285s; samplesPerSecond = 8776.8
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20658951 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0275s; samplesPerSecond = 9098.3
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14583322 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0237s; samplesPerSecond = 10551.7
MPI Rank 2: 01/16/2018 19:05:30: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15948618 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=1.07037s
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:30: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:30: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12371232 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0276s; samplesPerSecond = 9056.4
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18070514 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0192s; samplesPerSecond = 13031.1
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14239731 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0273s; samplesPerSecond = 9141.8
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15630155 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0272s; samplesPerSecond = 9192.9
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16935525 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0246s; samplesPerSecond = 10149.5
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18198833 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0294s; samplesPerSecond = 8516.4
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14475946 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0236s; samplesPerSecond = 10613.8
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18021601 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0248s; samplesPerSecond = 10066.1
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15849308 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0298s; samplesPerSecond = 8386.4
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14474426 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0313s; samplesPerSecond = 7999.7
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13362926 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0259s; samplesPerSecond = 9646.4
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13708299 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0242s; samplesPerSecond = 10313.7
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11569776 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0257s; samplesPerSecond = 9711.9
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16892331 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0221s; samplesPerSecond = 11319.2
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12752163 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0258s; samplesPerSecond = 9685.4
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17100866 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0255s; samplesPerSecond = 9793.6
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17660425 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0246s; samplesPerSecond = 10181.7
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14105803 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0216s; samplesPerSecond = 11554.9
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19333553 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0284s; samplesPerSecond = 8787.7
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20859524 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0317s; samplesPerSecond = 7894.3
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18499676 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0259s; samplesPerSecond = 9657.5
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18152439 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0259s; samplesPerSecond = 9667.9
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14037158 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0263s; samplesPerSecond = 9489.8
MPI Rank 2: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14866863 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0234s; samplesPerSecond = 10664.6
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20347746 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0283s; samplesPerSecond = 8834.6
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12815013 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0238s; samplesPerSecond = 10513.5
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18672809 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0276s; samplesPerSecond = 9063.7
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19552989 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0304s; samplesPerSecond = 8214.8
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16452642 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0440s; samplesPerSecond = 5685.4
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12461825 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0246s; samplesPerSecond = 10183.3
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17285251 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0272s; samplesPerSecond = 9182.3
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12253619 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0240s; samplesPerSecond = 10432.9
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14723334 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0377s; samplesPerSecond = 6624.2
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789538 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0218s; samplesPerSecond = 11489.4
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12575877 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0263s; samplesPerSecond = 9518.5
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13745928 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0379s; samplesPerSecond = 6600.7
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12839652 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0338s; samplesPerSecond = 7387.2
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16647280 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0265s; samplesPerSecond = 9433.9
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20679434 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0323s; samplesPerSecond = 7741.8
MPI Rank 2: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14585245 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0266s; samplesPerSecond = 9392.4
MPI Rank 2: 01/16/2018 19:05:31: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15914931 * 10000; EvalClassificationError = 0.07670000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=1.09854s
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:31: Action "train" complete.
MPI Rank 2: 
MPI Rank 2: 01/16/2018 19:05:31: __COMPLETED__
MPI Rank 3: CNTK 2.3.1+ (HEAD c4c2ce, Jan 16 2018 16:21:59) at 2018/01/16 19:05:24
MPI Rank 3: 
MPI Rank 3: /home/ubuntu/workspace/build/gpu/release/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../../SimpleMultiGPU.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  RunDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/ParallelTraining/NoQuantization/SinglePrecision/../..  OutputDir=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  precision=float  SimpleMultiGPU=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=32]]]]  stderr=/tmp/cntk-test-20180116190516.17566/ParallelTraining/NoQuantization_SinglePrecision@release_gpu/stderr
MPI Rank 3: 01/16/2018 19:05:26: -------------------------------------------------------------------
MPI Rank 3: 01/16/2018 19:05:26: Build info: 
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: 		Built time: Jan 16 2018 16:15:42
MPI Rank 3: 01/16/2018 19:05:26: 		Last modified date: Tue Jan 16 16:13:51 2018
MPI Rank 3: 01/16/2018 19:05:26: 		Build type: release
MPI Rank 3: 01/16/2018 19:05:26: 		Build target: GPU
MPI Rank 3: 01/16/2018 19:05:26: 		With ASGD: yes
MPI Rank 3: 01/16/2018 19:05:26: 		Math lib: mkl
MPI Rank 3: 01/16/2018 19:05:26: 		CUDA version: 9.0.0
MPI Rank 3: 01/16/2018 19:05:26: 		CUDNN version: 7.0.4
MPI Rank 3: 01/16/2018 19:05:26: 		Build Branch: HEAD
MPI Rank 3: 01/16/2018 19:05:26: 		Build SHA1: c4c2ce8c6e89b5c32e4d07523081283417bcfc6d
MPI Rank 3: 01/16/2018 19:05:26: 		MPI distribution: Open MPI
MPI Rank 3: 01/16/2018 19:05:26: 		MPI version: 1.10.7
MPI Rank 3: 01/16/2018 19:05:26: -------------------------------------------------------------------
MPI Rank 3: 01/16/2018 19:05:26: -------------------------------------------------------------------
MPI Rank 3: 01/16/2018 19:05:26: GPU info:
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 7852 MB
MPI Rank 3: 01/16/2018 19:05:26: -------------------------------------------------------------------
MPI Rank 3: 01/16/2018 19:05:26: Using 3 CPU threads.
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: ##############################################################################
MPI Rank 3: 01/16/2018 19:05:26: #                                                                            #
MPI Rank 3: 01/16/2018 19:05:26: # SimpleMultiGPU command (train action)                                      #
MPI Rank 3: 01/16/2018 19:05:26: #                                                                            #
MPI Rank 3: 01/16/2018 19:05:26: ##############################################################################
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: 
MPI Rank 3: Creating virgin network.
MPI Rank 3: SimpleNetworkBuilder Using GPU 0
MPI Rank 3: 01/16/2018 19:05:26: 
MPI Rank 3: Model has 25 nodes. Using GPU 0.
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 3: 01/16/2018 19:05:26: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 3: 
MPI Rank 3: 
MPI Rank 3: Allocating matrices for forward and/or backward propagation.
MPI Rank 3: 
MPI Rank 3: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 3: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 3: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 3: 
MPI Rank 3: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 3: 
MPI Rank 3: Here are the ones that share memory:
MPI Rank 3: 	{ PosteriorProb : [2 x 1 x *]
MPI Rank 3: 	  ScaledLogLikelihood : [2 x 1 x *] }
MPI Rank 3: 	{ B0 : [50 x 1] (gradient)
MPI Rank 3: 	  H1 : [50 x 1 x *] }
MPI Rank 3: 	{ HLast : [2 x 1 x *] (gradient)
MPI Rank 3: 	  W0 : [50 x 2] (gradient)
MPI Rank 3: 	  W0*features+B0 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  W1*H1 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  W1*H1+B1 : [50 x 1 x *]
MPI Rank 3: 	  W1*H1+B1 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  W2*H1 : [2 x 1 x *]
MPI Rank 3: 	  W2*H1 : [2 x 1 x *] (gradient) }
MPI Rank 3: 	{ H2 : [50 x 1 x *]
MPI Rank 3: 	  W0*features+B0 : [50 x 1 x *]
MPI Rank 3: 	  W1 : [50 x 50] (gradient)
MPI Rank 3: 	  W1*H1 : [50 x 1 x *] }
MPI Rank 3: 	{ H1 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  H2 : [50 x 1 x *] (gradient)
MPI Rank 3: 	  HLast : [2 x 1 x *]
MPI Rank 3: 	  W0*features : [50 x *]
MPI Rank 3: 	  W0*features : [50 x *] (gradient) }
MPI Rank 3: 
MPI Rank 3: Here are the ones that don't share memory:
MPI Rank 3: 	{features : [2 x *]}
MPI Rank 3: 	{MeanOfFeatures : [2]}
MPI Rank 3: 	{InvStdOfFeatures : [2]}
MPI Rank 3: 	{W0 : [50 x 2]}
MPI Rank 3: 	{B0 : [50 x 1]}
MPI Rank 3: 	{W1 : [50 x 50]}
MPI Rank 3: 	{B1 : [50 x 1]}
MPI Rank 3: 	{W2 : [2 x 50]}
MPI Rank 3: 	{B2 : [2 x 1]}
MPI Rank 3: 	{labels : [2 x *]}
MPI Rank 3: 	{Prior : [2]}
MPI Rank 3: 	{EvalClassificationError : [1]}
MPI Rank 3: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 3: 	{W2 : [2 x 50] (gradient)}
MPI Rank 3: 	{LogOfPrior : [2]}
MPI Rank 3: 	{MVNormalizedFeatures : [2 x *]}
MPI Rank 3: 	{B1 : [50 x 1] (gradient)}
MPI Rank 3: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 3: 	{B2 : [2 x 1] (gradient)}
MPI Rank 3: 
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: Training 2802 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: 	Node 'B0' (LearnableParameter operation) : [50 x 1]
MPI Rank 3: 01/16/2018 19:05:26: 	Node 'B1' (LearnableParameter operation) : [50 x 1]
MPI Rank 3: 01/16/2018 19:05:26: 	Node 'B2' (LearnableParameter operation) : [2 x 1]
MPI Rank 3: 01/16/2018 19:05:26: 	Node 'W0' (LearnableParameter operation) : [50 x 2]
MPI Rank 3: 01/16/2018 19:05:26: 	Node 'W1' (LearnableParameter operation) : [50 x 50]
MPI Rank 3: 01/16/2018 19:05:26: 	Node 'W2' (LearnableParameter operation) : [2 x 50]
MPI Rank 3: 
MPI Rank 3: Initializing dataParallelSGD with FP32 aggregation.
MPI Rank 3: NcclComm: disabled, same device used by more than one rank
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: Precomputing --> 3 PreCompute nodes found.
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: 	MeanOfFeatures = Mean()
MPI Rank 3: 01/16/2018 19:05:26: 	InvStdOfFeatures = InvStdDev()
MPI Rank 3: 01/16/2018 19:05:26: 	Prior = Mean()
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:26: Precomputing --> Completed.
MPI Rank 3: 
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:27: Starting Epoch 1: learning rate per sample = 0.020000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:27: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[   1-  10]: CrossEntropyWithSoftmax = 0.70007977 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0279s; samplesPerSecond = 8951.2
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  11-  20]: CrossEntropyWithSoftmax = 0.71514543 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0426s; samplesPerSecond = 5863.2
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  21-  30]: CrossEntropyWithSoftmax = 0.72945593 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0316s; samplesPerSecond = 7910.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  31-  40]: CrossEntropyWithSoftmax = 0.70079058 * 250; EvalClassificationError = 0.52400000 * 250; time = 0.0276s; samplesPerSecond = 9046.5
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  41-  50]: CrossEntropyWithSoftmax = 0.70605617 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0222s; samplesPerSecond = 11266.0
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  51-  60]: CrossEntropyWithSoftmax = 0.71572398 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0278s; samplesPerSecond = 8977.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  61-  70]: CrossEntropyWithSoftmax = 0.72149850 * 250; EvalClassificationError = 0.48000000 * 250; time = 0.0278s; samplesPerSecond = 8980.7
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  71-  80]: CrossEntropyWithSoftmax = 0.79845604 * 250; EvalClassificationError = 0.47600000 * 250; time = 0.0299s; samplesPerSecond = 8356.5
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  81-  90]: CrossEntropyWithSoftmax = 0.69665185 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0258s; samplesPerSecond = 9672.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[  91- 100]: CrossEntropyWithSoftmax = 0.70723325 * 250; EvalClassificationError = 0.49200000 * 250; time = 0.0276s; samplesPerSecond = 9063.6
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 101- 110]: CrossEntropyWithSoftmax = 0.71420345 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0316s; samplesPerSecond = 7908.5
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 111- 120]: CrossEntropyWithSoftmax = 0.69535258 * 250; EvalClassificationError = 0.43600000 * 250; time = 0.0303s; samplesPerSecond = 8258.8
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 121- 130]: CrossEntropyWithSoftmax = 0.70078532 * 250; EvalClassificationError = 0.44000000 * 250; time = 0.0281s; samplesPerSecond = 8892.2
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 131- 140]: CrossEntropyWithSoftmax = 0.71857914 * 250; EvalClassificationError = 0.54800000 * 250; time = 0.0298s; samplesPerSecond = 8386.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 141- 150]: CrossEntropyWithSoftmax = 0.72088357 * 250; EvalClassificationError = 0.48800000 * 250; time = 0.0225s; samplesPerSecond = 11125.0
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 151- 160]: CrossEntropyWithSoftmax = 0.71798840 * 250; EvalClassificationError = 0.55200000 * 250; time = 0.0237s; samplesPerSecond = 10562.3
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 161- 170]: CrossEntropyWithSoftmax = 0.74162164 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0258s; samplesPerSecond = 9706.8
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 171- 180]: CrossEntropyWithSoftmax = 0.71835127 * 250; EvalClassificationError = 0.51600000 * 250; time = 0.0228s; samplesPerSecond = 10974.6
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 181- 190]: CrossEntropyWithSoftmax = 0.71529461 * 250; EvalClassificationError = 0.48400000 * 250; time = 0.0253s; samplesPerSecond = 9881.5
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 191- 200]: CrossEntropyWithSoftmax = 0.71727656 * 250; EvalClassificationError = 0.53200000 * 250; time = 0.0279s; samplesPerSecond = 8965.2
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 201- 210]: CrossEntropyWithSoftmax = 0.71745516 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0218s; samplesPerSecond = 11461.3
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 211- 220]: CrossEntropyWithSoftmax = 0.72088398 * 250; EvalClassificationError = 0.50000000 * 250; time = 0.0243s; samplesPerSecond = 10303.2
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 221- 230]: CrossEntropyWithSoftmax = 0.72006809 * 250; EvalClassificationError = 0.50800000 * 250; time = 0.0294s; samplesPerSecond = 8489.5
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 231- 240]: CrossEntropyWithSoftmax = 0.71275468 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0286s; samplesPerSecond = 8742.6
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 241- 250]: CrossEntropyWithSoftmax = 0.69644781 * 250; EvalClassificationError = 0.50400000 * 250; time = 0.0277s; samplesPerSecond = 9025.3
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 251- 260]: CrossEntropyWithSoftmax = 0.70129698 * 250; EvalClassificationError = 0.51200000 * 250; time = 0.0275s; samplesPerSecond = 9105.6
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 261- 270]: CrossEntropyWithSoftmax = 0.70768095 * 250; EvalClassificationError = 0.54400000 * 250; time = 0.0281s; samplesPerSecond = 8900.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 271- 280]: CrossEntropyWithSoftmax = 0.69744379 * 250; EvalClassificationError = 0.52800000 * 250; time = 0.0227s; samplesPerSecond = 11026.3
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 281- 290]: CrossEntropyWithSoftmax = 0.69266187 * 250; EvalClassificationError = 0.44800000 * 250; time = 0.0329s; samplesPerSecond = 7593.1
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 291- 300]: CrossEntropyWithSoftmax = 0.69347266 * 250; EvalClassificationError = 0.49600000 * 250; time = 0.0284s; samplesPerSecond = 8789.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 301- 310]: CrossEntropyWithSoftmax = 0.69257409 * 250; EvalClassificationError = 0.54000000 * 250; time = 0.0278s; samplesPerSecond = 9006.4
MPI Rank 3: 01/16/2018 19:05:27:  Epoch[ 1 of 4]-Minibatch[ 311- 320]: CrossEntropyWithSoftmax = 0.68625741 * 250; EvalClassificationError = 0.38000000 * 250; time = 0.0237s; samplesPerSecond = 10553.2
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 321- 330]: CrossEntropyWithSoftmax = 0.69064011 * 250; EvalClassificationError = 0.46800000 * 250; time = 0.0260s; samplesPerSecond = 9603.5
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 331- 340]: CrossEntropyWithSoftmax = 0.70192153 * 250; EvalClassificationError = 0.46000000 * 250; time = 0.0243s; samplesPerSecond = 10270.4
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 341- 350]: CrossEntropyWithSoftmax = 0.69058912 * 250; EvalClassificationError = 0.52000000 * 250; time = 0.0286s; samplesPerSecond = 8741.5
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 351- 360]: CrossEntropyWithSoftmax = 0.67041492 * 250; EvalClassificationError = 0.39200000 * 250; time = 0.0369s; samplesPerSecond = 6779.9
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 361- 370]: CrossEntropyWithSoftmax = 0.65913973 * 250; EvalClassificationError = 0.35600000 * 250; time = 0.0224s; samplesPerSecond = 11159.2
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 371- 380]: CrossEntropyWithSoftmax = 0.63919877 * 250; EvalClassificationError = 0.36400000 * 250; time = 0.0216s; samplesPerSecond = 11597.6
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 381- 390]: CrossEntropyWithSoftmax = 0.61293885 * 250; EvalClassificationError = 0.19200000 * 250; time = 0.0299s; samplesPerSecond = 8348.5
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 1 of 4]-Minibatch[ 391- 400]: CrossEntropyWithSoftmax = 0.55255355 * 250; EvalClassificationError = 0.18800000 * 250; time = 0.0301s; samplesPerSecond = 8304.6
MPI Rank 3: 01/16/2018 19:05:28: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 0.70019555 * 10000; EvalClassificationError = 0.47350000 * 10000; totalSamplesSeen = 10000; learningRatePerSample = 0.02; epochTime=1.10551s
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:28: Starting Epoch 2: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:28: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.50774630 * 250; EvalClassificationError = 0.24000000 * 250; time = 0.0252s; samplesPerSecond = 9907.0
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.43388933 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0263s; samplesPerSecond = 9505.3
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.36674877 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0246s; samplesPerSecond = 10174.8
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.33768770 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0314s; samplesPerSecond = 7949.9
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.30320952 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0253s; samplesPerSecond = 9891.6
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.29576047 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0309s; samplesPerSecond = 8080.3
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.24924496 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0275s; samplesPerSecond = 9078.3
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.24632418 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0200s; samplesPerSecond = 12487.4
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.20943161 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0240s; samplesPerSecond = 10427.1
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.19115999 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0258s; samplesPerSecond = 9697.2
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.17923233 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0235s; samplesPerSecond = 10642.1
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.17075425 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0285s; samplesPerSecond = 8762.4
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.14442373 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0240s; samplesPerSecond = 10414.0
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.17753820 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0238s; samplesPerSecond = 10494.7
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.15087857 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0299s; samplesPerSecond = 8359.8
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.19253022 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0320s; samplesPerSecond = 7815.1
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17830684 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0351s; samplesPerSecond = 7117.0
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.15115429 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0225s; samplesPerSecond = 11124.8
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19135969 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0201s; samplesPerSecond = 12420.4
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.21491485 * 250; EvalClassificationError = 0.10400000 * 250; time = 0.0284s; samplesPerSecond = 8794.7
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18682346 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0235s; samplesPerSecond = 10618.3
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18483206 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0218s; samplesPerSecond = 11486.6
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14684504 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0300s; samplesPerSecond = 8343.2
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.15322117 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0299s; samplesPerSecond = 8371.2
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.19882571 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0317s; samplesPerSecond = 7884.7
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.13683833 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0259s; samplesPerSecond = 9636.1
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18621189 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0234s; samplesPerSecond = 10678.3
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19408050 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0325s; samplesPerSecond = 7691.5
MPI Rank 3: 01/16/2018 19:05:28:  Epoch[ 2 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.17298137 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0266s; samplesPerSecond = 9411.6
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.13265130 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0287s; samplesPerSecond = 8717.6
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17627179 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0312s; samplesPerSecond = 8003.3
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12734628 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0263s; samplesPerSecond = 9494.6
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.15108452 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0250s; samplesPerSecond = 9999.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19729184 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0265s; samplesPerSecond = 9444.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12857333 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0412s; samplesPerSecond = 6061.7
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13867804 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0261s; samplesPerSecond = 9571.1
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12786050 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0241s; samplesPerSecond = 10385.8
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16643303 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0319s; samplesPerSecond = 7832.3
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20440407 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0257s; samplesPerSecond = 9719.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 2 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14566238 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0206s; samplesPerSecond = 12119.9
MPI Rank 3: 01/16/2018 19:05:29: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 0.20373031 * 10000; EvalClassificationError = 0.08270000 * 10000; totalSamplesSeen = 20000; learningRatePerSample = 0.0080000004; epochTime=1.08565s
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:29: Starting Epoch 3: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:29: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12590085 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0251s; samplesPerSecond = 9957.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.17780229 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0224s; samplesPerSecond = 11183.8
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14417637 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0296s; samplesPerSecond = 8445.7
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15796897 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0245s; samplesPerSecond = 10214.6
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.17002999 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0264s; samplesPerSecond = 9460.9
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18262114 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0289s; samplesPerSecond = 8642.5
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14643695 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0217s; samplesPerSecond = 11516.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18030529 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0261s; samplesPerSecond = 9595.7
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15846151 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0237s; samplesPerSecond = 10557.9
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14486534 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0283s; samplesPerSecond = 8823.1
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13469094 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0236s; samplesPerSecond = 10578.9
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13720020 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0421s; samplesPerSecond = 5943.3
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11641296 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0317s; samplesPerSecond = 7878.1
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16786646 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0240s; samplesPerSecond = 10419.1
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12811514 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0219s; samplesPerSecond = 11408.1
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17257851 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0283s; samplesPerSecond = 8844.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17623655 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0217s; samplesPerSecond = 11536.3
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14121117 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0243s; samplesPerSecond = 10277.8
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19243443 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0297s; samplesPerSecond = 8428.6
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20908162 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0258s; samplesPerSecond = 9701.3
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18472067 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0255s; samplesPerSecond = 9794.7
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18185536 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0305s; samplesPerSecond = 8194.7
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14074205 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0279s; samplesPerSecond = 8972.0
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14871620 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0280s; samplesPerSecond = 8938.5
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20299705 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0288s; samplesPerSecond = 8672.5
MPI Rank 3: 01/16/2018 19:05:29:  Epoch[ 3 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12852038 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0232s; samplesPerSecond = 10768.1
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18660439 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0317s; samplesPerSecond = 7878.7
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19575997 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0243s; samplesPerSecond = 10286.3
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16667676 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0276s; samplesPerSecond = 9069.5
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12526168 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0221s; samplesPerSecond = 11301.5
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17392132 * 250; EvalClassificationError = 0.08800000 * 250; time = 0.0283s; samplesPerSecond = 8827.1
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12281615 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0364s; samplesPerSecond = 6874.3
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14759390 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0246s; samplesPerSecond = 10176.9
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19801301 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0227s; samplesPerSecond = 11025.7
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12593395 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0237s; samplesPerSecond = 10569.2
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13756617 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0258s; samplesPerSecond = 9693.2
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12838525 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0259s; samplesPerSecond = 9663.6
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16654369 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0287s; samplesPerSecond = 8716.5
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20658951 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0275s; samplesPerSecond = 9097.4
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 3 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14583322 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0238s; samplesPerSecond = 10516.3
MPI Rank 3: 01/16/2018 19:05:30: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 0.15948618 * 10000; EvalClassificationError = 0.07660000 * 10000; totalSamplesSeen = 30000; learningRatePerSample = 0.0080000004; epochTime=1.07043s
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:30: Starting Epoch 4: learning rate per sample = 0.008000  effective momentum = 0.900000  momentum as time constant = 237.3 samples
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:30: Starting minibatch loop, DataParallelSGD training (myRank = 3, numNodes = 4, numGradientBits = 32), distributed reading is ENABLED.
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[   1-  10, 2.50%]: CrossEntropyWithSoftmax = 0.12371232 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0277s; samplesPerSecond = 9038.8
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 5.00%]: CrossEntropyWithSoftmax = 0.18070514 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0192s; samplesPerSecond = 13040.0
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  21-  30, 7.50%]: CrossEntropyWithSoftmax = 0.14239731 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0274s; samplesPerSecond = 9136.3
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  31-  40, 10.00%]: CrossEntropyWithSoftmax = 0.15630155 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0272s; samplesPerSecond = 9191.0
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  41-  50, 12.50%]: CrossEntropyWithSoftmax = 0.16935525 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0245s; samplesPerSecond = 10222.2
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  51-  60, 15.00%]: CrossEntropyWithSoftmax = 0.18198833 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0295s; samplesPerSecond = 8466.1
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  61-  70, 17.50%]: CrossEntropyWithSoftmax = 0.14475946 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0236s; samplesPerSecond = 10604.2
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  71-  80, 20.00%]: CrossEntropyWithSoftmax = 0.18021601 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0248s; samplesPerSecond = 10065.3
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  81-  90, 22.50%]: CrossEntropyWithSoftmax = 0.15849308 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0296s; samplesPerSecond = 8436.8
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[  91- 100, 25.00%]: CrossEntropyWithSoftmax = 0.14474426 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0314s; samplesPerSecond = 7954.0
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 27.50%]: CrossEntropyWithSoftmax = 0.13362926 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0259s; samplesPerSecond = 9655.2
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 30.00%]: CrossEntropyWithSoftmax = 0.13708299 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0243s; samplesPerSecond = 10304.6
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 32.50%]: CrossEntropyWithSoftmax = 0.11569776 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0257s; samplesPerSecond = 9711.2
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 35.00%]: CrossEntropyWithSoftmax = 0.16892331 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0221s; samplesPerSecond = 11312.4
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 37.50%]: CrossEntropyWithSoftmax = 0.12752163 * 250; EvalClassificationError = 0.04800000 * 250; time = 0.0257s; samplesPerSecond = 9732.6
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 40.00%]: CrossEntropyWithSoftmax = 0.17100866 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0257s; samplesPerSecond = 9742.5
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 161- 170, 42.50%]: CrossEntropyWithSoftmax = 0.17660425 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0244s; samplesPerSecond = 10254.8
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 171- 180, 45.00%]: CrossEntropyWithSoftmax = 0.14105803 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0218s; samplesPerSecond = 11459.6
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 181- 190, 47.50%]: CrossEntropyWithSoftmax = 0.19333553 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0284s; samplesPerSecond = 8788.1
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 191- 200, 50.00%]: CrossEntropyWithSoftmax = 0.20859524 * 250; EvalClassificationError = 0.10000000 * 250; time = 0.0317s; samplesPerSecond = 7893.8
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 201- 210, 52.50%]: CrossEntropyWithSoftmax = 0.18499676 * 250; EvalClassificationError = 0.08000000 * 250; time = 0.0259s; samplesPerSecond = 9657.8
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 211- 220, 55.00%]: CrossEntropyWithSoftmax = 0.18152439 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0259s; samplesPerSecond = 9666.5
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 221- 230, 57.50%]: CrossEntropyWithSoftmax = 0.14037158 * 250; EvalClassificationError = 0.05600000 * 250; time = 0.0263s; samplesPerSecond = 9490.0
MPI Rank 3: 01/16/2018 19:05:30:  Epoch[ 4 of 4]-Minibatch[ 231- 240, 60.00%]: CrossEntropyWithSoftmax = 0.14866863 * 250; EvalClassificationError = 0.07600000 * 250; time = 0.0234s; samplesPerSecond = 10664.4
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 241- 250, 62.50%]: CrossEntropyWithSoftmax = 0.20347746 * 250; EvalClassificationError = 0.11200000 * 250; time = 0.0283s; samplesPerSecond = 8834.1
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 251- 260, 65.00%]: CrossEntropyWithSoftmax = 0.12815013 * 250; EvalClassificationError = 0.07200000 * 250; time = 0.0237s; samplesPerSecond = 10553.1
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 261- 270, 67.50%]: CrossEntropyWithSoftmax = 0.18672809 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0277s; samplesPerSecond = 9034.3
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 271- 280, 70.00%]: CrossEntropyWithSoftmax = 0.19552989 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0304s; samplesPerSecond = 8219.0
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 281- 290, 72.50%]: CrossEntropyWithSoftmax = 0.16452642 * 250; EvalClassificationError = 0.06800000 * 250; time = 0.0440s; samplesPerSecond = 5682.8
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 291- 300, 75.00%]: CrossEntropyWithSoftmax = 0.12461825 * 250; EvalClassificationError = 0.04400000 * 250; time = 0.0245s; samplesPerSecond = 10189.1
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 301- 310, 77.50%]: CrossEntropyWithSoftmax = 0.17285251 * 250; EvalClassificationError = 0.08400000 * 250; time = 0.0272s; samplesPerSecond = 9177.5
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 311- 320, 80.00%]: CrossEntropyWithSoftmax = 0.12253619 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0238s; samplesPerSecond = 10485.6
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 321- 330, 82.50%]: CrossEntropyWithSoftmax = 0.14723334 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0378s; samplesPerSecond = 6607.0
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 331- 340, 85.00%]: CrossEntropyWithSoftmax = 0.19789538 * 250; EvalClassificationError = 0.09200000 * 250; time = 0.0216s; samplesPerSecond = 11561.0
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 341- 350, 87.50%]: CrossEntropyWithSoftmax = 0.12575877 * 250; EvalClassificationError = 0.05200000 * 250; time = 0.0263s; samplesPerSecond = 9501.2
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 351- 360, 90.00%]: CrossEntropyWithSoftmax = 0.13745928 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0379s; samplesPerSecond = 6587.9
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 361- 370, 92.50%]: CrossEntropyWithSoftmax = 0.12839652 * 250; EvalClassificationError = 0.06000000 * 250; time = 0.0338s; samplesPerSecond = 7390.3
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 371- 380, 95.00%]: CrossEntropyWithSoftmax = 0.16647280 * 250; EvalClassificationError = 0.09600000 * 250; time = 0.0263s; samplesPerSecond = 9490.9
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 381- 390, 97.50%]: CrossEntropyWithSoftmax = 0.20679434 * 250; EvalClassificationError = 0.11600000 * 250; time = 0.0325s; samplesPerSecond = 7699.9
MPI Rank 3: 01/16/2018 19:05:31:  Epoch[ 4 of 4]-Minibatch[ 391- 400, 100.00%]: CrossEntropyWithSoftmax = 0.14585245 * 250; EvalClassificationError = 0.06400000 * 250; time = 0.0266s; samplesPerSecond = 9392.6
MPI Rank 3: 01/16/2018 19:05:31: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 0.15914931 * 10000; EvalClassificationError = 0.07670000 * 10000; totalSamplesSeen = 40000; learningRatePerSample = 0.0080000004; epochTime=1.0986s
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:31: Action "train" complete.
MPI Rank 3: 
MPI Rank 3: 01/16/2018 19:05:31: __COMPLETED__