CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
    Hardware threads: 12
    Total Memory: 57691188 kB
-------------------------------------------------------------------
=== Running mpiexec -n 3 /home/ubuntu/workspace/build/gpu/debug/bin/cntk configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/.. OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu DeviceId=0 timestamping=true numCPUThreads=4 precision=double speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]] speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]] stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
CNTK 2.3.1+ (HEAD 8663d3, Jan 17 2018 06:43:13) at 2018/01/17 18:07:26

/home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/..  OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DeviceId=0  timestamping=true  numCPUThreads=4  precision=doubleCNTK 2.3.1+ (HEAD 8663d3, Jan 17 2018 06:43:13) at 2018/01/17 18:07:26

/home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/..  OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DeviceId=0  timestamping=true  numCPUThreads=4  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
CNTK 2.3.1+ (HEAD 8663d3, Jan 17 2018 06:43:13) at 2018/01/17 18:07:26

/home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/..  OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DeviceId=0  timestamping=true  numCPUThreads=4  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
--------------------------------------------------------------------------
[[1674,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: 50c7bce59d98

Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
ping [requestnodes (before change)]: 3 nodes pinging each other
ping [requestnodes (before change)]: 3 nodes pinging each other
ping [requestnodes (before change)]: 3 nodes pinging each other
ping [requestnodes (after change)]: 3 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 3 out of 3 MPI nodes on a single host (3 requested); we (2) are in (participating)
ping [mpihelper]: 3 nodes pinging each other
ping [requestnodes (after change)]: 3 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 3 out of 3 MPI nodes on a single host (3 requested); we (0) are in (participating)
ping [mpihelper]: 3 nodes pinging each other
ping [requestnodes (after change)]: 3 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 3 out of 3 MPI nodes on a single host (3 requested); we (1) are in (participating)
ping [mpihelper]: 3 nodes pinging each other
01/17/2018 18:07:26: Redirecting stderr to file /tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr_speechTrain.logrank0
01/17/2018 18:07:26: Redirecting stderr to file /tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr_speechTrain.logrank1
01/17/2018 18:07:27: Redirecting stderr to file /tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr_speechTrain.logrank2
[50c7bce59d98:00039] 2 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[50c7bce59d98:00039] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
MPI Rank 0: CNTK 2.3.1+ (HEAD 8663d3, Jan 17 2018 06:43:13) at 2018/01/17 18:07:26
MPI Rank 0: 
MPI Rank 0: /home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/..  OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DeviceId=0  timestamping=true  numCPUThreads=4  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
MPI Rank 0: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 0: 01/17/2018 18:07:26: Build info: 
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:26: 		Built time: Jan 17 2018 06:40:26
MPI Rank 0: 01/17/2018 18:07:26: 		Last modified date: Wed Jan 17 06:39:51 2018
MPI Rank 0: 01/17/2018 18:07:26: 		Build type: debug
MPI Rank 0: 01/17/2018 18:07:26: 		Build target: GPU
MPI Rank 0: 01/17/2018 18:07:26: 		With ASGD: yes
MPI Rank 0: 01/17/2018 18:07:26: 		Math lib: mkl
MPI Rank 0: 01/17/2018 18:07:26: 		CUDA version: 9.0.0
MPI Rank 0: 01/17/2018 18:07:26: 		CUDNN version: 7.0.4
MPI Rank 0: 01/17/2018 18:07:26: 		Build Branch: HEAD
MPI Rank 0: 01/17/2018 18:07:26: 		Build SHA1: 8663d3ffe597a4c2dc25de7a1ba1eabee3e96b2f
MPI Rank 0: 01/17/2018 18:07:26: 		MPI distribution: Open MPI
MPI Rank 0: 01/17/2018 18:07:26: 		MPI version: 1.10.7
MPI Rank 0: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 0: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 0: 01/17/2018 18:07:26: GPU info:
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:26: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8112 MB
MPI Rank 0: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 0: 01/17/2018 18:07:26: Using 4 CPU threads.
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:26: ##############################################################################
MPI Rank 0: 01/17/2018 18:07:26: #                                                                            #
MPI Rank 0: 01/17/2018 18:07:26: # speechTrain command (train action)                                         #
MPI Rank 0: 01/17/2018 18:07:26: #                                                                            #
MPI Rank 0: 01/17/2018 18:07:26: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:26: 
MPI Rank 0: Creating virgin network.
MPI Rank 0: SimpleNetworkBuilder Using GPU 0
MPI Rank 0: Reading script file glob_0000.scp ... 948 entries
MPI Rank 0: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 0: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 0: Total (133) state names in state list '/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list'
MPI Rank 0: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 0: 01/17/2018 18:07:47: 
MPI Rank 0: Model has 25 nodes. Using GPU 0.
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:47: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 0: 01/17/2018 18:07:47: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: Allocating matrices for forward and/or backward propagation.
MPI Rank 0: 
MPI Rank 0: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 0: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 0: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 0: 
MPI Rank 0: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 0: 
MPI Rank 0: Here are the ones that share memory:
MPI Rank 0: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 0: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 0: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  HLast : [132 x 1 x *]
MPI Rank 0: 	  W0*features : [512 x *]
MPI Rank 0: 	  W0*features : [512 x *] (gradient) }
MPI Rank 0: 	{ B0 : [512 x 1] (gradient)
MPI Rank 0: 	  H1 : [512 x 1 x *] }
MPI Rank 0: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 0: 	  W0 : [512 x 363] (gradient)
MPI Rank 0: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [512 x 1 x *]
MPI Rank 0: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W2*H1 : [132 x 1 x *]
MPI Rank 0: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 0: 	{ H2 : [512 x 1 x *]
MPI Rank 0: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 0: 	  W1 : [512 x 512] (gradient)
MPI Rank 0: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 0: 
MPI Rank 0: Here are the ones that don't share memory:
MPI Rank 0: 	{features : [363 x *]}
MPI Rank 0: 	{InvStdOfFeatures : [363]}
MPI Rank 0: 	{W0 : [512 x 363]}
MPI Rank 0: 	{MeanOfFeatures : [363]}
MPI Rank 0: 	{W2 : [132 x 512]}
MPI Rank 0: 	{B2 : [132 x 1]}
MPI Rank 0: 	{labels : [132 x *]}
MPI Rank 0: 	{Prior : [132]}
MPI Rank 0: 	{B0 : [512 x 1]}
MPI Rank 0: 	{W1 : [512 x 512]}
MPI Rank 0: 	{B1 : [512 x 1]}
MPI Rank 0: 	{LogOfPrior : [132]}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 0: 	{EvalClassificationError : [1]}
MPI Rank 0: 	{W2 : [132 x 512] (gradient)}
MPI Rank 0: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 0: 	{B1 : [512 x 1] (gradient)}
MPI Rank 0: 	{B2 : [132 x 1] (gradient)}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:47: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:47: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/17/2018 18:07:47: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/17/2018 18:07:47: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 0: 01/17/2018 18:07:47: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 0: 01/17/2018 18:07:47: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 0: 01/17/2018 18:07:47: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 0: 
MPI Rank 0: Initializing dataParallelSGD for 1-bit quantization.
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:47: Precomputing --> 3 PreCompute nodes found.
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:47: 	MeanOfFeatures = Mean()
MPI Rank 0: 01/17/2018 18:07:47: 	InvStdOfFeatures = InvStdDev()
MPI Rank 0: 01/17/2018 18:07:47: 	Prior = Mean()
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:07:57: Precomputing --> Completed.
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:08: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:08: Starting minibatch loop.
MPI Rank 0: 01/17/2018 18:08:08:  Epoch[ 1 of 3]-Minibatch[   1-  10, 3.12%]: CrossEntropyWithSoftmax = 4.62512789 * 640; EvalClassificationError = 0.94062500 * 640; time = 0.2423s; samplesPerSecond = 2641.1
MPI Rank 0: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.35619366 * 640; EvalClassificationError = 0.92343750 * 640; time = 0.2353s; samplesPerSecond = 2719.4
MPI Rank 0: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.97911998 * 640; EvalClassificationError = 0.89531250 * 640; time = 0.1818s; samplesPerSecond = 3519.8
MPI Rank 0: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.73643568 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.1704s; samplesPerSecond = 3756.2
MPI Rank 0: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  41-  50, 15.62%]: CrossEntropyWithSoftmax = 3.83079080 * 640; EvalClassificationError = 0.88281250 * 640; time = 0.1776s; samplesPerSecond = 3604.0
MPI Rank 0: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71437689 * 640; EvalClassificationError = 0.86875000 * 640; time = 0.1727s; samplesPerSecond = 3705.3
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.42186230 * 640; EvalClassificationError = 0.79062500 * 640; time = 0.1190s; samplesPerSecond = 5376.6
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53658052 * 640; EvalClassificationError = 0.82031250 * 640; time = 0.1251s; samplesPerSecond = 5117.6
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  81-  90, 28.12%]: CrossEntropyWithSoftmax = 3.49758017 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.1270s; samplesPerSecond = 5038.0
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.39996308 * 640; EvalClassificationError = 0.80468750 * 640; time = 0.1357s; samplesPerSecond = 4716.9
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.49445772 * 640; EvalClassificationError = 0.82500000 * 640; time = 0.1277s; samplesPerSecond = 5013.5
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.26676998 * 640; EvalClassificationError = 0.79218750 * 640; time = 0.1376s; samplesPerSecond = 4651.8
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 121- 130, 40.62%]: CrossEntropyWithSoftmax = 3.18870173 * 640; EvalClassificationError = 0.78906250 * 640; time = 0.1267s; samplesPerSecond = 5052.4
MPI Rank 0: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.05687263 * 640; EvalClassificationError = 0.74687500 * 640; time = 0.1314s; samplesPerSecond = 4871.2
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.95594568 * 640; EvalClassificationError = 0.71875000 * 640; time = 0.1419s; samplesPerSecond = 4511.7
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.10219603 * 640; EvalClassificationError = 0.74062500 * 640; time = 0.1288s; samplesPerSecond = 4968.6
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 161- 170, 53.12%]: CrossEntropyWithSoftmax = 2.80745014 * 640; EvalClassificationError = 0.70625000 * 640; time = 0.1257s; samplesPerSecond = 5091.3
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.72061841 * 640; EvalClassificationError = 0.65468750 * 640; time = 0.1308s; samplesPerSecond = 4892.0
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.80425747 * 640; EvalClassificationError = 0.71718750 * 640; time = 0.1533s; samplesPerSecond = 4175.7
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.71253068 * 640; EvalClassificationError = 0.67812500 * 640; time = 0.1235s; samplesPerSecond = 5182.1
MPI Rank 0: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 201- 210, 65.62%]: CrossEntropyWithSoftmax = 2.59360398 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1079s; samplesPerSecond = 5929.2
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.60386648 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1367s; samplesPerSecond = 4681.9
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.53706677 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1514s; samplesPerSecond = 4226.1
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.56177342 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1292s; samplesPerSecond = 4953.6
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 241- 250, 78.12%]: CrossEntropyWithSoftmax = 2.50118790 * 640; EvalClassificationError = 0.64218750 * 640; time = 0.1264s; samplesPerSecond = 5065.1
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.40119787 * 640; EvalClassificationError = 0.62500000 * 640; time = 0.1286s; samplesPerSecond = 4977.0
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27491502 * 640; EvalClassificationError = 0.58906250 * 640; time = 0.1265s; samplesPerSecond = 5060.9
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.51724207 * 640; EvalClassificationError = 0.65781250 * 640; time = 0.1524s; samplesPerSecond = 4199.5
MPI Rank 0: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 281- 290, 90.62%]: CrossEntropyWithSoftmax = 2.27797542 * 640; EvalClassificationError = 0.59687500 * 640; time = 0.1377s; samplesPerSecond = 4649.3
MPI Rank 0: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26017739 * 640; EvalClassificationError = 0.60937500 * 640; time = 0.1458s; samplesPerSecond = 4389.9
MPI Rank 0: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24735342 * 640; EvalClassificationError = 0.58437500 * 640; time = 0.1261s; samplesPerSecond = 5073.3
MPI Rank 0: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.23665381 * 640; EvalClassificationError = 0.60625000 * 640; time = 0.1179s; samplesPerSecond = 5426.6
MPI Rank 0: 01/17/2018 18:08:13: Finished Epoch[ 1 of 3]: [Training] CrossEntropyWithSoftmax = 3.03815141 * 20480; EvalClassificationError = 0.73432617 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=4.60764s
MPI Rank 0: 01/17/2018 18:08:13: SGD: Saving checkpoint model '/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/models/cntkSpeech.dnn.1'
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:13: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:13: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 3, numGradientBits = 1), distributed reading is ENABLED.
MPI Rank 0: 01/17/2018 18:08:13:  Epoch[ 2 of 3]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.19429671 * 2560; EvalClassificationError = 0.60039062 * 2560; time = 0.2162s; samplesPerSecond = 11842.7
MPI Rank 0: 01/17/2018 18:08:13:  Epoch[ 2 of 3]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.15577543 * 2560; EvalClassificationError = 0.57070312 * 2560; time = 0.1938s; samplesPerSecond = 13212.0
MPI Rank 0: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.09655269 * 2560; EvalClassificationError = 0.56289062 * 2560; time = 0.1934s; samplesPerSecond = 13237.8
MPI Rank 0: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.06745040 * 2560; EvalClassificationError = 0.56171875 * 2560; time = 0.2172s; samplesPerSecond = 11784.1
MPI Rank 0: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.06704837 * 2560; EvalClassificationError = 0.55976563 * 2560; time = 0.1841s; samplesPerSecond = 13904.9
MPI Rank 0: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 2.00128953 * 2560; EvalClassificationError = 0.54492188 * 2560; time = 0.1950s; samplesPerSecond = 13125.5
MPI Rank 0: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 1.99512965 * 2560; EvalClassificationError = 0.54726562 * 2560; time = 0.1989s; samplesPerSecond = 12867.8
MPI Rank 0: 01/17/2018 18:08:15:  Epoch[ 2 of 3]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 1.99976057 * 2560; EvalClassificationError = 0.55468750 * 2560; time = 0.1887s; samplesPerSecond = 13564.3
MPI Rank 0: 01/17/2018 18:08:15: Finished Epoch[ 2 of 3]: [Training] CrossEntropyWithSoftmax = 2.07216292 * 20480; EvalClassificationError = 0.56279297 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=1.5946s
MPI Rank 0: 01/17/2018 18:08:15: SGD: Saving checkpoint model '/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/models/cntkSpeech.dnn.2'
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:15: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:15: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 3, numGradientBits = 1), distributed reading is ENABLED.
MPI Rank 0: 01/17/2018 18:08:15:  Epoch[ 3 of 3]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.95863860 * 10240; EvalClassificationError = 0.53154297 * 10240; time = 0.4267s; samplesPerSecond = 23995.7
MPI Rank 0: 01/17/2018 18:08:15:  Epoch[ 3 of 3]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.97873024 * 10240; EvalClassificationError = 0.54990234 * 10240; time = 0.3926s; samplesPerSecond = 26081.5
MPI Rank 0: 01/17/2018 18:08:15: Finished Epoch[ 3 of 3]: [Training] CrossEntropyWithSoftmax = 1.96868442 * 20480; EvalClassificationError = 0.54072266 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=0.828253s
MPI Rank 0: 01/17/2018 18:08:15: SGD: Saving checkpoint model '/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/models/cntkSpeech.dnn'
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:16: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: 01/17/2018 18:08:16: __COMPLETED__
MPI Rank 1: CNTK 2.3.1+ (HEAD 8663d3, Jan 17 2018 06:43:13) at 2018/01/17 18:07:26
MPI Rank 1: 
MPI Rank 1: /home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/..  OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DeviceId=0  timestamping=true  numCPUThreads=4  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
MPI Rank 1: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 1: 01/17/2018 18:07:26: Build info: 
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:26: 		Built time: Jan 17 2018 06:40:26
MPI Rank 1: 01/17/2018 18:07:26: 		Last modified date: Wed Jan 17 06:39:51 2018
MPI Rank 1: 01/17/2018 18:07:26: 		Build type: debug
MPI Rank 1: 01/17/2018 18:07:26: 		Build target: GPU
MPI Rank 1: 01/17/2018 18:07:26: 		With ASGD: yes
MPI Rank 1: 01/17/2018 18:07:26: 		Math lib: mkl
MPI Rank 1: 01/17/2018 18:07:26: 		CUDA version: 9.0.0
MPI Rank 1: 01/17/2018 18:07:26: 		CUDNN version: 7.0.4
MPI Rank 1: 01/17/2018 18:07:26: 		Build Branch: HEAD
MPI Rank 1: 01/17/2018 18:07:26: 		Build SHA1: 8663d3ffe597a4c2dc25de7a1ba1eabee3e96b2f
MPI Rank 1: 01/17/2018 18:07:26: 		MPI distribution: Open MPI
MPI Rank 1: 01/17/2018 18:07:26: 		MPI version: 1.10.7
MPI Rank 1: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 1: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 1: 01/17/2018 18:07:26: GPU info:
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:26: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8039 MB
MPI Rank 1: 01/17/2018 18:07:26: -------------------------------------------------------------------
MPI Rank 1: 01/17/2018 18:07:26: Using 4 CPU threads.
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:26: ##############################################################################
MPI Rank 1: 01/17/2018 18:07:26: #                                                                            #
MPI Rank 1: 01/17/2018 18:07:26: # speechTrain command (train action)                                         #
MPI Rank 1: 01/17/2018 18:07:26: #                                                                            #
MPI Rank 1: 01/17/2018 18:07:26: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:26: 
MPI Rank 1: Creating virgin network.
MPI Rank 1: SimpleNetworkBuilder Using GPU 0
MPI Rank 1: Reading script file glob_0000.scp ... 948 entries
MPI Rank 1: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 1: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 1: Total (133) state names in state list '/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list'
MPI Rank 1: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 1: 01/17/2018 18:07:58: 
MPI Rank 1: Model has 25 nodes. Using GPU 0.
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:58: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 1: 01/17/2018 18:07:58: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: Allocating matrices for forward and/or backward propagation.
MPI Rank 1: 
MPI Rank 1: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 1: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 1: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 1: 
MPI Rank 1: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 1: 
MPI Rank 1: Here are the ones that share memory:
MPI Rank 1: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 1: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 1: 	{ H2 : [512 x 1 x *]
MPI Rank 1: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 1: 	  W1 : [512 x 512] (gradient)
MPI Rank 1: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 1: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  HLast : [132 x 1 x *]
MPI Rank 1: 	  W0*features : [512 x *]
MPI Rank 1: 	  W0*features : [512 x *] (gradient) }
MPI Rank 1: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 1: 	  W0 : [512 x 363] (gradient)
MPI Rank 1: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [512 x 1 x *]
MPI Rank 1: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W2*H1 : [132 x 1 x *]
MPI Rank 1: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 1: 	{ B0 : [512 x 1] (gradient)
MPI Rank 1: 	  H1 : [512 x 1 x *] }
MPI Rank 1: 
MPI Rank 1: Here are the ones that don't share memory:
MPI Rank 1: 	{EvalClassificationError : [1]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 1: 	{B1 : [512 x 1] (gradient)}
MPI Rank 1: 	{W2 : [132 x 512] (gradient)}
MPI Rank 1: 	{LogOfPrior : [132]}
MPI Rank 1: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 1: 	{B2 : [132 x 1] (gradient)}
MPI Rank 1: 	{B2 : [132 x 1]}
MPI Rank 1: 	{labels : [132 x *]}
MPI Rank 1: 	{Prior : [132]}
MPI Rank 1: 	{B0 : [512 x 1]}
MPI Rank 1: 	{W1 : [512 x 512]}
MPI Rank 1: 	{B1 : [512 x 1]}
MPI Rank 1: 	{W2 : [132 x 512]}
MPI Rank 1: 	{MeanOfFeatures : [363]}
MPI Rank 1: 	{InvStdOfFeatures : [363]}
MPI Rank 1: 	{W0 : [512 x 363]}
MPI Rank 1: 	{features : [363 x *]}
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:58: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:58: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/17/2018 18:07:58: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/17/2018 18:07:58: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 1: 01/17/2018 18:07:58: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 1: 01/17/2018 18:07:58: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 1: 01/17/2018 18:07:58: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 1: 
MPI Rank 1: Initializing dataParallelSGD for 1-bit quantization.
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:58: Precomputing --> 3 PreCompute nodes found.
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:07:58: 	MeanOfFeatures = Mean()
MPI Rank 1: 01/17/2018 18:07:58: 	InvStdOfFeatures = InvStdDev()
MPI Rank 1: 01/17/2018 18:07:58: 	Prior = Mean()
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:08: Precomputing --> Completed.
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:08: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:08: Starting minibatch loop.
MPI Rank 1: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[   1-  10, 3.12%]: CrossEntropyWithSoftmax = 4.62512789 * 640; EvalClassificationError = 0.94062500 * 640; time = 0.2826s; samplesPerSecond = 2264.7
MPI Rank 1: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.35619366 * 640; EvalClassificationError = 0.92343750 * 640; time = 0.1769s; samplesPerSecond = 3618.0
MPI Rank 1: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.97911998 * 640; EvalClassificationError = 0.89531250 * 640; time = 0.1782s; samplesPerSecond = 3591.2
MPI Rank 1: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.73643568 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.1869s; samplesPerSecond = 3424.5
MPI Rank 1: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  41-  50, 15.62%]: CrossEntropyWithSoftmax = 3.83079080 * 640; EvalClassificationError = 0.88281250 * 640; time = 0.2028s; samplesPerSecond = 3156.0
MPI Rank 1: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71437689 * 640; EvalClassificationError = 0.86875000 * 640; time = 0.1929s; samplesPerSecond = 3317.6
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.42186230 * 640; EvalClassificationError = 0.79062500 * 640; time = 0.1339s; samplesPerSecond = 4779.3
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53658052 * 640; EvalClassificationError = 0.82031250 * 640; time = 0.1586s; samplesPerSecond = 4035.7
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  81-  90, 28.12%]: CrossEntropyWithSoftmax = 3.49758017 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.1358s; samplesPerSecond = 4713.9
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.39996308 * 640; EvalClassificationError = 0.80468750 * 640; time = 0.1573s; samplesPerSecond = 4068.6
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.49445772 * 640; EvalClassificationError = 0.82500000 * 640; time = 0.1268s; samplesPerSecond = 5048.7
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.26676998 * 640; EvalClassificationError = 0.79218750 * 640; time = 0.1380s; samplesPerSecond = 4638.7
MPI Rank 1: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 121- 130, 40.62%]: CrossEntropyWithSoftmax = 3.18870173 * 640; EvalClassificationError = 0.78906250 * 640; time = 0.1257s; samplesPerSecond = 5093.0
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.05687263 * 640; EvalClassificationError = 0.74687500 * 640; time = 0.1327s; samplesPerSecond = 4821.3
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.95594568 * 640; EvalClassificationError = 0.71875000 * 640; time = 0.1558s; samplesPerSecond = 4107.9
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.10219603 * 640; EvalClassificationError = 0.74062500 * 640; time = 0.1159s; samplesPerSecond = 5522.2
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 161- 170, 53.12%]: CrossEntropyWithSoftmax = 2.80745014 * 640; EvalClassificationError = 0.70625000 * 640; time = 0.1406s; samplesPerSecond = 4553.0
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.72061841 * 640; EvalClassificationError = 0.65468750 * 640; time = 0.1374s; samplesPerSecond = 4657.0
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.80425747 * 640; EvalClassificationError = 0.71718750 * 640; time = 0.1272s; samplesPerSecond = 5033.1
MPI Rank 1: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.71253068 * 640; EvalClassificationError = 0.67812500 * 640; time = 0.1529s; samplesPerSecond = 4185.7
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 201- 210, 65.62%]: CrossEntropyWithSoftmax = 2.59360398 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1131s; samplesPerSecond = 5656.7
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.60386648 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1115s; samplesPerSecond = 5738.0
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.53706677 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1336s; samplesPerSecond = 4789.5
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.56177342 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1351s; samplesPerSecond = 4736.6
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 241- 250, 78.12%]: CrossEntropyWithSoftmax = 2.50118790 * 640; EvalClassificationError = 0.64218750 * 640; time = 0.1410s; samplesPerSecond = 4540.2
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.40119787 * 640; EvalClassificationError = 0.62500000 * 640; time = 0.1311s; samplesPerSecond = 4882.2
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27491502 * 640; EvalClassificationError = 0.58906250 * 640; time = 0.1248s; samplesPerSecond = 5129.0
MPI Rank 1: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.51724207 * 640; EvalClassificationError = 0.65781250 * 640; time = 0.1369s; samplesPerSecond = 4675.3
MPI Rank 1: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 281- 290, 90.62%]: CrossEntropyWithSoftmax = 2.27797542 * 640; EvalClassificationError = 0.59687500 * 640; time = 0.1150s; samplesPerSecond = 5563.7
MPI Rank 1: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26017739 * 640; EvalClassificationError = 0.60937500 * 640; time = 0.1281s; samplesPerSecond = 4994.8
MPI Rank 1: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24735342 * 640; EvalClassificationError = 0.58437500 * 640; time = 0.1235s; samplesPerSecond = 5183.3
MPI Rank 1: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.23665381 * 640; EvalClassificationError = 0.60625000 * 640; time = 0.0724s; samplesPerSecond = 8841.5
MPI Rank 1: 01/17/2018 18:08:13: Finished Epoch[ 1 of 3]: [Training] CrossEntropyWithSoftmax = 3.03815141 * 20480; EvalClassificationError = 0.73432617 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=4.63168s
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:13: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:13: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 3, numGradientBits = 1), distributed reading is ENABLED.
MPI Rank 1: 01/17/2018 18:08:13:  Epoch[ 2 of 3]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.19429671 * 2560; EvalClassificationError = 0.60039062 * 2560; time = 0.2149s; samplesPerSecond = 11912.2
MPI Rank 1: 01/17/2018 18:08:13:  Epoch[ 2 of 3]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.15577543 * 2560; EvalClassificationError = 0.57070312 * 2560; time = 0.1944s; samplesPerSecond = 13167.4
MPI Rank 1: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.09655269 * 2560; EvalClassificationError = 0.56289062 * 2560; time = 0.1941s; samplesPerSecond = 13191.0
MPI Rank 1: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.06745040 * 2560; EvalClassificationError = 0.56171875 * 2560; time = 0.2174s; samplesPerSecond = 11775.7
MPI Rank 1: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.06704837 * 2560; EvalClassificationError = 0.55976563 * 2560; time = 0.1890s; samplesPerSecond = 13546.9
MPI Rank 1: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 2.00128953 * 2560; EvalClassificationError = 0.54492188 * 2560; time = 0.1907s; samplesPerSecond = 13425.6
MPI Rank 1: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 1.99512965 * 2560; EvalClassificationError = 0.54726562 * 2560; time = 0.1984s; samplesPerSecond = 12904.6
MPI Rank 1: 01/17/2018 18:08:15:  Epoch[ 2 of 3]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 1.99976057 * 2560; EvalClassificationError = 0.55468750 * 2560; time = 0.1882s; samplesPerSecond = 13600.6
MPI Rank 1: 01/17/2018 18:08:15: Finished Epoch[ 2 of 3]: [Training] CrossEntropyWithSoftmax = 2.07216292 * 20480; EvalClassificationError = 0.56279297 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=1.59465s
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:15: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:15: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 3, numGradientBits = 1), distributed reading is ENABLED.
MPI Rank 1: 01/17/2018 18:08:15:  Epoch[ 3 of 3]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.95863860 * 10240; EvalClassificationError = 0.53154297 * 10240; time = 0.4265s; samplesPerSecond = 24007.0
MPI Rank 1: 01/17/2018 18:08:15:  Epoch[ 3 of 3]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.97873024 * 10240; EvalClassificationError = 0.54990234 * 10240; time = 0.3942s; samplesPerSecond = 25977.3
MPI Rank 1: 01/17/2018 18:08:15: Finished Epoch[ 3 of 3]: [Training] CrossEntropyWithSoftmax = 1.96868442 * 20480; EvalClassificationError = 0.54072266 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=0.828353s
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:16: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: 01/17/2018 18:08:16: __COMPLETED__
MPI Rank 2: CNTK 2.3.1+ (HEAD 8663d3, Jan 17 2018 06:43:13) at 2018/01/17 18:07:26
MPI Rank 2: 
MPI Rank 2: /home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/../cntk.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/Parallel1BitQuantization/..  OutputDir=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu  DeviceId=0  timestamping=true  numCPUThreads=4  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  stderr=/tmp/cntk-test-20180117180725.10946/Speech/DNN_Parallel1BitQuantization@debug_gpu/stderr
MPI Rank 2: 01/17/2018 18:07:27: -------------------------------------------------------------------
MPI Rank 2: 01/17/2018 18:07:27: Build info: 
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:27: 		Built time: Jan 17 2018 06:40:26
MPI Rank 2: 01/17/2018 18:07:27: 		Last modified date: Wed Jan 17 06:39:51 2018
MPI Rank 2: 01/17/2018 18:07:27: 		Build type: debug
MPI Rank 2: 01/17/2018 18:07:27: 		Build target: GPU
MPI Rank 2: 01/17/2018 18:07:27: 		With ASGD: yes
MPI Rank 2: 01/17/2018 18:07:27: 		Math lib: mkl
MPI Rank 2: 01/17/2018 18:07:27: 		CUDA version: 9.0.0
MPI Rank 2: 01/17/2018 18:07:27: 		CUDNN version: 7.0.4
MPI Rank 2: 01/17/2018 18:07:27: 		Build Branch: HEAD
MPI Rank 2: 01/17/2018 18:07:27: 		Build SHA1: 8663d3ffe597a4c2dc25de7a1ba1eabee3e96b2f
MPI Rank 2: 01/17/2018 18:07:27: 		MPI distribution: Open MPI
MPI Rank 2: 01/17/2018 18:07:27: 		MPI version: 1.10.7
MPI Rank 2: 01/17/2018 18:07:27: -------------------------------------------------------------------
MPI Rank 2: 01/17/2018 18:07:27: -------------------------------------------------------------------
MPI Rank 2: 01/17/2018 18:07:27: GPU info:
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:27: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 7967 MB
MPI Rank 2: 01/17/2018 18:07:27: -------------------------------------------------------------------
MPI Rank 2: 01/17/2018 18:07:27: Using 4 CPU threads.
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:27: ##############################################################################
MPI Rank 2: 01/17/2018 18:07:27: #                                                                            #
MPI Rank 2: 01/17/2018 18:07:27: # speechTrain command (train action)                                         #
MPI Rank 2: 01/17/2018 18:07:27: #                                                                            #
MPI Rank 2: 01/17/2018 18:07:27: ##############################################################################
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:27: 
MPI Rank 2: Creating virgin network.
MPI Rank 2: SimpleNetworkBuilder Using GPU 0
MPI Rank 2: Reading script file glob_0000.scp ... 948 entries
MPI Rank 2: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 2: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 2: Total (133) state names in state list '/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list'
MPI Rank 2: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 2: 01/17/2018 18:07:58: 
MPI Rank 2: Model has 25 nodes. Using GPU 0.
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:58: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 2: 01/17/2018 18:07:58: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: Allocating matrices for forward and/or backward propagation.
MPI Rank 2: 
MPI Rank 2: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 2: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 2: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 2: 
MPI Rank 2: Memory Sharing: Out of 40 matrices, 21 are shared as 5, and 19 are not shared.
MPI Rank 2: 
MPI Rank 2: Here are the ones that share memory:
MPI Rank 2: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 2: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 2: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 2: 	  W0 : [512 x 363] (gradient)
MPI Rank 2: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1+B1 : [512 x 1 x *]
MPI Rank 2: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  W2*H1 : [132 x 1 x *]
MPI Rank 2: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 2: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  HLast : [132 x 1 x *]
MPI Rank 2: 	  W0*features : [512 x *]
MPI Rank 2: 	  W0*features : [512 x *] (gradient) }
MPI Rank 2: 	{ B0 : [512 x 1] (gradient)
MPI Rank 2: 	  H1 : [512 x 1 x *] }
MPI Rank 2: 	{ H2 : [512 x 1 x *]
MPI Rank 2: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 2: 	  W1 : [512 x 512] (gradient)
MPI Rank 2: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 2: 
MPI Rank 2: Here are the ones that don't share memory:
MPI Rank 2: 	{features : [363 x *]}
MPI Rank 2: 	{W2 : [132 x 512]}
MPI Rank 2: 	{B2 : [132 x 1]}
MPI Rank 2: 	{labels : [132 x *]}
MPI Rank 2: 	{Prior : [132]}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 2: 	{W2 : [132 x 512] (gradient)}
MPI Rank 2: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 2: 	{EvalClassificationError : [1]}
MPI Rank 2: 	{LogOfPrior : [132]}
MPI Rank 2: 	{B1 : [512 x 1] (gradient)}
MPI Rank 2: 	{B2 : [132 x 1] (gradient)}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 2: 	{B0 : [512 x 1]}
MPI Rank 2: 	{W1 : [512 x 512]}
MPI Rank 2: 	{B1 : [512 x 1]}
MPI Rank 2: 	{MeanOfFeatures : [363]}
MPI Rank 2: 	{W0 : [512 x 363]}
MPI Rank 2: 	{InvStdOfFeatures : [363]}
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:58: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:58: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 2: 01/17/2018 18:07:58: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 2: 01/17/2018 18:07:58: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 2: 01/17/2018 18:07:58: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 2: 01/17/2018 18:07:58: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 2: 01/17/2018 18:07:58: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 2: 
MPI Rank 2: Initializing dataParallelSGD for 1-bit quantization.
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:58: Precomputing --> 3 PreCompute nodes found.
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:07:58: 	MeanOfFeatures = Mean()
MPI Rank 2: 01/17/2018 18:07:58: 	InvStdOfFeatures = InvStdDev()
MPI Rank 2: 01/17/2018 18:07:58: 	Prior = Mean()
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:08: Precomputing --> Completed.
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:08: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:08: Starting minibatch loop.
MPI Rank 2: 01/17/2018 18:08:08:  Epoch[ 1 of 3]-Minibatch[   1-  10, 3.12%]: CrossEntropyWithSoftmax = 4.62512789 * 640; EvalClassificationError = 0.94062500 * 640; time = 0.2257s; samplesPerSecond = 2835.4
MPI Rank 2: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.35619366 * 640; EvalClassificationError = 0.92343750 * 640; time = 0.2147s; samplesPerSecond = 2981.5
MPI Rank 2: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.97911998 * 640; EvalClassificationError = 0.89531250 * 640; time = 0.2126s; samplesPerSecond = 3010.8
MPI Rank 2: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.73643568 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.1980s; samplesPerSecond = 3232.7
MPI Rank 2: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  41-  50, 15.62%]: CrossEntropyWithSoftmax = 3.83079080 * 640; EvalClassificationError = 0.88281250 * 640; time = 0.1763s; samplesPerSecond = 3630.8
MPI Rank 2: 01/17/2018 18:08:09:  Epoch[ 1 of 3]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71437689 * 640; EvalClassificationError = 0.86875000 * 640; time = 0.1728s; samplesPerSecond = 3704.6
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.42186230 * 640; EvalClassificationError = 0.79062500 * 640; time = 0.1315s; samplesPerSecond = 4865.1
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53658052 * 640; EvalClassificationError = 0.82031250 * 640; time = 0.1269s; samplesPerSecond = 5041.9
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  81-  90, 28.12%]: CrossEntropyWithSoftmax = 3.49758017 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.1204s; samplesPerSecond = 5315.6
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.39996308 * 640; EvalClassificationError = 0.80468750 * 640; time = 0.1234s; samplesPerSecond = 5186.4
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.49445772 * 640; EvalClassificationError = 0.82500000 * 640; time = 0.1296s; samplesPerSecond = 4937.3
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.26676998 * 640; EvalClassificationError = 0.79218750 * 640; time = 0.1255s; samplesPerSecond = 5099.7
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 121- 130, 40.62%]: CrossEntropyWithSoftmax = 3.18870173 * 640; EvalClassificationError = 0.78906250 * 640; time = 0.1262s; samplesPerSecond = 5071.7
MPI Rank 2: 01/17/2018 18:08:10:  Epoch[ 1 of 3]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.05687263 * 640; EvalClassificationError = 0.74687500 * 640; time = 0.1266s; samplesPerSecond = 5053.6
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.95594568 * 640; EvalClassificationError = 0.71875000 * 640; time = 0.1244s; samplesPerSecond = 5142.7
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.10219603 * 640; EvalClassificationError = 0.74062500 * 640; time = 0.1290s; samplesPerSecond = 4961.9
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 161- 170, 53.12%]: CrossEntropyWithSoftmax = 2.80745014 * 640; EvalClassificationError = 0.70625000 * 640; time = 0.1425s; samplesPerSecond = 4490.9
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.72061841 * 640; EvalClassificationError = 0.65468750 * 640; time = 0.1194s; samplesPerSecond = 5360.9
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.80425747 * 640; EvalClassificationError = 0.71718750 * 640; time = 0.1419s; samplesPerSecond = 4510.7
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.71253068 * 640; EvalClassificationError = 0.67812500 * 640; time = 0.1351s; samplesPerSecond = 4738.9
MPI Rank 2: 01/17/2018 18:08:11:  Epoch[ 1 of 3]-Minibatch[ 201- 210, 65.62%]: CrossEntropyWithSoftmax = 2.59360398 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1206s; samplesPerSecond = 5306.0
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.60386648 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1436s; samplesPerSecond = 4457.8
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.53706677 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1165s; samplesPerSecond = 5494.8
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.56177342 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.1271s; samplesPerSecond = 5035.5
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 241- 250, 78.12%]: CrossEntropyWithSoftmax = 2.50118790 * 640; EvalClassificationError = 0.64218750 * 640; time = 0.1216s; samplesPerSecond = 5261.0
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.40119787 * 640; EvalClassificationError = 0.62500000 * 640; time = 0.1211s; samplesPerSecond = 5285.2
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27491502 * 640; EvalClassificationError = 0.58906250 * 640; time = 0.1404s; samplesPerSecond = 4557.3
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.51724207 * 640; EvalClassificationError = 0.65781250 * 640; time = 0.1343s; samplesPerSecond = 4765.7
MPI Rank 2: 01/17/2018 18:08:12:  Epoch[ 1 of 3]-Minibatch[ 281- 290, 90.62%]: CrossEntropyWithSoftmax = 2.27797542 * 640; EvalClassificationError = 0.59687500 * 640; time = 0.1179s; samplesPerSecond = 5426.4
MPI Rank 2: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26017739 * 640; EvalClassificationError = 0.60937500 * 640; time = 0.1362s; samplesPerSecond = 4697.8
MPI Rank 2: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24735342 * 640; EvalClassificationError = 0.58437500 * 640; time = 0.1226s; samplesPerSecond = 5218.6
MPI Rank 2: 01/17/2018 18:08:13:  Epoch[ 1 of 3]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.23665381 * 640; EvalClassificationError = 0.60625000 * 640; time = 0.1363s; samplesPerSecond = 4696.6
MPI Rank 2: 01/17/2018 18:08:13: Finished Epoch[ 1 of 3]: [Training] CrossEntropyWithSoftmax = 3.03815141 * 20480; EvalClassificationError = 0.73432617 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=4.54739s
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:13: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:13: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 3, numGradientBits = 1), distributed reading is ENABLED.
MPI Rank 2: 01/17/2018 18:08:13:  Epoch[ 2 of 3]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.19429671 * 2560; EvalClassificationError = 0.60039062 * 2560; time = 0.2146s; samplesPerSecond = 11928.0
MPI Rank 2: 01/17/2018 18:08:13:  Epoch[ 2 of 3]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.15577543 * 2560; EvalClassificationError = 0.57070312 * 2560; time = 0.1946s; samplesPerSecond = 13152.0
MPI Rank 2: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.09655269 * 2560; EvalClassificationError = 0.56289062 * 2560; time = 0.1941s; samplesPerSecond = 13189.2
MPI Rank 2: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.06745040 * 2560; EvalClassificationError = 0.56171875 * 2560; time = 0.2112s; samplesPerSecond = 12123.4
MPI Rank 2: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.06704837 * 2560; EvalClassificationError = 0.55976563 * 2560; time = 0.1899s; samplesPerSecond = 13480.7
MPI Rank 2: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 2.00128953 * 2560; EvalClassificationError = 0.54492188 * 2560; time = 0.1891s; samplesPerSecond = 13539.8
MPI Rank 2: 01/17/2018 18:08:14:  Epoch[ 2 of 3]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 1.99512965 * 2560; EvalClassificationError = 0.54726562 * 2560; time = 0.2049s; samplesPerSecond = 12494.1
MPI Rank 2: 01/17/2018 18:08:15:  Epoch[ 2 of 3]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 1.99976057 * 2560; EvalClassificationError = 0.55468750 * 2560; time = 0.1886s; samplesPerSecond = 13571.1
MPI Rank 2: 01/17/2018 18:08:15: Finished Epoch[ 2 of 3]: [Training] CrossEntropyWithSoftmax = 2.07216292 * 20480; EvalClassificationError = 0.56279297 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=1.59471s
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:15: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:15: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 3, numGradientBits = 1), distributed reading is ENABLED.
MPI Rank 2: 01/17/2018 18:08:15:  Epoch[ 3 of 3]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.95863860 * 10240; EvalClassificationError = 0.53154297 * 10240; time = 0.4257s; samplesPerSecond = 24054.4
MPI Rank 2: 01/17/2018 18:08:15:  Epoch[ 3 of 3]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.97873024 * 10240; EvalClassificationError = 0.54990234 * 10240; time = 0.3925s; samplesPerSecond = 26086.3
MPI Rank 2: 01/17/2018 18:08:15: Finished Epoch[ 3 of 3]: [Training] CrossEntropyWithSoftmax = 1.96868442 * 20480; EvalClassificationError = 0.54072266 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=0.827886s
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:15: Action "train" complete.
MPI Rank 2: 
MPI Rank 2: 01/17/2018 18:08:16: __COMPLETED__