CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
    Hardware threads: 6
    Total Memory: 58719796 kB
-------------------------------------------------------------------
=== Running c:\local\msmpi-7.0.12437.6\Bin/mpiexec.exe -n 3 C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu DeviceId=-1 timestamping=true numCPUThreads=2 precision=double speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]] speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]] speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]] speechTrain=[SGD=[maxEpochs=4]] speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]] stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
CNTK 2.3.1+ (HEAD 9e527f, Jan 12 2018 07:29:42) at 2018/01/13 07:58:14

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=2  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  speechTrain=[SGD=[maxEpochs=4]]  speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]]  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 3 nodes pinging each other
CNTK 2.3.1+ (HEAD 9e527f, Jan 12 2018 07:29:42) at 2018/01/13 07:58:14

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=2  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  speechTrain=[SGD=[maxEpochs=4]]  speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]]  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 3 nodes pinging each other
CNTK 2.3.1+ (HEAD 9e527f, Jan 12 2018 07:29:42) at 2018/01/13 07:58:14

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=2  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  speechTrain=[SGD=[maxEpochs=4]]  speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]]  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 3 nodes pinging each other
ping [requestnodes (after change)]: 3 nodes pinging each other
ping [requestnodes (after change)]: 3 nodes pinging each other
ping [requestnodes (after change)]: 3 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 3 out of 3 MPI nodes on a single host (3 requested); we (1) are in (participating)
ping [mpihelper]: 3 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 3 out of 3 MPI nodes on a single host (3 requested); we (2) are in (participating)
requestnodes [MPIWrapperMpi]: using 3 out of 3 MPI nodes on a single host (3 requested); we (0) are in (participating)
ping [mpihelper]: 3 nodes pinging each other
ping [mpihelper]: 3 nodes pinging each other
MPI Rank 0: 01/13/2018 07:58:14: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr_speechTrain.logrank0
MPI Rank 0: CNTK 2.3.1+ (HEAD 9e527f, Jan 12 2018 07:29:42) at 2018/01/13 07:58:14
MPI Rank 0: 
MPI Rank 0: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=2  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  speechTrain=[SGD=[maxEpochs=4]]  speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]]  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: Build info: 
MPI Rank 0: 
MPI Rank 0: 		Built time: Jan 12 2018 07:19:19
MPI Rank 0: 		Last modified date: Fri Jan 12 06:53:42 2018
MPI Rank 0: 		Build type: Release
MPI Rank 0: 		Build target: GPU
MPI Rank 0: 		With ASGD: yes
MPI Rank 0: 		Math lib: mkl
MPI Rank 0: 		CUDA version: 9.0.0
MPI Rank 0: 		CUDNN version: 7.0.5
MPI Rank 0: 		Build Branch: HEAD
MPI Rank 0: 		Build SHA1: 9e527facf044613b4f2735fe3f894df60e93fc6f
MPI Rank 0: 		MPI distribution: Microsoft MPI
MPI Rank 0: 		MPI version: 7.0.12437.6
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: GPU info:
MPI Rank 0: 
MPI Rank 0: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 8001 MB
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: 01/13/2018 07:58:14: Using 2 CPU threads.
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: ##############################################################################
MPI Rank 0: 01/13/2018 07:58:14: #                                                                            #
MPI Rank 0: 01/13/2018 07:58:14: # speechTrain command (train action)                                         #
MPI Rank 0: 01/13/2018 07:58:14: #                                                                            #
MPI Rank 0: 01/13/2018 07:58:14: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: 
MPI Rank 0: Creating virgin network.
MPI Rank 0: SimpleNetworkBuilder Using CPU
MPI Rank 0: Reading script file glob_0000.scp ... 948 entries
MPI Rank 0: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 0: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 0: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 0: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 0: 01/13/2018 07:58:14: 
MPI Rank 0: Model has 25 nodes. Using CPU.
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 0: 01/13/2018 07:58:14: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: Allocating matrices for forward and/or backward propagation.
MPI Rank 0: 
MPI Rank 0: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 0: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 0: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 0: 
MPI Rank 0: Memory Sharing: Out of 40 matrices, 20 are shared as 5, and 20 are not shared.
MPI Rank 0: 
MPI Rank 0: Here are the ones that share memory:
MPI Rank 0: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 0: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 0: 	{ H2 : [512 x 1 x *]
MPI Rank 0: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 0: 	  W1 : [512 x 512] (gradient)
MPI Rank 0: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 0: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  HLast : [132 x 1 x *]
MPI Rank 0: 	  W0*features : [512 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [512 x 1 x *] }
MPI Rank 0: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 0: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W2*H1 : [132 x 1 x *]
MPI Rank 0: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 0: 	{ H1 : [512 x 1 x *]
MPI Rank 0: 	  W0 : [512 x 363] (gradient)
MPI Rank 0: 	  W0*features : [512 x *] }
MPI Rank 0: 
MPI Rank 0: Here are the ones that don't share memory:
MPI Rank 0: 	{W2 : [132 x 512]}
MPI Rank 0: 	{B2 : [132 x 1]}
MPI Rank 0: 	{MeanOfFeatures : [363]}
MPI Rank 0: 	{W0 : [512 x 363]}
MPI Rank 0: 	{B0 : [512 x 1]}
MPI Rank 0: 	{InvStdOfFeatures : [363]}
MPI Rank 0: 	{W1 : [512 x 512]}
MPI Rank 0: 	{features : [363 x *]}
MPI Rank 0: 	{B1 : [512 x 1]}
MPI Rank 0: 	{LogOfPrior : [132]}
MPI Rank 0: 	{W2 : [132 x 512] (gradient)}
MPI Rank 0: 	{B1 : [512 x 1] (gradient)}
MPI Rank 0: 	{EvalClassificationError : [1]}
MPI Rank 0: 	{B0 : [512 x 1] (gradient)}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 0: 	{Prior : [132]}
MPI Rank 0: 	{labels : [132 x *]}
MPI Rank 0: 	{B2 : [132 x 1] (gradient)}
MPI Rank 0: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/13/2018 07:58:14: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/13/2018 07:58:14: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 0: 01/13/2018 07:58:14: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 0: 01/13/2018 07:58:14: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 0: 01/13/2018 07:58:14: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 0: 
MPI Rank 0: Initializing dataParallelSGD for 1-bit quantization.
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: Precomputing --> 3 PreCompute nodes found.
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:14: 	MeanOfFeatures = Mean()
MPI Rank 0: 01/13/2018 07:58:14: 	InvStdOfFeatures = InvStdDev()
MPI Rank 0: 01/13/2018 07:58:14: 	Prior = Mean()
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:16: Precomputing --> Completed.
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:18: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:18: Starting minibatch loop.
MPI Rank 0: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[   1-  10, 3.13%]: CrossEntropyWithSoftmax = 4.59755198 * 640; EvalClassificationError = 0.93125000 * 640; time = 0.1810s; samplesPerSecond = 3536.2
MPI Rank 0: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.34610349 * 640; EvalClassificationError = 0.92031250 * 640; time = 0.1738s; samplesPerSecond = 3681.4
MPI Rank 0: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.98222516 * 640; EvalClassificationError = 0.89062500 * 640; time = 0.1685s; samplesPerSecond = 3797.1
MPI Rank 0: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.74152814 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.1659s; samplesPerSecond = 3858.2
MPI Rank 0: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  41-  50, 15.63%]: CrossEntropyWithSoftmax = 3.83818572 * 640; EvalClassificationError = 0.86718750 * 640; time = 0.1649s; samplesPerSecond = 3882.1
MPI Rank 0: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71641238 * 640; EvalClassificationError = 0.87500000 * 640; time = 0.1653s; samplesPerSecond = 3872.2
MPI Rank 0: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.41802791 * 640; EvalClassificationError = 0.79687500 * 640; time = 0.1676s; samplesPerSecond = 3817.7
MPI Rank 0: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53832947 * 640; EvalClassificationError = 0.82812500 * 640; time = 0.1654s; samplesPerSecond = 3869.2
MPI Rank 0: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  81-  90, 28.13%]: CrossEntropyWithSoftmax = 3.50628076 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.1692s; samplesPerSecond = 3782.5
MPI Rank 0: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.41478252 * 640; EvalClassificationError = 0.80781250 * 640; time = 0.1661s; samplesPerSecond = 3852.3
MPI Rank 0: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.51031210 * 640; EvalClassificationError = 0.82812500 * 640; time = 0.1703s; samplesPerSecond = 3758.9
MPI Rank 0: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.28365485 * 640; EvalClassificationError = 0.79375000 * 640; time = 0.1679s; samplesPerSecond = 3812.0
MPI Rank 0: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 121- 130, 40.63%]: CrossEntropyWithSoftmax = 3.20932117 * 640; EvalClassificationError = 0.79531250 * 640; time = 0.1662s; samplesPerSecond = 3850.1
MPI Rank 0: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.07460535 * 640; EvalClassificationError = 0.75468750 * 640; time = 0.1654s; samplesPerSecond = 3868.9
MPI Rank 0: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.97529104 * 640; EvalClassificationError = 0.72031250 * 640; time = 0.1679s; samplesPerSecond = 3811.3
MPI Rank 0: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.11968883 * 640; EvalClassificationError = 0.74531250 * 640; time = 0.1648s; samplesPerSecond = 3883.0
MPI Rank 0: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 161- 170, 53.13%]: CrossEntropyWithSoftmax = 2.84172140 * 640; EvalClassificationError = 0.71093750 * 640; time = 0.1669s; samplesPerSecond = 3835.4
MPI Rank 0: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.74031745 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1681s; samplesPerSecond = 3807.8
MPI Rank 0: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.83858085 * 640; EvalClassificationError = 0.72656250 * 640; time = 0.1723s; samplesPerSecond = 3714.7
MPI Rank 0: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.74632253 * 640; EvalClassificationError = 0.69218750 * 640; time = 0.1645s; samplesPerSecond = 3891.0
MPI Rank 0: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 201- 210, 65.63%]: CrossEntropyWithSoftmax = 2.61033254 * 640; EvalClassificationError = 0.66250000 * 640; time = 0.1708s; samplesPerSecond = 3746.2
MPI Rank 0: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.61330754 * 640; EvalClassificationError = 0.65000000 * 640; time = 0.1653s; samplesPerSecond = 3871.7
MPI Rank 0: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.54591810 * 640; EvalClassificationError = 0.66406250 * 640; time = 0.1677s; samplesPerSecond = 3817.5
MPI Rank 0: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.57566512 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1749s; samplesPerSecond = 3660.2
MPI Rank 0: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 241- 250, 78.13%]: CrossEntropyWithSoftmax = 2.49164945 * 640; EvalClassificationError = 0.63281250 * 640; time = 0.1618s; samplesPerSecond = 3954.6
MPI Rank 0: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.39954797 * 640; EvalClassificationError = 0.62812500 * 640; time = 0.1648s; samplesPerSecond = 3884.5
MPI Rank 0: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27034227 * 640; EvalClassificationError = 0.59375000 * 640; time = 0.1631s; samplesPerSecond = 3923.2
MPI Rank 0: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.52112387 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1639s; samplesPerSecond = 3905.3
MPI Rank 0: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 281- 290, 90.63%]: CrossEntropyWithSoftmax = 2.27800991 * 640; EvalClassificationError = 0.59062500 * 640; time = 0.1696s; samplesPerSecond = 3774.3
MPI Rank 0: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26783634 * 640; EvalClassificationError = 0.61093750 * 640; time = 0.1718s; samplesPerSecond = 3725.0
MPI Rank 0: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24590355 * 640; EvalClassificationError = 0.58593750 * 640; time = 0.1557s; samplesPerSecond = 4110.5
MPI Rank 0: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.24415615 * 640; EvalClassificationError = 0.59843750 * 640; time = 0.1673s; samplesPerSecond = 3826.5
MPI Rank 0: 01/13/2018 07:58:23: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 3.04696987 * 20480; EvalClassificationError = 0.73583984 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=5.37027s
MPI Rank 0: 01/13/2018 07:58:23: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/models/cntkSpeech.dnn.1'
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:23: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:23: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 0: Actual gradient aggregation time: 0.018506
MPI Rank 0: Async gradient aggregation wait time: 0.0383716
MPI Rank 0: Actual gradient aggregation time: 0.0473457
MPI Rank 0: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.23258828 * 2304; EvalClassificationError = 0.61414931 * 2304; time = 0.4392s; samplesPerSecond = 5246.2
MPI Rank 0: Async gradient aggregation wait time: 0.0367385
MPI Rank 0: Actual gradient aggregation time: 0.0485138
MPI Rank 0: Async gradient aggregation wait time: 0.0386083
MPI Rank 0: Actual gradient aggregation time: 0.0465482
MPI Rank 0: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.23900729 * 2560; EvalClassificationError = 0.58320313 * 2560; time = 0.4739s; samplesPerSecond = 5401.7
MPI Rank 0: Async gradient aggregation wait time: 0.0263024
MPI Rank 0: Actual gradient aggregation time: 0.0395968
MPI Rank 0: Async gradient aggregation wait time: 0.0273655
MPI Rank 0: Actual gradient aggregation time: 0.0378789
MPI Rank 0: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.16821561 * 2560; EvalClassificationError = 0.57773438 * 2560; time = 0.3928s; samplesPerSecond = 6517.5
MPI Rank 0: Async gradient aggregation wait time: 0.0257338
MPI Rank 0: Actual gradient aggregation time: 0.0398502
MPI Rank 0: Async gradient aggregation wait time: 0.0313585
MPI Rank 0: Actual gradient aggregation time: 0.038939
MPI Rank 0: 01/13/2018 07:58:25:  Epoch[ 2 of 4]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.19929007 * 2560; EvalClassificationError = 0.62148437 * 2560; time = 0.3954s; samplesPerSecond = 6474.1
MPI Rank 0: Async gradient aggregation wait time: 0.0266807
MPI Rank 0: Actual gradient aggregation time: 0.0385712
MPI Rank 0: Async gradient aggregation wait time: 0.0280673
MPI Rank 0: Actual gradient aggregation time: 0.039158
MPI Rank 0: 01/13/2018 07:58:25:  Epoch[ 2 of 4]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.22078510 * 2560; EvalClassificationError = 0.59648437 * 2560; time = 0.3899s; samplesPerSecond = 6566.2
MPI Rank 0: Async gradient aggregation wait time: 0.0276887
MPI Rank 0: Actual gradient aggregation time: 0.0392831
MPI Rank 0: Async gradient aggregation wait time: 0.0298269
MPI Rank 0: Actual gradient aggregation time: 0.0376491
MPI Rank 0: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 2.11215778 * 2560; EvalClassificationError = 0.57500000 * 2560; time = 0.3940s; samplesPerSecond = 6498.2
MPI Rank 0: Async gradient aggregation wait time: 0.028471
MPI Rank 0: Actual gradient aggregation time: 0.042872
MPI Rank 0: Async gradient aggregation wait time: 0.0318992
MPI Rank 0: Actual gradient aggregation time: 0.0367055
MPI Rank 0: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 2.17278295 * 2560; EvalClassificationError = 0.61875000 * 2560; time = 0.3925s; samplesPerSecond = 6522.7
MPI Rank 0: Async gradient aggregation wait time: 0.0265334
MPI Rank 0: Actual gradient aggregation time: 0.0380921
MPI Rank 0: Async gradient aggregation wait time: 0.0260404
MPI Rank 0: Actual gradient aggregation time: 0.0385981
MPI Rank 0: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 2.13143218 * 2560; EvalClassificationError = 0.61015625 * 2560; time = 0.3895s; samplesPerSecond = 6572.3
MPI Rank 0: Async gradient aggregation wait time: 0.0297624
MPI Rank 0: Actual gradient aggregation time: 0.0284831
MPI Rank 0: 01/13/2018 07:58:26: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 2.18331391 * 20480; EvalClassificationError = 0.59926758 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=3.32958s
MPI Rank 0: 01/13/2018 07:58:26: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/models/cntkSpeech.dnn.2'
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:27: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:27: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 0: Async gradient aggregation wait time: 0.0376975
MPI Rank 0: Actual gradient aggregation time: 0.0896461
MPI Rank 0: Async gradient aggregation wait time: 0.0419601
MPI Rank 0: Actual gradient aggregation time: 0.0864808
MPI Rank 0: 01/13/2018 07:58:27:  Epoch[ 3 of 4]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 2.20416772 * 9216; EvalClassificationError = 0.58626302 * 9216; time = 0.8854s; samplesPerSecond = 10409.2
MPI Rank 0: Async gradient aggregation wait time: 0.0462147
MPI Rank 0: Actual gradient aggregation time: 0.0913183
MPI Rank 0: Async gradient aggregation wait time: 0.0476004
MPI Rank 0: Actual gradient aggregation time: 0.0870584
MPI Rank 0: 01/13/2018 07:58:28:  Epoch[ 3 of 4]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 2.14455206 * 10240; EvalClassificationError = 0.58935547 * 10240; time = 0.9054s; samplesPerSecond = 11309.4
MPI Rank 0: 01/13/2018 07:58:28: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 2.16743561 * 20480; EvalClassificationError = 0.58686523 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=1.87636s
MPI Rank 0: 01/13/2018 07:58:28: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/models/cntkSpeech.dnn.3'
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:28: Starting Epoch 4: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:28: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 0: Async gradient aggregation wait time: 3.6e-06
MPI Rank 0: Actual gradient aggregation time: 0.0551877
MPI Rank 0: Async gradient aggregation wait time: 0.0499484
MPI Rank 0: Actual gradient aggregation time: 0.0877477
MPI Rank 0: 01/13/2018 07:58:29:  Epoch[ 4 of 4]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.99101995 * 9216; EvalClassificationError = 0.54448785 * 9216; time = 0.7708s; samplesPerSecond = 11956.9
MPI Rank 0: Async gradient aggregation wait time: 0.0318686
MPI Rank 0: Actual gradient aggregation time: 0.0814691
MPI Rank 0: Async gradient aggregation wait time: 0.044481
MPI Rank 0: Actual gradient aggregation time: 0.0834269
MPI Rank 0: 01/13/2018 07:58:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.97439774 * 10240; EvalClassificationError = 0.54384766 * 10240; time = 0.8392s; samplesPerSecond = 12202.7
MPI Rank 0: Async gradient aggregation wait time: 0.0248052
MPI Rank 0: 01/13/2018 07:58:30: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 1.98345326 * 20480; EvalClassificationError = 0.54462891 * 20480; totalSamplesSeen = 81920; learningRatePerSample = 9.7656251e-05; epochTime=1.68779s
MPI Rank 0: 01/13/2018 07:58:30: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/models/cntkSpeech.dnn'
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:30: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: 01/13/2018 07:58:30: __COMPLETED__
MPI Rank 1: 01/13/2018 07:58:14: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr_speechTrain.logrank1
MPI Rank 1: CNTK 2.3.1+ (HEAD 9e527f, Jan 12 2018 07:29:42) at 2018/01/13 07:58:14
MPI Rank 1: 
MPI Rank 1: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=2  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  speechTrain=[SGD=[maxEpochs=4]]  speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]]  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: Build info: 
MPI Rank 1: 
MPI Rank 1: 		Built time: Jan 12 2018 07:19:19
MPI Rank 1: 		Last modified date: Fri Jan 12 06:53:42 2018
MPI Rank 1: 		Build type: Release
MPI Rank 1: 		Build target: GPU
MPI Rank 1: 		With ASGD: yes
MPI Rank 1: 		Math lib: mkl
MPI Rank 1: 		CUDA version: 9.0.0
MPI Rank 1: 		CUDNN version: 7.0.5
MPI Rank 1: 		Build Branch: HEAD
MPI Rank 1: 		Build SHA1: 9e527facf044613b4f2735fe3f894df60e93fc6f
MPI Rank 1: 		MPI distribution: Microsoft MPI
MPI Rank 1: 		MPI version: 7.0.12437.6
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: GPU info:
MPI Rank 1: 
MPI Rank 1: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 8001 MB
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: 01/13/2018 07:58:15: Using 2 CPU threads.
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: ##############################################################################
MPI Rank 1: 01/13/2018 07:58:15: #                                                                            #
MPI Rank 1: 01/13/2018 07:58:15: # speechTrain command (train action)                                         #
MPI Rank 1: 01/13/2018 07:58:15: #                                                                            #
MPI Rank 1: 01/13/2018 07:58:15: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: 
MPI Rank 1: Creating virgin network.
MPI Rank 1: SimpleNetworkBuilder Using CPU
MPI Rank 1: Reading script file glob_0000.scp ... 948 entries
MPI Rank 1: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 1: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 1: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 1: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 1: 01/13/2018 07:58:15: 
MPI Rank 1: Model has 25 nodes. Using CPU.
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 1: 01/13/2018 07:58:15: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: Allocating matrices for forward and/or backward propagation.
MPI Rank 1: 
MPI Rank 1: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 1: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 1: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 1: 
MPI Rank 1: Memory Sharing: Out of 40 matrices, 20 are shared as 5, and 20 are not shared.
MPI Rank 1: 
MPI Rank 1: Here are the ones that share memory:
MPI Rank 1: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 1: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 1: 	{ H2 : [512 x 1 x *]
MPI Rank 1: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 1: 	  W1 : [512 x 512] (gradient)
MPI Rank 1: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 1: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  HLast : [132 x 1 x *]
MPI Rank 1: 	  W0*features : [512 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [512 x 1 x *] }
MPI Rank 1: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 1: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W2*H1 : [132 x 1 x *]
MPI Rank 1: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 1: 	{ H1 : [512 x 1 x *]
MPI Rank 1: 	  W0 : [512 x 363] (gradient)
MPI Rank 1: 	  W0*features : [512 x *] }
MPI Rank 1: 
MPI Rank 1: Here are the ones that don't share memory:
MPI Rank 1: 	{W2 : [132 x 512]}
MPI Rank 1: 	{W1 : [512 x 512]}
MPI Rank 1: 	{B1 : [512 x 1] (gradient)}
MPI Rank 1: 	{B1 : [512 x 1]}
MPI Rank 1: 	{W2 : [132 x 512] (gradient)}
MPI Rank 1: 	{B2 : [132 x 1] (gradient)}
MPI Rank 1: 	{W0 : [512 x 363]}
MPI Rank 1: 	{B2 : [132 x 1]}
MPI Rank 1: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 1: 	{B0 : [512 x 1]}
MPI Rank 1: 	{labels : [132 x *]}
MPI Rank 1: 	{InvStdOfFeatures : [363]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 1: 	{features : [363 x *]}
MPI Rank 1: 	{Prior : [132]}
MPI Rank 1: 	{LogOfPrior : [132]}
MPI Rank 1: 	{MeanOfFeatures : [363]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 1: 	{EvalClassificationError : [1]}
MPI Rank 1: 	{B0 : [512 x 1] (gradient)}
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/13/2018 07:58:15: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/13/2018 07:58:15: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 1: 01/13/2018 07:58:15: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 1: 01/13/2018 07:58:15: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 1: 01/13/2018 07:58:15: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 1: 
MPI Rank 1: Initializing dataParallelSGD for 1-bit quantization.
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: Precomputing --> 3 PreCompute nodes found.
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:15: 	MeanOfFeatures = Mean()
MPI Rank 1: 01/13/2018 07:58:15: 	InvStdOfFeatures = InvStdDev()
MPI Rank 1: 01/13/2018 07:58:15: 	Prior = Mean()
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:18: Precomputing --> Completed.
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:18: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:18: Starting minibatch loop.
MPI Rank 1: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[   1-  10, 3.13%]: CrossEntropyWithSoftmax = 4.59755198 * 640; EvalClassificationError = 0.93125000 * 640; time = 0.1851s; samplesPerSecond = 3458.4
MPI Rank 1: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.34610349 * 640; EvalClassificationError = 0.92031250 * 640; time = 0.1724s; samplesPerSecond = 3712.3
MPI Rank 1: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.98222516 * 640; EvalClassificationError = 0.89062500 * 640; time = 0.1712s; samplesPerSecond = 3738.8
MPI Rank 1: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.74152814 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.1688s; samplesPerSecond = 3791.5
MPI Rank 1: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  41-  50, 15.63%]: CrossEntropyWithSoftmax = 3.83818572 * 640; EvalClassificationError = 0.86718750 * 640; time = 0.1739s; samplesPerSecond = 3679.7
MPI Rank 1: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71641238 * 640; EvalClassificationError = 0.87500000 * 640; time = 0.1728s; samplesPerSecond = 3704.6
MPI Rank 1: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.41802791 * 640; EvalClassificationError = 0.79687500 * 640; time = 0.1766s; samplesPerSecond = 3624.3
MPI Rank 1: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53832947 * 640; EvalClassificationError = 0.82812500 * 640; time = 0.1699s; samplesPerSecond = 3766.0
MPI Rank 1: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  81-  90, 28.13%]: CrossEntropyWithSoftmax = 3.50628076 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.1692s; samplesPerSecond = 3783.2
MPI Rank 1: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.41478252 * 640; EvalClassificationError = 0.80781250 * 640; time = 0.1749s; samplesPerSecond = 3659.9
MPI Rank 1: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.51031210 * 640; EvalClassificationError = 0.82812500 * 640; time = 0.1701s; samplesPerSecond = 3763.4
MPI Rank 1: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.28365485 * 640; EvalClassificationError = 0.79375000 * 640; time = 0.1762s; samplesPerSecond = 3632.2
MPI Rank 1: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 121- 130, 40.63%]: CrossEntropyWithSoftmax = 3.20932117 * 640; EvalClassificationError = 0.79531250 * 640; time = 0.1680s; samplesPerSecond = 3808.9
MPI Rank 1: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.07460535 * 640; EvalClassificationError = 0.75468750 * 640; time = 0.1718s; samplesPerSecond = 3726.0
MPI Rank 1: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.97529104 * 640; EvalClassificationError = 0.72031250 * 640; time = 0.1696s; samplesPerSecond = 3773.4
MPI Rank 1: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.11968883 * 640; EvalClassificationError = 0.74531250 * 640; time = 0.1687s; samplesPerSecond = 3794.2
MPI Rank 1: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 161- 170, 53.13%]: CrossEntropyWithSoftmax = 2.84172140 * 640; EvalClassificationError = 0.71093750 * 640; time = 0.1693s; samplesPerSecond = 3780.4
MPI Rank 1: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.74031745 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1741s; samplesPerSecond = 3675.3
MPI Rank 1: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.83858085 * 640; EvalClassificationError = 0.72656250 * 640; time = 0.1676s; samplesPerSecond = 3817.7
MPI Rank 1: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.74632253 * 640; EvalClassificationError = 0.69218750 * 640; time = 0.1730s; samplesPerSecond = 3698.8
MPI Rank 1: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 201- 210, 65.63%]: CrossEntropyWithSoftmax = 2.61033254 * 640; EvalClassificationError = 0.66250000 * 640; time = 0.1684s; samplesPerSecond = 3799.9
MPI Rank 1: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.61330754 * 640; EvalClassificationError = 0.65000000 * 640; time = 0.1713s; samplesPerSecond = 3736.4
MPI Rank 1: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.54591810 * 640; EvalClassificationError = 0.66406250 * 640; time = 0.1746s; samplesPerSecond = 3665.1
MPI Rank 1: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.57566512 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1730s; samplesPerSecond = 3700.5
MPI Rank 1: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 241- 250, 78.13%]: CrossEntropyWithSoftmax = 2.49164945 * 640; EvalClassificationError = 0.63281250 * 640; time = 0.1720s; samplesPerSecond = 3720.9
MPI Rank 1: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.39954797 * 640; EvalClassificationError = 0.62812500 * 640; time = 0.1656s; samplesPerSecond = 3865.1
MPI Rank 1: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27034227 * 640; EvalClassificationError = 0.59375000 * 640; time = 0.1697s; samplesPerSecond = 3771.5
MPI Rank 1: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.52112387 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1711s; samplesPerSecond = 3740.3
MPI Rank 1: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 281- 290, 90.63%]: CrossEntropyWithSoftmax = 2.27800991 * 640; EvalClassificationError = 0.59062500 * 640; time = 0.1682s; samplesPerSecond = 3804.4
MPI Rank 1: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26783634 * 640; EvalClassificationError = 0.61093750 * 640; time = 0.1706s; samplesPerSecond = 3751.2
MPI Rank 1: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24590355 * 640; EvalClassificationError = 0.58593750 * 640; time = 0.1783s; samplesPerSecond = 3589.1
MPI Rank 1: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.24415615 * 640; EvalClassificationError = 0.59843750 * 640; time = 0.1068s; samplesPerSecond = 5990.1
MPI Rank 1: 01/13/2018 07:58:23: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 3.04696987 * 20480; EvalClassificationError = 0.73583984 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=5.44465s
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:23: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:23: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 1: Actual gradient aggregation time: 0.0425335
MPI Rank 1: Async gradient aggregation wait time: 4e-06
MPI Rank 1: Actual gradient aggregation time: 0.0324656
MPI Rank 1: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.23258828 * 2304; EvalClassificationError = 0.61414931 * 2304; time = 0.4905s; samplesPerSecond = 4696.8
MPI Rank 1: Async gradient aggregation wait time: 3.8e-06
MPI Rank 1: Actual gradient aggregation time: 0.035058
MPI Rank 1: Async gradient aggregation wait time: 4.1e-06
MPI Rank 1: Actual gradient aggregation time: 0.033148
MPI Rank 1: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.23900729 * 2560; EvalClassificationError = 0.58320313 * 2560; time = 0.4731s; samplesPerSecond = 5411.5
MPI Rank 1: Async gradient aggregation wait time: 3.5e-06
MPI Rank 1: Actual gradient aggregation time: 0.0162057
MPI Rank 1: Async gradient aggregation wait time: 3.7e-06
MPI Rank 1: Actual gradient aggregation time: 0.015974
MPI Rank 1: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.16821561 * 2560; EvalClassificationError = 0.57773438 * 2560; time = 0.3918s; samplesPerSecond = 6533.7
MPI Rank 1: Async gradient aggregation wait time: 3.8e-06
MPI Rank 1: Actual gradient aggregation time: 0.015891
MPI Rank 1: Async gradient aggregation wait time: 3.2e-06
MPI Rank 1: Actual gradient aggregation time: 0.0159907
MPI Rank 1: 01/13/2018 07:58:25:  Epoch[ 2 of 4]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.19929007 * 2560; EvalClassificationError = 0.62148437 * 2560; time = 0.3958s; samplesPerSecond = 6468.4
MPI Rank 1: Async gradient aggregation wait time: 3.9e-06
MPI Rank 1: Actual gradient aggregation time: 0.0156905
MPI Rank 1: Async gradient aggregation wait time: 3.2e-06
MPI Rank 1: Actual gradient aggregation time: 0.0165034
MPI Rank 1: 01/13/2018 07:58:25:  Epoch[ 2 of 4]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.22078510 * 2560; EvalClassificationError = 0.59648437 * 2560; time = 0.3899s; samplesPerSecond = 6565.6
MPI Rank 1: Async gradient aggregation wait time: 3.5e-06
MPI Rank 1: Actual gradient aggregation time: 0.0165691
MPI Rank 1: Async gradient aggregation wait time: 3.3e-06
MPI Rank 1: Actual gradient aggregation time: 0.0165801
MPI Rank 1: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 2.11215778 * 2560; EvalClassificationError = 0.57500000 * 2560; time = 0.3943s; samplesPerSecond = 6492.4
MPI Rank 1: Async gradient aggregation wait time: 3.3e-06
MPI Rank 1: Actual gradient aggregation time: 0.016589
MPI Rank 1: Async gradient aggregation wait time: 3.4e-06
MPI Rank 1: Actual gradient aggregation time: 0.0155794
MPI Rank 1: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 2.17278295 * 2560; EvalClassificationError = 0.61875000 * 2560; time = 0.3919s; samplesPerSecond = 6531.7
MPI Rank 1: Async gradient aggregation wait time: 3.6e-06
MPI Rank 1: Actual gradient aggregation time: 0.0160863
MPI Rank 1: Async gradient aggregation wait time: 3.4e-06
MPI Rank 1: Actual gradient aggregation time: 0.0154426
MPI Rank 1: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 2.13143218 * 2560; EvalClassificationError = 0.61015625 * 2560; time = 0.3817s; samplesPerSecond = 6706.1
MPI Rank 1: Async gradient aggregation wait time: 3.1e-06
MPI Rank 1: Actual gradient aggregation time: 0.0183133
MPI Rank 1: 01/13/2018 07:58:26: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 2.18331391 * 20480; EvalClassificationError = 0.59926758 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=3.33569s
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:27: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:27: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 1: Async gradient aggregation wait time: 4.9e-06
MPI Rank 1: Actual gradient aggregation time: 0.0153587
MPI Rank 1: Async gradient aggregation wait time: 5.4e-06
MPI Rank 1: Actual gradient aggregation time: 0.0162046
MPI Rank 1: 01/13/2018 07:58:28:  Epoch[ 3 of 4]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 2.20416772 * 9216; EvalClassificationError = 0.58626302 * 9216; time = 0.9882s; samplesPerSecond = 9326.1
MPI Rank 1: Async gradient aggregation wait time: 5e-06
MPI Rank 1: Actual gradient aggregation time: 0.0164137
MPI Rank 1: Async gradient aggregation wait time: 4.8e-06
MPI Rank 1: Actual gradient aggregation time: 0.0160289
MPI Rank 1: 01/13/2018 07:58:28:  Epoch[ 3 of 4]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 2.14455206 * 10240; EvalClassificationError = 0.58935547 * 10240; time = 0.8639s; samplesPerSecond = 11853.9
MPI Rank 1: 01/13/2018 07:58:28: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 2.16743561 * 20480; EvalClassificationError = 0.58686523 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=1.8761s
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:28: Starting Epoch 4: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:28: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 1: Async gradient aggregation wait time: 6.5e-06
MPI Rank 1: Actual gradient aggregation time: 0.0356454
MPI Rank 1: Async gradient aggregation wait time: 4.3e-06
MPI Rank 1: Actual gradient aggregation time: 0.0154757
MPI Rank 1: 01/13/2018 07:58:29:  Epoch[ 4 of 4]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.99101995 * 9216; EvalClassificationError = 0.54448785 * 9216; time = 0.8639s; samplesPerSecond = 10667.9
MPI Rank 1: Async gradient aggregation wait time: 4.6e-06
MPI Rank 1: Actual gradient aggregation time: 0.0161914
MPI Rank 1: Async gradient aggregation wait time: 4.6e-06
MPI Rank 1: Actual gradient aggregation time: 0.0168329
MPI Rank 1: 01/13/2018 07:58:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.97439774 * 10240; EvalClassificationError = 0.54384766 * 10240; time = 0.8054s; samplesPerSecond = 12714.9
MPI Rank 1: Async gradient aggregation wait time: 1.9e-06
MPI Rank 1: 01/13/2018 07:58:30: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 1.98345326 * 20480; EvalClassificationError = 0.54462891 * 20480; totalSamplesSeen = 81920; learningRatePerSample = 9.7656251e-05; epochTime=1.69373s
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:30: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: 01/13/2018 07:58:30: __COMPLETED__
MPI Rank 2: 01/13/2018 07:58:15: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr_speechTrain.logrank2
MPI Rank 2: CNTK 2.3.1+ (HEAD 9e527f, Jan 12 2018 07:29:42) at 2018/01/13 07:58:14
MPI Rank 2: 
MPI Rank 2: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu  DeviceId=-1  timestamping=true  numCPUThreads=2  precision=double  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[gradientBits=1]]]]  speechTrain=[SGD=[ParallelTrain=[DataParallelSGD=[useBufferedAsyncGradientAggregation=true]]]]  speechTrain=[SGD=[ParallelTrain=[parallelizationStartEpoch=2]]]  speechTrain=[SGD=[maxEpochs=4]]  speechTrain=[SGD=[ParallelTrain=[syncPerfStats=5]]]  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180113075813.113888\Speech\DNN_ParallelBufferedAsyncGradientAggregation@release_cpu/stderr
MPI Rank 2: -------------------------------------------------------------------
MPI Rank 2: Build info: 
MPI Rank 2: 
MPI Rank 2: 		Built time: Jan 12 2018 07:19:19
MPI Rank 2: 		Last modified date: Fri Jan 12 06:53:42 2018
MPI Rank 2: 		Build type: Release
MPI Rank 2: 		Build target: GPU
MPI Rank 2: 		With ASGD: yes
MPI Rank 2: 		Math lib: mkl
MPI Rank 2: 		CUDA version: 9.0.0
MPI Rank 2: 		CUDNN version: 7.0.5
MPI Rank 2: 		Build Branch: HEAD
MPI Rank 2: 		Build SHA1: 9e527facf044613b4f2735fe3f894df60e93fc6f
MPI Rank 2: 		MPI distribution: Microsoft MPI
MPI Rank 2: 		MPI version: 7.0.12437.6
MPI Rank 2: -------------------------------------------------------------------
MPI Rank 2: -------------------------------------------------------------------
MPI Rank 2: GPU info:
MPI Rank 2: 
MPI Rank 2: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 8001 MB
MPI Rank 2: -------------------------------------------------------------------
MPI Rank 2: 01/13/2018 07:58:15: Using 2 CPU threads.
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: ##############################################################################
MPI Rank 2: 01/13/2018 07:58:15: #                                                                            #
MPI Rank 2: 01/13/2018 07:58:15: # speechTrain command (train action)                                         #
MPI Rank 2: 01/13/2018 07:58:15: #                                                                            #
MPI Rank 2: 01/13/2018 07:58:15: ##############################################################################
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: 
MPI Rank 2: Creating virgin network.
MPI Rank 2: SimpleNetworkBuilder Using CPU
MPI Rank 2: Reading script file glob_0000.scp ... 948 entries
MPI Rank 2: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 2: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 2: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 2: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 2: 01/13/2018 07:58:15: 
MPI Rank 2: Model has 25 nodes. Using CPU.
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 2: 01/13/2018 07:58:15: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: Allocating matrices for forward and/or backward propagation.
MPI Rank 2: 
MPI Rank 2: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 2: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 2: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 2: 
MPI Rank 2: Memory Sharing: Out of 40 matrices, 20 are shared as 5, and 20 are not shared.
MPI Rank 2: 
MPI Rank 2: Here are the ones that share memory:
MPI Rank 2: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 2: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 2: 	{ H1 : [512 x 1 x *]
MPI Rank 2: 	  W0 : [512 x 363] (gradient)
MPI Rank 2: 	  W0*features : [512 x *] }
MPI Rank 2: 	{ H2 : [512 x 1 x *]
MPI Rank 2: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 2: 	  W1 : [512 x 512] (gradient)
MPI Rank 2: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 2: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 2: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  W2*H1 : [132 x 1 x *]
MPI Rank 2: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 2: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 2: 	  HLast : [132 x 1 x *]
MPI Rank 2: 	  W0*features : [512 x *] (gradient)
MPI Rank 2: 	  W1*H1+B1 : [512 x 1 x *] }
MPI Rank 2: 
MPI Rank 2: Here are the ones that don't share memory:
MPI Rank 2: 	{B0 : [512 x 1] (gradient)}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 2: 	{features : [363 x *]}
MPI Rank 2: 	{W1 : [512 x 512]}
MPI Rank 2: 	{B1 : [512 x 1] (gradient)}
MPI Rank 2: 	{B2 : [132 x 1] (gradient)}
MPI Rank 2: 	{W0 : [512 x 363]}
MPI Rank 2: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 2: 	{Prior : [132]}
MPI Rank 2: 	{LogOfPrior : [132]}
MPI Rank 2: 	{labels : [132 x *]}
MPI Rank 2: 	{MeanOfFeatures : [363]}
MPI Rank 2: 	{B2 : [132 x 1]}
MPI Rank 2: 	{W2 : [132 x 512]}
MPI Rank 2: 	{W2 : [132 x 512] (gradient)}
MPI Rank 2: 	{B0 : [512 x 1]}
MPI Rank 2: 	{EvalClassificationError : [1]}
MPI Rank 2: 	{InvStdOfFeatures : [363]}
MPI Rank 2: 	{B1 : [512 x 1]}
MPI Rank 2: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 2: 01/13/2018 07:58:15: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 2: 01/13/2018 07:58:15: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 2: 01/13/2018 07:58:15: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 2: 01/13/2018 07:58:15: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 2: 01/13/2018 07:58:15: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 2: 
MPI Rank 2: Initializing dataParallelSGD for 1-bit quantization.
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: Precomputing --> 3 PreCompute nodes found.
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:15: 	MeanOfFeatures = Mean()
MPI Rank 2: 01/13/2018 07:58:15: 	InvStdOfFeatures = InvStdDev()
MPI Rank 2: 01/13/2018 07:58:15: 	Prior = Mean()
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:17: Precomputing --> Completed.
MPI Rank 2: 
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:18: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:18: Starting minibatch loop.
MPI Rank 2: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[   1-  10, 3.13%]: CrossEntropyWithSoftmax = 4.59755198 * 640; EvalClassificationError = 0.93125000 * 640; time = 0.1719s; samplesPerSecond = 3723.1
MPI Rank 2: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.34610349 * 640; EvalClassificationError = 0.92031250 * 640; time = 0.1674s; samplesPerSecond = 3823.6
MPI Rank 2: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.98222516 * 640; EvalClassificationError = 0.89062500 * 640; time = 0.1582s; samplesPerSecond = 4045.1
MPI Rank 2: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.74152814 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.1694s; samplesPerSecond = 3777.1
MPI Rank 2: 01/13/2018 07:58:18:  Epoch[ 1 of 4]-Minibatch[  41-  50, 15.63%]: CrossEntropyWithSoftmax = 3.83818572 * 640; EvalClassificationError = 0.86718750 * 640; time = 0.1724s; samplesPerSecond = 3711.3
MPI Rank 2: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71641238 * 640; EvalClassificationError = 0.87500000 * 640; time = 0.1749s; samplesPerSecond = 3658.7
MPI Rank 2: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.41802791 * 640; EvalClassificationError = 0.79687500 * 640; time = 0.1679s; samplesPerSecond = 3811.9
MPI Rank 2: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53832947 * 640; EvalClassificationError = 0.82812500 * 640; time = 0.1691s; samplesPerSecond = 3785.0
MPI Rank 2: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  81-  90, 28.13%]: CrossEntropyWithSoftmax = 3.50628076 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.1639s; samplesPerSecond = 3904.6
MPI Rank 2: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.41478252 * 640; EvalClassificationError = 0.80781250 * 640; time = 0.1680s; samplesPerSecond = 3808.4
MPI Rank 2: 01/13/2018 07:58:19:  Epoch[ 1 of 4]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.51031210 * 640; EvalClassificationError = 0.82812500 * 640; time = 0.1835s; samplesPerSecond = 3487.7
MPI Rank 2: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.28365485 * 640; EvalClassificationError = 0.79375000 * 640; time = 0.1685s; samplesPerSecond = 3797.8
MPI Rank 2: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 121- 130, 40.63%]: CrossEntropyWithSoftmax = 3.20932117 * 640; EvalClassificationError = 0.79531250 * 640; time = 0.1706s; samplesPerSecond = 3752.2
MPI Rank 2: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.07460535 * 640; EvalClassificationError = 0.75468750 * 640; time = 0.1723s; samplesPerSecond = 3715.4
MPI Rank 2: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.97529104 * 640; EvalClassificationError = 0.72031250 * 640; time = 0.1687s; samplesPerSecond = 3794.5
MPI Rank 2: 01/13/2018 07:58:20:  Epoch[ 1 of 4]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.11968883 * 640; EvalClassificationError = 0.74531250 * 640; time = 0.1717s; samplesPerSecond = 3727.2
MPI Rank 2: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 161- 170, 53.13%]: CrossEntropyWithSoftmax = 2.84172140 * 640; EvalClassificationError = 0.71093750 * 640; time = 0.1701s; samplesPerSecond = 3761.6
MPI Rank 2: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.74031745 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1735s; samplesPerSecond = 3688.1
MPI Rank 2: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.83858085 * 640; EvalClassificationError = 0.72656250 * 640; time = 0.1675s; samplesPerSecond = 3820.1
MPI Rank 2: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.74632253 * 640; EvalClassificationError = 0.69218750 * 640; time = 0.1720s; samplesPerSecond = 3720.9
MPI Rank 2: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 201- 210, 65.63%]: CrossEntropyWithSoftmax = 2.61033254 * 640; EvalClassificationError = 0.66250000 * 640; time = 0.1684s; samplesPerSecond = 3800.3
MPI Rank 2: 01/13/2018 07:58:21:  Epoch[ 1 of 4]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.61330754 * 640; EvalClassificationError = 0.65000000 * 640; time = 0.1717s; samplesPerSecond = 3726.9
MPI Rank 2: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.54591810 * 640; EvalClassificationError = 0.66406250 * 640; time = 0.1656s; samplesPerSecond = 3865.7
MPI Rank 2: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.57566512 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1669s; samplesPerSecond = 3834.6
MPI Rank 2: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 241- 250, 78.13%]: CrossEntropyWithSoftmax = 2.49164945 * 640; EvalClassificationError = 0.63281250 * 640; time = 0.1709s; samplesPerSecond = 3745.8
MPI Rank 2: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.39954797 * 640; EvalClassificationError = 0.62812500 * 640; time = 0.1775s; samplesPerSecond = 3605.9
MPI Rank 2: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27034227 * 640; EvalClassificationError = 0.59375000 * 640; time = 0.1740s; samplesPerSecond = 3677.7
MPI Rank 2: 01/13/2018 07:58:22:  Epoch[ 1 of 4]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.52112387 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.1685s; samplesPerSecond = 3797.1
MPI Rank 2: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 281- 290, 90.63%]: CrossEntropyWithSoftmax = 2.27800991 * 640; EvalClassificationError = 0.59062500 * 640; time = 0.1703s; samplesPerSecond = 3757.5
MPI Rank 2: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26783634 * 640; EvalClassificationError = 0.61093750 * 640; time = 0.1749s; samplesPerSecond = 3659.0
MPI Rank 2: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24590355 * 640; EvalClassificationError = 0.58593750 * 640; time = 0.1747s; samplesPerSecond = 3664.1
MPI Rank 2: 01/13/2018 07:58:23:  Epoch[ 1 of 4]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.24415615 * 640; EvalClassificationError = 0.59843750 * 640; time = 0.1288s; samplesPerSecond = 4968.4
MPI Rank 2: 01/13/2018 07:58:23: Finished Epoch[ 1 of 4]: [Training] CrossEntropyWithSoftmax = 3.04696987 * 20480; EvalClassificationError = 0.73583984 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=5.42584s
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:23: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:23: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 2: Actual gradient aggregation time: 0.0567442
MPI Rank 2: Async gradient aggregation wait time: 0.0406804
MPI Rank 2: Actual gradient aggregation time: 0.0468348
MPI Rank 2: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.23258828 * 2304; EvalClassificationError = 0.61414931 * 2304; time = 0.4479s; samplesPerSecond = 5144.5
MPI Rank 2: Async gradient aggregation wait time: 0.0393159
MPI Rank 2: Actual gradient aggregation time: 0.0497662
MPI Rank 2: Async gradient aggregation wait time: 0.0395294
MPI Rank 2: Actual gradient aggregation time: 0.0481627
MPI Rank 2: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.23900729 * 2560; EvalClassificationError = 0.58320313 * 2560; time = 0.4747s; samplesPerSecond = 5392.9
MPI Rank 2: Async gradient aggregation wait time: 0.0197754
MPI Rank 2: Actual gradient aggregation time: 0.0394368
MPI Rank 2: Async gradient aggregation wait time: 0.0172038
MPI Rank 2: Actual gradient aggregation time: 0.0403552
MPI Rank 2: 01/13/2018 07:58:24:  Epoch[ 2 of 4]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.16821561 * 2560; EvalClassificationError = 0.57773438 * 2560; time = 0.3930s; samplesPerSecond = 6513.9
MPI Rank 2: Async gradient aggregation wait time: 0.0192733
MPI Rank 2: Actual gradient aggregation time: 0.0395468
MPI Rank 2: Async gradient aggregation wait time: 0.0237143
MPI Rank 2: Actual gradient aggregation time: 0.0389574
MPI Rank 2: 01/13/2018 07:58:25:  Epoch[ 2 of 4]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.19929007 * 2560; EvalClassificationError = 0.62148437 * 2560; time = 0.3953s; samplesPerSecond = 6476.6
MPI Rank 2: Async gradient aggregation wait time: 0.0215438
MPI Rank 2: Actual gradient aggregation time: 0.0384019
MPI Rank 2: Async gradient aggregation wait time: 0.0204583
MPI Rank 2: Actual gradient aggregation time: 0.0392419
MPI Rank 2: 01/13/2018 07:58:25:  Epoch[ 2 of 4]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.22078510 * 2560; EvalClassificationError = 0.59648437 * 2560; time = 0.3903s; samplesPerSecond = 6559.4
MPI Rank 2: Async gradient aggregation wait time: 0.0220661
MPI Rank 2: Actual gradient aggregation time: 0.0392639
MPI Rank 2: Async gradient aggregation wait time: 0.0204449
MPI Rank 2: Actual gradient aggregation time: 0.0377404
MPI Rank 2: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 2.11215778 * 2560; EvalClassificationError = 0.57500000 * 2560; time = 0.3931s; samplesPerSecond = 6511.8
MPI Rank 2: Async gradient aggregation wait time: 0.020381
MPI Rank 2: Actual gradient aggregation time: 0.0428779
MPI Rank 2: Async gradient aggregation wait time: 0.022894
MPI Rank 2: Actual gradient aggregation time: 0.0392671
MPI Rank 2: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 2.17278295 * 2560; EvalClassificationError = 0.61875000 * 2560; time = 0.3938s; samplesPerSecond = 6501.6
MPI Rank 2: Async gradient aggregation wait time: 0.0198748
MPI Rank 2: Actual gradient aggregation time: 0.0381007
MPI Rank 2: Async gradient aggregation wait time: 0.0194479
MPI Rank 2: Actual gradient aggregation time: 0.0385052
MPI Rank 2: 01/13/2018 07:58:26:  Epoch[ 2 of 4]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 2.13143218 * 2560; EvalClassificationError = 0.61015625 * 2560; time = 0.3868s; samplesPerSecond = 6618.8
MPI Rank 2: Async gradient aggregation wait time: 0.0210546
MPI Rank 2: Actual gradient aggregation time: 0.0300217
MPI Rank 2: 01/13/2018 07:58:26: Finished Epoch[ 2 of 4]: [Training] CrossEntropyWithSoftmax = 2.18331391 * 20480; EvalClassificationError = 0.59926758 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=3.33108s
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:27: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:27: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 2: Async gradient aggregation wait time: 0.0296124
MPI Rank 2: Actual gradient aggregation time: 0.0894143
MPI Rank 2: Async gradient aggregation wait time: 0.0295791
MPI Rank 2: Actual gradient aggregation time: 0.0862009
MPI Rank 2: 01/13/2018 07:58:27:  Epoch[ 3 of 4]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 2.20416772 * 9216; EvalClassificationError = 0.58626302 * 9216; time = 0.8957s; samplesPerSecond = 10289.3
MPI Rank 2: Async gradient aggregation wait time: 0.0275376
MPI Rank 2: Actual gradient aggregation time: 0.0939057
MPI Rank 2: Async gradient aggregation wait time: 0.0351082
MPI Rank 2: Actual gradient aggregation time: 0.0868625
MPI Rank 2: 01/13/2018 07:58:28:  Epoch[ 3 of 4]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 2.14455206 * 10240; EvalClassificationError = 0.58935547 * 10240; time = 0.9018s; samplesPerSecond = 11354.6
MPI Rank 2: 01/13/2018 07:58:28: Finished Epoch[ 3 of 4]: [Training] CrossEntropyWithSoftmax = 2.16743561 * 20480; EvalClassificationError = 0.58686523 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=1.87216s
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:28: Starting Epoch 4: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:28: Starting minibatch loop, DataParallelSGD training (myRank = 2, numNodes = 3, numGradientBits = 1), BufferedAsyncGradientAggregation is ENABLED, distributed reading is ENABLED.
MPI Rank 2: Async gradient aggregation wait time: 0.0181707
MPI Rank 2: Actual gradient aggregation time: 0.0715623
MPI Rank 2: Async gradient aggregation wait time: 0.0314283
MPI Rank 2: Actual gradient aggregation time: 0.087681
MPI Rank 2: 01/13/2018 07:58:29:  Epoch[ 4 of 4]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.99101995 * 9216; EvalClassificationError = 0.54448785 * 9216; time = 0.7788s; samplesPerSecond = 11834.0
MPI Rank 2: Async gradient aggregation wait time: 0.0444036
MPI Rank 2: Actual gradient aggregation time: 0.0814134
MPI Rank 2: Async gradient aggregation wait time: 0.0323686
MPI Rank 2: Actual gradient aggregation time: 0.083242
MPI Rank 2: 01/13/2018 07:58:30:  Epoch[ 4 of 4]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.97439774 * 10240; EvalClassificationError = 0.54384766 * 10240; time = 0.8366s; samplesPerSecond = 12240.4
MPI Rank 2: Async gradient aggregation wait time: 0.0180619
MPI Rank 2: 01/13/2018 07:58:30: Finished Epoch[ 4 of 4]: [Training] CrossEntropyWithSoftmax = 1.98345326 * 20480; EvalClassificationError = 0.54462891 * 20480; totalSamplesSeen = 81920; learningRatePerSample = 9.7656251e-05; epochTime=1.68703s
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:30: Action "train" complete.
MPI Rank 2: 
MPI Rank 2: 01/13/2018 07:58:30: __COMPLETED__
