CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
    Hardware threads: 6
    Total Memory: 58719796 kB
-------------------------------------------------------------------
=== Running c:\local\msmpi-7.0.12437.6\Bin/mpiexec.exe -n 2 C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu DeviceId=0 timestamping=true numCPUThreads=3 shareNodeValueMatrices=true saveBestModelPerCriterion=true stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:56:30

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 2 nodes pinging each other
CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:56:30

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 2 nodes pinging each other
ping [requestnodes (after change)]: 2 nodes pinging each other
ping [requestnodes (after change)]: 2 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 2 out of 2 MPI nodes on a single host (2 requested); we (0) are in (participating)
requestnodes [MPIWrapperMpi]: using 2 out of 2 MPI nodes on a single host (2 requested); we (1) are in (participating)
ping [mpihelper]: 2 nodes pinging each other
ping [mpihelper]: 2 nodes pinging each other
MPI Rank 0: 01/11/2018 08:56:30: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr_speechTrain.logrank0
MPI Rank 0: CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:56:30
MPI Rank 0: 
MPI Rank 0: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: Build info: 
MPI Rank 0: 
MPI Rank 0: 		Built time: Jan 10 2018 22:47:38
MPI Rank 0: 		Last modified date: Wed Jan 10 22:18:32 2018
MPI Rank 0: 		Build type: Release
MPI Rank 0: 		Build target: GPU
MPI Rank 0: 		With ASGD: yes
MPI Rank 0: 		Math lib: mkl
MPI Rank 0: 		CUDA version: 9.0.0
MPI Rank 0: 		CUDNN version: 7.0.5
MPI Rank 0: 		Build Branch: HEAD
MPI Rank 0: 		Build SHA1: db192cd3cb9ac688cae719c41e5930a4e3f628ea
MPI Rank 0: 		MPI distribution: Microsoft MPI
MPI Rank 0: 		MPI version: 7.0.12437.6
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: GPU info:
MPI Rank 0: 
MPI Rank 0: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 8001 MB
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: 01/11/2018 08:56:30: Using 3 CPU threads.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:30: ##############################################################################
MPI Rank 0: 01/11/2018 08:56:30: #                                                                            #
MPI Rank 0: 01/11/2018 08:56:30: # speechTrain command (train action)                                         #
MPI Rank 0: 01/11/2018 08:56:30: #                                                                            #
MPI Rank 0: 01/11/2018 08:56:30: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:30: 
MPI Rank 0: Creating virgin network.
MPI Rank 0: SimpleNetworkBuilder Using GPU 0
MPI Rank 0: Reading script file glob_0000.scp ... 948 entries
MPI Rank 0: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 0: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 0: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 0: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 0: Reading script file glob_0000.cv.scp ... 300 entries
MPI Rank 0: HTKDeserializer: selected '300' utterances grouped into '1' chunks, average chunk size: 300.0 utterances, 83050.0 frames (for I/O: 300.0 utterances, 83050.0 frames)
MPI Rank 0: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 0: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 0: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 0: 01/11/2018 08:56:31: 
MPI Rank 0: Model has 25 nodes. Using GPU 0.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:31: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 0: 01/11/2018 08:56:31: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: Allocating matrices for forward and/or backward propagation.
MPI Rank 0: 
MPI Rank 0: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 0: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 0: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 0: 
MPI Rank 0: Memory Sharing: Out of 40 matrices, 20 are shared as 5, and 20 are not shared.
MPI Rank 0: 
MPI Rank 0: Here are the ones that share memory:
MPI Rank 0: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 0: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 0: 	{ H1 : [512 x 1 x *]
MPI Rank 0: 	  W0 : [512 x 363] (gradient)
MPI Rank 0: 	  W0*features : [512 x *] }
MPI Rank 0: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  HLast : [132 x 1 x *]
MPI Rank 0: 	  W0*features : [512 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [512 x 1 x *] }
MPI Rank 0: 	{ H2 : [512 x 1 x *]
MPI Rank 0: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 0: 	  W1 : [512 x 512] (gradient)
MPI Rank 0: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 0: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 0: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 0: 	  W2*H1 : [132 x 1 x *]
MPI Rank 0: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 0: 
MPI Rank 0: Here are the ones that don't share memory:
MPI Rank 0: 	{features : [363 x *]}
MPI Rank 0: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 0: 	{B1 : [512 x 1] (gradient)}
MPI Rank 0: 	{W2 : [132 x 512] (gradient)}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 0: 	{LogOfPrior : [132]}
MPI Rank 0: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 0: 	{B0 : [512 x 1] (gradient)}
MPI Rank 0: 	{B2 : [132 x 1] (gradient)}
MPI Rank 0: 	{W0 : [512 x 363]}
MPI Rank 0: 	{B1 : [512 x 1]}
MPI Rank 0: 	{EvalClassificationError : [1]}
MPI Rank 0: 	{InvStdOfFeatures : [363]}
MPI Rank 0: 	{MeanOfFeatures : [363]}
MPI Rank 0: 	{B0 : [512 x 1]}
MPI Rank 0: 	{W2 : [132 x 512]}
MPI Rank 0: 	{W1 : [512 x 512]}
MPI Rank 0: 	{B2 : [132 x 1]}
MPI Rank 0: 	{labels : [132 x *]}
MPI Rank 0: 	{Prior : [132]}
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:31: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:31: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/11/2018 08:56:31: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/11/2018 08:56:31: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 0: 01/11/2018 08:56:31: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 0: 01/11/2018 08:56:31: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 0: 01/11/2018 08:56:31: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 0: 
MPI Rank 0: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:31: Precomputing --> 3 PreCompute nodes found.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:31: 	MeanOfFeatures = Mean()
MPI Rank 0: 01/11/2018 08:56:31: 	InvStdOfFeatures = InvStdDev()
MPI Rank 0: 01/11/2018 08:56:31: 	Prior = Mean()
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:34: Precomputing --> Completed.
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:35: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:35: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[   1-  10, 3.13%]: CrossEntropyWithSoftmax = 4.62512789 * 640; EvalClassificationError = 0.94062500 * 640; time = 0.0857s; samplesPerSecond = 7465.7
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.35619366 * 640; EvalClassificationError = 0.92343750 * 640; time = 0.0770s; samplesPerSecond = 8309.7
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.97911998 * 640; EvalClassificationError = 0.89531250 * 640; time = 0.0667s; samplesPerSecond = 9599.3
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.73643568 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.0680s; samplesPerSecond = 9406.2
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  41-  50, 15.63%]: CrossEntropyWithSoftmax = 3.83079081 * 640; EvalClassificationError = 0.88281250 * 640; time = 0.0653s; samplesPerSecond = 9807.6
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71437690 * 640; EvalClassificationError = 0.86875000 * 640; time = 0.0671s; samplesPerSecond = 9532.1
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.42186231 * 640; EvalClassificationError = 0.79062500 * 640; time = 0.0650s; samplesPerSecond = 9849.0
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53658053 * 640; EvalClassificationError = 0.82031250 * 640; time = 0.0664s; samplesPerSecond = 9637.4
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  81-  90, 28.13%]: CrossEntropyWithSoftmax = 3.49758018 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.0694s; samplesPerSecond = 9219.7
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.39996308 * 640; EvalClassificationError = 0.80468750 * 640; time = 0.0682s; samplesPerSecond = 9387.4
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.49445773 * 640; EvalClassificationError = 0.82500000 * 640; time = 0.0670s; samplesPerSecond = 9555.4
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.26676999 * 640; EvalClassificationError = 0.79218750 * 640; time = 0.0670s; samplesPerSecond = 9557.8
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 121- 130, 40.63%]: CrossEntropyWithSoftmax = 3.18870174 * 640; EvalClassificationError = 0.78906250 * 640; time = 0.0665s; samplesPerSecond = 9619.9
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.05687264 * 640; EvalClassificationError = 0.74687500 * 640; time = 0.0642s; samplesPerSecond = 9972.1
MPI Rank 0: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.95594570 * 640; EvalClassificationError = 0.71875000 * 640; time = 0.0653s; samplesPerSecond = 9795.2
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.10219605 * 640; EvalClassificationError = 0.74062500 * 640; time = 0.0649s; samplesPerSecond = 9855.9
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 161- 170, 53.13%]: CrossEntropyWithSoftmax = 2.80745016 * 640; EvalClassificationError = 0.70625000 * 640; time = 0.0655s; samplesPerSecond = 9766.5
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.72061843 * 640; EvalClassificationError = 0.65468750 * 640; time = 0.0802s; samplesPerSecond = 7980.9
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.80425748 * 640; EvalClassificationError = 0.71718750 * 640; time = 0.0669s; samplesPerSecond = 9572.1
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.71253069 * 640; EvalClassificationError = 0.67812500 * 640; time = 0.0803s; samplesPerSecond = 7974.4
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 201- 210, 65.63%]: CrossEntropyWithSoftmax = 2.59360400 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.0667s; samplesPerSecond = 9596.9
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.60386650 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.0661s; samplesPerSecond = 9684.8
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.53706679 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.0673s; samplesPerSecond = 9516.6
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.56177344 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.0649s; samplesPerSecond = 9858.3
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 241- 250, 78.13%]: CrossEntropyWithSoftmax = 2.50118792 * 640; EvalClassificationError = 0.64218750 * 640; time = 0.0651s; samplesPerSecond = 9837.1
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.40119789 * 640; EvalClassificationError = 0.62500000 * 640; time = 0.0683s; samplesPerSecond = 9363.7
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27491504 * 640; EvalClassificationError = 0.58906250 * 640; time = 0.0666s; samplesPerSecond = 9605.3
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.51724208 * 640; EvalClassificationError = 0.65781250 * 640; time = 0.0685s; samplesPerSecond = 9348.8
MPI Rank 0: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 281- 290, 90.63%]: CrossEntropyWithSoftmax = 2.27797543 * 640; EvalClassificationError = 0.59687500 * 640; time = 0.0668s; samplesPerSecond = 9577.6
MPI Rank 0: 01/11/2018 08:56:38:  Epoch[ 1 of 15]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26017741 * 640; EvalClassificationError = 0.60937500 * 640; time = 0.0675s; samplesPerSecond = 9478.3
MPI Rank 0: 01/11/2018 08:56:38:  Epoch[ 1 of 15]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24735343 * 640; EvalClassificationError = 0.58437500 * 640; time = 0.0669s; samplesPerSecond = 9561.6
MPI Rank 0: 01/11/2018 08:56:38:  Epoch[ 1 of 15]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.23665382 * 640; EvalClassificationError = 0.60625000 * 640; time = 0.0662s; samplesPerSecond = 9660.8
MPI Rank 0: 01/11/2018 08:56:38: Finished Epoch[ 1 of 15]: [Training] CrossEntropyWithSoftmax = 3.03815142 * 20480; EvalClassificationError = 0.73432617 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=2.20165s
MPI Rank 0: 01/11/2018 08:56:39: Final Results: Minibatch[1-1299]: CrossEntropyWithSoftmax = 2.24821048 * 83050; perplexity = 9.47077252; EvalClassificationError = 0.61623119 * 83050
MPI Rank 0: 01/11/2018 08:56:39: Finished Epoch[ 1 of 15]: [Validate] CrossEntropyWithSoftmax = 2.24821048 * 83050; EvalClassificationError = 0.61623119 * 83050
MPI Rank 0: 01/11/2018 08:56:39: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 2.248210 (Epoch 1); EvalClassificationError = 0.616231 (Epoch 1)
MPI Rank 0: 01/11/2018 08:56:39: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.1'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:39: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:39: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:39:  Epoch[ 2 of 15]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.13894071 * 2560; EvalClassificationError = 0.56992188 * 2560; time = 0.1379s; samplesPerSecond = 18569.8
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.06106261 * 2560; EvalClassificationError = 0.55664063 * 2560; time = 0.1249s; samplesPerSecond = 20500.8
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.04459475 * 2560; EvalClassificationError = 0.55039063 * 2560; time = 0.1233s; samplesPerSecond = 20765.4
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.03347291 * 2560; EvalClassificationError = 0.55742187 * 2560; time = 0.1259s; samplesPerSecond = 20331.2
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.02079287 * 2560; EvalClassificationError = 0.54414063 * 2560; time = 0.1218s; samplesPerSecond = 21023.0
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 1.96950012 * 2560; EvalClassificationError = 0.53085938 * 2560; time = 0.1253s; samplesPerSecond = 20431.8
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 1.95934863 * 2560; EvalClassificationError = 0.52812500 * 2560; time = 0.1212s; samplesPerSecond = 21120.0
MPI Rank 0: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 1.94070839 * 2560; EvalClassificationError = 0.53125000 * 2560; time = 0.1212s; samplesPerSecond = 21127.6
MPI Rank 0: 01/11/2018 08:56:40: Finished Epoch[ 2 of 15]: [Training] CrossEntropyWithSoftmax = 2.02105263 * 20480; EvalClassificationError = 0.54609375 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=1.00871s
MPI Rank 0: 01/11/2018 08:56:41: Final Results: Minibatch[1-326]: CrossEntropyWithSoftmax = 1.92733488 * 83050; perplexity = 6.87117334; EvalClassificationError = 0.53122216 * 83050
MPI Rank 0: 01/11/2018 08:56:41: Finished Epoch[ 2 of 15]: [Validate] CrossEntropyWithSoftmax = 1.92733488 * 83050; EvalClassificationError = 0.53122216 * 83050
MPI Rank 0: 01/11/2018 08:56:41: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.927335 (Epoch 2); EvalClassificationError = 0.531222 (Epoch 2)
MPI Rank 0: 01/11/2018 08:56:41: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.2'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:41: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:41: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:42:  Epoch[ 3 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.94336420 * 10240; EvalClassificationError = 0.53056641 * 10240; time = 0.3987s; samplesPerSecond = 25683.8
MPI Rank 0: 01/11/2018 08:56:42:  Epoch[ 3 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.96525554 * 10240; EvalClassificationError = 0.54873047 * 10240; time = 0.3563s; samplesPerSecond = 28738.4
MPI Rank 0: 01/11/2018 08:56:42: Finished Epoch[ 3 of 15]: [Training] CrossEntropyWithSoftmax = 1.95430987 * 20480; EvalClassificationError = 0.53964844 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=0.762259s
MPI Rank 0: 01/11/2018 08:56:43: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.90639119 * 83050; perplexity = 6.72876211; EvalClassificationError = 0.52304636 * 83050
MPI Rank 0: 01/11/2018 08:56:43: Finished Epoch[ 3 of 15]: [Validate] CrossEntropyWithSoftmax = 1.90639119 * 83050; EvalClassificationError = 0.52304636 * 83050
MPI Rank 0: 01/11/2018 08:56:43: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.906391 (Epoch 3); EvalClassificationError = 0.523046 (Epoch 3)
MPI Rank 0: 01/11/2018 08:56:43: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.3'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:43: Starting Epoch 4: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:43: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:43:  Epoch[ 4 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.92960398 * 10240; EvalClassificationError = 0.52734375 * 10240; time = 0.3610s; samplesPerSecond = 28363.8
MPI Rank 0: 01/11/2018 08:56:44:  Epoch[ 4 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.91791093 * 10240; EvalClassificationError = 0.52138672 * 10240; time = 0.3528s; samplesPerSecond = 29024.9
MPI Rank 0: 01/11/2018 08:56:44: Finished Epoch[ 4 of 15]: [Training] CrossEntropyWithSoftmax = 1.92375746 * 20480; EvalClassificationError = 0.52436523 * 20480; totalSamplesSeen = 81920; learningRatePerSample = 9.7656251e-05; epochTime=0.721823s
MPI Rank 0: 01/11/2018 08:56:44: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.89723688 * 83050; perplexity = 6.66744604; EvalClassificationError = 0.52192655 * 83050
MPI Rank 0: 01/11/2018 08:56:44: Finished Epoch[ 4 of 15]: [Validate] CrossEntropyWithSoftmax = 1.89723688 * 83050; EvalClassificationError = 0.52192655 * 83050
MPI Rank 0: 01/11/2018 08:56:44: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.897237 (Epoch 4); EvalClassificationError = 0.521927 (Epoch 4)
MPI Rank 0: 01/11/2018 08:56:44: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.4'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:45: Starting Epoch 5: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:45: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:45:  Epoch[ 5 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.93213905 * 10240; EvalClassificationError = 0.52744141 * 10240; time = 0.3573s; samplesPerSecond = 28659.2
MPI Rank 0: 01/11/2018 08:56:45:  Epoch[ 5 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.91008045 * 10240; EvalClassificationError = 0.52197266 * 10240; time = 0.3480s; samplesPerSecond = 29422.0
MPI Rank 0: 01/11/2018 08:56:45: Finished Epoch[ 5 of 15]: [Training] CrossEntropyWithSoftmax = 1.92110975 * 20480; EvalClassificationError = 0.52470703 * 20480; totalSamplesSeen = 102400; learningRatePerSample = 9.7656251e-05; epochTime=0.713388s
MPI Rank 0: 01/11/2018 08:56:46: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.88941575 * 83050; perplexity = 6.61550243; EvalClassificationError = 0.52039735 * 83050
MPI Rank 0: 01/11/2018 08:56:46: Finished Epoch[ 5 of 15]: [Validate] CrossEntropyWithSoftmax = 1.88941575 * 83050; EvalClassificationError = 0.52039735 * 83050
MPI Rank 0: 01/11/2018 08:56:46: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.889416 (Epoch 5); EvalClassificationError = 0.520397 (Epoch 5)
MPI Rank 0: 01/11/2018 08:56:46: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.5'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:46: Starting Epoch 6: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:46: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:46:  Epoch[ 6 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.92107601 * 10240; EvalClassificationError = 0.52783203 * 10240; time = 0.3537s; samplesPerSecond = 28954.3
MPI Rank 0: 01/11/2018 08:56:47:  Epoch[ 6 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.90118051 * 10240; EvalClassificationError = 0.52031250 * 10240; time = 0.3442s; samplesPerSecond = 29753.7
MPI Rank 0: 01/11/2018 08:56:47: Finished Epoch[ 6 of 15]: [Training] CrossEntropyWithSoftmax = 1.91112826 * 20480; EvalClassificationError = 0.52407227 * 20480; totalSamplesSeen = 122880; learningRatePerSample = 9.7656251e-05; epochTime=0.704781s
MPI Rank 0: 01/11/2018 08:56:48: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.88230716 * 83050; perplexity = 6.56864231; EvalClassificationError = 0.51898856 * 83050
MPI Rank 0: 01/11/2018 08:56:48: Finished Epoch[ 6 of 15]: [Validate] CrossEntropyWithSoftmax = 1.88230716 * 83050; EvalClassificationError = 0.51898856 * 83050
MPI Rank 0: 01/11/2018 08:56:48: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.882307 (Epoch 6); EvalClassificationError = 0.518989 (Epoch 6)
MPI Rank 0: 01/11/2018 08:56:48: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.6'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:48: Starting Epoch 7: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:48: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:48:  Epoch[ 7 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.87751809 * 10240; EvalClassificationError = 0.51201172 * 10240; time = 0.3535s; samplesPerSecond = 28965.2
MPI Rank 0: 01/11/2018 08:56:48:  Epoch[ 7 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.90589643 * 10240; EvalClassificationError = 0.53007812 * 10240; time = 0.3507s; samplesPerSecond = 29197.7
MPI Rank 0: 01/11/2018 08:56:48: Finished Epoch[ 7 of 15]: [Training] CrossEntropyWithSoftmax = 1.89170726 * 20480; EvalClassificationError = 0.52104492 * 20480; totalSamplesSeen = 143360; learningRatePerSample = 9.7656251e-05; epochTime=0.711525s
MPI Rank 0: 01/11/2018 08:56:49: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.87533201 * 83050; perplexity = 6.52298444; EvalClassificationError = 0.51865141 * 83050
MPI Rank 0: 01/11/2018 08:56:49: Finished Epoch[ 7 of 15]: [Validate] CrossEntropyWithSoftmax = 1.87533201 * 83050; EvalClassificationError = 0.51865141 * 83050
MPI Rank 0: 01/11/2018 08:56:49: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.875332 (Epoch 7); EvalClassificationError = 0.518651 (Epoch 7)
MPI Rank 0: 01/11/2018 08:56:49: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.7'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:49: Starting Epoch 8: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:49: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:50:  Epoch[ 8 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.88190523 * 10240; EvalClassificationError = 0.51777344 * 10240; time = 0.3479s; samplesPerSecond = 29432.7
MPI Rank 0: 01/11/2018 08:56:50:  Epoch[ 8 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.86655063 * 10240; EvalClassificationError = 0.51562500 * 10240; time = 0.3591s; samplesPerSecond = 28517.4
MPI Rank 0: 01/11/2018 08:56:50: Finished Epoch[ 8 of 15]: [Training] CrossEntropyWithSoftmax = 1.87422793 * 20480; EvalClassificationError = 0.51669922 * 20480; totalSamplesSeen = 163840; learningRatePerSample = 9.7656251e-05; epochTime=0.714474s
MPI Rank 0: 01/11/2018 08:56:51: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.86996773 * 83050; perplexity = 6.48808705; EvalClassificationError = 0.51725467 * 83050
MPI Rank 0: 01/11/2018 08:56:51: Finished Epoch[ 8 of 15]: [Validate] CrossEntropyWithSoftmax = 1.86996773 * 83050; EvalClassificationError = 0.51725467 * 83050
MPI Rank 0: 01/11/2018 08:56:51: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.869968 (Epoch 8); EvalClassificationError = 0.517255 (Epoch 8)
MPI Rank 0: 01/11/2018 08:56:51: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.8'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:51: Starting Epoch 9: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:51: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:51:  Epoch[ 9 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.85947921 * 10240; EvalClassificationError = 0.50673828 * 10240; time = 0.3620s; samplesPerSecond = 28289.3
MPI Rank 0: 01/11/2018 08:56:52:  Epoch[ 9 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.85700426 * 10240; EvalClassificationError = 0.51582031 * 10240; time = 0.3440s; samplesPerSecond = 29765.2
MPI Rank 0: 01/11/2018 08:56:52: Finished Epoch[ 9 of 15]: [Training] CrossEntropyWithSoftmax = 1.85824174 * 20480; EvalClassificationError = 0.51127930 * 20480; totalSamplesSeen = 184320; learningRatePerSample = 9.7656251e-05; epochTime=0.714642s
MPI Rank 0: 01/11/2018 08:56:52: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.86323873 * 83050; perplexity = 6.44457525; EvalClassificationError = 0.51674895 * 83050
MPI Rank 0: 01/11/2018 08:56:52: Finished Epoch[ 9 of 15]: [Validate] CrossEntropyWithSoftmax = 1.86323873 * 83050; EvalClassificationError = 0.51674895 * 83050
MPI Rank 0: 01/11/2018 08:56:52: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.863239 (Epoch 9); EvalClassificationError = 0.516749 (Epoch 9)
MPI Rank 0: 01/11/2018 08:56:52: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.9'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:52: Starting Epoch 10: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:52: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:53:  Epoch[10 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.89317989 * 10240; EvalClassificationError = 0.52548828 * 10240; time = 0.3452s; samplesPerSecond = 29662.9
MPI Rank 0: 01/11/2018 08:56:53:  Epoch[10 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84631301 * 10240; EvalClassificationError = 0.50986328 * 10240; time = 0.3453s; samplesPerSecond = 29651.9
MPI Rank 0: 01/11/2018 08:56:53: Finished Epoch[10 of 15]: [Training] CrossEntropyWithSoftmax = 1.86974645 * 20480; EvalClassificationError = 0.51767578 * 20480; totalSamplesSeen = 204800; learningRatePerSample = 9.7656251e-05; epochTime=0.697611s
MPI Rank 0: 01/11/2018 08:56:54: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.85695611 * 83050; perplexity = 6.40421333; EvalClassificationError = 0.51576159 * 83050
MPI Rank 0: 01/11/2018 08:56:54: Finished Epoch[10 of 15]: [Validate] CrossEntropyWithSoftmax = 1.85695611 * 83050; EvalClassificationError = 0.51576159 * 83050
MPI Rank 0: 01/11/2018 08:56:54: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.856956 (Epoch 10); EvalClassificationError = 0.515762 (Epoch 10)
MPI Rank 0: 01/11/2018 08:56:54: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.10'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:54: Starting Epoch 11: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:54: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:54:  Epoch[11 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.86460008 * 10240; EvalClassificationError = 0.50751953 * 10240; time = 0.3689s; samplesPerSecond = 27760.8
MPI Rank 0: 01/11/2018 08:56:55:  Epoch[11 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.86031159 * 10240; EvalClassificationError = 0.51816406 * 10240; time = 0.3434s; samplesPerSecond = 29823.3
MPI Rank 0: 01/11/2018 08:56:55: Finished Epoch[11 of 15]: [Training] CrossEntropyWithSoftmax = 1.86245583 * 20480; EvalClassificationError = 0.51284180 * 20480; totalSamplesSeen = 225280; learningRatePerSample = 9.7656251e-05; epochTime=0.719375s
MPI Rank 0: 01/11/2018 08:56:56: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.85008405 * 83050; perplexity = 6.36035408; EvalClassificationError = 0.51326911 * 83050
MPI Rank 0: 01/11/2018 08:56:56: Finished Epoch[11 of 15]: [Validate] CrossEntropyWithSoftmax = 1.85008405 * 83050; EvalClassificationError = 0.51326911 * 83050
MPI Rank 0: 01/11/2018 08:56:56: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.850084 (Epoch 11); EvalClassificationError = 0.513269 (Epoch 11)
MPI Rank 0: 01/11/2018 08:56:56: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.11'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:56: Starting Epoch 12: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:56: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:56:  Epoch[12 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.86700752 * 10240; EvalClassificationError = 0.51181641 * 10240; time = 0.3593s; samplesPerSecond = 28500.5
MPI Rank 0: 01/11/2018 08:56:56:  Epoch[12 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.83390766 * 10240; EvalClassificationError = 0.50585938 * 10240; time = 0.3516s; samplesPerSecond = 29120.5
MPI Rank 0: 01/11/2018 08:56:56: Finished Epoch[12 of 15]: [Training] CrossEntropyWithSoftmax = 1.85045759 * 20480; EvalClassificationError = 0.50883789 * 20480; totalSamplesSeen = 245760; learningRatePerSample = 9.7656251e-05; epochTime=0.718445s
MPI Rank 0: 01/11/2018 08:56:57: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.84352145 * 83050; perplexity = 6.31875031; EvalClassificationError = 0.51169175 * 83050
MPI Rank 0: 01/11/2018 08:56:57: Finished Epoch[12 of 15]: [Validate] CrossEntropyWithSoftmax = 1.84352145 * 83050; EvalClassificationError = 0.51169175 * 83050
MPI Rank 0: 01/11/2018 08:56:57: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.843521 (Epoch 12); EvalClassificationError = 0.511692 (Epoch 12)
MPI Rank 0: 01/11/2018 08:56:57: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.12'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:57: Starting Epoch 13: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:57: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:58:  Epoch[13 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.84005490 * 10046; EvalClassificationError = 0.51542903 * 10046; time = 0.4121s; samplesPerSecond = 24377.6
MPI Rank 0: 01/11/2018 08:56:58:  Epoch[13 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.87225994 * 10240; EvalClassificationError = 0.51484375 * 10240; time = 0.3417s; samplesPerSecond = 29963.7
MPI Rank 0: 01/11/2018 08:56:58: Finished Epoch[13 of 15]: [Training] CrossEntropyWithSoftmax = 1.85713955 * 20480; EvalClassificationError = 0.51479492 * 20480; totalSamplesSeen = 266240; learningRatePerSample = 9.7656251e-05; epochTime=0.771022s
MPI Rank 0: 01/11/2018 08:56:59: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.83713385 * 83050; perplexity = 6.27851730; EvalClassificationError = 0.50862131 * 83050
MPI Rank 0: 01/11/2018 08:56:59: Finished Epoch[13 of 15]: [Validate] CrossEntropyWithSoftmax = 1.83713385 * 83050; EvalClassificationError = 0.50862131 * 83050
MPI Rank 0: 01/11/2018 08:56:59: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.837134 (Epoch 13); EvalClassificationError = 0.508621 (Epoch 13)
MPI Rank 0: 01/11/2018 08:56:59: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.13'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:59: Starting Epoch 14: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:56:59: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:56:59:  Epoch[14 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.85347546 * 10240; EvalClassificationError = 0.50312500 * 10240; time = 0.3345s; samplesPerSecond = 30608.3
MPI Rank 0: 01/11/2018 08:56:59:  Epoch[14 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84170081 * 10240; EvalClassificationError = 0.50791016 * 10240; time = 0.3311s; samplesPerSecond = 30929.8
MPI Rank 0: 01/11/2018 08:56:59: Finished Epoch[14 of 15]: [Training] CrossEntropyWithSoftmax = 1.84758814 * 20480; EvalClassificationError = 0.50551758 * 20480; totalSamplesSeen = 286720; learningRatePerSample = 9.7656251e-05; epochTime=0.672587s
MPI Rank 0: 01/11/2018 08:57:00: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.83143597 * 83050; perplexity = 6.24284478; EvalClassificationError = 0.50930765 * 83050
MPI Rank 0: 01/11/2018 08:57:00: Finished Epoch[14 of 15]: [Validate] CrossEntropyWithSoftmax = 1.83143597 * 83050; EvalClassificationError = 0.50930765 * 83050
MPI Rank 0: 01/11/2018 08:57:00: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.831436 (Epoch 14); EvalClassificationError = 0.508621 (Epoch 13)
MPI Rank 0: 01/11/2018 08:57:00: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.14'
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:00: Starting Epoch 15: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:00: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:57:01:  Epoch[15 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.81729821 * 10240; EvalClassificationError = 0.50380859 * 10240; time = 0.3489s; samplesPerSecond = 29350.5
MPI Rank 0: 01/11/2018 08:57:01:  Epoch[15 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84154546 * 10240; EvalClassificationError = 0.51152344 * 10240; time = 0.3460s; samplesPerSecond = 29592.2
MPI Rank 0: 01/11/2018 08:57:01: Finished Epoch[15 of 15]: [Training] CrossEntropyWithSoftmax = 1.82942183 * 20480; EvalClassificationError = 0.50766602 * 20480; totalSamplesSeen = 307200; learningRatePerSample = 9.7656251e-05; epochTime=0.702487s
MPI Rank 0: 01/11/2018 08:57:02: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.82545027 * 83050; perplexity = 6.20558856; EvalClassificationError = 0.50745334 * 83050
MPI Rank 0: 01/11/2018 08:57:02: Finished Epoch[15 of 15]: [Validate] CrossEntropyWithSoftmax = 1.82545027 * 83050; EvalClassificationError = 0.50745334 * 83050
MPI Rank 0: 01/11/2018 08:57:02: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.825450 (Epoch 15); EvalClassificationError = 0.507453 (Epoch 15)
MPI Rank 0: 01/11/2018 08:57:02: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn'
MPI Rank 0: 01/11/2018 08:57:02: Best epoch for criterion 'CrossEntropyWithSoftmax' is 15 and model C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn copied to C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn_CrossEntropyWithSoftmax
MPI Rank 0: 01/11/2018 08:57:02: Best epoch for criterion 'EvalClassificationError' is 15 and model C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn copied to C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn_EvalClassificationError
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:02: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:02: __COMPLETED__
MPI Rank 1: 01/11/2018 08:56:31: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr_speechTrain.logrank1
MPI Rank 1: CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:56:30
MPI Rank 1: 
MPI Rank 1: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: Build info: 
MPI Rank 1: 
MPI Rank 1: 		Built time: Jan 10 2018 22:47:38
MPI Rank 1: 		Last modified date: Wed Jan 10 22:18:32 2018
MPI Rank 1: 		Build type: Release
MPI Rank 1: 		Build target: GPU
MPI Rank 1: 		With ASGD: yes
MPI Rank 1: 		Math lib: mkl
MPI Rank 1: 		CUDA version: 9.0.0
MPI Rank 1: 		CUDNN version: 7.0.5
MPI Rank 1: 		Build Branch: HEAD
MPI Rank 1: 		Build SHA1: db192cd3cb9ac688cae719c41e5930a4e3f628ea
MPI Rank 1: 		MPI distribution: Microsoft MPI
MPI Rank 1: 		MPI version: 7.0.12437.6
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: GPU info:
MPI Rank 1: 
MPI Rank 1: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 7914 MB
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: 01/11/2018 08:56:31: Using 3 CPU threads.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: ##############################################################################
MPI Rank 1: 01/11/2018 08:56:31: #                                                                            #
MPI Rank 1: 01/11/2018 08:56:31: # speechTrain command (train action)                                         #
MPI Rank 1: 01/11/2018 08:56:31: #                                                                            #
MPI Rank 1: 01/11/2018 08:56:31: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: 
MPI Rank 1: Creating virgin network.
MPI Rank 1: SimpleNetworkBuilder Using GPU 0
MPI Rank 1: Reading script file glob_0000.scp ... 948 entries
MPI Rank 1: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 1: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 1: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 1: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 1: Reading script file glob_0000.cv.scp ... 300 entries
MPI Rank 1: HTKDeserializer: selected '300' utterances grouped into '1' chunks, average chunk size: 300.0 utterances, 83050.0 frames (for I/O: 300.0 utterances, 83050.0 frames)
MPI Rank 1: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 1: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 1: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 1: 01/11/2018 08:56:31: 
MPI Rank 1: Model has 25 nodes. Using GPU 0.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 1: 01/11/2018 08:56:31: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: Allocating matrices for forward and/or backward propagation.
MPI Rank 1: 
MPI Rank 1: Gradient Memory Aliasing: 4 are aliased.
MPI Rank 1: 	W2*H1 (gradient) reuses HLast (gradient)
MPI Rank 1: 	W1*H1 (gradient) reuses W1*H1+B1 (gradient)
MPI Rank 1: 
MPI Rank 1: Memory Sharing: Out of 40 matrices, 20 are shared as 5, and 20 are not shared.
MPI Rank 1: 
MPI Rank 1: Here are the ones that share memory:
MPI Rank 1: 	{ PosteriorProb : [132 x 1 x *]
MPI Rank 1: 	  ScaledLogLikelihood : [132 x 1 x *] }
MPI Rank 1: 	{ HLast : [132 x 1 x *] (gradient)
MPI Rank 1: 	  W0*features+B0 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  W2*H1 : [132 x 1 x *]
MPI Rank 1: 	  W2*H1 : [132 x 1 x *] (gradient) }
MPI Rank 1: 	{ H1 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  H2 : [512 x 1 x *] (gradient)
MPI Rank 1: 	  HLast : [132 x 1 x *]
MPI Rank 1: 	  W0*features : [512 x *] (gradient)
MPI Rank 1: 	  W1*H1+B1 : [512 x 1 x *] }
MPI Rank 1: 	{ H2 : [512 x 1 x *]
MPI Rank 1: 	  W0*features+B0 : [512 x 1 x *]
MPI Rank 1: 	  W1 : [512 x 512] (gradient)
MPI Rank 1: 	  W1*H1 : [512 x 1 x *] }
MPI Rank 1: 	{ H1 : [512 x 1 x *]
MPI Rank 1: 	  W0 : [512 x 363] (gradient)
MPI Rank 1: 	  W0*features : [512 x *] }
MPI Rank 1: 
MPI Rank 1: Here are the ones that don't share memory:
MPI Rank 1: 	{MVNormalizedFeatures : [363 x *]}
MPI Rank 1: 	{B1 : [512 x 1] (gradient)}
MPI Rank 1: 	{B2 : [132 x 1] (gradient)}
MPI Rank 1: 	{EvalClassificationError : [1]}
MPI Rank 1: 	{W2 : [132 x 512] (gradient)}
MPI Rank 1: 	{LogOfPrior : [132]}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1]}
MPI Rank 1: 	{B0 : [512 x 1] (gradient)}
MPI Rank 1: 	{CrossEntropyWithSoftmax : [1] (gradient)}
MPI Rank 1: 	{features : [363 x *]}
MPI Rank 1: 	{labels : [132 x *]}
MPI Rank 1: 	{Prior : [132]}
MPI Rank 1: 	{W0 : [512 x 363]}
MPI Rank 1: 	{MeanOfFeatures : [363]}
MPI Rank 1: 	{B0 : [512 x 1]}
MPI Rank 1: 	{W1 : [512 x 512]}
MPI Rank 1: 	{B1 : [512 x 1]}
MPI Rank 1: 	{InvStdOfFeatures : [363]}
MPI Rank 1: 	{W2 : [132 x 512]}
MPI Rank 1: 	{B2 : [132 x 1]}
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/11/2018 08:56:31: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/11/2018 08:56:31: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 1: 01/11/2018 08:56:31: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 1: 01/11/2018 08:56:31: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 1: 01/11/2018 08:56:31: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 1: 
MPI Rank 1: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: Precomputing --> 3 PreCompute nodes found.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:31: 	MeanOfFeatures = Mean()
MPI Rank 1: 01/11/2018 08:56:31: 	InvStdOfFeatures = InvStdDev()
MPI Rank 1: 01/11/2018 08:56:31: 	Prior = Mean()
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:35: Precomputing --> Completed.
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:35: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 607.4 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:35: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[   1-  10, 3.13%]: CrossEntropyWithSoftmax = 4.62512789 * 640; EvalClassificationError = 0.94062500 * 640; time = 0.0861s; samplesPerSecond = 7430.2
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  11-  20, 6.25%]: CrossEntropyWithSoftmax = 4.35619366 * 640; EvalClassificationError = 0.92343750 * 640; time = 0.0771s; samplesPerSecond = 8305.1
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  21-  30, 9.38%]: CrossEntropyWithSoftmax = 3.97911998 * 640; EvalClassificationError = 0.89531250 * 640; time = 0.0667s; samplesPerSecond = 9602.2
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  31-  40, 12.50%]: CrossEntropyWithSoftmax = 3.73643568 * 640; EvalClassificationError = 0.84531250 * 640; time = 0.0672s; samplesPerSecond = 9526.6
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  41-  50, 15.63%]: CrossEntropyWithSoftmax = 3.83079081 * 640; EvalClassificationError = 0.88281250 * 640; time = 0.0653s; samplesPerSecond = 9807.9
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  51-  60, 18.75%]: CrossEntropyWithSoftmax = 3.71437690 * 640; EvalClassificationError = 0.86875000 * 640; time = 0.0671s; samplesPerSecond = 9534.0
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  61-  70, 21.88%]: CrossEntropyWithSoftmax = 3.42186231 * 640; EvalClassificationError = 0.79062500 * 640; time = 0.0650s; samplesPerSecond = 9848.5
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  71-  80, 25.00%]: CrossEntropyWithSoftmax = 3.53658053 * 640; EvalClassificationError = 0.82031250 * 640; time = 0.0664s; samplesPerSecond = 9636.8
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  81-  90, 28.13%]: CrossEntropyWithSoftmax = 3.49758018 * 640; EvalClassificationError = 0.81718750 * 640; time = 0.0694s; samplesPerSecond = 9220.7
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[  91- 100, 31.25%]: CrossEntropyWithSoftmax = 3.39996308 * 640; EvalClassificationError = 0.80468750 * 640; time = 0.0690s; samplesPerSecond = 9269.0
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 101- 110, 34.38%]: CrossEntropyWithSoftmax = 3.49445773 * 640; EvalClassificationError = 0.82500000 * 640; time = 0.0670s; samplesPerSecond = 9553.3
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 111- 120, 37.50%]: CrossEntropyWithSoftmax = 3.26676999 * 640; EvalClassificationError = 0.79218750 * 640; time = 0.0670s; samplesPerSecond = 9557.4
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 121- 130, 40.63%]: CrossEntropyWithSoftmax = 3.18870174 * 640; EvalClassificationError = 0.78906250 * 640; time = 0.0656s; samplesPerSecond = 9748.8
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 131- 140, 43.75%]: CrossEntropyWithSoftmax = 3.05687264 * 640; EvalClassificationError = 0.74687500 * 640; time = 0.0642s; samplesPerSecond = 9973.2
MPI Rank 1: 01/11/2018 08:56:36:  Epoch[ 1 of 15]-Minibatch[ 141- 150, 46.88%]: CrossEntropyWithSoftmax = 2.95594570 * 640; EvalClassificationError = 0.71875000 * 640; time = 0.0653s; samplesPerSecond = 9793.9
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 151- 160, 50.00%]: CrossEntropyWithSoftmax = 3.10219605 * 640; EvalClassificationError = 0.74062500 * 640; time = 0.0651s; samplesPerSecond = 9835.7
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 161- 170, 53.13%]: CrossEntropyWithSoftmax = 2.80745016 * 640; EvalClassificationError = 0.70625000 * 640; time = 0.0664s; samplesPerSecond = 9636.9
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 171- 180, 56.25%]: CrossEntropyWithSoftmax = 2.72061843 * 640; EvalClassificationError = 0.65468750 * 640; time = 0.0802s; samplesPerSecond = 7980.1
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 181- 190, 59.38%]: CrossEntropyWithSoftmax = 2.80425748 * 640; EvalClassificationError = 0.71718750 * 640; time = 0.0669s; samplesPerSecond = 9571.2
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 191- 200, 62.50%]: CrossEntropyWithSoftmax = 2.71253069 * 640; EvalClassificationError = 0.67812500 * 640; time = 0.0803s; samplesPerSecond = 7968.7
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 201- 210, 65.63%]: CrossEntropyWithSoftmax = 2.59360400 * 640; EvalClassificationError = 0.66093750 * 640; time = 0.0668s; samplesPerSecond = 9576.9
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 211- 220, 68.75%]: CrossEntropyWithSoftmax = 2.60386650 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.0652s; samplesPerSecond = 9814.5
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 221- 230, 71.88%]: CrossEntropyWithSoftmax = 2.53706679 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.0672s; samplesPerSecond = 9518.8
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 231- 240, 75.00%]: CrossEntropyWithSoftmax = 2.56177344 * 640; EvalClassificationError = 0.65625000 * 640; time = 0.0649s; samplesPerSecond = 9857.0
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 241- 250, 78.13%]: CrossEntropyWithSoftmax = 2.50118792 * 640; EvalClassificationError = 0.64218750 * 640; time = 0.0659s; samplesPerSecond = 9706.1
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 251- 260, 81.25%]: CrossEntropyWithSoftmax = 2.40119789 * 640; EvalClassificationError = 0.62500000 * 640; time = 0.0674s; samplesPerSecond = 9490.2
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 261- 270, 84.38%]: CrossEntropyWithSoftmax = 2.27491504 * 640; EvalClassificationError = 0.58906250 * 640; time = 0.0675s; samplesPerSecond = 9480.7
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 271- 280, 87.50%]: CrossEntropyWithSoftmax = 2.51724208 * 640; EvalClassificationError = 0.65781250 * 640; time = 0.0685s; samplesPerSecond = 9346.1
MPI Rank 1: 01/11/2018 08:56:37:  Epoch[ 1 of 15]-Minibatch[ 281- 290, 90.63%]: CrossEntropyWithSoftmax = 2.27797543 * 640; EvalClassificationError = 0.59687500 * 640; time = 0.0668s; samplesPerSecond = 9576.0
MPI Rank 1: 01/11/2018 08:56:38:  Epoch[ 1 of 15]-Minibatch[ 291- 300, 93.75%]: CrossEntropyWithSoftmax = 2.26017741 * 640; EvalClassificationError = 0.60937500 * 640; time = 0.0666s; samplesPerSecond = 9603.2
MPI Rank 1: 01/11/2018 08:56:38:  Epoch[ 1 of 15]-Minibatch[ 301- 310, 96.88%]: CrossEntropyWithSoftmax = 2.24735343 * 640; EvalClassificationError = 0.58437500 * 640; time = 0.0678s; samplesPerSecond = 9439.0
MPI Rank 1: 01/11/2018 08:56:38:  Epoch[ 1 of 15]-Minibatch[ 311- 320, 100.00%]: CrossEntropyWithSoftmax = 2.23665382 * 640; EvalClassificationError = 0.60625000 * 640; time = 0.0655s; samplesPerSecond = 9777.5
MPI Rank 1: 01/11/2018 08:56:38: Finished Epoch[ 1 of 15]: [Training] CrossEntropyWithSoftmax = 3.03815142 * 20480; EvalClassificationError = 0.73432617 * 20480; totalSamplesSeen = 20480; learningRatePerSample = 0.015625; epochTime=2.20122s
MPI Rank 1: 01/11/2018 08:56:39: Final Results: Minibatch[1-1299]: CrossEntropyWithSoftmax = 2.24821048 * 83050; perplexity = 9.47077252; EvalClassificationError = 0.61623119 * 83050
MPI Rank 1: 01/11/2018 08:56:39: Finished Epoch[ 1 of 15]: [Validate] CrossEntropyWithSoftmax = 2.24821048 * 83050; EvalClassificationError = 0.61623119 * 83050
MPI Rank 1: 01/11/2018 08:56:39: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 2.248210 (Epoch 1); EvalClassificationError = 0.616231 (Epoch 1)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:39: Starting Epoch 2: learning rate per sample = 0.001953  effective momentum = 0.656119  momentum as time constant = 607.5 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:39: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:39:  Epoch[ 2 of 15]-Minibatch[   1-  10, 12.50%]: CrossEntropyWithSoftmax = 2.13894071 * 2560; EvalClassificationError = 0.56992188 * 2560; time = 0.1377s; samplesPerSecond = 18590.0
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  11-  20, 25.00%]: CrossEntropyWithSoftmax = 2.06106261 * 2560; EvalClassificationError = 0.55664063 * 2560; time = 0.1248s; samplesPerSecond = 20504.7
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  21-  30, 37.50%]: CrossEntropyWithSoftmax = 2.04459475 * 2560; EvalClassificationError = 0.55039063 * 2560; time = 0.1233s; samplesPerSecond = 20767.1
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  31-  40, 50.00%]: CrossEntropyWithSoftmax = 2.03347291 * 2560; EvalClassificationError = 0.55742187 * 2560; time = 0.1259s; samplesPerSecond = 20333.9
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  41-  50, 62.50%]: CrossEntropyWithSoftmax = 2.02079287 * 2560; EvalClassificationError = 0.54414063 * 2560; time = 0.1218s; samplesPerSecond = 21025.5
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  51-  60, 75.00%]: CrossEntropyWithSoftmax = 1.96950012 * 2560; EvalClassificationError = 0.53085938 * 2560; time = 0.1253s; samplesPerSecond = 20433.2
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  61-  70, 87.50%]: CrossEntropyWithSoftmax = 1.95934863 * 2560; EvalClassificationError = 0.52812500 * 2560; time = 0.1212s; samplesPerSecond = 21121.6
MPI Rank 1: 01/11/2018 08:56:40:  Epoch[ 2 of 15]-Minibatch[  71-  80, 100.00%]: CrossEntropyWithSoftmax = 1.94070839 * 2560; EvalClassificationError = 0.53125000 * 2560; time = 0.1212s; samplesPerSecond = 21130.2
MPI Rank 1: 01/11/2018 08:56:40: Finished Epoch[ 2 of 15]: [Training] CrossEntropyWithSoftmax = 2.02105263 * 20480; EvalClassificationError = 0.54609375 * 20480; totalSamplesSeen = 40960; learningRatePerSample = 0.001953125; epochTime=1.00861s
MPI Rank 1: 01/11/2018 08:56:41: Final Results: Minibatch[1-326]: CrossEntropyWithSoftmax = 1.92733488 * 83050; perplexity = 6.87117334; EvalClassificationError = 0.53122216 * 83050
MPI Rank 1: 01/11/2018 08:56:41: Finished Epoch[ 2 of 15]: [Validate] CrossEntropyWithSoftmax = 1.92733488 * 83050; EvalClassificationError = 0.53122216 * 83050
MPI Rank 1: 01/11/2018 08:56:41: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.927335 (Epoch 2); EvalClassificationError = 0.531222 (Epoch 2)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:41: Starting Epoch 3: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:41: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:42:  Epoch[ 3 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.94336420 * 10240; EvalClassificationError = 0.53056641 * 10240; time = 0.3991s; samplesPerSecond = 25654.8
MPI Rank 1: 01/11/2018 08:56:42:  Epoch[ 3 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.96525554 * 10240; EvalClassificationError = 0.54873047 * 10240; time = 0.3563s; samplesPerSecond = 28737.2
MPI Rank 1: 01/11/2018 08:56:42: Finished Epoch[ 3 of 15]: [Training] CrossEntropyWithSoftmax = 1.95430987 * 20480; EvalClassificationError = 0.53964844 * 20480; totalSamplesSeen = 61440; learningRatePerSample = 9.7656251e-05; epochTime=0.762165s
MPI Rank 1: 01/11/2018 08:56:43: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.90639119 * 83050; perplexity = 6.72876211; EvalClassificationError = 0.52304636 * 83050
MPI Rank 1: 01/11/2018 08:56:43: Finished Epoch[ 3 of 15]: [Validate] CrossEntropyWithSoftmax = 1.90639119 * 83050; EvalClassificationError = 0.52304636 * 83050
MPI Rank 1: 01/11/2018 08:56:43: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.906391 (Epoch 3); EvalClassificationError = 0.523046 (Epoch 3)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:43: Starting Epoch 4: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:43: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:43:  Epoch[ 4 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.92960398 * 10240; EvalClassificationError = 0.52734375 * 10240; time = 0.3616s; samplesPerSecond = 28321.4
MPI Rank 1: 01/11/2018 08:56:44:  Epoch[ 4 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.91791093 * 10240; EvalClassificationError = 0.52138672 * 10240; time = 0.3527s; samplesPerSecond = 29032.3
MPI Rank 1: 01/11/2018 08:56:44: Finished Epoch[ 4 of 15]: [Training] CrossEntropyWithSoftmax = 1.92375746 * 20480; EvalClassificationError = 0.52436523 * 20480; totalSamplesSeen = 81920; learningRatePerSample = 9.7656251e-05; epochTime=0.72171s
MPI Rank 1: 01/11/2018 08:56:44: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.89723688 * 83050; perplexity = 6.66744604; EvalClassificationError = 0.52192655 * 83050
MPI Rank 1: 01/11/2018 08:56:44: Finished Epoch[ 4 of 15]: [Validate] CrossEntropyWithSoftmax = 1.89723688 * 83050; EvalClassificationError = 0.52192655 * 83050
MPI Rank 1: 01/11/2018 08:56:44: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.897237 (Epoch 4); EvalClassificationError = 0.521927 (Epoch 4)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:45: Starting Epoch 5: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:45: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:45:  Epoch[ 5 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.93213905 * 10240; EvalClassificationError = 0.52744141 * 10240; time = 0.3574s; samplesPerSecond = 28651.7
MPI Rank 1: 01/11/2018 08:56:45:  Epoch[ 5 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.91008045 * 10240; EvalClassificationError = 0.52197266 * 10240; time = 0.3489s; samplesPerSecond = 29347.9
MPI Rank 1: 01/11/2018 08:56:45: Finished Epoch[ 5 of 15]: [Training] CrossEntropyWithSoftmax = 1.92110975 * 20480; EvalClassificationError = 0.52470703 * 20480; totalSamplesSeen = 102400; learningRatePerSample = 9.7656251e-05; epochTime=0.71382s
MPI Rank 1: 01/11/2018 08:56:46: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.88941575 * 83050; perplexity = 6.61550243; EvalClassificationError = 0.52039735 * 83050
MPI Rank 1: 01/11/2018 08:56:46: Finished Epoch[ 5 of 15]: [Validate] CrossEntropyWithSoftmax = 1.88941575 * 83050; EvalClassificationError = 0.52039735 * 83050
MPI Rank 1: 01/11/2018 08:56:46: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.889416 (Epoch 5); EvalClassificationError = 0.520397 (Epoch 5)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:46: Starting Epoch 6: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:46: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:46:  Epoch[ 6 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.92107601 * 10240; EvalClassificationError = 0.52783203 * 10240; time = 0.3546s; samplesPerSecond = 28878.3
MPI Rank 1: 01/11/2018 08:56:47:  Epoch[ 6 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.90118051 * 10240; EvalClassificationError = 0.52031250 * 10240; time = 0.3442s; samplesPerSecond = 29751.0
MPI Rank 1: 01/11/2018 08:56:47: Finished Epoch[ 6 of 15]: [Training] CrossEntropyWithSoftmax = 1.91112826 * 20480; EvalClassificationError = 0.52407227 * 20480; totalSamplesSeen = 122880; learningRatePerSample = 9.7656251e-05; epochTime=0.705214s
MPI Rank 1: 01/11/2018 08:56:48: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.88230716 * 83050; perplexity = 6.56864231; EvalClassificationError = 0.51898856 * 83050
MPI Rank 1: 01/11/2018 08:56:48: Finished Epoch[ 6 of 15]: [Validate] CrossEntropyWithSoftmax = 1.88230716 * 83050; EvalClassificationError = 0.51898856 * 83050
MPI Rank 1: 01/11/2018 08:56:48: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.882307 (Epoch 6); EvalClassificationError = 0.518989 (Epoch 6)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:48: Starting Epoch 7: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:48: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:48:  Epoch[ 7 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.87751809 * 10240; EvalClassificationError = 0.51201172 * 10240; time = 0.3541s; samplesPerSecond = 28915.1
MPI Rank 1: 01/11/2018 08:56:48:  Epoch[ 7 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.90589643 * 10240; EvalClassificationError = 0.53007812 * 10240; time = 0.3509s; samplesPerSecond = 29184.5
MPI Rank 1: 01/11/2018 08:56:48: Finished Epoch[ 7 of 15]: [Training] CrossEntropyWithSoftmax = 1.89170726 * 20480; EvalClassificationError = 0.52104492 * 20480; totalSamplesSeen = 143360; learningRatePerSample = 9.7656251e-05; epochTime=0.711954s
MPI Rank 1: 01/11/2018 08:56:49: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.87533201 * 83050; perplexity = 6.52298444; EvalClassificationError = 0.51865141 * 83050
MPI Rank 1: 01/11/2018 08:56:49: Finished Epoch[ 7 of 15]: [Validate] CrossEntropyWithSoftmax = 1.87533201 * 83050; EvalClassificationError = 0.51865141 * 83050
MPI Rank 1: 01/11/2018 08:56:49: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.875332 (Epoch 7); EvalClassificationError = 0.518651 (Epoch 7)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:49: Starting Epoch 8: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:49: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:50:  Epoch[ 8 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.88190523 * 10240; EvalClassificationError = 0.51777344 * 10240; time = 0.3488s; samplesPerSecond = 29356.4
MPI Rank 1: 01/11/2018 08:56:50:  Epoch[ 8 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.86655063 * 10240; EvalClassificationError = 0.51562500 * 10240; time = 0.3591s; samplesPerSecond = 28515.0
MPI Rank 1: 01/11/2018 08:56:50: Finished Epoch[ 8 of 15]: [Training] CrossEntropyWithSoftmax = 1.87422793 * 20480; EvalClassificationError = 0.51669922 * 20480; totalSamplesSeen = 163840; learningRatePerSample = 9.7656251e-05; epochTime=0.715646s
MPI Rank 1: 01/11/2018 08:56:51: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.86996773 * 83050; perplexity = 6.48808705; EvalClassificationError = 0.51725467 * 83050
MPI Rank 1: 01/11/2018 08:56:51: Finished Epoch[ 8 of 15]: [Validate] CrossEntropyWithSoftmax = 1.86996773 * 83050; EvalClassificationError = 0.51725467 * 83050
MPI Rank 1: 01/11/2018 08:56:51: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.869968 (Epoch 8); EvalClassificationError = 0.517255 (Epoch 8)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:51: Starting Epoch 9: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:51: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:51:  Epoch[ 9 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.85947921 * 10240; EvalClassificationError = 0.50673828 * 10240; time = 0.3625s; samplesPerSecond = 28249.5
MPI Rank 1: 01/11/2018 08:56:52:  Epoch[ 9 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.85700426 * 10240; EvalClassificationError = 0.51582031 * 10240; time = 0.3440s; samplesPerSecond = 29766.7
MPI Rank 1: 01/11/2018 08:56:52: Finished Epoch[ 9 of 15]: [Training] CrossEntropyWithSoftmax = 1.85824174 * 20480; EvalClassificationError = 0.51127930 * 20480; totalSamplesSeen = 184320; learningRatePerSample = 9.7656251e-05; epochTime=0.714196s
MPI Rank 1: 01/11/2018 08:56:52: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.86323873 * 83050; perplexity = 6.44457525; EvalClassificationError = 0.51674895 * 83050
MPI Rank 1: 01/11/2018 08:56:52: Finished Epoch[ 9 of 15]: [Validate] CrossEntropyWithSoftmax = 1.86323873 * 83050; EvalClassificationError = 0.51674895 * 83050
MPI Rank 1: 01/11/2018 08:56:52: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.863239 (Epoch 9); EvalClassificationError = 0.516749 (Epoch 9)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:52: Starting Epoch 10: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:52: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:53:  Epoch[10 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.89317989 * 10240; EvalClassificationError = 0.52548828 * 10240; time = 0.3461s; samplesPerSecond = 29589.6
MPI Rank 1: 01/11/2018 08:56:53:  Epoch[10 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84631301 * 10240; EvalClassificationError = 0.50986328 * 10240; time = 0.3454s; samplesPerSecond = 29649.3
MPI Rank 1: 01/11/2018 08:56:53: Finished Epoch[10 of 15]: [Training] CrossEntropyWithSoftmax = 1.86974645 * 20480; EvalClassificationError = 0.51767578 * 20480; totalSamplesSeen = 204800; learningRatePerSample = 9.7656251e-05; epochTime=0.698029s
MPI Rank 1: 01/11/2018 08:56:54: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.85695611 * 83050; perplexity = 6.40421333; EvalClassificationError = 0.51576159 * 83050
MPI Rank 1: 01/11/2018 08:56:54: Finished Epoch[10 of 15]: [Validate] CrossEntropyWithSoftmax = 1.85695611 * 83050; EvalClassificationError = 0.51576159 * 83050
MPI Rank 1: 01/11/2018 08:56:54: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.856956 (Epoch 10); EvalClassificationError = 0.515762 (Epoch 10)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:54: Starting Epoch 11: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:54: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:54:  Epoch[11 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.86460008 * 10240; EvalClassificationError = 0.50751953 * 10240; time = 0.3698s; samplesPerSecond = 27693.0
MPI Rank 1: 01/11/2018 08:56:55:  Epoch[11 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.86031159 * 10240; EvalClassificationError = 0.51816406 * 10240; time = 0.3434s; samplesPerSecond = 29820.5
MPI Rank 1: 01/11/2018 08:56:55: Finished Epoch[11 of 15]: [Training] CrossEntropyWithSoftmax = 1.86245583 * 20480; EvalClassificationError = 0.51284180 * 20480; totalSamplesSeen = 225280; learningRatePerSample = 9.7656251e-05; epochTime=0.719804s
MPI Rank 1: 01/11/2018 08:56:56: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.85008405 * 83050; perplexity = 6.36035408; EvalClassificationError = 0.51326911 * 83050
MPI Rank 1: 01/11/2018 08:56:56: Finished Epoch[11 of 15]: [Validate] CrossEntropyWithSoftmax = 1.85008405 * 83050; EvalClassificationError = 0.51326911 * 83050
MPI Rank 1: 01/11/2018 08:56:56: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.850084 (Epoch 11); EvalClassificationError = 0.513269 (Epoch 11)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:56: Starting Epoch 12: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:56: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:56:  Epoch[12 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.86700752 * 10240; EvalClassificationError = 0.51181641 * 10240; time = 0.3602s; samplesPerSecond = 28425.3
MPI Rank 1: 01/11/2018 08:56:56:  Epoch[12 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.83390766 * 10240; EvalClassificationError = 0.50585938 * 10240; time = 0.3517s; samplesPerSecond = 29117.8
MPI Rank 1: 01/11/2018 08:56:56: Finished Epoch[12 of 15]: [Training] CrossEntropyWithSoftmax = 1.85045759 * 20480; EvalClassificationError = 0.50883789 * 20480; totalSamplesSeen = 245760; learningRatePerSample = 9.7656251e-05; epochTime=0.718878s
MPI Rank 1: 01/11/2018 08:56:57: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.84352145 * 83050; perplexity = 6.31875031; EvalClassificationError = 0.51169175 * 83050
MPI Rank 1: 01/11/2018 08:56:57: Finished Epoch[12 of 15]: [Validate] CrossEntropyWithSoftmax = 1.84352145 * 83050; EvalClassificationError = 0.51169175 * 83050
MPI Rank 1: 01/11/2018 08:56:57: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.843521 (Epoch 12); EvalClassificationError = 0.511692 (Epoch 12)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:57: Starting Epoch 13: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:57: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:58:  Epoch[13 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.84005490 * 10046; EvalClassificationError = 0.51542903 * 10046; time = 0.4131s; samplesPerSecond = 24321.0
MPI Rank 1: 01/11/2018 08:56:58:  Epoch[13 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.87225994 * 10240; EvalClassificationError = 0.51484375 * 10240; time = 0.3418s; samplesPerSecond = 29962.4
MPI Rank 1: 01/11/2018 08:56:58: Finished Epoch[13 of 15]: [Training] CrossEntropyWithSoftmax = 1.85713955 * 20480; EvalClassificationError = 0.51479492 * 20480; totalSamplesSeen = 266240; learningRatePerSample = 9.7656251e-05; epochTime=0.771454s
MPI Rank 1: 01/11/2018 08:56:59: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.83713385 * 83050; perplexity = 6.27851730; EvalClassificationError = 0.50862131 * 83050
MPI Rank 1: 01/11/2018 08:56:59: Finished Epoch[13 of 15]: [Validate] CrossEntropyWithSoftmax = 1.83713385 * 83050; EvalClassificationError = 0.50862131 * 83050
MPI Rank 1: 01/11/2018 08:56:59: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.837134 (Epoch 13); EvalClassificationError = 0.508621 (Epoch 13)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:59: Starting Epoch 14: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:56:59: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:56:59:  Epoch[14 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.85347546 * 10240; EvalClassificationError = 0.50312500 * 10240; time = 0.3355s; samplesPerSecond = 30518.9
MPI Rank 1: 01/11/2018 08:56:59:  Epoch[14 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84170081 * 10240; EvalClassificationError = 0.50791016 * 10240; time = 0.3311s; samplesPerSecond = 30929.0
MPI Rank 1: 01/11/2018 08:56:59: Finished Epoch[14 of 15]: [Training] CrossEntropyWithSoftmax = 1.84758814 * 20480; EvalClassificationError = 0.50551758 * 20480; totalSamplesSeen = 286720; learningRatePerSample = 9.7656251e-05; epochTime=0.673016s
MPI Rank 1: 01/11/2018 08:57:00: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.83143597 * 83050; perplexity = 6.24284478; EvalClassificationError = 0.50930765 * 83050
MPI Rank 1: 01/11/2018 08:57:00: Finished Epoch[14 of 15]: [Validate] CrossEntropyWithSoftmax = 1.83143597 * 83050; EvalClassificationError = 0.50930765 * 83050
MPI Rank 1: 01/11/2018 08:57:00: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.831436 (Epoch 14); EvalClassificationError = 0.508621 (Epoch 13)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:00: Starting Epoch 15: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:00: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:57:01:  Epoch[15 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.81729821 * 10240; EvalClassificationError = 0.50380859 * 10240; time = 0.3493s; samplesPerSecond = 29313.9
MPI Rank 1: 01/11/2018 08:57:01:  Epoch[15 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84154546 * 10240; EvalClassificationError = 0.51152344 * 10240; time = 0.3460s; samplesPerSecond = 29594.0
MPI Rank 1: 01/11/2018 08:57:01: Finished Epoch[15 of 15]: [Training] CrossEntropyWithSoftmax = 1.82942183 * 20480; EvalClassificationError = 0.50766602 * 20480; totalSamplesSeen = 307200; learningRatePerSample = 9.7656251e-05; epochTime=0.702385s
MPI Rank 1: 01/11/2018 08:57:02: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.82545027 * 83050; perplexity = 6.20558856; EvalClassificationError = 0.50745334 * 83050
MPI Rank 1: 01/11/2018 08:57:02: Finished Epoch[15 of 15]: [Validate] CrossEntropyWithSoftmax = 1.82545027 * 83050; EvalClassificationError = 0.50745334 * 83050
MPI Rank 1: 01/11/2018 08:57:02: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.825450 (Epoch 15); EvalClassificationError = 0.507453 (Epoch 15)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:02: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:02: __COMPLETED__
=== Deleting last epoch data
==== Re-running from checkpoint
=== Running c:\local\msmpi-7.0.12437.6\Bin/mpiexec.exe -n 2 C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu DeviceId=0 timestamping=true makeMode=true numCPUThreads=3 shareNodeValueMatrices=true saveBestModelPerCriterion=true stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:57:03

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  makeMode=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 2 nodes pinging each other
CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:57:03

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  makeMode=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data
ping [requestnodes (before change)]: 2 nodes pinging each other
ping [requestnodes (after change)]: 2 nodes pinging each other
ping [requestnodes (after change)]: 2 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 2 out of 2 MPI nodes on a single host (2 requested); we (1) are in (participating)
requestnodes [MPIWrapperMpi]: using 2 out of 2 MPI nodes on a single host (2 requested); we (0) are in (participating)
ping [mpihelper]: 2 nodes pinging each other
ping [mpihelper]: 2 nodes pinging each other
MPI Rank 0: 01/11/2018 08:57:03: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr_speechTrain.logrank0
MPI Rank 0: CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:57:03
MPI Rank 0: 
MPI Rank 0: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  makeMode=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: Build info: 
MPI Rank 0: 
MPI Rank 0: 		Built time: Jan 10 2018 22:47:38
MPI Rank 0: 		Last modified date: Wed Jan 10 22:18:32 2018
MPI Rank 0: 		Build type: Release
MPI Rank 0: 		Build target: GPU
MPI Rank 0: 		With ASGD: yes
MPI Rank 0: 		Math lib: mkl
MPI Rank 0: 		CUDA version: 9.0.0
MPI Rank 0: 		CUDNN version: 7.0.5
MPI Rank 0: 		Build Branch: HEAD
MPI Rank 0: 		Build SHA1: db192cd3cb9ac688cae719c41e5930a4e3f628ea
MPI Rank 0: 		MPI distribution: Microsoft MPI
MPI Rank 0: 		MPI version: 7.0.12437.6
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: GPU info:
MPI Rank 0: 
MPI Rank 0: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 8001 MB
MPI Rank 0: -------------------------------------------------------------------
MPI Rank 0: 01/11/2018 08:57:03: Using 3 CPU threads.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:03: ##############################################################################
MPI Rank 0: 01/11/2018 08:57:03: #                                                                            #
MPI Rank 0: 01/11/2018 08:57:03: # speechTrain command (train action)                                         #
MPI Rank 0: 01/11/2018 08:57:03: #                                                                            #
MPI Rank 0: 01/11/2018 08:57:03: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:03: 
MPI Rank 0: Starting from checkpoint. Loading network from 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.14'.
MPI Rank 0: SimpleNetworkBuilder Using GPU 0
MPI Rank 0: Reading script file glob_0000.scp ... 948 entries
MPI Rank 0: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 0: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 0: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 0: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 0: Reading script file glob_0000.cv.scp ... 300 entries
MPI Rank 0: HTKDeserializer: selected '300' utterances grouped into '1' chunks, average chunk size: 300.0 utterances, 83050.0 frames (for I/O: 300.0 utterances, 83050.0 frames)
MPI Rank 0: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 0: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 0: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 0: 01/11/2018 08:57:04: 
MPI Rank 0: Model has 25 nodes. Using GPU 0.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:04: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 0: 01/11/2018 08:57:04: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:04: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:04: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/11/2018 08:57:04: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 0: 01/11/2018 08:57:04: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 0: 01/11/2018 08:57:04: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 0: 01/11/2018 08:57:04: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 0: 01/11/2018 08:57:04: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 0: 
MPI Rank 0: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 0: 01/11/2018 08:57:04: No PreCompute nodes found, or all already computed. Skipping pre-computation step.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:04: Starting Epoch 15: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:04: Starting minibatch loop, DataParallelSGD training (myRank = 0, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 0: 01/11/2018 08:57:06:  Epoch[15 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.81729821 * 10240; EvalClassificationError = 0.50380859 * 10240; time = 1.2648s; samplesPerSecond = 8095.9
MPI Rank 0: 01/11/2018 08:57:06:  Epoch[15 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84154546 * 10240; EvalClassificationError = 0.51152344 * 10240; time = 0.5361s; samplesPerSecond = 19102.5
MPI Rank 0: 01/11/2018 08:57:06: Finished Epoch[15 of 15]: [Training] CrossEntropyWithSoftmax = 1.82942183 * 20480; EvalClassificationError = 0.50766602 * 20480; totalSamplesSeen = 307200; learningRatePerSample = 9.7656251e-05; epochTime=1.84357s
MPI Rank 0: 01/11/2018 08:57:07: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.82545027 * 83050; perplexity = 6.20558856; EvalClassificationError = 0.50745334 * 83050
MPI Rank 0: 01/11/2018 08:57:07: Finished Epoch[15 of 15]: [Validate] CrossEntropyWithSoftmax = 1.82545027 * 83050; EvalClassificationError = 0.50745334 * 83050
MPI Rank 0: 01/11/2018 08:57:07: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.825450 (Epoch 15); EvalClassificationError = 0.507453 (Epoch 15)
MPI Rank 0: 01/11/2018 08:57:07: SGD: Saving checkpoint model 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn'
MPI Rank 0: 01/11/2018 08:57:07: Best epoch for criterion 'CrossEntropyWithSoftmax' is 15 and model C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn copied to C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn_CrossEntropyWithSoftmax
MPI Rank 0: 01/11/2018 08:57:07: Best epoch for criterion 'EvalClassificationError' is 15 and model C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn copied to C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn_EvalClassificationError
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:07: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: 01/11/2018 08:57:07: __COMPLETED__
MPI Rank 1: 01/11/2018 08:57:04: Redirecting stderr to file C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr_speechTrain.logrank1
MPI Rank 1: CNTK 2.3.1+ (HEAD db192c, Jan 10 2018 22:59:43) at 2018/01/11 08:57:03
MPI Rank 1: 
MPI Rank 1: C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion/cntkcv.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  RunDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\DNN\SaveBestModelPerCriterion  OutputDir=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu  DeviceId=0  timestamping=true  makeMode=true  numCPUThreads=3  shareNodeValueMatrices=true  saveBestModelPerCriterion=true  stderr=C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/stderr
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: Build info: 
MPI Rank 1: 
MPI Rank 1: 		Built time: Jan 10 2018 22:47:38
MPI Rank 1: 		Last modified date: Wed Jan 10 22:18:32 2018
MPI Rank 1: 		Build type: Release
MPI Rank 1: 		Build target: GPU
MPI Rank 1: 		With ASGD: yes
MPI Rank 1: 		Math lib: mkl
MPI Rank 1: 		CUDA version: 9.0.0
MPI Rank 1: 		CUDNN version: 7.0.5
MPI Rank 1: 		Build Branch: HEAD
MPI Rank 1: 		Build SHA1: db192cd3cb9ac688cae719c41e5930a4e3f628ea
MPI Rank 1: 		MPI distribution: Microsoft MPI
MPI Rank 1: 		MPI version: 7.0.12437.6
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: GPU info:
MPI Rank 1: 
MPI Rank 1: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8124 MB; free memory = 7908 MB
MPI Rank 1: -------------------------------------------------------------------
MPI Rank 1: 01/11/2018 08:57:04: Using 3 CPU threads.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: ##############################################################################
MPI Rank 1: 01/11/2018 08:57:04: #                                                                            #
MPI Rank 1: 01/11/2018 08:57:04: # speechTrain command (train action)                                         #
MPI Rank 1: 01/11/2018 08:57:04: #                                                                            #
MPI Rank 1: 01/11/2018 08:57:04: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: 
MPI Rank 1: Starting from checkpoint. Loading network from 'C:\local\cygwin-2.8.2-x64\tmp\cntk-test-20180111085400.505371\Speech\DNN_SaveBestModelPerCriterion@release_gpu/models/cntkSpeech.dnn.14'.
MPI Rank 1: SimpleNetworkBuilder Using GPU 0
MPI Rank 1: Reading script file glob_0000.scp ... 948 entries
MPI Rank 1: HTKDeserializer: selected '948' utterances grouped into '3' chunks, average chunk size: 316.0 utterances, 84244.7 frames (for I/O: 316.0 utterances, 84244.7 frames)
MPI Rank 1: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 1: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 1: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 1: Reading script file glob_0000.cv.scp ... 300 entries
MPI Rank 1: HTKDeserializer: selected '300' utterances grouped into '1' chunks, average chunk size: 300.0 utterances, 83050.0 frames (for I/O: 300.0 utterances, 83050.0 frames)
MPI Rank 1: HTKDeserializer: determined feature kind as '33'-dimensional 'USER' with frame shift 10.0 ms
MPI Rank 1: Total (133) state names in state list 'C:\jenkins\workspace\CNTK-Test-Windows-W1\Tests\EndToEndTests\Speech\Data/state.list'
MPI Rank 1: MLFDeserializer: '948' utterances with '252734' frames
MPI Rank 1: 01/11/2018 08:57:04: 
MPI Rank 1: Model has 25 nodes. Using GPU 0.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: Training criterion:   CrossEntropyWithSoftmax = CrossEntropyWithSoftmax
MPI Rank 1: 01/11/2018 08:57:04: Evaluation criterion: EvalClassificationError = ClassificationError
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: 	Node 'B0' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/11/2018 08:57:04: 	Node 'B1' (LearnableParameter operation) : [512 x 1]
MPI Rank 1: 01/11/2018 08:57:04: 	Node 'B2' (LearnableParameter operation) : [132 x 1]
MPI Rank 1: 01/11/2018 08:57:04: 	Node 'W0' (LearnableParameter operation) : [512 x 363]
MPI Rank 1: 01/11/2018 08:57:04: 	Node 'W1' (LearnableParameter operation) : [512 x 512]
MPI Rank 1: 01/11/2018 08:57:04: 	Node 'W2' (LearnableParameter operation) : [132 x 512]
MPI Rank 1: 
MPI Rank 1: Initializing dataParallelSGD with FP64 aggregation.
MPI Rank 1: 01/11/2018 08:57:04: No PreCompute nodes found, or all already computed. Skipping pre-computation step.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: Starting Epoch 15: learning rate per sample = 0.000098  effective momentum = 0.656119  momentum as time constant = 2429.9 samples
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:04: Starting minibatch loop, DataParallelSGD training (myRank = 1, numNodes = 2, numGradientBits = 64), distributed reading is ENABLED.
MPI Rank 1: 01/11/2018 08:57:06:  Epoch[15 of 15]-Minibatch[   1-  10, 50.00%]: CrossEntropyWithSoftmax = 1.81729821 * 10240; EvalClassificationError = 0.50380859 * 10240; time = 1.2652s; samplesPerSecond = 8093.4
MPI Rank 1: 01/11/2018 08:57:06:  Epoch[15 of 15]-Minibatch[  11-  20, 100.00%]: CrossEntropyWithSoftmax = 1.84154546 * 10240; EvalClassificationError = 0.51152344 * 10240; time = 0.5369s; samplesPerSecond = 19071.7
MPI Rank 1: 01/11/2018 08:57:06: Finished Epoch[15 of 15]: [Training] CrossEntropyWithSoftmax = 1.82942183 * 20480; EvalClassificationError = 0.50766602 * 20480; totalSamplesSeen = 307200; learningRatePerSample = 9.7656251e-05; epochTime=1.84402s
MPI Rank 1: 01/11/2018 08:57:07: Final Results: Minibatch[1-83]: CrossEntropyWithSoftmax = 1.82545027 * 83050; perplexity = 6.20558856; EvalClassificationError = 0.50745334 * 83050
MPI Rank 1: 01/11/2018 08:57:07: Finished Epoch[15 of 15]: [Validate] CrossEntropyWithSoftmax = 1.82545027 * 83050; EvalClassificationError = 0.50745334 * 83050
MPI Rank 1: 01/11/2018 08:57:07: Best epoch per criterion so far: [Validate] CrossEntropyWithSoftmax = 1.825450 (Epoch 15); EvalClassificationError = 0.507453 (Epoch 15)
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:07: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: 01/11/2018 08:57:07: __COMPLETED__
