CNTK 2.0rc1+ (master e1f48f, Apr 10 2017 00:25:48) on msraml-tesla03 at 2017/04/10 01:24:09

/home/qiwye/git/cntk/build/release/bin/cntk  configfile=03_ResNet-parallel.cntk  minibatch=256  parallelTrain=true  epochSize=12  asyncBuffer=false  parallelizationMethod=DataParallelASGD  DataDir= /home/qiwye/git/cntk/Examples/Image/Dat
aSets/CIFAR-10
CNTK 2.0rc1+ (master e1f48f, Apr 10 2017 00:25:48) on msraml-tesla03 at 2017/04/10 01:24:09

/home/qiwye/git/cntk/build/release/bin/cntk  configfile=03_ResNet-parallel.cntk  minibatch=256  parallelTrain=true  epochSize=12  asyncBuffer=false  parallelizationMethod=DataParallelASGD  DataDir= /home/qiwye/git/cntk/Examples/Image/Dat
aSets/CIFAR-10
ping [requestnodes (before change)]: 2 nodes pinging each other
ping [requestnodes (before change)]: 2 nodes pinging each other
ping [requestnodes (after change)]: 2 nodes pinging each other
ping [requestnodes (after change)]: 2 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 2 out of 2 MPI nodes on a single host (2 requested); we (0) are in (participating)
ping [mpihelper]: 2 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 2 out of 2 MPI nodes on a single host (2 requested); we (1) are in (participating)
ping [mpihelper]: 2 nodes pinging each other
Redirecting stderr to file ./Output-ssgd/03_ResNet
Redirecting stderr to file ./Output-ssgd/03_ResNet.rank1
-------------------------------------------------------------------
Build info: 

		Built time: Apr 10 2017 00:24:51
		Last modified date: Sun Apr  9 23:12:54 2017
		Build type: release
		Build target: GPU
		With ASGD: yes
		Math lib: mkl
		CUDA_PATH: /usr/local/cuda-8.0
		CUB_PATH: /usr/local/cub-1.4.1
		CUDNN_PATH: /usr/local/cudnn-5.1
		Build Branch: master
		Build SHA1: e1f48f4c145ed8b7e34d064758d449a8e7cc46b5
		Built by Source/CNTK/buildinfo.h$$0 on msraml-tesla03
		Build Path: /home/qiwye/git/cntk
		MPI distribution: Open MPI
		MPI version: 1.10.3
-------------------------------------------------------------------
-------------------------------------------------------------------
GPU info:

		Device[0]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11437 MB
		Device[1]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11206 MB
		Device[2]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 520 MB
		Device[3]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11210 MB
		Device[4]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11437 MB
		Device[5]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 9414 MB
		Device[6]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 525 MB
		Device[7]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11207 MB
-------------------------------------------------------------------
-------------------------------------------------------------------
Build info: 

		Built time: Apr 10 2017 00:24:51
		Last modified date: Sun Apr  9 23:12:54 2017
		Build type: release
		Build target: GPU
		With ASGD: yes
		Math lib: mkl
		CUDA_PATH: /usr/local/cuda-8.0
		CUB_PATH: /usr/local/cub-1.4.1
		CUDNN_PATH: /usr/local/cudnn-5.1
		Build Branch: master
		Build SHA1: e1f48f4c145ed8b7e34d064758d449a8e7cc46b5
		Built by Source/CNTK/buildinfo.h$$0 on msraml-tesla03
		Build Path: /home/qiwye/git/cntk
		MPI distribution: Open MPI
		MPI version: 1.10.3
-------------------------------------------------------------------
-------------------------------------------------------------------
GPU info:

		Device[0]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11433 MB
		Device[1]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11165 MB
		Device[2]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 520 MB
		Device[3]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11210 MB
		Device[4]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11437 MB
		Device[5]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 9589 MB
		Device[6]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 525 MB
		Device[7]: cores = 2880; computeCapability = 3.5; type = "Tesla K40m"; total memory = 11439 MB; free memory = 11208 MB
-------------------------------------------------------------------
MPI Rank 0: Configuration After Processing and Variable Resolution:
MPI Rank 0: 
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:asyncBuffer=false
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:command=Train
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:ConfigDir=.
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:configName=ssgd
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:DataDir=/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-9
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:DeviceId=0
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:epochSize=13
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:imageLayout=cudnn
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:initOnCPUOnly=true
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:makeMode=true
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:minibatch=256
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:ModelDir=./Output-ssgd/Models
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:ndlMacros=./Macros.ndl
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:OutputDir=./Output-ssgd
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:parallelizationMethod=DataParallelASGD
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:parallelTrain=true
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:precision=float
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:prefetch=true
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:Proj16to32Filename=./16to32.txt
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:Proj32to64Filename=./32to64.txt
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:RootDir=.
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:stderr=./Output-ssgd/03_ResNet
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:Test=[
MPI Rank 0:     action = "test"
MPI Rank 0:     modelPath = "./Output-ssgd/Models/03_ResNet"
MPI Rank 0:     minibatchSize = 256
MPI Rank 0:     reader = [
MPI Rank 0:         readerType = "ImageReader"
MPI Rank 0:         file = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/cifar-10-batches-py/test_map.txt"
MPI Rank 0:         randomize = "none"
MPI Rank 0:         features = [
MPI Rank 0:             width = 32
MPI Rank 0:             height = 32
MPI Rank 0:             channels = 3
MPI Rank 0:             cropType = "Center"
MPI Rank 0:             sideRatio = 1
MPI Rank 0:             jitterType = "UniRatio"
MPI Rank 0:             interpolations = "linear"
MPI Rank 0:             meanFile = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/cifar-10-batches-py/CIFAR-10_mean.xml"
MPI Rank 0:         ]
MPI Rank 0:         labels = [
MPI Rank 0:             labelDim = 10
MPI Rank 0:         ]
MPI Rank 0:     ]    
MPI Rank 0: ]
MPI Rank 0: 
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:traceLevel=1
MPI Rank 0: configparameters: 03_ResNet-parallel.cntk:Train=[
MPI Rank 0:     action = "train"
MPI Rank 0:     modelPath = "./Output-ssgd/Models/03_ResNet"
MPI Rank 0:      NDLNetworkBuilder = [
MPI Rank 0:         networkDescription = "./03_ResNet.ndl"
MPI Rank 0:     ]
MPI Rank 0:     SGD = [
MPI Rank 0:         epochSize = 0
MPI Rank 0:         minibatchSize = 256
MPI Rank 0:         learningRatesPerSample = 0.004*80:0.0004*40:0.00004
MPI Rank 0:         momentumPerMB = 0
MPI Rank 0:         maxEpochs = 12
MPI Rank 0:         L2RegWeight = 0.0001
MPI Rank 0:         dropoutRate = 0
MPI Rank 0:         perfTraceLevel = 0
MPI Rank 0:         firstMBsToShowResult = 1
MPI Rank 0:         numMBsToShowResult = 10
MPI Rank 0:         ParallelTrain = [
MPI Rank 0:             parallelizationMethod = DataParallelASGD
MPI Rank 0:             distributedMBReading = "true"
MPI Rank 0:             parallelizationStartEpoch = 1
MPI Rank 0:             DataParallelSGD = [
MPI Rank 0:                 gradientBits = 32
MPI Rank 0:                 useBufferedAsyncGradientAggregation = false
MPI Rank 0:             ]
MPI Rank 0:             ModelAveragingSGD = [
MPI Rank 0:                 blockSizePerWorker = 128
MPI Rank 0:             ]
MPI Rank 0:             DataParallelASGD = [
MPI Rank 0:                 syncPeriod = 128
MPI Rank 0:                 usePipeline = false
MPI Rank 0:             ]
MPI Rank 0:         ]
MPI Rank 0:     ]
MPI Rank 0:     reader = [
MPI Rank 0:         readerType = "ImageReader"
MPI Rank 0:         file = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/train_map.txt"
MPI Rank 0:         randomize = "auto"
MPI Rank 0:         features = [
MPI Rank 0:             width = 32
MPI Rank 0:             height = 32
MPI Rank 0:             channels = 3
MPI Rank 0:             cropType = "RandomSide"
MPI Rank 0:             sideRatio = 0.8
MPI Rank 0:             jitterType = "UniRatio"
MPI Rank 0:             interpolations = "linear"
MPI Rank 0:             meanFile = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/CIFAR-10_mean.xml"
MPI Rank 0:         ]
MPI Rank 0:         labels = [
MPI Rank 0:             labelDim = 10
MPI Rank 0:         ]
MPI Rank 0:     ]
MPI Rank 0:     cvReader = [
MPI Rank 0:         readerType = "ImageReader"
MPI Rank 0:         file = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/test_map.txt"
MPI Rank 0:         randomize = "none"
MPI Rank 0:         features = [
MPI Rank 0:             width = 32
MPI Rank 0:             height = 32
MPI Rank 0:             channels = 3
MPI Rank 0:             cropType = "Center"
MPI Rank 0:             sideRatio = 1
MPI Rank 0:             jitterType = "UniRatio"
MPI Rank 0:             interpolations = "linear"
MPI Rank 0:             meanFile = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/CIFAR-10_mean.xml"
MPI Rank 0:         ]
MPI Rank 0:         labels = [
MPI Rank 0:             labelDim = 10
MPI Rank 0:         ]
MPI Rank 0:     ]    
MPI Rank 0: ]
MPI Rank 0: 
MPI Rank 0: Commands: Train
MPI Rank 0: precision = "float"
MPI Rank 0: 
MPI Rank 0: ##############################################################################
MPI Rank 0: #                                                                            #
MPI Rank 0: # Train command (train action)                                               #
MPI Rank 0: #                                                                            #
MPI Rank 0: ##############################################################################
MPI Rank 0: 
MPI Rank 0: 
MPI Rank 0: Starting from checkpoint. Loading network from './Output-ssgd/Models/03_ResNet.1'.
MPI Rank 0: NDLBuilder Using GPU 0
MPI Rank 0: conv1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 3, Output: 32 x 32 x 16, Kernel: 3 x 3 x 3, Map: 1 x 1 x 16, Stride: 1 x 1 x 3, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn1_1.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn1_1.c2.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn1_2.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn1_2.c2.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn1_3.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn1_3.c2.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_1.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 16 x 16 x 32, Kernel: 3 x 3 x 16, Map: 1 x 1 x 32, Stride: 2 x 2 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_1.c2.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_1.c_proj.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 16 x 16 x 32, Kernel: 1 x 1 x 16, Map: 1 x 1 x 32, Stride: 2 x 2 x 16, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_2.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_2.c2.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_3.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn2_3.c2.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_1.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 8 x 8 x 64, Kernel: 3 x 3 x 32, Map: 1 x 1 x 64, Stride: 2 x 2 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_1.c2.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_1.c_proj.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 8 x 8 x 64, Kernel: 1 x 1 x 32, Map: 1 x 1 x 64, Stride: 2 x 2 x 32, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_2.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_2.c2.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_3.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: rn3_3.c2.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 0: Using CNTK batch normalization engine.
MPI Rank 0: pool: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 1 x 1 x 64, Kernel: 8 x 8 x 1, Map: 1, Stride: 1 x 1 x 1, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
MPI Rank 0: 
MPI Rank 0: Model has 205 nodes. Using GPU 0.
MPI Rank 0: 
MPI Rank 0: Training criterion:   CE = CrossEntropyWithSoftmax
MPI Rank 0: Evaluation criterion: Err = ClassificationError
MPI Rank 0: 
MPI Rank 0: Training 269914 parameters in 63 out of 63 parameter tensors and 137 nodes with gradient:
MPI Rank 0: 
MPI Rank 0:     Node 'OutputNodes.W' (LearnableParameter operation) : [10 x 1 x 1 x 64]
MPI Rank 0:     Node 'OutputNodes.b' (LearnableParameter operation) : [10]
MPI Rank 0:     Node 'conv1.c.W' (LearnableParameter operation) : [16 x 27]
MPI Rank 0:     Node 'conv1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'conv1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_1.c1.c.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 0:     Node 'rn1_1.c1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_1.c1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_1.c2.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 0:     Node 'rn1_1.c2.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_1.c2.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_2.c1.c.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 0:     Node 'rn1_2.c1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_2.c1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_2.c2.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 0:     Node 'rn1_2.c2.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_2.c2.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_3.c1.c.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 0:     Node 'rn1_3.c1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_3.c1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_3.c2.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 0:     Node 'rn1_3.c2.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn1_3.c2.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 0:     Node 'rn2_1.c1.c.W' (LearnableParameter operation) : [32 x 144]
MPI Rank 0:     Node 'rn2_1.c1.c.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_1.c1.c.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_1.c2.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 0:     Node 'rn2_1.c2.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_1.c2.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_1.c_proj.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_1.c_proj.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_2.c1.c.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 0:     Node 'rn2_2.c1.c.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_2.c1.c.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_2.c2.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 0:     Node 'rn2_2.c2.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_2.c2.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_3.c1.c.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 0:     Node 'rn2_3.c1.c.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_3.c1.c.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_3.c2.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 0:     Node 'rn2_3.c2.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn2_3.c2.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 0:     Node 'rn3_1.c1.c.W' (LearnableParameter operation) : [64 x 288]
MPI Rank 0:     Node 'rn3_1.c1.c.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_1.c1.c.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_1.c2.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 0:     Node 'rn3_1.c2.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_1.c2.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_1.c_proj.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_1.c_proj.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_2.c1.c.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 0:     Node 'rn3_2.c1.c.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_2.c1.c.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_2.c2.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 0:     Node 'rn3_2.c2.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_2.c2.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_3.c1.c.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 0:     Node 'rn3_3.c1.c.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_3.c1.c.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_3.c2.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 0:     Node 'rn3_3.c2.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 0:     Node 'rn3_3.c2.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 0: 
MPI Rank 0: No PreCompute nodes found, or all already computed. Skipping pre-computation step.
MPI Rank 0: Warning: Checkpoint file is missing. Parameter-learning state (such as momentum) will be reset.
MPI Rank 0: [INFO] [2017-04-10 01:24:15] multiverso MPI-Net is initialized under MPI_THREAD_SERIALIZED mode.
MPI Rank 0: [INFO] [2017-04-10 01:24:15] All nodes registered. System contains 2 nodes. num_worker = 2, num_server = 2
MPI Rank 0: [INFO] [2017-04-10 01:24:15] Create a async server
MPI Rank 0: [INFO] [2017-04-10 01:24:15] Rank 0: Multiverso start successfully
MPI Rank 0: multiverso initial model loaded.
MPI Rank 0: 
MPI Rank 0: Starting Epoch 2: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.75266480 * 128; Err = 0.73437500 * 128; time = 3.6164s; samplesPerSecond = 35.4
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.81491913 * 1152; Err = 0.70746528 * 1152; time = 2.9668s; samplesPerSecond = 388.3
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.76509781 * 1280; Err = 0.66640625 * 1280; time = 1.6275s; samplesPerSecond = 786.5
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.79817123 * 1280; Err = 0.67656250 * 1280; time = 1.6591s; samplesPerSecond = 771.5
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.74831848 * 1280; Err = 0.66484375 * 1280; time = 1.8909s; samplesPerSecond = 676.9
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.74608231 * 1280; Err = 0.63359375 * 1280; time = 1.8115s; samplesPerSecond = 706.6
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.68409576 * 1280; Err = 0.64218750 * 1280; time = 1.7294s; samplesPerSecond = 740.1
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.70501556 * 1280; Err = 0.64375000 * 1280; time = 1.6690s; samplesPerSecond = 766.9
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 1.65417938 * 1280; Err = 0.61953125 * 1280; time = 1.8719s; samplesPerSecond = 683.8
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.68069611 * 1280; Err = 0.63125000 * 1280; time = 1.9880s; samplesPerSecond = 643.9
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.65381165 * 1280; Err = 0.62031250 * 1280; time = 1.7742s; samplesPerSecond = 721.5
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.62686157 * 1280; Err = 0.59453125 * 1280; time = 1.7167s; samplesPerSecond = 745.6
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.67001495 * 1280; Err = 0.62812500 * 1280; time = 1.7631s; samplesPerSecond = 726.0
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 1.63753052 * 1280; Err = 0.61171875 * 1280; time = 1.7039s; samplesPerSecond = 751.2
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 1.61827850 * 1280; Err = 0.61171875 * 1280; time = 1.7159s; samplesPerSecond = 746.0
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 1.61999359 * 1280; Err = 0.60312500 * 1280; time = 1.8264s; samplesPerSecond = 700.8
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.59391937 * 1280; Err = 0.59062500 * 1280; time = 1.6314s; samplesPerSecond = 784.6
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 1.57502136 * 1280; Err = 0.58984375 * 1280; time = 1.8800s; samplesPerSecond = 680.9
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 1.62439270 * 1280; Err = 0.59218750 * 1280; time = 1.7102s; samplesPerSecond = 748.4
MPI Rank 0:  Epoch[ 2 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 1.56337891 * 1280; Err = 0.57656250 * 1280; time = 1.7185s; samplesPerSecond = 744.8
MPI Rank 0: Finished Epoch[ 2 of 12]: [Training] CE = 1.66838453 * 25000; Err = 0.62560000 * 25000; totalSamplesSeen = 25000; learningRatePerSample = 0.0040000002; epochTime=39.2448s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 1.79508092 * 10000; perplexity = 6.01996181; Err = 0.65150000 * 10000
MPI Rank 0: Finished Epoch[ 2 of 12]: [Validate] CE = 1.79508092 * 10000; Err = 0.65150000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.2'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 3: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.48495650 * 128; Err = 0.54687500 * 128; time = 0.2127s; samplesPerSecond = 601.9
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.53969767 * 1152; Err = 0.55555556 * 1152; time = 1.6209s; samplesPerSecond = 710.7
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.51011219 * 1280; Err = 0.56484375 * 1280; time = 1.7145s; samplesPerSecond = 746.6
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.52098866 * 1280; Err = 0.56796875 * 1280; time = 1.9144s; samplesPerSecond = 668.6
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.43163567 * 1280; Err = 0.53906250 * 1280; time = 1.8007s; samplesPerSecond = 710.9
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.53417320 * 1280; Err = 0.56406250 * 1280; time = 1.7340s; samplesPerSecond = 738.2
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.43913193 * 1280; Err = 0.53984375 * 1280; time = 1.7895s; samplesPerSecond = 715.3
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.45313797 * 1280; Err = 0.52812500 * 1280; time = 1.8061s; samplesPerSecond = 708.7
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 1.44498825 * 1280; Err = 0.53984375 * 1280; time = 1.8088s; samplesPerSecond = 707.6
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.48993073 * 1280; Err = 0.54140625 * 1280; time = 1.8827s; samplesPerSecond = 679.9
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.43004608 * 1280; Err = 0.53515625 * 1280; time = 1.7655s; samplesPerSecond = 725.0
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.38471832 * 1280; Err = 0.51250000 * 1280; time = 1.7875s; samplesPerSecond = 716.1
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.38783417 * 1280; Err = 0.50546875 * 1280; time = 1.8237s; samplesPerSecond = 701.9
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 1.42435760 * 1280; Err = 0.52812500 * 1280; time = 1.8147s; samplesPerSecond = 705.3
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 1.39031982 * 1280; Err = 0.50468750 * 1280; time = 1.8037s; samplesPerSecond = 709.6
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 1.36525726 * 1280; Err = 0.51484375 * 1280; time = 1.7419s; samplesPerSecond = 734.8
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.27109680 * 1280; Err = 0.46953125 * 1280; time = 1.8025s; samplesPerSecond = 710.1
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 1.38014374 * 1280; Err = 0.50468750 * 1280; time = 1.8642s; samplesPerSecond = 686.6
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 1.29729004 * 1280; Err = 0.47109375 * 1280; time = 1.7898s; samplesPerSecond = 715.2
MPI Rank 0:  Epoch[ 3 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 1.28159485 * 1280; Err = 0.46796875 * 1280; time = 1.8438s; samplesPerSecond = 694.2
MPI Rank 0: Finished Epoch[ 3 of 12]: [Training] CE = 1.41563953 * 25000; Err = 0.52236000 * 25000; totalSamplesSeen = 50000; learningRatePerSample = 0.0040000002; epochTime=35.374s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 2.01017602 * 10000; perplexity = 7.46463115; Err = 0.60600000 * 10000
MPI Rank 0: Finished Epoch[ 3 of 12]: [Validate] CE = 2.01017602 * 10000; Err = 0.60600000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.3'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 4: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.27632427 * 128; Err = 0.46093750 * 128; time = 0.3062s; samplesPerSecond = 418.1
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.26607588 * 1152; Err = 0.45920139 * 1152; time = 1.5294s; samplesPerSecond = 753.2
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.24981461 * 1280; Err = 0.46093750 * 1280; time = 1.7557s; samplesPerSecond = 729.1
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.26835079 * 1280; Err = 0.46406250 * 1280; time = 1.9175s; samplesPerSecond = 667.6
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.25952301 * 1280; Err = 0.45546875 * 1280; time = 1.7350s; samplesPerSecond = 737.7
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.21506920 * 1280; Err = 0.44140625 * 1280; time = 1.5425s; samplesPerSecond = 829.8
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.26541710 * 1280; Err = 0.45156250 * 1280; time = 1.7940s; samplesPerSecond = 713.5
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.16921616 * 1280; Err = 0.41718750 * 1280; time = 1.7456s; samplesPerSecond = 733.3
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 1.16603928 * 1280; Err = 0.43437500 * 1280; time = 1.7090s; samplesPerSecond = 749.0
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.24926376 * 1280; Err = 0.47031250 * 1280; time = 1.7457s; samplesPerSecond = 733.2
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.17576981 * 1280; Err = 0.43046875 * 1280; time = 1.8915s; samplesPerSecond = 676.7
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.14468155 * 1280; Err = 0.40859375 * 1280; time = 1.8480s; samplesPerSecond = 692.6
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.13331909 * 1280; Err = 0.40390625 * 1280; time = 1.7356s; samplesPerSecond = 737.5
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 1.07030182 * 1280; Err = 0.37109375 * 1280; time = 1.8282s; samplesPerSecond = 700.2
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 1.12988281 * 1280; Err = 0.40078125 * 1280; time = 1.9016s; samplesPerSecond = 673.1
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 1.18912506 * 1280; Err = 0.44296875 * 1280; time = 1.6406s; samplesPerSecond = 780.2
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.13141937 * 1280; Err = 0.40937500 * 1280; time = 2.0050s; samplesPerSecond = 638.4
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 1.15046082 * 1280; Err = 0.41484375 * 1280; time = 1.7280s; samplesPerSecond = 740.7
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 1.06772308 * 1280; Err = 0.41484375 * 1280; time = 1.8265s; samplesPerSecond = 700.8
MPI Rank 0:  Epoch[ 4 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 1.09528809 * 1280; Err = 0.38906250 * 1280; time = 1.8914s; samplesPerSecond = 676.7
MPI Rank 0: Finished Epoch[ 4 of 12]: [Training] CE = 1.17657727 * 25000; Err = 0.42720000 * 25000; totalSamplesSeen = 75000; learningRatePerSample = 0.0040000002; epochTime=35.1181s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 1.12203568 * 10000; perplexity = 3.07109962; Err = 0.40270000 * 10000
MPI Rank 0: Finished Epoch[ 4 of 12]: [Validate] CE = 1.12203568 * 10000; Err = 0.40270000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.4'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 5: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.02243984 * 128; Err = 0.35937500 * 128; time = 0.3081s; samplesPerSecond = 415.4
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.02105651 * 1152; Err = 0.37326389 * 1152; time = 1.7381s; samplesPerSecond = 662.8
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.06995220 * 1280; Err = 0.37578125 * 1280; time = 1.9870s; samplesPerSecond = 644.2
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.06020527 * 1280; Err = 0.37187500 * 1280; time = 1.6422s; samplesPerSecond = 779.4
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.95883789 * 1280; Err = 0.34687500 * 1280; time = 1.8984s; samplesPerSecond = 674.3
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.03424339 * 1280; Err = 0.38125000 * 1280; time = 1.8898s; samplesPerSecond = 677.3
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.04579353 * 1280; Err = 0.38671875 * 1280; time = 1.8995s; samplesPerSecond = 673.9
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.04445076 * 1280; Err = 0.37578125 * 1280; time = 1.7976s; samplesPerSecond = 712.1
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.91115112 * 1280; Err = 0.32421875 * 1280; time = 1.9477s; samplesPerSecond = 657.2
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.98631210 * 1280; Err = 0.35312500 * 1280; time = 2.0708s; samplesPerSecond = 618.1
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.98559647 * 1280; Err = 0.34687500 * 1280; time = 1.9054s; samplesPerSecond = 671.8
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.02687683 * 1280; Err = 0.37968750 * 1280; time = 1.8432s; samplesPerSecond = 694.5
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.00755386 * 1280; Err = 0.34843750 * 1280; time = 1.9061s; samplesPerSecond = 671.5
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.94496307 * 1280; Err = 0.34296875 * 1280; time = 1.9919s; samplesPerSecond = 642.6
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.93193512 * 1280; Err = 0.34062500 * 1280; time = 1.8202s; samplesPerSecond = 703.2
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.98439178 * 1280; Err = 0.35000000 * 1280; time = 1.8039s; samplesPerSecond = 709.6
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.90554352 * 1280; Err = 0.32734375 * 1280; time = 1.8487s; samplesPerSecond = 692.4
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.87322540 * 1280; Err = 0.31406250 * 1280; time = 1.8432s; samplesPerSecond = 694.5
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.97645264 * 1280; Err = 0.34843750 * 1280; time = 1.9057s; samplesPerSecond = 671.7
MPI Rank 0:  Epoch[ 5 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.88249054 * 1280; Err = 0.30625000 * 1280; time = 1.5189s; samplesPerSecond = 842.7
MPI Rank 0: Finished Epoch[ 5 of 12]: [Training] CE = 0.97886297 * 25000; Err = 0.35088000 * 25000; totalSamplesSeen = 100000; learningRatePerSample = 0.0040000002; epochTime=36.4638s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 1.10240869 * 10000; perplexity = 3.01141084; Err = 0.37190000 * 10000
MPI Rank 0: Finished Epoch[ 5 of 12]: [Validate] CE = 1.10240869 * 10000; Err = 0.37190000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.5'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 6: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.99850869 * 128; Err = 0.35937500 * 128; time = 0.2974s; samplesPerSecond = 430.4
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.95278504 * 1152; Err = 0.34201389 * 1152; time = 1.5403s; samplesPerSecond = 747.9
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.87053070 * 1280; Err = 0.30859375 * 1280; time = 1.7121s; samplesPerSecond = 747.6
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.92212448 * 1280; Err = 0.31953125 * 1280; time = 1.7200s; samplesPerSecond = 744.2
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.90970421 * 1280; Err = 0.31328125 * 1280; time = 1.7247s; samplesPerSecond = 742.2
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.91234131 * 1280; Err = 0.31484375 * 1280; time = 1.7109s; samplesPerSecond = 748.1
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.92099571 * 1280; Err = 0.32031250 * 1280; time = 1.7983s; samplesPerSecond = 711.8
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.87335472 * 1280; Err = 0.30859375 * 1280; time = 1.9858s; samplesPerSecond = 644.6
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.84789276 * 1280; Err = 0.30468750 * 1280; time = 1.9244s; samplesPerSecond = 665.1
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.87059479 * 1280; Err = 0.29609375 * 1280; time = 1.8166s; samplesPerSecond = 704.6
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.83420486 * 1280; Err = 0.28906250 * 1280; time = 1.8096s; samplesPerSecond = 707.3
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.87643051 * 1280; Err = 0.30703125 * 1280; time = 1.8383s; samplesPerSecond = 696.3
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.85173264 * 1280; Err = 0.30234375 * 1280; time = 1.7184s; samplesPerSecond = 744.9
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.84832687 * 1280; Err = 0.29062500 * 1280; time = 1.8223s; samplesPerSecond = 702.4
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.78573074 * 1280; Err = 0.28359375 * 1280; time = 1.8148s; samplesPerSecond = 705.3
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.79827423 * 1280; Err = 0.26328125 * 1280; time = 1.8218s; samplesPerSecond = 702.6
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.88019409 * 1280; Err = 0.31484375 * 1280; time = 1.7311s; samplesPerSecond = 739.4
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.78453674 * 1280; Err = 0.28750000 * 1280; time = 1.9734s; samplesPerSecond = 648.6
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.84921265 * 1280; Err = 0.29218750 * 1280; time = 1.8112s; samplesPerSecond = 706.7
MPI Rank 0:  Epoch[ 6 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.80948792 * 1280; Err = 0.28828125 * 1280; time = 1.9159s; samplesPerSecond = 668.1
MPI Rank 0: Finished Epoch[ 6 of 12]: [Training] CE = 0.86264094 * 25000; Err = 0.30212000 * 25000; totalSamplesSeen = 125000; learningRatePerSample = 0.0040000002; epochTime=35.5526s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 0.94340823 * 10000; perplexity = 2.56872132; Err = 0.33490000 * 10000
MPI Rank 0: Finished Epoch[ 6 of 12]: [Validate] CE = 0.94340823 * 10000; Err = 0.33490000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.6'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 7: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.83054650 * 128; Err = 0.26562500 * 128; time = 0.2195s; samplesPerSecond = 583.2
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.82931697 * 1152; Err = 0.29774306 * 1152; time = 1.6299s; samplesPerSecond = 706.8
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.78532372 * 1280; Err = 0.26562500 * 1280; time = 2.0101s; samplesPerSecond = 636.8
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.84593296 * 1280; Err = 0.29140625 * 1280; time = 1.7052s; samplesPerSecond = 750.6
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.75033283 * 1280; Err = 0.26484375 * 1280; time = 1.8870s; samplesPerSecond = 678.3
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.78420601 * 1280; Err = 0.26640625 * 1280; time = 1.9405s; samplesPerSecond = 659.6
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.78109283 * 1280; Err = 0.27812500 * 1280; time = 1.8538s; samplesPerSecond = 690.5
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.73740616 * 1280; Err = 0.26093750 * 1280; time = 1.7397s; samplesPerSecond = 735.8
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.84222603 * 1280; Err = 0.29375000 * 1280; time = 1.7962s; samplesPerSecond = 712.6
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.78126259 * 1280; Err = 0.27656250 * 1280; time = 1.7286s; samplesPerSecond = 740.5
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.76586151 * 1280; Err = 0.26796875 * 1280; time = 1.8010s; samplesPerSecond = 710.7
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.74219131 * 1280; Err = 0.26093750 * 1280; time = 1.8098s; samplesPerSecond = 707.3
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.73395691 * 1280; Err = 0.24453125 * 1280; time = 1.8206s; samplesPerSecond = 703.1
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.78399811 * 1280; Err = 0.27343750 * 1280; time = 1.7475s; samplesPerSecond = 732.5
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.79863281 * 1280; Err = 0.27734375 * 1280; time = 1.8344s; samplesPerSecond = 697.8
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.75530167 * 1280; Err = 0.26718750 * 1280; time = 1.4581s; samplesPerSecond = 877.9
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.76989212 * 1280; Err = 0.27109375 * 1280; time = 1.8428s; samplesPerSecond = 694.6
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.78043060 * 1280; Err = 0.26640625 * 1280; time = 1.6618s; samplesPerSecond = 770.3
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.73198090 * 1280; Err = 0.25625000 * 1280; time = 1.6031s; samplesPerSecond = 798.5
MPI Rank 0:  Epoch[ 7 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.77142639 * 1280; Err = 0.26171875 * 1280; time = 1.8084s; samplesPerSecond = 707.8
MPI Rank 0: Finished Epoch[ 7 of 12]: [Training] CE = 0.77780109 * 25000; Err = 0.27076000 * 25000; totalSamplesSeen = 150000; learningRatePerSample = 0.0040000002; epochTime=34.9748s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 1.07266990 * 10000; perplexity = 2.92317367; Err = 0.33950000 * 10000
MPI Rank 0: Finished Epoch[ 7 of 12]: [Validate] CE = 1.07266990 * 10000; Err = 0.33950000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.7'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 8: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.60670477 * 128; Err = 0.25781250 * 128; time = 0.2222s; samplesPerSecond = 576.0
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.71904129 * 1152; Err = 0.25607639 * 1152; time = 1.5250s; samplesPerSecond = 755.4
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.72518921 * 1280; Err = 0.26015625 * 1280; time = 1.7312s; samplesPerSecond = 739.4
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.76588621 * 1280; Err = 0.25937500 * 1280; time = 1.7255s; samplesPerSecond = 741.8
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.74321499 * 1280; Err = 0.26328125 * 1280; time = 1.7299s; samplesPerSecond = 739.9
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.79642162 * 1280; Err = 0.26562500 * 1280; time = 1.8232s; samplesPerSecond = 702.1
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.73362236 * 1280; Err = 0.25390625 * 1280; time = 1.8372s; samplesPerSecond = 696.7
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.77696304 * 1280; Err = 0.26875000 * 1280; time = 1.8333s; samplesPerSecond = 698.2
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.69842186 * 1280; Err = 0.24218750 * 1280; time = 1.7386s; samplesPerSecond = 736.2
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.63282204 * 1280; Err = 0.22734375 * 1280; time = 1.6280s; samplesPerSecond = 786.3
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.66842041 * 1280; Err = 0.23750000 * 1280; time = 1.8405s; samplesPerSecond = 695.5
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.71357803 * 1280; Err = 0.25390625 * 1280; time = 1.7887s; samplesPerSecond = 715.6
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.68300781 * 1280; Err = 0.22421875 * 1280; time = 1.9056s; samplesPerSecond = 671.7
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.68068771 * 1280; Err = 0.23125000 * 1280; time = 1.7329s; samplesPerSecond = 738.6
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.67280731 * 1280; Err = 0.24843750 * 1280; time = 1.8117s; samplesPerSecond = 706.5
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.74227905 * 1280; Err = 0.25156250 * 1280; time = 1.8355s; samplesPerSecond = 697.4
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.74218140 * 1280; Err = 0.26406250 * 1280; time = 1.8073s; samplesPerSecond = 708.2
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.67347565 * 1280; Err = 0.23046875 * 1280; time = 1.8158s; samplesPerSecond = 704.9
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.71262589 * 1280; Err = 0.25312500 * 1280; time = 1.8011s; samplesPerSecond = 710.7
MPI Rank 0:  Epoch[ 8 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.68168640 * 1280; Err = 0.23828125 * 1280; time = 1.6981s; samplesPerSecond = 753.8
MPI Rank 0: Finished Epoch[ 8 of 12]: [Training] CE = 0.71403070 * 25000; Err = 0.24876000 * 25000; totalSamplesSeen = 175000; learningRatePerSample = 0.0040000002; epochTime=34.8298s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 0.93113912 * 10000; perplexity = 2.53739794; Err = 0.31040000 * 10000
MPI Rank 0: Finished Epoch[ 8 of 12]: [Validate] CE = 0.93113912 * 10000; Err = 0.31040000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.8'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 9: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.75184584 * 128; Err = 0.28125000 * 128; time = 0.2080s; samplesPerSecond = 615.4
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.64343940 * 1152; Err = 0.22916667 * 1152; time = 1.6086s; samplesPerSecond = 716.1
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.62986178 * 1280; Err = 0.20859375 * 1280; time = 1.7279s; samplesPerSecond = 740.8
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.64027834 * 1280; Err = 0.21328125 * 1280; time = 1.9280s; samplesPerSecond = 663.9
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.69680042 * 1280; Err = 0.24218750 * 1280; time = 1.6299s; samplesPerSecond = 785.3
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.69152355 * 1280; Err = 0.24765625 * 1280; time = 1.6331s; samplesPerSecond = 783.8
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.62510529 * 1280; Err = 0.22343750 * 1280; time = 1.5751s; samplesPerSecond = 812.6
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.68971672 * 1280; Err = 0.23828125 * 1280; time = 1.6423s; samplesPerSecond = 779.4
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.68932304 * 1280; Err = 0.23593750 * 1280; time = 1.8013s; samplesPerSecond = 710.6
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.63705902 * 1280; Err = 0.22500000 * 1280; time = 1.6997s; samplesPerSecond = 753.1
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.67337570 * 1280; Err = 0.22890625 * 1280; time = 2.0038s; samplesPerSecond = 638.8
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.63598175 * 1280; Err = 0.20937500 * 1280; time = 1.7191s; samplesPerSecond = 744.6
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.68425598 * 1280; Err = 0.24765625 * 1280; time = 1.7783s; samplesPerSecond = 719.8
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.62809830 * 1280; Err = 0.22656250 * 1280; time = 1.8671s; samplesPerSecond = 685.6
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.60770721 * 1280; Err = 0.21953125 * 1280; time = 1.9152s; samplesPerSecond = 668.3
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.61740646 * 1280; Err = 0.22031250 * 1280; time = 1.7508s; samplesPerSecond = 731.1
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.60150681 * 1280; Err = 0.21484375 * 1280; time = 1.8924s; samplesPerSecond = 676.4
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.61738205 * 1280; Err = 0.21875000 * 1280; time = 1.8045s; samplesPerSecond = 709.3
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.63640060 * 1280; Err = 0.21640625 * 1280; time = 1.7620s; samplesPerSecond = 726.5
MPI Rank 0:  Epoch[ 9 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.64825974 * 1280; Err = 0.23359375 * 1280; time = 1.8577s; samplesPerSecond = 689.0
MPI Rank 0: Finished Epoch[ 9 of 12]: [Training] CE = 0.64865246 * 25000; Err = 0.22676000 * 25000; totalSamplesSeen = 200000; learningRatePerSample = 0.0040000002; epochTime=34.7791s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 1.04686862 * 10000; perplexity = 2.84871672; Err = 0.33820000 * 10000
MPI Rank 0: Finished Epoch[ 9 of 12]: [Validate] CE = 1.04686862 * 10000; Err = 0.33820000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.9'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 10: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[10 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.54060709 * 128; Err = 0.17968750 * 128; time = 0.2188s; samplesPerSecond = 585.1
MPI Rank 0:  Epoch[10 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.61029300 * 1152; Err = 0.22222222 * 1152; time = 1.5528s; samplesPerSecond = 741.9
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.63416338 * 1280; Err = 0.21953125 * 1280; time = 1.7407s; samplesPerSecond = 735.3
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.59266586 * 1280; Err = 0.19921875 * 1280; time = 1.8829s; samplesPerSecond = 679.8
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.62723541 * 1280; Err = 0.22578125 * 1280; time = 1.7282s; samplesPerSecond = 740.7
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.60379925 * 1280; Err = 0.20625000 * 1280; time = 1.7496s; samplesPerSecond = 731.6
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.62765255 * 1280; Err = 0.22812500 * 1280; time = 1.8148s; samplesPerSecond = 705.3
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.60488014 * 1280; Err = 0.21562500 * 1280; time = 1.7092s; samplesPerSecond = 748.9
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.61295395 * 1280; Err = 0.20390625 * 1280; time = 1.7626s; samplesPerSecond = 726.2
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.60691299 * 1280; Err = 0.21718750 * 1280; time = 1.9288s; samplesPerSecond = 663.6
MPI Rank 0:  Epoch[10 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.64869270 * 1280; Err = 0.22031250 * 1280; time = 1.8479s; samplesPerSecond = 692.7
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.57487450 * 1280; Err = 0.19296875 * 1280; time = 1.8130s; samplesPerSecond = 706.0
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.63811188 * 1280; Err = 0.22265625 * 1280; time = 1.8975s; samplesPerSecond = 674.6
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.59670639 * 1280; Err = 0.21093750 * 1280; time = 1.8972s; samplesPerSecond = 674.7
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.58384323 * 1280; Err = 0.20468750 * 1280; time = 1.8979s; samplesPerSecond = 674.4
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.62350845 * 1280; Err = 0.21640625 * 1280; time = 1.7758s; samplesPerSecond = 720.8
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.57972794 * 1280; Err = 0.20234375 * 1280; time = 2.0567s; samplesPerSecond = 622.4
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.65487213 * 1280; Err = 0.21562500 * 1280; time = 1.7079s; samplesPerSecond = 749.5
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.62748108 * 1280; Err = 0.22343750 * 1280; time = 1.7923s; samplesPerSecond = 714.2
MPI Rank 0:  Epoch[10 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.60598526 * 1280; Err = 0.20000000 * 1280; time = 1.8409s; samplesPerSecond = 695.3
MPI Rank 0: Finished Epoch[10 of 12]: [Training] CE = 0.61193480 * 25000; Err = 0.21252000 * 25000; totalSamplesSeen = 225000; learningRatePerSample = 0.0040000002; epochTime=35.6219s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 1.14263183 * 10000; perplexity = 3.13500833; Err = 0.35980000 * 10000
MPI Rank 0: Finished Epoch[10 of 12]: [Validate] CE = 1.14263183 * 10000; Err = 0.35980000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.10'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 11: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[11 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.64261544 * 128; Err = 0.21093750 * 128; time = 0.2048s; samplesPerSecond = 624.9
MPI Rank 0:  Epoch[11 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.61469490 * 1152; Err = 0.21527778 * 1152; time = 1.5999s; samplesPerSecond = 720.1
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.57980871 * 1280; Err = 0.20000000 * 1280; time = 1.8613s; samplesPerSecond = 687.7
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.60074663 * 1280; Err = 0.20781250 * 1280; time = 1.8076s; samplesPerSecond = 708.1
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.62818985 * 1280; Err = 0.21953125 * 1280; time = 1.7008s; samplesPerSecond = 752.6
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.57138958 * 1280; Err = 0.19453125 * 1280; time = 1.8020s; samplesPerSecond = 710.3
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.57930202 * 1280; Err = 0.19687500 * 1280; time = 1.7749s; samplesPerSecond = 721.2
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.61204147 * 1280; Err = 0.21875000 * 1280; time = 1.8545s; samplesPerSecond = 690.2
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.58628922 * 1280; Err = 0.20312500 * 1280; time = 1.8016s; samplesPerSecond = 710.5
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.61974716 * 1280; Err = 0.22031250 * 1280; time = 1.6370s; samplesPerSecond = 781.9
MPI Rank 0:  Epoch[11 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.55952873 * 1280; Err = 0.18828125 * 1280; time = 1.8838s; samplesPerSecond = 679.5
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.57139816 * 1280; Err = 0.19531250 * 1280; time = 1.8173s; samplesPerSecond = 704.3
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.56446228 * 1280; Err = 0.19218750 * 1280; time = 1.7395s; samplesPerSecond = 735.9
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.61259384 * 1280; Err = 0.20390625 * 1280; time = 1.7535s; samplesPerSecond = 730.0
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.54565811 * 1280; Err = 0.19453125 * 1280; time = 1.7259s; samplesPerSecond = 741.6
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.54923401 * 1280; Err = 0.19609375 * 1280; time = 1.9184s; samplesPerSecond = 667.2
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.59994583 * 1280; Err = 0.20390625 * 1280; time = 1.7244s; samplesPerSecond = 742.3
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.52817230 * 1280; Err = 0.19062500 * 1280; time = 1.8009s; samplesPerSecond = 710.8
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.62245407 * 1280; Err = 0.21171875 * 1280; time = 1.7419s; samplesPerSecond = 734.8
MPI Rank 0:  Epoch[11 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.56471176 * 1280; Err = 0.19453125 * 1280; time = 1.7323s; samplesPerSecond = 738.9
MPI Rank 0: Finished Epoch[11 of 12]: [Training] CE = 0.58433871 * 25000; Err = 0.20224000 * 25000; totalSamplesSeen = 250000; learningRatePerSample = 0.0040000002; epochTime=34.9257s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 0.79050067 * 10000; perplexity = 2.20449987; Err = 0.27050000 * 10000
MPI Rank 0: Finished Epoch[11 of 12]: [Validate] CE = 0.79050067 * 10000; Err = 0.27050000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet.11'
MPI Rank 0: 
MPI Rank 0: Starting Epoch 12: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 0: 
MPI Rank 0: Starting minibatch loop, DataParallelASGD training (myRank = 0, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 0:  Epoch[12 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.65181273 * 128; Err = 0.25000000 * 128; time = 0.2138s; samplesPerSecond = 598.8
MPI Rank 0:  Epoch[12 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.55534043 * 1152; Err = 0.19791667 * 1152; time = 1.6279s; samplesPerSecond = 707.7
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.52375193 * 1280; Err = 0.18515625 * 1280; time = 1.8949s; samplesPerSecond = 675.5
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.57513618 * 1280; Err = 0.19609375 * 1280; time = 1.6965s; samplesPerSecond = 754.5
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.59431992 * 1280; Err = 0.21328125 * 1280; time = 1.6547s; samplesPerSecond = 773.5
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.56935158 * 1280; Err = 0.20000000 * 1280; time = 1.8771s; samplesPerSecond = 681.9
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.56304646 * 1280; Err = 0.21093750 * 1280; time = 1.6236s; samplesPerSecond = 788.4
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.48123589 * 1280; Err = 0.17734375 * 1280; time = 1.6409s; samplesPerSecond = 780.0
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.53849792 * 1280; Err = 0.19218750 * 1280; time = 1.8082s; samplesPerSecond = 707.9
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.52385139 * 1280; Err = 0.18515625 * 1280; time = 1.9135s; samplesPerSecond = 668.9
MPI Rank 0:  Epoch[12 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.54889832 * 1280; Err = 0.19218750 * 1280; time = 1.8249s; samplesPerSecond = 701.4
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.56033821 * 1280; Err = 0.19296875 * 1280; time = 1.9990s; samplesPerSecond = 640.3
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.58805046 * 1280; Err = 0.19921875 * 1280; time = 1.7219s; samplesPerSecond = 743.4
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.52831116 * 1280; Err = 0.17578125 * 1280; time = 1.9990s; samplesPerSecond = 640.3
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.56004105 * 1280; Err = 0.19921875 * 1280; time = 1.7295s; samplesPerSecond = 740.1
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.55382004 * 1280; Err = 0.20312500 * 1280; time = 1.6170s; samplesPerSecond = 791.6
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.58086472 * 1280; Err = 0.20234375 * 1280; time = 1.9271s; samplesPerSecond = 664.2
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.53163300 * 1280; Err = 0.19062500 * 1280; time = 1.7460s; samplesPerSecond = 733.1
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.53606033 * 1280; Err = 0.18593750 * 1280; time = 1.7473s; samplesPerSecond = 732.6
MPI Rank 0:  Epoch[12 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.52596588 * 1280; Err = 0.16406250 * 1280; time = 1.9044s; samplesPerSecond = 672.1
MPI Rank 0: Finished Epoch[12 of 12]: [Training] CE = 0.54974551 * 25000; Err = 0.19304000 * 25000; totalSamplesSeen = 275000; learningRatePerSample = 0.0040000002; epochTime=35.2147s
MPI Rank 0: Final Results: Minibatch[1-79]: CE = 0.77124689 * 10000; perplexity = 2.16246091; Err = 0.25340000 * 10000
MPI Rank 0: Finished Epoch[12 of 12]: [Validate] CE = 0.77124689 * 10000; Err = 0.25340000 * 10000
MPI Rank 0: SGD: Saving checkpoint model './Output-ssgd/Models/03_ResNet'
MPI Rank 0: ~MultiversoHelper
MPI Rank 0: [INFO] [2017-04-10 01:31:42] Multiverso Shutdown successfully
MPI Rank 0: 
MPI Rank 0: Action "train" complete.
MPI Rank 0: 
MPI Rank 0: COMPLETED.
MPI Rank 0: ~MPIWrapperMpi
MPI Rank 1: Configuration After Processing and Variable Resolution:
MPI Rank 1: 
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:asyncBuffer=false
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:command=Train
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:ConfigDir=.
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:configName=ssgd
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:DataDir=/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:DeviceId=0
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:epochSize=12
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:imageLayout=cudnn
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:initOnCPUOnly=true
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:makeMode=true
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:minibatch=256
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:ModelDir=./Output-ssgd/Models
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:ndlMacros=./Macros.ndl
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:OutputDir=./Output-ssgd
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:parallelizationMethod=DataParallelASGD
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:parallelTrain=true
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:precision=float
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:prefetch=true
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:Proj16to32Filename=./16to32.txt
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:Proj32to64Filename=./32to64.txt
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:RootDir=.
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:stderr=./Output-ssgd/03_ResNet
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:Test=[
MPI Rank 1:     action = "test"
MPI Rank 1:     modelPath = "./Output-ssgd/Models/03_ResNet"
MPI Rank 1:     minibatchSize = 256
MPI Rank 1:     reader = [
MPI Rank 1:         readerType = "ImageReader"
MPI Rank 1:         file = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/cifar-10-batches-py/test_map.txt"
MPI Rank 1:         randomize = "none"
MPI Rank 1:         features = [
MPI Rank 1:             width = 32
MPI Rank 1:             height = 32
MPI Rank 1:             channels = 3
MPI Rank 1:             cropType = "Center"
MPI Rank 1:             sideRatio = 1
MPI Rank 1:             jitterType = "UniRatio"
MPI Rank 1:             interpolations = "linear"
MPI Rank 1:             meanFile = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/cifar-10-batches-py/CIFAR-10_mean.xml"
MPI Rank 1:         ]
MPI Rank 1:         labels = [
MPI Rank 1:             labelDim = 10
MPI Rank 1:         ]
MPI Rank 1:     ]    
MPI Rank 1: ]
MPI Rank 1: 
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:traceLevel=1
MPI Rank 1: configparameters: 03_ResNet-parallel.cntk:Train=[
MPI Rank 1:     action = "train"
MPI Rank 1:     modelPath = "./Output-ssgd/Models/03_ResNet"
MPI Rank 1:      NDLNetworkBuilder = [
MPI Rank 1:         networkDescription = "./03_ResNet.ndl"
MPI Rank 1:     ]
MPI Rank 1:     SGD = [
MPI Rank 1:         epochSize = 0
MPI Rank 1:         minibatchSize = 256
MPI Rank 1:         learningRatesPerSample = 0.004*80:0.0004*40:0.00004
MPI Rank 1:         momentumPerMB = 0
MPI Rank 1:         maxEpochs = 12
MPI Rank 1:         L2RegWeight = 0.0001
MPI Rank 1:         dropoutRate = 0
MPI Rank 1:         perfTraceLevel = 0
MPI Rank 1:         firstMBsToShowResult = 1
MPI Rank 1:         numMBsToShowResult = 10
MPI Rank 1:         ParallelTrain = [
MPI Rank 1:             parallelizationMethod = DataParallelASGD
MPI Rank 1:             distributedMBReading = "true"
MPI Rank 1:             parallelizationStartEpoch = 1
MPI Rank 1:             DataParallelSGD = [
MPI Rank 1:                 gradientBits = 32
MPI Rank 1:                 useBufferedAsyncGradientAggregation = false
MPI Rank 1:             ]
MPI Rank 1:             ModelAveragingSGD = [
MPI Rank 1:                 blockSizePerWorker = 128
MPI Rank 1:             ]
MPI Rank 1:             DataParallelASGD = [
MPI Rank 1:                 syncPeriod = 128
MPI Rank 1:                 usePipeline = false
MPI Rank 1:             ]
MPI Rank 1:         ]
MPI Rank 1:     ]
MPI Rank 1:     reader = [
MPI Rank 1:         readerType = "ImageReader"
MPI Rank 1:         file = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/train_map.txt"
MPI Rank 1:         randomize = "auto"
MPI Rank 1:         features = [
MPI Rank 1:             width = 32
MPI Rank 1:             height = 32
MPI Rank 1:             channels = 3
MPI Rank 1:             cropType = "RandomSide"
MPI Rank 1:             sideRatio = 0.8
MPI Rank 1:             jitterType = "UniRatio"
MPI Rank 1:             interpolations = "linear"
MPI Rank 1:             meanFile = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/CIFAR-10_mean.xml"
MPI Rank 1:         ]
MPI Rank 1:         labels = [
MPI Rank 1:             labelDim = 10
MPI Rank 1:         ]
MPI Rank 1:     ]
MPI Rank 1:     cvReader = [
MPI Rank 1:         readerType = "ImageReader"
MPI Rank 1:         file = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/test_map.txt"
MPI Rank 1:         randomize = "none"
MPI Rank 1:         features = [
MPI Rank 1:             width = 32
MPI Rank 1:             height = 32
MPI Rank 1:             channels = 3
MPI Rank 1:             cropType = "Center"
MPI Rank 1:             sideRatio = 1
MPI Rank 1:             jitterType = "UniRatio"
MPI Rank 1:             interpolations = "linear"
MPI Rank 1:             meanFile = "/home/qiwye/git/cntk/Examples/Image/DataSets/CIFAR-10/CIFAR-10_mean.xml"
MPI Rank 1:         ]
MPI Rank 1:         labels = [
MPI Rank 1:             labelDim = 10
MPI Rank 1:         ]
MPI Rank 1:     ]    
MPI Rank 1: ]
MPI Rank 1: 
MPI Rank 1: Commands: Train
MPI Rank 1: precision = "float"
MPI Rank 1: 
MPI Rank 1: ##############################################################################
MPI Rank 1: #                                                                            #
MPI Rank 1: # Train command (train action)                                               #
MPI Rank 1: #                                                                            #
MPI Rank 1: ##############################################################################
MPI Rank 1: 
MPI Rank 1: 
MPI Rank 1: Starting from checkpoint. Loading network from './Output-ssgd/Models/03_ResNet.1'.
MPI Rank 1: NDLBuilder Using GPU 0
MPI Rank 1: conv1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 3, Output: 32 x 32 x 16, Kernel: 3 x 3 x 3, Map: 1 x 1 x 16, Stride: 1 x 1 x 3, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn1_1.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn1_1.c2.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn1_2.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn1_2.c2.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn1_3.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn1_3.c2.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 32 x 32 x 16, Kernel: 3 x 3 x 16, Map: 1 x 1 x 16, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_1.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 16 x 16 x 32, Kernel: 3 x 3 x 16, Map: 1 x 1 x 32, Stride: 2 x 2 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_1.c2.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_1.c_proj.c: using cuDNN convolution engine for geometry: Input: 32 x 32 x 16, Output: 16 x 16 x 32, Kernel: 1 x 1 x 16, Map: 1 x 1 x 32, Stride: 2 x 2 x 16, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_2.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_2.c2.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_3.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn2_3.c2.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 16 x 16 x 32, Kernel: 3 x 3 x 32, Map: 1 x 1 x 32, Stride: 1 x 1 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_1.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 8 x 8 x 64, Kernel: 3 x 3 x 32, Map: 1 x 1 x 64, Stride: 2 x 2 x 32, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_1.c2.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_1.c_proj.c: using cuDNN convolution engine for geometry: Input: 16 x 16 x 32, Output: 8 x 8 x 64, Kernel: 1 x 1 x 32, Map: 1 x 1 x 64, Stride: 2 x 2 x 32, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_2.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_2.c2.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_3.c1.c.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: rn3_3.c2.c.c: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 8 x 8 x 64, Kernel: 3 x 3 x 64, Map: 1 x 1 x 64, Stride: 1 x 1 x 64, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
MPI Rank 1: Using CNTK batch normalization engine.
MPI Rank 1: pool: using cuDNN convolution engine for geometry: Input: 8 x 8 x 64, Output: 1 x 1 x 64, Kernel: 8 x 8 x 1, Map: 1, Stride: 1 x 1 x 1, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
MPI Rank 1: 
MPI Rank 1: Model has 205 nodes. Using GPU 0.
MPI Rank 1: 
MPI Rank 1: Training criterion:   CE = CrossEntropyWithSoftmax
MPI Rank 1: Evaluation criterion: Err = ClassificationError
MPI Rank 1: 
MPI Rank 1: Training 269914 parameters in 63 out of 63 parameter tensors and 137 nodes with gradient:
MPI Rank 1: 
MPI Rank 1:     Node 'OutputNodes.W' (LearnableParameter operation) : [10 x 1 x 1 x 64]
MPI Rank 1:     Node 'OutputNodes.b' (LearnableParameter operation) : [10]
MPI Rank 1:     Node 'conv1.c.W' (LearnableParameter operation) : [16 x 27]
MPI Rank 1:     Node 'conv1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'conv1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_1.c1.c.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 1:     Node 'rn1_1.c1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_1.c1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_1.c2.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 1:     Node 'rn1_1.c2.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_1.c2.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_2.c1.c.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 1:     Node 'rn1_2.c1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_2.c1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_2.c2.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 1:     Node 'rn1_2.c2.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_2.c2.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_3.c1.c.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 1:     Node 'rn1_3.c1.c.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_3.c1.c.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_3.c2.W' (LearnableParameter operation) : [16 x 144]
MPI Rank 1:     Node 'rn1_3.c2.c.b' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn1_3.c2.c.sc' (LearnableParameter operation) : [16 x 1]
MPI Rank 1:     Node 'rn2_1.c1.c.W' (LearnableParameter operation) : [32 x 144]
MPI Rank 1:     Node 'rn2_1.c1.c.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_1.c1.c.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_1.c2.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 1:     Node 'rn2_1.c2.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_1.c2.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_1.c_proj.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_1.c_proj.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_2.c1.c.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 1:     Node 'rn2_2.c1.c.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_2.c1.c.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_2.c2.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 1:     Node 'rn2_2.c2.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_2.c2.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_3.c1.c.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 1:     Node 'rn2_3.c1.c.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_3.c1.c.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_3.c2.W' (LearnableParameter operation) : [32 x 288]
MPI Rank 1:     Node 'rn2_3.c2.c.b' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn2_3.c2.c.sc' (LearnableParameter operation) : [32 x 1]
MPI Rank 1:     Node 'rn3_1.c1.c.W' (LearnableParameter operation) : [64 x 288]
MPI Rank 1:     Node 'rn3_1.c1.c.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_1.c1.c.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_1.c2.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 1:     Node 'rn3_1.c2.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_1.c2.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_1.c_proj.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_1.c_proj.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_2.c1.c.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 1:     Node 'rn3_2.c1.c.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_2.c1.c.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_2.c2.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 1:     Node 'rn3_2.c2.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_2.c2.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_3.c1.c.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 1:     Node 'rn3_3.c1.c.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_3.c1.c.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_3.c2.W' (LearnableParameter operation) : [64 x 576]
MPI Rank 1:     Node 'rn3_3.c2.c.b' (LearnableParameter operation) : [64 x 1]
MPI Rank 1:     Node 'rn3_3.c2.c.sc' (LearnableParameter operation) : [64 x 1]
MPI Rank 1: 
MPI Rank 1: No PreCompute nodes found, or all already computed. Skipping pre-computation step.
MPI Rank 1: Warning: Checkpoint file is missing. Parameter-learning state (such as momentum) will be reset.
MPI Rank 1: [INFO] [2017-04-10 01:24:15] multiverso MPI-Net is initialized under MPI_THREAD_SERIALIZED mode.
MPI Rank 1: [INFO] [2017-04-10 01:24:15] Create a async server
MPI Rank 1: [INFO] [2017-04-10 01:24:15] Rank 1: Multiverso start successfully
MPI Rank 1: multiverso initial model loaded.
MPI Rank 1: 
MPI Rank 1: Starting Epoch 2: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.84901845 * 128; Err = 0.67968750 * 128; time = 3.9359s; samplesPerSecond = 32.5
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.87261726 * 1152; Err = 0.71440972 * 1152; time = 3.4636s; samplesPerSecond = 332.6
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.74392185 * 1280; Err = 0.65468750 * 1280; time = 1.9161s; samplesPerSecond = 668.0
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.83036194 * 1280; Err = 0.68203125 * 1280; time = 1.9115s; samplesPerSecond = 669.6
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.77370987 * 1280; Err = 0.65468750 * 1280; time = 1.7240s; samplesPerSecond = 742.5
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.73785858 * 1280; Err = 0.65937500 * 1280; time = 1.8035s; samplesPerSecond = 709.7
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.70149384 * 1280; Err = 0.62812500 * 1280; time = 1.9996s; samplesPerSecond = 640.1
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.68090057 * 1280; Err = 0.64062500 * 1280; time = 1.8197s; samplesPerSecond = 703.4
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 1.73368683 * 1280; Err = 0.64140625 * 1280; time = 1.7147s; samplesPerSecond = 746.5
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.62542419 * 1280; Err = 0.60390625 * 1280; time = 1.7263s; samplesPerSecond = 741.5
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.59853516 * 1280; Err = 0.58984375 * 1280; time = 2.0162s; samplesPerSecond = 634.9
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.57299805 * 1280; Err = 0.59453125 * 1280; time = 1.9583s; samplesPerSecond = 653.6
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.68873138 * 1280; Err = 0.63437500 * 1280; time = 1.8918s; samplesPerSecond = 676.6
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 1.59287415 * 1280; Err = 0.59687500 * 1280; time = 1.8923s; samplesPerSecond = 676.4
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 1.55739594 * 1280; Err = 0.59062500 * 1280; time = 1.8330s; samplesPerSecond = 698.3
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 1.56488190 * 1280; Err = 0.57812500 * 1280; time = 2.0229s; samplesPerSecond = 632.7
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.65792236 * 1280; Err = 0.61718750 * 1280; time = 1.8766s; samplesPerSecond = 682.1
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 1.56865845 * 1280; Err = 0.59375000 * 1280; time = 1.8488s; samplesPerSecond = 692.3
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 1.56352844 * 1280; Err = 0.58906250 * 1280; time = 1.8947s; samplesPerSecond = 675.6
MPI Rank 1:  Epoch[ 2 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 1.47549133 * 1280; Err = 0.54765625 * 1280; time = 1.4049s; samplesPerSecond = 911.1
MPI Rank 1: Finished Epoch[ 2 of 12]: [Training] CE = 1.65236516 * 25000; Err = 0.61868000 * 25000; totalSamplesSeen = 25000; learningRatePerSample = 0.0040000002; epochTime=41.4788s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 1.52545216 * 10000; perplexity = 4.59722179; Err = 0.58250000 * 10000
MPI Rank 1: Finished Epoch[ 2 of 12]: [Validate] CE = 1.52545216 * 10000; Err = 0.58250000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 3: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.63857090 * 128; Err = 0.60156250 * 128; time = 0.3044s; samplesPerSecond = 420.5
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.55756313 * 1152; Err = 0.57378472 * 1152; time = 1.6162s; samplesPerSecond = 712.8
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.48348885 * 1280; Err = 0.54531250 * 1280; time = 1.8487s; samplesPerSecond = 692.4
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.53348732 * 1280; Err = 0.56093750 * 1280; time = 1.7862s; samplesPerSecond = 716.6
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.52516861 * 1280; Err = 0.57265625 * 1280; time = 1.7687s; samplesPerSecond = 723.7
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.52053986 * 1280; Err = 0.55859375 * 1280; time = 1.9428s; samplesPerSecond = 658.8
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.47622299 * 1280; Err = 0.54765625 * 1280; time = 1.7914s; samplesPerSecond = 714.5
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.44300232 * 1280; Err = 0.53125000 * 1280; time = 1.8046s; samplesPerSecond = 709.3
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 1.41773148 * 1280; Err = 0.53203125 * 1280; time = 1.7211s; samplesPerSecond = 743.7
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.47147751 * 1280; Err = 0.56015625 * 1280; time = 1.7879s; samplesPerSecond = 715.9
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.38301849 * 1280; Err = 0.52031250 * 1280; time = 1.8318s; samplesPerSecond = 698.8
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.40034943 * 1280; Err = 0.51484375 * 1280; time = 1.9065s; samplesPerSecond = 671.4
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.35770569 * 1280; Err = 0.51484375 * 1280; time = 1.8270s; samplesPerSecond = 700.6
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 1.36131134 * 1280; Err = 0.49453125 * 1280; time = 1.8119s; samplesPerSecond = 706.4
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 1.34538879 * 1280; Err = 0.49062500 * 1280; time = 1.7956s; samplesPerSecond = 712.9
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 1.43016510 * 1280; Err = 0.53828125 * 1280; time = 1.9363s; samplesPerSecond = 661.0
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.28576813 * 1280; Err = 0.46484375 * 1280; time = 1.8028s; samplesPerSecond = 710.0
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 1.34371033 * 1280; Err = 0.48984375 * 1280; time = 1.6955s; samplesPerSecond = 754.9
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 1.38341827 * 1280; Err = 0.49531250 * 1280; time = 1.8197s; samplesPerSecond = 703.4
MPI Rank 1:  Epoch[ 3 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 1.34341736 * 1280; Err = 0.50156250 * 1280; time = 1.8006s; samplesPerSecond = 710.9
MPI Rank 1: Finished Epoch[ 3 of 12]: [Training] CE = 1.41974656 * 25000; Err = 0.52516000 * 25000; totalSamplesSeen = 50000; learningRatePerSample = 0.0040000002; epochTime=35.4316s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 1.73691697 * 10000; perplexity = 5.67980540; Err = 0.54430000 * 10000
MPI Rank 1: Finished Epoch[ 3 of 12]: [Validate] CE = 1.73691697 * 10000; Err = 0.54430000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 4: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.22537661 * 128; Err = 0.50000000 * 128; time = 0.2309s; samplesPerSecond = 554.3
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.31144476 * 1152; Err = 0.46875000 * 1152; time = 1.7156s; samplesPerSecond = 671.5
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.24825573 * 1280; Err = 0.46328125 * 1280; time = 1.9395s; samplesPerSecond = 660.0
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.30184898 * 1280; Err = 0.47421875 * 1280; time = 1.7254s; samplesPerSecond = 741.8
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.21274414 * 1280; Err = 0.43515625 * 1280; time = 1.9192s; samplesPerSecond = 666.9
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.16527786 * 1280; Err = 0.43984375 * 1280; time = 2.0764s; samplesPerSecond = 616.5
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.22623978 * 1280; Err = 0.44921875 * 1280; time = 1.7358s; samplesPerSecond = 737.4
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.23687744 * 1280; Err = 0.46093750 * 1280; time = 2.0685s; samplesPerSecond = 618.8
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 1.19844742 * 1280; Err = 0.41796875 * 1280; time = 1.8255s; samplesPerSecond = 701.2
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.13706818 * 1280; Err = 0.42578125 * 1280; time = 1.8149s; samplesPerSecond = 705.3
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.19731293 * 1280; Err = 0.43593750 * 1280; time = 1.7423s; samplesPerSecond = 734.7
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.15672302 * 1280; Err = 0.42500000 * 1280; time = 2.0199s; samplesPerSecond = 633.7
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 1.12729034 * 1280; Err = 0.39765625 * 1280; time = 1.8398s; samplesPerSecond = 695.7
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 1.14964905 * 1280; Err = 0.40625000 * 1280; time = 1.6855s; samplesPerSecond = 759.4
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 1.16299744 * 1280; Err = 0.41562500 * 1280; time = 1.9342s; samplesPerSecond = 661.8
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 1.09338837 * 1280; Err = 0.37734375 * 1280; time = 1.7454s; samplesPerSecond = 733.4
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.10862427 * 1280; Err = 0.41718750 * 1280; time = 1.9472s; samplesPerSecond = 657.3
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 1.10857086 * 1280; Err = 0.39531250 * 1280; time = 1.7849s; samplesPerSecond = 717.1
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 1.10548248 * 1280; Err = 0.39921875 * 1280; time = 1.8015s; samplesPerSecond = 710.5
MPI Rank 1:  Epoch[ 4 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 1.12614288 * 1280; Err = 0.40546875 * 1280; time = 1.5621s; samplesPerSecond = 819.4
MPI Rank 1: Finished Epoch[ 4 of 12]: [Training] CE = 1.17353203 * 25000; Err = 0.42584000 * 25000; totalSamplesSeen = 75000; learningRatePerSample = 0.0040000002; epochTime=35.9541s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 1.11999686 * 10000; perplexity = 3.06484456; Err = 0.40750000 * 10000
MPI Rank 1: Finished Epoch[ 4 of 12]: [Validate] CE = 1.11999686 * 10000; Err = 0.40750000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 5: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 1.02669287 * 128; Err = 0.37500000 * 128; time = 0.2196s; samplesPerSecond = 582.7
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 1.09283008 * 1152; Err = 0.38802083 * 1152; time = 1.5558s; samplesPerSecond = 740.4
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 1.03717003 * 1280; Err = 0.37656250 * 1280; time = 1.6252s; samplesPerSecond = 787.6
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 1.07710266 * 1280; Err = 0.40625000 * 1280; time = 2.0165s; samplesPerSecond = 634.8
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 1.01594162 * 1280; Err = 0.35234375 * 1280; time = 1.7093s; samplesPerSecond = 748.9
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 1.05669289 * 1280; Err = 0.38906250 * 1280; time = 1.7230s; samplesPerSecond = 742.9
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 1.03651352 * 1280; Err = 0.36562500 * 1280; time = 1.7957s; samplesPerSecond = 712.8
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 1.02820930 * 1280; Err = 0.37031250 * 1280; time = 1.6636s; samplesPerSecond = 769.4
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.99804153 * 1280; Err = 0.35937500 * 1280; time = 1.8720s; samplesPerSecond = 683.8
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 1.00480423 * 1280; Err = 0.35703125 * 1280; time = 1.6311s; samplesPerSecond = 784.7
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 1.03187180 * 1280; Err = 0.37031250 * 1280; time = 1.6468s; samplesPerSecond = 777.2
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 1.05509872 * 1280; Err = 0.38593750 * 1280; time = 1.7344s; samplesPerSecond = 738.0
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.99173965 * 1280; Err = 0.35078125 * 1280; time = 1.7446s; samplesPerSecond = 733.7
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.95438614 * 1280; Err = 0.34218750 * 1280; time = 1.8137s; samplesPerSecond = 705.7
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.94732971 * 1280; Err = 0.34375000 * 1280; time = 1.6308s; samplesPerSecond = 784.9
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.96456604 * 1280; Err = 0.35546875 * 1280; time = 1.8192s; samplesPerSecond = 703.6
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 1.03544006 * 1280; Err = 0.37031250 * 1280; time = 1.8115s; samplesPerSecond = 706.6
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.94266815 * 1280; Err = 0.32890625 * 1280; time = 1.8981s; samplesPerSecond = 674.4
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.86698151 * 1280; Err = 0.30078125 * 1280; time = 1.6157s; samplesPerSecond = 792.3
MPI Rank 1:  Epoch[ 5 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.94117126 * 1280; Err = 0.32968750 * 1280; time = 1.7231s; samplesPerSecond = 742.8
MPI Rank 1: Finished Epoch[ 5 of 12]: [Training] CE = 1.00156000 * 25000; Err = 0.35956000 * 25000; totalSamplesSeen = 100000; learningRatePerSample = 0.0040000002; epochTime=34.2878s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 1.02531530 * 10000; perplexity = 2.78797438; Err = 0.35420000 * 10000
MPI Rank 1: Finished Epoch[ 5 of 12]: [Validate] CE = 1.02531530 * 10000; Err = 0.35420000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 6: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.88546169 * 128; Err = 0.34375000 * 128; time = 0.2090s; samplesPerSecond = 612.3
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.95239577 * 1152; Err = 0.34288194 * 1152; time = 1.7317s; samplesPerSecond = 665.3
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.92279720 * 1280; Err = 0.33828125 * 1280; time = 1.8810s; samplesPerSecond = 680.5
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.87004700 * 1280; Err = 0.31328125 * 1280; time = 1.9047s; samplesPerSecond = 672.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.87928104 * 1280; Err = 0.31406250 * 1280; time = 1.9080s; samplesPerSecond = 670.9
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.91237717 * 1280; Err = 0.33515625 * 1280; time = 1.8714s; samplesPerSecond = 684.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.88594666 * 1280; Err = 0.32343750 * 1280; time = 1.8108s; samplesPerSecond = 706.9
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.86997528 * 1280; Err = 0.31640625 * 1280; time = 1.5296s; samplesPerSecond = 836.8
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.84205017 * 1280; Err = 0.30078125 * 1280; time = 1.8363s; samplesPerSecond = 697.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.93592606 * 1280; Err = 0.33437500 * 1280; time = 1.8191s; samplesPerSecond = 703.6
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.91535950 * 1280; Err = 0.30625000 * 1280; time = 1.8160s; samplesPerSecond = 704.9
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.86422348 * 1280; Err = 0.30000000 * 1280; time = 1.8363s; samplesPerSecond = 697.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.89147797 * 1280; Err = 0.31171875 * 1280; time = 1.8237s; samplesPerSecond = 701.9
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.84267578 * 1280; Err = 0.30078125 * 1280; time = 1.8927s; samplesPerSecond = 676.3
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.86245956 * 1280; Err = 0.30312500 * 1280; time = 1.8208s; samplesPerSecond = 703.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.86605301 * 1280; Err = 0.30312500 * 1280; time = 1.9106s; samplesPerSecond = 670.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.85631256 * 1280; Err = 0.30234375 * 1280; time = 1.6603s; samplesPerSecond = 771.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.76534729 * 1280; Err = 0.26328125 * 1280; time = 1.7659s; samplesPerSecond = 724.8
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.80992889 * 1280; Err = 0.29375000 * 1280; time = 1.7655s; samplesPerSecond = 725.0
MPI Rank 1:  Epoch[ 6 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.85066681 * 1280; Err = 0.31171875 * 1280; time = 1.7870s; samplesPerSecond = 716.3
MPI Rank 1: Finished Epoch[ 6 of 12]: [Training] CE = 0.87194555 * 25000; Err = 0.31096000 * 25000; totalSamplesSeen = 125000; learningRatePerSample = 0.0040000002; epochTime=35.4096s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 1.06034188 * 10000; perplexity = 2.88735796; Err = 0.35100000 * 10000
MPI Rank 1: Finished Epoch[ 6 of 12]: [Validate] CE = 1.06034188 * 10000; Err = 0.35100000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 7: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.78797102 * 128; Err = 0.28125000 * 128; time = 0.3081s; samplesPerSecond = 415.4
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.78527721 * 1152; Err = 0.26909722 * 1152; time = 1.5362s; samplesPerSecond = 749.9
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.78044796 * 1280; Err = 0.27500000 * 1280; time = 1.7405s; samplesPerSecond = 735.4
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.81336508 * 1280; Err = 0.29921875 * 1280; time = 1.9098s; samplesPerSecond = 670.2
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.76695766 * 1280; Err = 0.26406250 * 1280; time = 1.8305s; samplesPerSecond = 699.3
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.79526711 * 1280; Err = 0.28671875 * 1280; time = 1.6256s; samplesPerSecond = 787.4
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.75314293 * 1280; Err = 0.26875000 * 1280; time = 1.8270s; samplesPerSecond = 700.6
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.79434586 * 1280; Err = 0.26875000 * 1280; time = 1.9340s; samplesPerSecond = 661.9
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.79795265 * 1280; Err = 0.28750000 * 1280; time = 1.7193s; samplesPerSecond = 744.5
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.78198204 * 1280; Err = 0.27812500 * 1280; time = 2.0106s; samplesPerSecond = 636.6
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.83682861 * 1280; Err = 0.30468750 * 1280; time = 1.7873s; samplesPerSecond = 716.2
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.72318649 * 1280; Err = 0.25156250 * 1280; time = 1.8109s; samplesPerSecond = 706.8
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.77111053 * 1280; Err = 0.27187500 * 1280; time = 1.8199s; samplesPerSecond = 703.3
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.76260300 * 1280; Err = 0.26953125 * 1280; time = 1.9329s; samplesPerSecond = 662.2
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.73107300 * 1280; Err = 0.24687500 * 1280; time = 1.8325s; samplesPerSecond = 698.5
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.80951385 * 1280; Err = 0.28750000 * 1280; time = 2.1859s; samplesPerSecond = 585.6
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.77648239 * 1280; Err = 0.26640625 * 1280; time = 1.9266s; samplesPerSecond = 664.4
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.76481552 * 1280; Err = 0.27343750 * 1280; time = 2.0690s; samplesPerSecond = 618.7
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.76691132 * 1280; Err = 0.26718750 * 1280; time = 1.8266s; samplesPerSecond = 700.7
MPI Rank 1:  Epoch[ 7 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.69785309 * 1280; Err = 0.23984375 * 1280; time = 1.6508s; samplesPerSecond = 775.4
MPI Rank 1: Finished Epoch[ 7 of 12]: [Training] CE = 0.77286484 * 25000; Err = 0.27188000 * 25000; totalSamplesSeen = 150000; learningRatePerSample = 0.0040000002; epochTime=36.1563s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 0.85163169 * 10000; perplexity = 2.34346754; Err = 0.28760000 * 10000
MPI Rank 1: Finished Epoch[ 7 of 12]: [Validate] CE = 0.85163169 * 10000; Err = 0.28760000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 8: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.78503180 * 128; Err = 0.26562500 * 128; time = 0.3094s; samplesPerSecond = 413.8
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.76882325 * 1152; Err = 0.27430556 * 1152; time = 1.7655s; samplesPerSecond = 652.5
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.69405298 * 1280; Err = 0.24531250 * 1280; time = 1.8562s; samplesPerSecond = 689.6
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.73400774 * 1280; Err = 0.25781250 * 1280; time = 1.9113s; samplesPerSecond = 669.7
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.70796337 * 1280; Err = 0.24609375 * 1280; time = 1.9246s; samplesPerSecond = 665.1
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.69772644 * 1280; Err = 0.24296875 * 1280; time = 1.7385s; samplesPerSecond = 736.3
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.68252869 * 1280; Err = 0.25234375 * 1280; time = 1.9064s; samplesPerSecond = 671.4
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.70437241 * 1280; Err = 0.22578125 * 1280; time = 1.8358s; samplesPerSecond = 697.2
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.74145317 * 1280; Err = 0.25546875 * 1280; time = 2.0096s; samplesPerSecond = 636.9
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.73163872 * 1280; Err = 0.24687500 * 1280; time = 1.9179s; samplesPerSecond = 667.4
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.72021332 * 1280; Err = 0.23671875 * 1280; time = 1.8019s; samplesPerSecond = 710.4
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.72987213 * 1280; Err = 0.26250000 * 1280; time = 1.7220s; samplesPerSecond = 743.3
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.67413788 * 1280; Err = 0.24218750 * 1280; time = 1.8986s; samplesPerSecond = 674.2
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.73837967 * 1280; Err = 0.26171875 * 1280; time = 1.8239s; samplesPerSecond = 701.8
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.70513763 * 1280; Err = 0.25625000 * 1280; time = 1.8277s; samplesPerSecond = 700.3
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.67812805 * 1280; Err = 0.23281250 * 1280; time = 1.8082s; samplesPerSecond = 707.9
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.68363190 * 1280; Err = 0.23359375 * 1280; time = 1.8230s; samplesPerSecond = 702.1
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.67091293 * 1280; Err = 0.24843750 * 1280; time = 1.7993s; samplesPerSecond = 711.4
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.69231644 * 1280; Err = 0.24296875 * 1280; time = 1.8766s; samplesPerSecond = 682.1
MPI Rank 1:  Epoch[ 8 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.65727158 * 1280; Err = 0.23203125 * 1280; time = 1.6761s; samplesPerSecond = 763.7
MPI Rank 1: Finished Epoch[ 8 of 12]: [Training] CE = 0.70353820 * 25000; Err = 0.24612000 * 25000; totalSamplesSeen = 175000; learningRatePerSample = 0.0040000002; epochTime=36.209s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 0.72539377 * 10000; perplexity = 2.06554429; Err = 0.24530000 * 10000
MPI Rank 1: Finished Epoch[ 8 of 12]: [Validate] CE = 0.72539377 * 10000; Err = 0.24530000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 9: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.80789196 * 128; Err = 0.25000000 * 128; time = 0.2965s; samplesPerSecond = 431.8
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.62622318 * 1152; Err = 0.21527778 * 1152; time = 1.6613s; samplesPerSecond = 693.4
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.61614614 * 1280; Err = 0.20625000 * 1280; time = 1.8569s; samplesPerSecond = 689.3
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.63137989 * 1280; Err = 0.22890625 * 1280; time = 1.8098s; samplesPerSecond = 707.3
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.69058723 * 1280; Err = 0.23828125 * 1280; time = 1.9339s; samplesPerSecond = 661.9
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.72818851 * 1280; Err = 0.25234375 * 1280; time = 2.0978s; samplesPerSecond = 610.1
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.72131348 * 1280; Err = 0.24218750 * 1280; time = 2.2079s; samplesPerSecond = 579.7
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.64958992 * 1280; Err = 0.22656250 * 1280; time = 1.8007s; samplesPerSecond = 710.8
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.68881035 * 1280; Err = 0.25390625 * 1280; time = 1.8807s; samplesPerSecond = 680.6
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.69214211 * 1280; Err = 0.24921875 * 1280; time = 1.6423s; samplesPerSecond = 779.4
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.67042961 * 1280; Err = 0.23046875 * 1280; time = 1.8983s; samplesPerSecond = 674.3
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.65099335 * 1280; Err = 0.23359375 * 1280; time = 1.8322s; samplesPerSecond = 698.6
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.67299881 * 1280; Err = 0.23359375 * 1280; time = 1.8140s; samplesPerSecond = 705.6
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.64230881 * 1280; Err = 0.21953125 * 1280; time = 1.7339s; samplesPerSecond = 738.2
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.70511932 * 1280; Err = 0.23671875 * 1280; time = 1.8349s; samplesPerSecond = 697.6
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.65871582 * 1280; Err = 0.23046875 * 1280; time = 1.7211s; samplesPerSecond = 743.7
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.67262115 * 1280; Err = 0.22656250 * 1280; time = 1.8152s; samplesPerSecond = 705.2
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.63113327 * 1280; Err = 0.21562500 * 1280; time = 1.8292s; samplesPerSecond = 699.8
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.61177673 * 1280; Err = 0.22265625 * 1280; time = 1.8577s; samplesPerSecond = 689.0
MPI Rank 1:  Epoch[ 9 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.62056656 * 1280; Err = 0.21328125 * 1280; time = 1.7188s; samplesPerSecond = 744.7
MPI Rank 1: Finished Epoch[ 9 of 12]: [Training] CE = 0.66137266 * 25000; Err = 0.23000000 * 25000; totalSamplesSeen = 200000; learningRatePerSample = 0.0040000002; epochTime=36.0493s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 0.80890642 * 10000; perplexity = 2.24545106; Err = 0.26520000 * 10000
MPI Rank 1: Finished Epoch[ 9 of 12]: [Validate] CE = 0.80890642 * 10000; Err = 0.26520000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 10: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[10 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.60049057 * 128; Err = 0.21093750 * 128; time = 0.3054s; samplesPerSecond = 419.2
MPI Rank 1:  Epoch[10 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.59902112 * 1152; Err = 0.22569444 * 1152; time = 1.8222s; samplesPerSecond = 632.2
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.64356198 * 1280; Err = 0.22343750 * 1280; time = 1.7193s; samplesPerSecond = 744.5
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.61087751 * 1280; Err = 0.21093750 * 1280; time = 1.8835s; samplesPerSecond = 679.6
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.62380638 * 1280; Err = 0.22968750 * 1280; time = 1.8364s; samplesPerSecond = 697.0
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.55373478 * 1280; Err = 0.19218750 * 1280; time = 1.9597s; samplesPerSecond = 653.2
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.68988895 * 1280; Err = 0.23984375 * 1280; time = 1.7856s; samplesPerSecond = 716.9
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.64922714 * 1280; Err = 0.22031250 * 1280; time = 2.0035s; samplesPerSecond = 638.9
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.65011787 * 1280; Err = 0.22421875 * 1280; time = 1.7547s; samplesPerSecond = 729.4
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.63724632 * 1280; Err = 0.23750000 * 1280; time = 1.7530s; samplesPerSecond = 730.2
MPI Rank 1:  Epoch[10 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.59282494 * 1280; Err = 0.21484375 * 1280; time = 1.8392s; samplesPerSecond = 696.0
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.61133347 * 1280; Err = 0.20156250 * 1280; time = 1.8251s; samplesPerSecond = 701.3
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.60492630 * 1280; Err = 0.20390625 * 1280; time = 1.7932s; samplesPerSecond = 713.8
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.62901764 * 1280; Err = 0.22187500 * 1280; time = 1.7086s; samplesPerSecond = 749.2
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.63456039 * 1280; Err = 0.21171875 * 1280; time = 1.7182s; samplesPerSecond = 745.0
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.57890778 * 1280; Err = 0.20078125 * 1280; time = 1.8287s; samplesPerSecond = 700.0
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.60179443 * 1280; Err = 0.20234375 * 1280; time = 1.6327s; samplesPerSecond = 784.0
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.63815918 * 1280; Err = 0.22265625 * 1280; time = 1.9008s; samplesPerSecond = 673.4
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.62120667 * 1280; Err = 0.20859375 * 1280; time = 1.7921s; samplesPerSecond = 714.3
MPI Rank 1:  Epoch[10 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.60103683 * 1280; Err = 0.21484375 * 1280; time = 1.8455s; samplesPerSecond = 693.6
MPI Rank 1: Finished Epoch[10 of 12]: [Training] CE = 0.62002477 * 25000; Err = 0.21608000 * 25000; totalSamplesSeen = 225000; learningRatePerSample = 0.0040000002; epochTime=35.6254s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 1.03226201 * 10000; perplexity = 2.80740905; Err = 0.32840000 * 10000
MPI Rank 1: Finished Epoch[10 of 12]: [Validate] CE = 1.03226201 * 10000; Err = 0.32840000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 11: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[11 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.47479239 * 128; Err = 0.14843750 * 128; time = 0.2986s; samplesPerSecond = 428.6
MPI Rank 1:  Epoch[11 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.57546638 * 1152; Err = 0.20138889 * 1152; time = 1.6546s; samplesPerSecond = 696.2
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.64314499 * 1280; Err = 0.22031250 * 1280; time = 1.7444s; samplesPerSecond = 733.8
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.58746185 * 1280; Err = 0.20312500 * 1280; time = 1.8639s; samplesPerSecond = 686.7
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.64209785 * 1280; Err = 0.22734375 * 1280; time = 1.8664s; samplesPerSecond = 685.8
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.56380310 * 1280; Err = 0.18437500 * 1280; time = 1.8181s; samplesPerSecond = 704.0
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.55058308 * 1280; Err = 0.19062500 * 1280; time = 1.7357s; samplesPerSecond = 737.5
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.54512138 * 1280; Err = 0.19062500 * 1280; time = 1.8945s; samplesPerSecond = 675.6
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.58789444 * 1280; Err = 0.20000000 * 1280; time = 1.7995s; samplesPerSecond = 711.3
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.60862045 * 1280; Err = 0.20625000 * 1280; time = 1.9936s; samplesPerSecond = 642.1
MPI Rank 1:  Epoch[11 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.57939377 * 1280; Err = 0.20546875 * 1280; time = 1.7071s; samplesPerSecond = 749.8
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.62359467 * 1280; Err = 0.21250000 * 1280; time = 1.8293s; samplesPerSecond = 699.7
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.60461731 * 1280; Err = 0.21875000 * 1280; time = 1.9253s; samplesPerSecond = 664.8
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.60369415 * 1280; Err = 0.20625000 * 1280; time = 1.9318s; samplesPerSecond = 662.6
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.54191666 * 1280; Err = 0.20312500 * 1280; time = 1.8179s; samplesPerSecond = 704.1
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.64403458 * 1280; Err = 0.22890625 * 1280; time = 1.8227s; samplesPerSecond = 702.2
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.58542557 * 1280; Err = 0.20937500 * 1280; time = 1.9019s; samplesPerSecond = 673.0
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.58519440 * 1280; Err = 0.20781250 * 1280; time = 1.8222s; samplesPerSecond = 702.5
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.56002197 * 1280; Err = 0.19843750 * 1280; time = 1.9493s; samplesPerSecond = 656.6
MPI Rank 1:  Epoch[11 of 12]-Minibatch[ 181- 190, 0.00000000000105%]: CE = 0.56146088 * 1280; Err = 0.19921875 * 1280; time = 1.7141s; samplesPerSecond = 746.7
MPI Rank 1: Finished Epoch[11 of 12]: [Training] CE = 0.58766348 * 25000; Err = 0.20552000 * 25000; totalSamplesSeen = 250000; learningRatePerSample = 0.0040000002; epochTime=35.9046s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 0.68442069 * 10000; perplexity = 1.98262296; Err = 0.23360000 * 10000
MPI Rank 1: Finished Epoch[11 of 12]: [Validate] CE = 0.68442069 * 10000; Err = 0.23360000 * 10000
MPI Rank 1: 
MPI Rank 1: Starting Epoch 12: learning rate per sample = 0.004000  effective momentum = 0.000000  momentum as time constant = 0.0 samples
MPI Rank 1: 
MPI Rank 1: Starting minibatch loop, DataParallelASGD training (myRank = 1, numNodes = 2, SamplesSyncToServer = 128), Distributed Evaluation is DISABLED, distributed reading is ENABLED.
MPI Rank 1:  Epoch[12 of 12]-Minibatch[   1-   1, 0.000000000000006%]: CE = 0.47684953 * 128; Err = 0.15625000 * 128; time = 0.3019s; samplesPerSecond = 423.9
MPI Rank 1:  Epoch[12 of 12]-Minibatch[   2-  10, 0.00000000000006%]: CE = 0.62880304 * 1152; Err = 0.21093750 * 1152; time = 1.5355s; samplesPerSecond = 750.3
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  11-  20, 0.00000000000011%]: CE = 0.55633965 * 1280; Err = 0.19531250 * 1280; time = 1.8099s; samplesPerSecond = 707.2
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  21-  30, 0.00000000000017%]: CE = 0.54094992 * 1280; Err = 0.19140625 * 1280; time = 1.8151s; samplesPerSecond = 705.2
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  31-  40, 0.00000000000022%]: CE = 0.60759182 * 1280; Err = 0.20000000 * 1280; time = 1.9607s; samplesPerSecond = 652.8
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  41-  50, 0.00000000000028%]: CE = 0.56563663 * 1280; Err = 0.19531250 * 1280; time = 1.8181s; samplesPerSecond = 704.0
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  51-  60, 0.00000000000033%]: CE = 0.52482815 * 1280; Err = 0.18906250 * 1280; time = 1.8837s; samplesPerSecond = 679.5
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  61-  70, 0.00000000000039%]: CE = 0.56316566 * 1280; Err = 0.18984375 * 1280; time = 1.9964s; samplesPerSecond = 641.2
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  71-  80, 0.00000000000044%]: CE = 0.49520035 * 1280; Err = 0.17187500 * 1280; time = 1.8221s; samplesPerSecond = 702.5
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  81-  90, 0.00000000000050%]: CE = 0.59656639 * 1280; Err = 0.20000000 * 1280; time = 1.8292s; samplesPerSecond = 699.8
MPI Rank 1:  Epoch[12 of 12]-Minibatch[  91- 100, 0.00000000000056%]: CE = 0.56413231 * 1280; Err = 0.18203125 * 1280; time = 1.7237s; samplesPerSecond = 742.6
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 101- 110, 0.00000000000061%]: CE = 0.51646309 * 1280; Err = 0.18671875 * 1280; time = 1.8141s; samplesPerSecond = 705.6
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 111- 120, 0.00000000000067%]: CE = 0.56112671 * 1280; Err = 0.18984375 * 1280; time = 1.7291s; samplesPerSecond = 740.3
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 121- 130, 0.00000000000072%]: CE = 0.59806747 * 1280; Err = 0.21093750 * 1280; time = 1.9100s; samplesPerSecond = 670.2
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 131- 140, 0.00000000000078%]: CE = 0.59625015 * 1280; Err = 0.21015625 * 1280; time = 1.8752s; samplesPerSecond = 682.6
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 141- 150, 0.00000000000083%]: CE = 0.55589676 * 1280; Err = 0.18593750 * 1280; time = 2.0278s; samplesPerSecond = 631.2
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 151- 160, 0.00000000000089%]: CE = 0.55306396 * 1280; Err = 0.18515625 * 1280; time = 1.8233s; samplesPerSecond = 702.0
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 161- 170, 0.00000000000094%]: CE = 0.56157227 * 1280; Err = 0.19375000 * 1280; time = 1.9441s; samplesPerSecond = 658.4
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 171- 180, 0.00000000000100%]: CE = 0.58512573 * 1280; Err = 0.19531250 * 1280; time = 1.8170s; samplesPerSecond = 704.4
MPI Rank 1:  Epoch[12 of 12]-Minibatch[ 182- 190, 0.00000000000105%]: CE = 0.55865097 * 1280; Err = 0.19609375 * 1280; time = 1.6660s; samplesPerSecond = 768.3
MPI Rank 1: Finished Epoch[12 of 12]: [Training] CE = 0.56193457 * 25000; Err = 0.19272000 * 25000; totalSamplesSeen = 275000; learningRatePerSample = 0.0040000002; epochTime=35.8986s
MPI Rank 1: Final Results: Minibatch[1-79]: CE = 0.62149374 * 10000; perplexity = 1.86170687; Err = 0.21700000 * 10000
MPI Rank 1: Finished Epoch[12 of 12]: [Validate] CE = 0.62149374 * 10000; Err = 0.21700000 * 10000
MPI Rank 1: ~MultiversoHelper
MPI Rank 1: [INFO] [2017-04-10 01:31:42] Multiverso Shutdown successfully
MPI Rank 1: 
MPI Rank 1: Action "train" complete.
MPI Rank 1: 
MPI Rank 1: COMPLETED.
MPI Rank 1: ~MPIWrapperMpi
