CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
    Hardware threads: 12
    Total Memory: 57700428 kB
-------------------------------------------------------------------
=== Running /home/ubuntu/workspace/build/gpu/debug/bin/cntk configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/cntk_dpt.cntk currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data RunDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining OutputDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu DeviceId=0 timestamping=true
CNTK 2.3.1+ (HEAD ed450d, Jan  7 2018 20:10:59) at 2018/01/08 04:59:11

/home/ubuntu/workspace/build/gpu/debug/bin/cntk  configFile=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/cntk_dpt.cntk  currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  RunDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu  DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data  ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining  OutputDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu  DeviceId=0  timestamping=true
Changed current directory to /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
01/08/2018 04:59:11: -------------------------------------------------------------------
01/08/2018 04:59:11: Build info: 

01/08/2018 04:59:11: 		Built time: Jan  7 2018 20:08:47
01/08/2018 04:59:11: 		Last modified date: Sun Jan  7 20:08:19 2018
01/08/2018 04:59:11: 		Build type: debug
01/08/2018 04:59:11: 		Build target: GPU
01/08/2018 04:59:11: 		With ASGD: yes
01/08/2018 04:59:11: 		Math lib: mkl
01/08/2018 04:59:11: 		CUDA version: 9.0.0
01/08/2018 04:59:11: 		CUDNN version: 7.0.4
01/08/2018 04:59:11: 		Build Branch: HEAD
01/08/2018 04:59:11: 		Build SHA1: ed450d284dda1f314dfdeee86ce68e5bbfb0a87f
01/08/2018 04:59:11: 		MPI distribution: Open MPI
01/08/2018 04:59:11: 		MPI version: 1.10.7
01/08/2018 04:59:11: -------------------------------------------------------------------
01/08/2018 04:59:11: -------------------------------------------------------------------
01/08/2018 04:59:11: GPU info:

01/08/2018 04:59:11: 		Device[0]: cores = 3072; computeCapability = 5.2; type = "Tesla M60"; total memory = 8123 MB; free memory = 8112 MB
01/08/2018 04:59:11: -------------------------------------------------------------------

Configuration, Raw:

01/08/2018 04:59:11: precision = "float"
deviceId = $DeviceId$
command = dptPre1:addLayer2:dptPre2:addLayer3:speechTrain
ndlMacros = "$ConfigDir$/macros.txt"
globalMeanPath   = "GlobalStats/mean.363"
globalInvStdPath = "GlobalStats/var.363"
globalPriorPath  = "GlobalStats/prior.132"
traceLevel = 1
SGD = [
    epochSize = 81920
    minibatchSize = 256
    learningRatesPerMB = 0.8
    numMBsToShowResult = 10
    momentumPerMB = 0.9
    dropoutRate = 0.0
    maxEpochs = 2
]
dptPre1 = [
    action = "train"
    modelPath = "$RunDir$/models/Pre1/cntkSpeech"
    NDLNetworkBuilder = [
        networkDescription = "$ConfigDir$/dnn_1layer.txt"
    ]
]
addLayer2 = [    
    action = "edit"
    currLayer = 1
    newLayer = 2
    currModel = "$RunDir$/models/Pre1/cntkSpeech"
    newModel  = "$RunDir$/models/Pre2/cntkSpeech.0"
    editPath  = "$ConfigDir$/add_layer.mel"
]
dptPre2 = [
    action = "train"
    modelPath = "$RunDir$/models/Pre2/cntkSpeech"
    NDLNetworkBuilder = [
        networkDescription = "$ConfigDir$/dnn_1layer.txt"
    ]
]
addLayer3 = [    
    action = "edit"
    currLayer = 2
    newLayer = 3
    currModel = "$RunDir$/models/Pre2/cntkSpeech"
    newModel  = "$RunDir$/models/cntkSpeech.0"
    editPath  = "$ConfigDir$/add_layer.mel"
]
speechTrain = [
    action = "train"
    modelPath = "$RunDir$/models/cntkSpeech"
    deviceId = $DeviceId$
    traceLevel = 1
    NDLNetworkBuilder = [
        networkDescription = "$ConfigDir$/dnn.txt"
    ]
    SGD = [
        epochSize = 81920
        minibatchSize = 256:512
        learningRatesPerMB = 0.8:1.6
        numMBsToShowResult = 10
        momentumPerSample = 0.999589
        dropoutRate = 0.0
        maxEpochs = 4
        gradUpdateType = "none"
        normWithAveMultiplier = true
        clippingThresholdPerSample = 1#INF
    ]
]
reader = [
    readerType = "HTKMLFReader"
    readMethod = "blockRandomize"
    miniBatchMode = "partial"
    randomize = "auto"
    verbosity = 0
    useMersenneTwisterRand=true
    features = [
        dim = 363
        type = "real"
        scpFile = "$DataDir$/glob_0000.scp"
    ]
    labels = [
        mlfFile = "$DataDir$/glob_0000.mlf"
        labelMappingFile = "$DataDir$/state.list"
        labelDim = 132
        labelType = "category"
    ]
]
currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
RunDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu
DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining
OutputDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu
DeviceId=0
timestamping=true


Configuration After Variable Resolution:

01/08/2018 04:59:11: precision = "float"
deviceId = 0
command = dptPre1:addLayer2:dptPre2:addLayer3:speechTrain
ndlMacros = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/macros.txt"
globalMeanPath   = "GlobalStats/mean.363"
globalInvStdPath = "GlobalStats/var.363"
globalPriorPath  = "GlobalStats/prior.132"
traceLevel = 1
SGD = [
    epochSize = 81920
    minibatchSize = 256
    learningRatesPerMB = 0.8
    numMBsToShowResult = 10
    momentumPerMB = 0.9
    dropoutRate = 0.0
    maxEpochs = 2
]
dptPre1 = [
    action = "train"
    modelPath = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre1/cntkSpeech"
    NDLNetworkBuilder = [
        networkDescription = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/dnn_1layer.txt"
    ]
]
addLayer2 = [    
    action = "edit"
    currLayer = 1
    newLayer = 2
    currModel = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre1/cntkSpeech"
    newModel  = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech.0"
    editPath  = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/add_layer.mel"
]
dptPre2 = [
    action = "train"
    modelPath = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech"
    NDLNetworkBuilder = [
        networkDescription = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/dnn_1layer.txt"
    ]
]
addLayer3 = [    
    action = "edit"
    currLayer = 2
    newLayer = 3
    currModel = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech"
    newModel  = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech.0"
    editPath  = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/add_layer.mel"
]
speechTrain = [
    action = "train"
    modelPath = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech"
    deviceId = 0
    traceLevel = 1
    NDLNetworkBuilder = [
        networkDescription = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/dnn.txt"
    ]
    SGD = [
        epochSize = 81920
        minibatchSize = 256:512
        learningRatesPerMB = 0.8:1.6
        numMBsToShowResult = 10
        momentumPerSample = 0.999589
        dropoutRate = 0.0
        maxEpochs = 4
        gradUpdateType = "none"
        normWithAveMultiplier = true
        clippingThresholdPerSample = 1#INF
    ]
]
reader = [
    readerType = "HTKMLFReader"
    readMethod = "blockRandomize"
    miniBatchMode = "partial"
    randomize = "auto"
    verbosity = 0
    useMersenneTwisterRand=true
    features = [
        dim = 363
        type = "real"
        scpFile = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.scp"
    ]
    labels = [
        mlfFile = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.mlf"
        labelMappingFile = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list"
        labelDim = 132
        labelType = "category"
    ]
]
currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
RunDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu
DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining
OutputDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu
DeviceId=0
timestamping=true


Configuration After Processing and Variable Resolution:

configparameters: cntk_dpt.cntk:addLayer2=[    
    action = "edit"
    currLayer = 1
    newLayer = 2
    currModel = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre1/cntkSpeech"
    newModel  = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech.0"
    editPath  = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/add_layer.mel"
]

configparameters: cntk_dpt.cntk:addLayer3=[    
    action = "edit"
    currLayer = 2
    newLayer = 3
    currModel = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech"
    newModel  = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech.0"
    editPath  = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/add_layer.mel"
]

configparameters: cntk_dpt.cntk:command=dptPre1:addLayer2:dptPre2:addLayer3:speechTrain
configparameters: cntk_dpt.cntk:ConfigDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining
configparameters: cntk_dpt.cntk:currentDirectory=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
configparameters: cntk_dpt.cntk:DataDir=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data
configparameters: cntk_dpt.cntk:deviceId=0
configparameters: cntk_dpt.cntk:dptPre1=[
    action = "train"
    modelPath = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre1/cntkSpeech"
    NDLNetworkBuilder = [
        networkDescription = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/dnn_1layer.txt"
    ]
]

configparameters: cntk_dpt.cntk:dptPre2=[
    action = "train"
    modelPath = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech"
    NDLNetworkBuilder = [
        networkDescription = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/dnn_1layer.txt"
    ]
]

configparameters: cntk_dpt.cntk:globalInvStdPath=GlobalStats/var.363
configparameters: cntk_dpt.cntk:globalMeanPath=GlobalStats/mean.363
configparameters: cntk_dpt.cntk:globalPriorPath=GlobalStats/prior.132
configparameters: cntk_dpt.cntk:ndlMacros=/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/macros.txt
configparameters: cntk_dpt.cntk:OutputDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu
configparameters: cntk_dpt.cntk:precision=float
configparameters: cntk_dpt.cntk:reader=[
    readerType = "HTKMLFReader"
    readMethod = "blockRandomize"
    miniBatchMode = "partial"
    randomize = "auto"
    verbosity = 0
    useMersenneTwisterRand=true
    features = [
        dim = 363
        type = "real"
        scpFile = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.scp"
    ]
    labels = [
        mlfFile = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.mlf"
        labelMappingFile = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list"
        labelDim = 132
        labelType = "category"
    ]
]

configparameters: cntk_dpt.cntk:RunDir=/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu
configparameters: cntk_dpt.cntk:SGD=[
    epochSize = 81920
    minibatchSize = 256
    learningRatesPerMB = 0.8
    numMBsToShowResult = 10
    momentumPerMB = 0.9
    dropoutRate = 0.0
    maxEpochs = 2
]

configparameters: cntk_dpt.cntk:speechTrain=[
    action = "train"
    modelPath = "/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech"
    deviceId = 0
    traceLevel = 1
    NDLNetworkBuilder = [
        networkDescription = "/home/ubuntu/workspace/Tests/EndToEndTests/Speech/DNN/DiscriminativePreTraining/dnn.txt"
    ]
    SGD = [
        epochSize = 81920
        minibatchSize = 256:512
        learningRatesPerMB = 0.8:1.6
        numMBsToShowResult = 10
        momentumPerSample = 0.999589
        dropoutRate = 0.0
        maxEpochs = 4
        gradUpdateType = "none"
        normWithAveMultiplier = true
        clippingThresholdPerSample = 1#INF
    ]
]

configparameters: cntk_dpt.cntk:timestamping=true
configparameters: cntk_dpt.cntk:traceLevel=1
01/08/2018 04:59:11: Commands: dptPre1 addLayer2 dptPre2 addLayer3 speechTrain
01/08/2018 04:59:11: precision = "float"

01/08/2018 04:59:11: ##############################################################################
01/08/2018 04:59:11: #                                                                            #
01/08/2018 04:59:11: # dptPre1 command (train action)                                             #
01/08/2018 04:59:11: #                                                                            #
01/08/2018 04:59:11: ##############################################################################

01/08/2018 04:59:11: 
Creating virgin network.
NDLBuilder Using GPU 0
SetUniformRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4
reading script file /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.scp ... 948 entries
total 132 state names in state list /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list
htkmlfreader: reading MLF file /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.mlf ... total 948 entries
...............................................................................................feature set 0: 252734 frames in 948 out of 948 utterances
label set 0: 129 classes
minibatchutterancesource: 948 utterances grouped into 3 chunks, av. chunk size: 316.0 utterances, 84244.7 frames
01/08/2018 04:59:11: 
Model has 19 nodes. Using GPU 0.

01/08/2018 04:59:11: Training criterion:   ce = CrossEntropyWithSoftmax
01/08/2018 04:59:11: Evaluation criterion: err = ClassificationError


Allocating matrices for forward and/or backward propagation.

Gradient Memory Aliasing: 2 are aliased.
	OL.t (gradient) reuses OL.z (gradient)

Memory Sharing: Out of 29 matrices, 12 are shared as 3, and 17 are not shared.

Here are the ones that share memory:
	{ HL1.W : [512 x 363] (gradient)
	  HL1.z : [512 x 1 x *]
	  HL1.z : [512 x 1 x *] (gradient)
	  OL.t : [132 x 1 x *]
	  OL.t : [132 x 1 x *] (gradient)
	  OL.z : [132 x 1 x *] (gradient) }
	{ HL1.b : [512 x 1] (gradient)
	  HL1.y : [512 x 1 x *] }
	{ HL1.t : [512 x *]
	  HL1.t : [512 x *] (gradient)
	  HL1.y : [512 x 1 x *] (gradient)
	  OL.z : [132 x 1 x *] }

Here are the ones that don't share memory:
	{scaledLogLikelihood : [132 x 1 x *]}
	{features : [363 x *]}
	{labels : [132 x *]}
	{globalInvStd : [363 x 1]}
	{globalMean : [363 x 1]}
	{globalPrior : [132 x 1]}
	{HL1.W : [512 x 363]}
	{HL1.b : [512 x 1]}
	{OL.W : [132 x 512]}
	{OL.b : [132 x 1]}
	{ce : [1] (gradient)}
	{OL.b : [132 x 1] (gradient)}
	{err : [1]}
	{featNorm : [363 x *]}
	{ce : [1]}
	{logPrior : [132 x 1]}
	{OL.W : [132 x 512] (gradient)}


01/08/2018 04:59:11: Training 254084 parameters in 4 out of 4 parameter tensors and 10 nodes with gradient:

01/08/2018 04:59:11: 	Node 'HL1.W' (LearnableParameter operation) : [512 x 363]
01/08/2018 04:59:11: 	Node 'HL1.b' (LearnableParameter operation) : [512 x 1]
01/08/2018 04:59:11: 	Node 'OL.W' (LearnableParameter operation) : [132 x 512]
01/08/2018 04:59:11: 	Node 'OL.b' (LearnableParameter operation) : [132 x 1]

01/08/2018 04:59:11: No PreCompute nodes found, or all already computed. Skipping pre-computation step.

01/08/2018 04:59:11: Starting Epoch 1: learning rate per sample = 0.003125  effective momentum = 0.900000  momentum as time constant = 2429.8 samples
minibatchiterator: epoch 0: frames [0..81920] (first utterance at frame 0), data subset 0 of 1, with 1 datapasses
requiredata: determined feature kind as 33-dimensional 'USER' with frame shift 10.0 ms

01/08/2018 04:59:12: Starting minibatch loop.
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[   1-  10, 3.12%]: ce = 3.77545433 * 2560; err = 0.83984375 * 2560; time = 0.2024s; samplesPerSecond = 12646.0
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  11-  20, 6.25%]: ce = 2.92129173 * 2560; err = 0.69921875 * 2560; time = 0.0484s; samplesPerSecond = 52893.3
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  21-  30, 9.38%]: ce = 2.54243622 * 2560; err = 0.64882812 * 2560; time = 0.0483s; samplesPerSecond = 52984.2
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  31-  40, 12.50%]: ce = 2.20117416 * 2560; err = 0.60156250 * 2560; time = 0.0483s; samplesPerSecond = 53007.6
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  41-  50, 15.62%]: ce = 1.98474121 * 2560; err = 0.55273438 * 2560; time = 0.0478s; samplesPerSecond = 53609.4
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  51-  60, 18.75%]: ce = 1.87129364 * 2560; err = 0.51562500 * 2560; time = 0.0478s; samplesPerSecond = 53611.0
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  61-  70, 21.88%]: ce = 1.83400879 * 2560; err = 0.52812500 * 2560; time = 0.0478s; samplesPerSecond = 53550.9
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  71-  80, 25.00%]: ce = 1.71646271 * 2560; err = 0.49335937 * 2560; time = 0.0480s; samplesPerSecond = 53355.5
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  81-  90, 28.12%]: ce = 1.66541901 * 2560; err = 0.46328125 * 2560; time = 0.0561s; samplesPerSecond = 45671.0
01/08/2018 04:59:12:  Epoch[ 1 of 2]-Minibatch[  91- 100, 31.25%]: ce = 1.57725677 * 2560; err = 0.46054688 * 2560; time = 0.0485s; samplesPerSecond = 52739.4
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 101- 110, 34.38%]: ce = 1.61621246 * 2560; err = 0.45390625 * 2560; time = 0.0481s; samplesPerSecond = 53230.0
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 111- 120, 37.50%]: ce = 1.56063843 * 2560; err = 0.44140625 * 2560; time = 0.0487s; samplesPerSecond = 52579.2
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 121- 130, 40.62%]: ce = 1.52853241 * 2560; err = 0.44492188 * 2560; time = 0.0503s; samplesPerSecond = 50850.9
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 131- 140, 43.75%]: ce = 1.53461304 * 2560; err = 0.46210937 * 2560; time = 0.0484s; samplesPerSecond = 52864.9
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 141- 150, 46.88%]: ce = 1.46378479 * 2560; err = 0.44140625 * 2560; time = 0.0477s; samplesPerSecond = 53660.9
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 151- 160, 50.00%]: ce = 1.43345032 * 2560; err = 0.42617187 * 2560; time = 0.0484s; samplesPerSecond = 52935.7
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 161- 170, 53.12%]: ce = 1.43222961 * 2560; err = 0.42226562 * 2560; time = 0.0478s; samplesPerSecond = 53592.5
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 171- 180, 56.25%]: ce = 1.38003845 * 2560; err = 0.41250000 * 2560; time = 0.0475s; samplesPerSecond = 53852.1
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 181- 190, 59.38%]: ce = 1.35853271 * 2560; err = 0.40039062 * 2560; time = 0.0497s; samplesPerSecond = 51524.4
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 191- 200, 62.50%]: ce = 1.44864807 * 2560; err = 0.42656250 * 2560; time = 0.0479s; samplesPerSecond = 53432.2
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 201- 210, 65.62%]: ce = 1.43953552 * 2560; err = 0.42578125 * 2560; time = 0.0480s; samplesPerSecond = 53289.8
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 211- 220, 68.75%]: ce = 1.41762695 * 2560; err = 0.42617187 * 2560; time = 0.0479s; samplesPerSecond = 53466.4
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 221- 230, 71.88%]: ce = 1.33197937 * 2560; err = 0.40390625 * 2560; time = 0.0484s; samplesPerSecond = 52864.6
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 231- 240, 75.00%]: ce = 1.36100464 * 2560; err = 0.40429688 * 2560; time = 0.0484s; samplesPerSecond = 52933.9
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 241- 250, 78.12%]: ce = 1.30899048 * 2560; err = 0.39648438 * 2560; time = 0.0478s; samplesPerSecond = 53510.7
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 251- 260, 81.25%]: ce = 1.25351562 * 2560; err = 0.36953125 * 2560; time = 0.0478s; samplesPerSecond = 53553.0
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 261- 270, 84.38%]: ce = 1.30351257 * 2560; err = 0.39648438 * 2560; time = 0.0482s; samplesPerSecond = 53100.6
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 271- 280, 87.50%]: ce = 1.36050720 * 2560; err = 0.40898438 * 2560; time = 0.0480s; samplesPerSecond = 53317.9
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 281- 290, 90.62%]: ce = 1.29572754 * 2560; err = 0.39531250 * 2560; time = 0.0480s; samplesPerSecond = 53377.3
01/08/2018 04:59:13:  Epoch[ 1 of 2]-Minibatch[ 291- 300, 93.75%]: ce = 1.35762024 * 2560; err = 0.40898438 * 2560; time = 0.0515s; samplesPerSecond = 49743.0
01/08/2018 04:59:14:  Epoch[ 1 of 2]-Minibatch[ 301- 310, 96.88%]: ce = 1.32346802 * 2560; err = 0.39843750 * 2560; time = 0.0488s; samplesPerSecond = 52416.5
01/08/2018 04:59:14:  Epoch[ 1 of 2]-Minibatch[ 311- 320, 100.00%]: ce = 1.27053833 * 2560; err = 0.38203125 * 2560; time = 0.0440s; samplesPerSecond = 58138.0
01/08/2018 04:59:14: Finished Epoch[ 1 of 2]: [Training] ce = 1.65219517 * 81920; err = 0.46722412 * 81920; totalSamplesSeen = 81920; learningRatePerSample = 0.003125; epochTime=2.54239s
01/08/2018 04:59:14: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre1/cntkSpeech.1'

01/08/2018 04:59:14: Starting Epoch 2: learning rate per sample = 0.003125  effective momentum = 0.900000  momentum as time constant = 2429.8 samples
minibatchiterator: epoch 1: frames [81920..163840] (first utterance at frame 81920), data subset 0 of 1, with 1 datapasses

01/08/2018 04:59:14: Starting minibatch loop.
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[   1-  10, 3.12%]: ce = 1.24904280 * 2560; err = 0.39492187 * 2560; time = 0.0495s; samplesPerSecond = 51727.1
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  11-  20, 6.25%]: ce = 1.23917685 * 2560; err = 0.36992188 * 2560; time = 0.0486s; samplesPerSecond = 52724.7
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  21-  30, 9.38%]: ce = 1.26081600 * 2560; err = 0.39531250 * 2560; time = 0.0479s; samplesPerSecond = 53423.3
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  31-  40, 12.50%]: ce = 1.26097717 * 2560; err = 0.38281250 * 2560; time = 0.0481s; samplesPerSecond = 53197.8
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  41-  50, 15.62%]: ce = 1.27839279 * 2560; err = 0.36953125 * 2560; time = 0.0485s; samplesPerSecond = 52821.2
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  51-  60, 18.75%]: ce = 1.18358917 * 2560; err = 0.35742188 * 2560; time = 0.0479s; samplesPerSecond = 53472.9
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  61-  70, 21.88%]: ce = 1.19746399 * 2560; err = 0.36992188 * 2560; time = 0.0479s; samplesPerSecond = 53479.7
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  71-  80, 25.00%]: ce = 1.23055496 * 2560; err = 0.37070313 * 2560; time = 0.0480s; samplesPerSecond = 53348.3
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  81-  90, 28.12%]: ce = 1.26867142 * 2560; err = 0.38828125 * 2560; time = 0.0476s; samplesPerSecond = 53819.1
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[  91- 100, 31.25%]: ce = 1.24915771 * 2560; err = 0.37500000 * 2560; time = 0.0481s; samplesPerSecond = 53267.4
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 101- 110, 34.38%]: ce = 1.20246201 * 2560; err = 0.36718750 * 2560; time = 0.0477s; samplesPerSecond = 53619.5
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 111- 120, 37.50%]: ce = 1.18079071 * 2560; err = 0.36289063 * 2560; time = 0.0478s; samplesPerSecond = 53558.8
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 121- 130, 40.62%]: ce = 1.16271973 * 2560; err = 0.36523438 * 2560; time = 0.0479s; samplesPerSecond = 53395.4
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 131- 140, 43.75%]: ce = 1.16420593 * 2560; err = 0.36484375 * 2560; time = 0.0477s; samplesPerSecond = 53692.9
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 141- 150, 46.88%]: ce = 1.14631195 * 2560; err = 0.34375000 * 2560; time = 0.0478s; samplesPerSecond = 53571.1
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 151- 160, 50.00%]: ce = 1.11735229 * 2560; err = 0.34609375 * 2560; time = 0.0479s; samplesPerSecond = 53468.6
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 161- 170, 53.12%]: ce = 1.17672577 * 2560; err = 0.35976562 * 2560; time = 0.0480s; samplesPerSecond = 53346.2
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 171- 180, 56.25%]: ce = 1.13726349 * 2560; err = 0.35312500 * 2560; time = 0.0479s; samplesPerSecond = 53463.9
01/08/2018 04:59:14:  Epoch[ 2 of 2]-Minibatch[ 181- 190, 59.38%]: ce = 1.14749298 * 2560; err = 0.35390625 * 2560; time = 0.0477s; samplesPerSecond = 53675.3
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 191- 200, 62.50%]: ce = 1.11475067 * 2560; err = 0.33515625 * 2560; time = 0.0478s; samplesPerSecond = 53519.5
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 201- 210, 65.62%]: ce = 1.16500397 * 2560; err = 0.35000000 * 2560; time = 0.0484s; samplesPerSecond = 52903.3
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 211- 220, 68.75%]: ce = 1.14435730 * 2560; err = 0.35234375 * 2560; time = 0.0492s; samplesPerSecond = 52009.9
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 221- 230, 71.88%]: ce = 1.09438171 * 2560; err = 0.34648438 * 2560; time = 0.0483s; samplesPerSecond = 53011.8
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 231- 240, 75.00%]: ce = 1.12633362 * 2560; err = 0.34531250 * 2560; time = 0.0477s; samplesPerSecond = 53627.1
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 241- 250, 78.12%]: ce = 1.09389648 * 2560; err = 0.33867188 * 2560; time = 0.0545s; samplesPerSecond = 46976.5
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 251- 260, 81.25%]: ce = 1.08799744 * 2560; err = 0.32968750 * 2560; time = 0.0494s; samplesPerSecond = 51868.3
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 261- 270, 84.38%]: ce = 1.12633667 * 2560; err = 0.33906250 * 2560; time = 0.0511s; samplesPerSecond = 50050.5
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 271- 280, 87.50%]: ce = 1.12987671 * 2560; err = 0.34375000 * 2560; time = 0.0484s; samplesPerSecond = 52888.8
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 281- 290, 90.62%]: ce = 1.11752319 * 2560; err = 0.34531250 * 2560; time = 0.0484s; samplesPerSecond = 52919.8
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 291- 300, 93.75%]: ce = 1.08401489 * 2560; err = 0.32695313 * 2560; time = 0.0479s; samplesPerSecond = 53440.4
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 301- 310, 96.88%]: ce = 1.08304138 * 2560; err = 0.34492187 * 2560; time = 0.0482s; samplesPerSecond = 53106.7
01/08/2018 04:59:15:  Epoch[ 2 of 2]-Minibatch[ 311- 320, 100.00%]: ce = 1.07171021 * 2560; err = 0.32734375 * 2560; time = 0.0448s; samplesPerSecond = 57083.9
01/08/2018 04:59:15: Finished Epoch[ 2 of 2]: [Training] ce = 1.16538725 * 81920; err = 0.35673828 * 81920; totalSamplesSeen = 163840; learningRatePerSample = 0.003125; epochTime=1.55611s
01/08/2018 04:59:15: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre1/cntkSpeech'

01/08/2018 04:59:15: Action "train" complete.


01/08/2018 04:59:15: ##############################################################################
01/08/2018 04:59:15: #                                                                            #
01/08/2018 04:59:15: # addLayer2 command (edit action)                                            #
01/08/2018 04:59:15: #                                                                            #
01/08/2018 04:59:15: ##############################################################################


01/08/2018 04:59:15: Action "edit" complete.


01/08/2018 04:59:15: ##############################################################################
01/08/2018 04:59:15: #                                                                            #
01/08/2018 04:59:15: # dptPre2 command (train action)                                             #
01/08/2018 04:59:15: #                                                                            #
01/08/2018 04:59:15: ##############################################################################

01/08/2018 04:59:15: 
Starting from checkpoint. Loading network from '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech.0'.
NDLBuilder Using GPU 0
reading script file /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.scp ... 948 entries
total 132 state names in state list /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list
htkmlfreader: reading MLF file /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.mlf ... total 948 entries
...............................................................................................feature set 0: 252734 frames in 948 out of 948 utterances
label set 0: 129 classes
minibatchutterancesource: 948 utterances grouped into 3 chunks, av. chunk size: 316.0 utterances, 84244.7 frames
01/08/2018 04:59:15: 
Model has 24 nodes. Using GPU 0.

01/08/2018 04:59:15: Training criterion:   ce = CrossEntropyWithSoftmax
01/08/2018 04:59:15: Evaluation criterion: err = ClassificationError

01/08/2018 04:59:15: Training 516740 parameters in 6 out of 6 parameter tensors and 15 nodes with gradient:

01/08/2018 04:59:15: 	Node 'HL1.W' (LearnableParameter operation) : [512 x 363]
01/08/2018 04:59:15: 	Node 'HL1.b' (LearnableParameter operation) : [512 x 1]
01/08/2018 04:59:15: 	Node 'HL2.W' (LearnableParameter operation) : [512 x 512]
01/08/2018 04:59:15: 	Node 'HL2.b' (LearnableParameter operation) : [512 x 1]
01/08/2018 04:59:15: 	Node 'OL.W' (LearnableParameter operation) : [132 x 512]
01/08/2018 04:59:15: 	Node 'OL.b' (LearnableParameter operation) : [132 x 1]

01/08/2018 04:59:15: No PreCompute nodes found, or all already computed. Skipping pre-computation step.

01/08/2018 04:59:15: Starting Epoch 1: learning rate per sample = 0.003125  effective momentum = 0.900000  momentum as time constant = 2429.8 samples
minibatchiterator: epoch 0: frames [0..81920] (first utterance at frame 0), data subset 0 of 1, with 1 datapasses
requiredata: determined feature kind as 33-dimensional 'USER' with frame shift 10.0 ms

01/08/2018 04:59:16: Starting minibatch loop.
01/08/2018 04:59:16:  Epoch[ 1 of 2]-Minibatch[   1-  10, 3.12%]: ce = 3.95232048 * 2560; err = 0.81835938 * 2560; time = 0.0589s; samplesPerSecond = 43476.5
01/08/2018 04:59:16:  Epoch[ 1 of 2]-Minibatch[  11-  20, 6.25%]: ce = 2.59509544 * 2560; err = 0.63632813 * 2560; time = 0.0561s; samplesPerSecond = 45593.5
01/08/2018 04:59:16:  Epoch[ 1 of 2]-Minibatch[  21-  30, 9.38%]: ce = 2.15305252 * 2560; err = 0.58046875 * 2560; time = 0.0560s; samplesPerSecond = 45754.0
01/08/2018 04:59:16:  Epoch[ 1 of 2]-Minibatch[  31-  40, 12.50%]: ce = 1.80730286 * 2560; err = 0.50039062 * 2560; time = 0.0558s; samplesPerSecond = 45873.3
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[  41-  50, 15.62%]: ce = 1.62435303 * 2560; err = 0.47460938 * 2560; time = 0.0636s; samplesPerSecond = 40281.1
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[  51-  60, 18.75%]: ce = 1.57792664 * 2560; err = 0.45468750 * 2560; time = 0.0602s; samplesPerSecond = 42525.3
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[  61-  70, 21.88%]: ce = 1.57253876 * 2560; err = 0.46523437 * 2560; time = 0.0615s; samplesPerSecond = 41638.9
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[  71-  80, 25.00%]: ce = 1.49025574 * 2560; err = 0.45156250 * 2560; time = 0.0600s; samplesPerSecond = 42698.4
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[  81-  90, 28.12%]: ce = 1.43519135 * 2560; err = 0.41289063 * 2560; time = 0.0564s; samplesPerSecond = 45392.8
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[  91- 100, 31.25%]: ce = 1.39548492 * 2560; err = 0.41093750 * 2560; time = 0.0625s; samplesPerSecond = 40948.3
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 101- 110, 34.38%]: ce = 1.40931549 * 2560; err = 0.40351562 * 2560; time = 0.0608s; samplesPerSecond = 42117.5
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 111- 120, 37.50%]: ce = 1.35583801 * 2560; err = 0.39492187 * 2560; time = 0.0578s; samplesPerSecond = 44252.4
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 121- 130, 40.62%]: ce = 1.31971741 * 2560; err = 0.38828125 * 2560; time = 0.0561s; samplesPerSecond = 45597.7
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 131- 140, 43.75%]: ce = 1.33088074 * 2560; err = 0.40664062 * 2560; time = 0.0557s; samplesPerSecond = 45931.6
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 141- 150, 46.88%]: ce = 1.27847748 * 2560; err = 0.38242188 * 2560; time = 0.0557s; samplesPerSecond = 45925.0
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 151- 160, 50.00%]: ce = 1.28628845 * 2560; err = 0.39296875 * 2560; time = 0.0554s; samplesPerSecond = 46203.0
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 161- 170, 53.12%]: ce = 1.29282837 * 2560; err = 0.37734375 * 2560; time = 0.0555s; samplesPerSecond = 46140.9
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 171- 180, 56.25%]: ce = 1.26449585 * 2560; err = 0.38867188 * 2560; time = 0.0552s; samplesPerSecond = 46361.2
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 181- 190, 59.38%]: ce = 1.28384094 * 2560; err = 0.38828125 * 2560; time = 0.0554s; samplesPerSecond = 46190.0
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 191- 200, 62.50%]: ce = 1.32117004 * 2560; err = 0.40000000 * 2560; time = 0.0555s; samplesPerSecond = 46126.2
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 201- 210, 65.62%]: ce = 1.30416870 * 2560; err = 0.38085938 * 2560; time = 0.0558s; samplesPerSecond = 45847.7
01/08/2018 04:59:17:  Epoch[ 1 of 2]-Minibatch[ 211- 220, 68.75%]: ce = 1.31772766 * 2560; err = 0.39765625 * 2560; time = 0.0555s; samplesPerSecond = 46128.2
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 221- 230, 71.88%]: ce = 1.24123840 * 2560; err = 0.37148437 * 2560; time = 0.0557s; samplesPerSecond = 45988.1
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 231- 240, 75.00%]: ce = 1.26621399 * 2560; err = 0.38476562 * 2560; time = 0.0560s; samplesPerSecond = 45698.2
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 241- 250, 78.12%]: ce = 1.23011169 * 2560; err = 0.37031250 * 2560; time = 0.0561s; samplesPerSecond = 45603.2
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 251- 260, 81.25%]: ce = 1.19255066 * 2560; err = 0.35820313 * 2560; time = 0.0560s; samplesPerSecond = 45685.4
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 261- 270, 84.38%]: ce = 1.20788269 * 2560; err = 0.36914062 * 2560; time = 0.0555s; samplesPerSecond = 46101.9
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 271- 280, 87.50%]: ce = 1.24570618 * 2560; err = 0.37656250 * 2560; time = 0.0558s; samplesPerSecond = 45893.5
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 281- 290, 90.62%]: ce = 1.17422485 * 2560; err = 0.34257813 * 2560; time = 0.0552s; samplesPerSecond = 46408.2
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 291- 300, 93.75%]: ce = 1.17809753 * 2560; err = 0.35312500 * 2560; time = 0.0581s; samplesPerSecond = 44049.8
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 301- 310, 96.88%]: ce = 1.19910583 * 2560; err = 0.35625000 * 2560; time = 0.0566s; samplesPerSecond = 45200.4
01/08/2018 04:59:18:  Epoch[ 1 of 2]-Minibatch[ 311- 320, 100.00%]: ce = 1.15553284 * 2560; err = 0.34570312 * 2560; time = 0.0510s; samplesPerSecond = 50193.5
01/08/2018 04:59:18: Finished Epoch[ 1 of 2]: [Training] ce = 1.48309174 * 81920; err = 0.42297363 * 81920; totalSamplesSeen = 81920; learningRatePerSample = 0.003125; epochTime=2.70551s
01/08/2018 04:59:18: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech.1'

01/08/2018 04:59:18: Starting Epoch 2: learning rate per sample = 0.003125  effective momentum = 0.900000  momentum as time constant = 2429.8 samples
minibatchiterator: epoch 1: frames [81920..163840] (first utterance at frame 81920), data subset 0 of 1, with 1 datapasses

01/08/2018 04:59:18: Starting minibatch loop.
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[   1-  10, 3.12%]: ce = 1.16412601 * 2560; err = 0.36210938 * 2560; time = 0.0569s; samplesPerSecond = 44953.5
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[  11-  20, 6.25%]: ce = 1.18867970 * 2560; err = 0.35742188 * 2560; time = 0.0558s; samplesPerSecond = 45887.8
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[  21-  30, 9.38%]: ce = 1.15690613 * 2560; err = 0.35625000 * 2560; time = 0.0553s; samplesPerSecond = 46258.7
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[  31-  40, 12.50%]: ce = 1.15283051 * 2560; err = 0.35390625 * 2560; time = 0.0555s; samplesPerSecond = 46155.9
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[  41-  50, 15.62%]: ce = 1.19624062 * 2560; err = 0.35000000 * 2560; time = 0.0557s; samplesPerSecond = 45992.5
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[  51-  60, 18.75%]: ce = 1.13569336 * 2560; err = 0.35000000 * 2560; time = 0.0557s; samplesPerSecond = 45964.1
01/08/2018 04:59:18:  Epoch[ 2 of 2]-Minibatch[  61-  70, 21.88%]: ce = 1.14269714 * 2560; err = 0.35390625 * 2560; time = 0.0577s; samplesPerSecond = 44372.9
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[  71-  80, 25.00%]: ce = 1.17199554 * 2560; err = 0.36562500 * 2560; time = 0.0645s; samplesPerSecond = 39716.6
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[  81-  90, 28.12%]: ce = 1.17918625 * 2560; err = 0.36679688 * 2560; time = 0.0651s; samplesPerSecond = 39346.9
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[  91- 100, 31.25%]: ce = 1.19158630 * 2560; err = 0.36484375 * 2560; time = 0.0572s; samplesPerSecond = 44721.9
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 101- 110, 34.38%]: ce = 1.14164963 * 2560; err = 0.34414062 * 2560; time = 0.0600s; samplesPerSecond = 42664.0
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 111- 120, 37.50%]: ce = 1.13930664 * 2560; err = 0.34257813 * 2560; time = 0.0573s; samplesPerSecond = 44714.5
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 121- 130, 40.62%]: ce = 1.09886627 * 2560; err = 0.33906250 * 2560; time = 0.0580s; samplesPerSecond = 44137.2
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 131- 140, 43.75%]: ce = 1.12534027 * 2560; err = 0.34882812 * 2560; time = 0.0566s; samplesPerSecond = 45192.6
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 141- 150, 46.88%]: ce = 1.10109558 * 2560; err = 0.33359375 * 2560; time = 0.0575s; samplesPerSecond = 44545.0
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 151- 160, 50.00%]: ce = 1.08001862 * 2560; err = 0.34101562 * 2560; time = 0.0556s; samplesPerSecond = 46011.0
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 161- 170, 53.12%]: ce = 1.12076874 * 2560; err = 0.33359375 * 2560; time = 0.0555s; samplesPerSecond = 46162.5
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 171- 180, 56.25%]: ce = 1.07955017 * 2560; err = 0.33476563 * 2560; time = 0.0570s; samplesPerSecond = 44927.9
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 181- 190, 59.38%]: ce = 1.11439514 * 2560; err = 0.34531250 * 2560; time = 0.0557s; samplesPerSecond = 45921.2
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 191- 200, 62.50%]: ce = 1.08090973 * 2560; err = 0.32578125 * 2560; time = 0.0553s; samplesPerSecond = 46309.7
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 201- 210, 65.62%]: ce = 1.12362366 * 2560; err = 0.33281250 * 2560; time = 0.0552s; samplesPerSecond = 46344.3
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 211- 220, 68.75%]: ce = 1.09581451 * 2560; err = 0.33554688 * 2560; time = 0.0552s; samplesPerSecond = 46397.7
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 221- 230, 71.88%]: ce = 1.04845886 * 2560; err = 0.32968750 * 2560; time = 0.0555s; samplesPerSecond = 46118.6
01/08/2018 04:59:19:  Epoch[ 2 of 2]-Minibatch[ 231- 240, 75.00%]: ce = 1.09396973 * 2560; err = 0.33945313 * 2560; time = 0.0551s; samplesPerSecond = 46454.8
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 241- 250, 78.12%]: ce = 1.09650269 * 2560; err = 0.34140625 * 2560; time = 0.0555s; samplesPerSecond = 46121.1
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 251- 260, 81.25%]: ce = 1.07186279 * 2560; err = 0.32734375 * 2560; time = 0.0555s; samplesPerSecond = 46134.6
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 261- 270, 84.38%]: ce = 1.11242065 * 2560; err = 0.34296875 * 2560; time = 0.0561s; samplesPerSecond = 45629.5
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 271- 280, 87.50%]: ce = 1.09167480 * 2560; err = 0.33437500 * 2560; time = 0.0559s; samplesPerSecond = 45809.2
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 281- 290, 90.62%]: ce = 1.10017090 * 2560; err = 0.34414062 * 2560; time = 0.0554s; samplesPerSecond = 46178.1
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 291- 300, 93.75%]: ce = 1.09057312 * 2560; err = 0.32578125 * 2560; time = 0.0559s; samplesPerSecond = 45783.8
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 301- 310, 96.88%]: ce = 1.08012695 * 2560; err = 0.33281250 * 2560; time = 0.0553s; samplesPerSecond = 46268.7
01/08/2018 04:59:20:  Epoch[ 2 of 2]-Minibatch[ 311- 320, 100.00%]: ce = 1.07658386 * 2560; err = 0.33593750 * 2560; time = 0.0509s; samplesPerSecond = 50280.7
01/08/2018 04:59:20: Finished Epoch[ 2 of 2]: [Training] ce = 1.12011328 * 81920; err = 0.34349365 * 81920; totalSamplesSeen = 163840; learningRatePerSample = 0.003125; epochTime=1.81905s
01/08/2018 04:59:20: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/Pre2/cntkSpeech'

01/08/2018 04:59:20: Action "train" complete.


01/08/2018 04:59:20: ##############################################################################
01/08/2018 04:59:20: #                                                                            #
01/08/2018 04:59:20: # addLayer3 command (edit action)                                            #
01/08/2018 04:59:20: #                                                                            #
01/08/2018 04:59:20: ##############################################################################


01/08/2018 04:59:20: Action "edit" complete.


01/08/2018 04:59:20: ##############################################################################
01/08/2018 04:59:20: #                                                                            #
01/08/2018 04:59:20: # speechTrain command (train action)                                         #
01/08/2018 04:59:20: #                                                                            #
01/08/2018 04:59:20: ##############################################################################

01/08/2018 04:59:20: 
Starting from checkpoint. Loading network from '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech.0'.
NDLBuilder Using GPU 0
reading script file /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.scp ... 948 entries
total 132 state names in state list /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/state.list
htkmlfreader: reading MLF file /home/ubuntu/workspace/Tests/EndToEndTests/Speech/Data/glob_0000.mlf ... total 948 entries
...............................................................................................feature set 0: 252734 frames in 948 out of 948 utterances
label set 0: 129 classes
minibatchutterancesource: 948 utterances grouped into 3 chunks, av. chunk size: 316.0 utterances, 84244.7 frames
01/08/2018 04:59:20: 
Model has 29 nodes. Using GPU 0.

01/08/2018 04:59:20: Training criterion:   ce = CrossEntropyWithSoftmax
01/08/2018 04:59:20: Evaluation criterion: err = ClassificationError

01/08/2018 04:59:20: Training 779396 parameters in 8 out of 8 parameter tensors and 20 nodes with gradient:

01/08/2018 04:59:20: 	Node 'HL1.W' (LearnableParameter operation) : [512 x 363]
01/08/2018 04:59:20: 	Node 'HL1.b' (LearnableParameter operation) : [512 x 1]
01/08/2018 04:59:20: 	Node 'HL2.W' (LearnableParameter operation) : [512 x 512]
01/08/2018 04:59:20: 	Node 'HL2.b' (LearnableParameter operation) : [512 x 1]
01/08/2018 04:59:20: 	Node 'HL3.W' (LearnableParameter operation) : [512 x 512]
01/08/2018 04:59:20: 	Node 'HL3.b' (LearnableParameter operation) : [512 x 1]
01/08/2018 04:59:20: 	Node 'OL.W' (LearnableParameter operation) : [132 x 512]
01/08/2018 04:59:20: 	Node 'OL.b' (LearnableParameter operation) : [132 x 1]

01/08/2018 04:59:20: No PreCompute nodes found, or all already computed. Skipping pre-computation step.

01/08/2018 04:59:20: Starting Epoch 1: learning rate per sample = 0.003125  effective momentum = 0.900117  momentum as time constant = 2432.7 samples
minibatchiterator: epoch 0: frames [0..81920] (first utterance at frame 0), data subset 0 of 1, with 1 datapasses
requiredata: determined feature kind as 33-dimensional 'USER' with frame shift 10.0 ms

01/08/2018 04:59:21: Starting minibatch loop.
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[   1-  10, 3.12%]: ce = 4.04514732 * 2560; err = 0.84101563 * 2560; time = 0.0673s; samplesPerSecond = 38050.8
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[  11-  20, 6.25%]: ce = 2.57487679 * 2560; err = 0.61679688 * 2560; time = 0.0641s; samplesPerSecond = 39947.1
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[  21-  30, 9.38%]: ce = 2.06998596 * 2560; err = 0.56523437 * 2560; time = 0.0632s; samplesPerSecond = 40480.4
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[  31-  40, 12.50%]: ce = 1.69130554 * 2560; err = 0.47031250 * 2560; time = 0.0634s; samplesPerSecond = 40381.4
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[  41-  50, 15.62%]: ce = 1.51569901 * 2560; err = 0.43671875 * 2560; time = 0.0634s; samplesPerSecond = 40407.0
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[  51-  60, 18.75%]: ce = 1.45793076 * 2560; err = 0.41914062 * 2560; time = 0.0637s; samplesPerSecond = 40180.0
01/08/2018 04:59:21:  Epoch[ 1 of 4]-Minibatch[  61-  70, 21.88%]: ce = 1.46249542 * 2560; err = 0.43203125 * 2560; time = 0.0635s; samplesPerSecond = 40296.3
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[  71-  80, 25.00%]: ce = 1.37602081 * 2560; err = 0.40351562 * 2560; time = 0.0633s; samplesPerSecond = 40460.5
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[  81-  90, 28.12%]: ce = 1.32697144 * 2560; err = 0.38632813 * 2560; time = 0.0634s; samplesPerSecond = 40352.6
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[  91- 100, 31.25%]: ce = 1.28866119 * 2560; err = 0.37617187 * 2560; time = 0.0639s; samplesPerSecond = 40062.3
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 101- 110, 34.38%]: ce = 1.31844482 * 2560; err = 0.38437500 * 2560; time = 0.0640s; samplesPerSecond = 39973.8
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 111- 120, 37.50%]: ce = 1.27721405 * 2560; err = 0.36992188 * 2560; time = 0.0640s; samplesPerSecond = 39981.0
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 121- 130, 40.62%]: ce = 1.24465790 * 2560; err = 0.37382813 * 2560; time = 0.0633s; samplesPerSecond = 40422.2
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 131- 140, 43.75%]: ce = 1.25987854 * 2560; err = 0.38710937 * 2560; time = 0.0633s; samplesPerSecond = 40453.9
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 141- 150, 46.88%]: ce = 1.20045929 * 2560; err = 0.35976562 * 2560; time = 0.0641s; samplesPerSecond = 39961.5
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 151- 160, 50.00%]: ce = 1.22033539 * 2560; err = 0.36914062 * 2560; time = 0.0634s; samplesPerSecond = 40357.4
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 161- 170, 53.12%]: ce = 1.21545715 * 2560; err = 0.36093750 * 2560; time = 0.0634s; samplesPerSecond = 40353.0
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 171- 180, 56.25%]: ce = 1.19536133 * 2560; err = 0.36406250 * 2560; time = 0.0636s; samplesPerSecond = 40250.1
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 181- 190, 59.38%]: ce = 1.21321716 * 2560; err = 0.36796875 * 2560; time = 0.0657s; samplesPerSecond = 38940.6
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 191- 200, 62.50%]: ce = 1.25707092 * 2560; err = 0.38085938 * 2560; time = 0.0633s; samplesPerSecond = 40432.4
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 201- 210, 65.62%]: ce = 1.25220337 * 2560; err = 0.36484375 * 2560; time = 0.0631s; samplesPerSecond = 40563.0
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 211- 220, 68.75%]: ce = 1.25466614 * 2560; err = 0.38789062 * 2560; time = 0.0639s; samplesPerSecond = 40037.8
01/08/2018 04:59:22:  Epoch[ 1 of 4]-Minibatch[ 221- 230, 71.88%]: ce = 1.18672180 * 2560; err = 0.35429688 * 2560; time = 0.0634s; samplesPerSecond = 40400.9
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 231- 240, 75.00%]: ce = 1.21309814 * 2560; err = 0.37539062 * 2560; time = 0.0752s; samplesPerSecond = 34065.1
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 241- 250, 78.12%]: ce = 1.18207397 * 2560; err = 0.35585937 * 2560; time = 0.0686s; samplesPerSecond = 37323.4
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 251- 260, 81.25%]: ce = 1.14777222 * 2560; err = 0.34140625 * 2560; time = 0.0648s; samplesPerSecond = 39522.3
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 261- 270, 84.38%]: ce = 1.13528748 * 2560; err = 0.35156250 * 2560; time = 0.0663s; samplesPerSecond = 38613.3
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 271- 280, 87.50%]: ce = 1.19689026 * 2560; err = 0.36328125 * 2560; time = 0.0656s; samplesPerSecond = 39043.1
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 281- 290, 90.62%]: ce = 1.13403015 * 2560; err = 0.33554688 * 2560; time = 0.0642s; samplesPerSecond = 39875.0
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 291- 300, 93.75%]: ce = 1.14353638 * 2560; err = 0.35273437 * 2560; time = 0.0636s; samplesPerSecond = 40220.8
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 301- 310, 96.88%]: ce = 1.15669556 * 2560; err = 0.34765625 * 2560; time = 0.0643s; samplesPerSecond = 39797.5
01/08/2018 04:59:23:  Epoch[ 1 of 4]-Minibatch[ 311- 320, 100.00%]: ce = 1.10986328 * 2560; err = 0.33359375 * 2560; time = 0.0592s; samplesPerSecond = 43237.5
01/08/2018 04:59:23: Finished Epoch[ 1 of 4]: [Training] ce = 1.41637592 * 81920; err = 0.40404053 * 81920; totalSamplesSeen = 81920; learningRatePerSample = 0.003125; epochTime=2.89876s
01/08/2018 04:59:23: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech.1'

01/08/2018 04:59:23: Starting Epoch 2: learning rate per sample = 0.003125  effective momentum = 0.810210  momentum as time constant = 2432.7 samples
minibatchiterator: epoch 1: frames [81920..163840] (first utterance at frame 81920), data subset 0 of 1, with 1 datapasses

01/08/2018 04:59:23: Starting minibatch loop.
01/08/2018 04:59:23:  Epoch[ 2 of 4]-Minibatch[   1-  10, 6.25%]: ce = 1.26108437 * 5120; err = 0.37500000 * 5120; time = 0.1006s; samplesPerSecond = 50876.2
01/08/2018 04:59:23:  Epoch[ 2 of 4]-Minibatch[  11-  20, 12.50%]: ce = 1.40673923 * 5120; err = 0.40703125 * 5120; time = 0.0971s; samplesPerSecond = 52706.5
01/08/2018 04:59:23:  Epoch[ 2 of 4]-Minibatch[  21-  30, 18.75%]: ce = 1.22149639 * 5120; err = 0.35937500 * 5120; time = 0.0966s; samplesPerSecond = 52988.6
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  31-  40, 25.00%]: ce = 1.13947525 * 5120; err = 0.35195312 * 5120; time = 0.0972s; samplesPerSecond = 52676.9
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  41-  50, 31.25%]: ce = 1.15874138 * 5120; err = 0.35000000 * 5120; time = 0.0964s; samplesPerSecond = 53097.7
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  51-  60, 37.50%]: ce = 1.13039322 * 5120; err = 0.33984375 * 5120; time = 0.0994s; samplesPerSecond = 51525.1
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  61-  70, 43.75%]: ce = 1.10522003 * 5120; err = 0.34609375 * 5120; time = 0.1018s; samplesPerSecond = 50309.4
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  71-  80, 50.00%]: ce = 1.07494049 * 5120; err = 0.33437500 * 5120; time = 0.0973s; samplesPerSecond = 52607.4
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  81-  90, 56.25%]: ce = 1.08323288 * 5120; err = 0.32832031 * 5120; time = 0.0966s; samplesPerSecond = 52995.8
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[  91- 100, 62.50%]: ce = 1.10492630 * 5120; err = 0.35058594 * 5120; time = 0.0979s; samplesPerSecond = 52324.6
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[ 101- 110, 68.75%]: ce = 1.09108047 * 5120; err = 0.32636719 * 5120; time = 0.0971s; samplesPerSecond = 52723.4
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[ 111- 120, 75.00%]: ce = 1.06803894 * 5120; err = 0.33242187 * 5120; time = 0.0968s; samplesPerSecond = 52871.6
01/08/2018 04:59:24:  Epoch[ 2 of 4]-Minibatch[ 121- 130, 81.25%]: ce = 1.07306976 * 5120; err = 0.33281250 * 5120; time = 0.0972s; samplesPerSecond = 52690.0
01/08/2018 04:59:25:  Epoch[ 2 of 4]-Minibatch[ 131- 140, 87.50%]: ce = 1.20960236 * 5120; err = 0.36835937 * 5120; time = 0.0963s; samplesPerSecond = 53165.0
01/08/2018 04:59:25:  Epoch[ 2 of 4]-Minibatch[ 141- 150, 93.75%]: ce = 1.07773590 * 5120; err = 0.32851562 * 5120; time = 0.0973s; samplesPerSecond = 52626.1
01/08/2018 04:59:25:  Epoch[ 2 of 4]-Minibatch[ 151- 160, 100.00%]: ce = 1.05148163 * 5120; err = 0.31875000 * 5120; time = 0.0884s; samplesPerSecond = 57941.8
01/08/2018 04:59:25: Finished Epoch[ 2 of 4]: [Training] ce = 1.14107866 * 81920; err = 0.34686279 * 81920; totalSamplesSeen = 163840; learningRatePerSample = 0.003125; epochTime=1.56898s
01/08/2018 04:59:25: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech.2'

01/08/2018 04:59:25: Starting Epoch 3: learning rate per sample = 0.003125  effective momentum = 0.810210  momentum as time constant = 2432.7 samples
minibatchiterator: epoch 2: frames [163840..245760] (first utterance at frame 163840), data subset 0 of 1, with 1 datapasses

01/08/2018 04:59:25: Starting minibatch loop.
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[   1-  10, 6.25%]: ce = 1.07412243 * 5120; err = 0.33593750 * 5120; time = 0.0980s; samplesPerSecond = 52267.9
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[  11-  20, 12.50%]: ce = 1.09550304 * 5120; err = 0.33242187 * 5120; time = 0.0973s; samplesPerSecond = 52625.2
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[  21-  30, 18.75%]: ce = 1.08694725 * 5120; err = 0.33945313 * 5120; time = 0.0973s; samplesPerSecond = 52616.9
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[  31-  40, 25.00%]: ce = 1.04788971 * 5120; err = 0.32480469 * 5120; time = 0.0966s; samplesPerSecond = 52977.7
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[  41-  50, 31.25%]: ce = 1.07330208 * 5120; err = 0.32773438 * 5120; time = 0.0973s; samplesPerSecond = 52594.7
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[  51-  60, 37.50%]: ce = 1.08705254 * 5120; err = 0.33222656 * 5120; time = 0.0961s; samplesPerSecond = 53255.5
01/08/2018 04:59:25:  Epoch[ 3 of 4]-Minibatch[  61-  70, 43.75%]: ce = 1.06946869 * 5120; err = 0.33320312 * 5120; time = 0.0964s; samplesPerSecond = 53128.9
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[  71-  80, 50.00%]: ce = 1.07995758 * 5120; err = 0.33769531 * 5120; time = 0.0973s; samplesPerSecond = 52597.9
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[  81-  90, 56.25%]: ce = 1.10155334 * 5120; err = 0.35058594 * 5120; time = 0.0992s; samplesPerSecond = 51616.5
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[  91- 100, 62.50%]: ce = 1.02064209 * 5120; err = 0.31406250 * 5120; time = 0.0969s; samplesPerSecond = 52835.6
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[ 101- 110, 68.75%]: ce = 1.02912445 * 5120; err = 0.32519531 * 5120; time = 0.0975s; samplesPerSecond = 52535.0
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[ 111- 120, 75.00%]: ce = 1.05624924 * 5120; err = 0.32734375 * 5120; time = 0.0970s; samplesPerSecond = 52796.4
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[ 121- 130, 81.25%]: ce = 1.06630859 * 5120; err = 0.33730469 * 5120; time = 0.0972s; samplesPerSecond = 52681.3
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[ 131- 140, 87.50%]: ce = 1.09063416 * 5120; err = 0.34375000 * 5120; time = 0.0978s; samplesPerSecond = 52362.3
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[ 141- 150, 93.75%]: ce = 1.02371521 * 5120; err = 0.31660156 * 5120; time = 0.0970s; samplesPerSecond = 52789.8
01/08/2018 04:59:26:  Epoch[ 3 of 4]-Minibatch[ 151- 160, 100.00%]: ce = 1.03207855 * 5120; err = 0.32617188 * 5120; time = 0.0875s; samplesPerSecond = 58541.9
01/08/2018 04:59:26: Finished Epoch[ 3 of 4]: [Training] ce = 1.06465931 * 81920; err = 0.33153076 * 81920; totalSamplesSeen = 245760; learningRatePerSample = 0.003125; epochTime=1.56105s
01/08/2018 04:59:26: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech.3'

01/08/2018 04:59:26: Starting Epoch 4: learning rate per sample = 0.003125  effective momentum = 0.810210  momentum as time constant = 2432.7 samples
minibatchiterator: epoch 3: frames [245760..327680] (first utterance at frame 245760), data subset 0 of 1, with 1 datapasses

01/08/2018 04:59:26: Starting minibatch loop.
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[   1-  10, 6.25%]: ce = 1.02240515 * 5120; err = 0.32734375 * 5120; time = 0.0975s; samplesPerSecond = 52532.6
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[  11-  20, 12.50%]: ce = 1.00561662 * 4926; err = 0.31790499 * 4926; time = 0.3857s; samplesPerSecond = 12770.5
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[  21-  30, 18.75%]: ce = 1.02030258 * 5120; err = 0.31718750 * 5120; time = 0.0968s; samplesPerSecond = 52898.6
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[  31-  40, 25.00%]: ce = 1.03147869 * 5120; err = 0.32089844 * 5120; time = 0.0974s; samplesPerSecond = 52587.3
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[  41-  50, 31.25%]: ce = 1.03559341 * 5120; err = 0.32343750 * 5120; time = 0.0964s; samplesPerSecond = 53086.3
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[  51-  60, 37.50%]: ce = 0.99760704 * 5120; err = 0.31464844 * 5120; time = 0.0966s; samplesPerSecond = 53002.7
01/08/2018 04:59:27:  Epoch[ 4 of 4]-Minibatch[  61-  70, 43.75%]: ce = 1.01162643 * 5120; err = 0.31718750 * 5120; time = 0.0969s; samplesPerSecond = 52848.2
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[  71-  80, 50.00%]: ce = 1.00835876 * 5120; err = 0.30839844 * 5120; time = 0.0991s; samplesPerSecond = 51683.0
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[  81-  90, 56.25%]: ce = 0.97858810 * 5120; err = 0.31562500 * 5120; time = 0.0984s; samplesPerSecond = 52052.9
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[  91- 100, 62.50%]: ce = 0.98578568 * 5120; err = 0.30195312 * 5120; time = 0.0980s; samplesPerSecond = 52257.5
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[ 101- 110, 68.75%]: ce = 1.03475266 * 5120; err = 0.32089844 * 5120; time = 0.0969s; samplesPerSecond = 52833.7
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[ 111- 120, 75.00%]: ce = 0.98508987 * 5120; err = 0.30683594 * 5120; time = 0.0971s; samplesPerSecond = 52724.7
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[ 121- 130, 81.25%]: ce = 0.99634094 * 5120; err = 0.31250000 * 5120; time = 0.0974s; samplesPerSecond = 52562.4
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[ 131- 140, 87.50%]: ce = 0.96515045 * 5120; err = 0.29863281 * 5120; time = 0.0974s; samplesPerSecond = 52550.3
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[ 141- 150, 93.75%]: ce = 0.97302704 * 5120; err = 0.29843750 * 5120; time = 0.0981s; samplesPerSecond = 52199.8
01/08/2018 04:59:28:  Epoch[ 4 of 4]-Minibatch[ 151- 160, 100.00%]: ce = 0.96172943 * 5120; err = 0.30351563 * 5120; time = 0.0905s; samplesPerSecond = 56547.3
01/08/2018 04:59:28: Finished Epoch[ 4 of 4]: [Training] ce = 1.00105877 * 81920; err = 0.31295166 * 81920; totalSamplesSeen = 327680; learningRatePerSample = 0.003125; epochTime=1.85863s
01/08/2018 04:59:28: SGD: Saving checkpoint model '/tmp/cntk-test-20180108044804.549809/Speech/DNN_DiscriminativePreTraining@debug_gpu/models/cntkSpeech'

01/08/2018 04:59:28: Action "train" complete.

01/08/2018 04:59:28: __COMPLETED__