CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
    Hardware threads: 24
    Total Memory: 268381192 kB
-------------------------------------------------------------------
=== Running /cygdrive/c/jenkins/workspace/CNTK-Test-Windows-SlaveTest/x64/release/cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\LSTM/cntk.cntk currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data RunDir=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu DataDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\LSTM OutputDir=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu DeviceId=0 timestamping=true truncated=false speechTrain=[reader=[nbruttsineachrecurrentiter=2]] speechTrain=[SGD=[epochSize=2560]] speechTrain=[SGD=[maxEpochs=2]] speechTrain=[SGD=[numMBsToShowResult=1]] modelSelector=2 shareNodeValueMatrices=true
CNTK 2.0.beta6.0+ (HEAD bbb440, Dec 20 2016 05:52:50) on DPHAIM-24 at 2016/12/20 15:31:03

C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\LSTM/cntk.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data  RunDir=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\LSTM  OutputDir=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu  DeviceId=0  timestamping=true  truncated=false  speechTrain=[reader=[nbruttsineachrecurrentiter=2]]  speechTrain=[SGD=[epochSize=2560]]  speechTrain=[SGD=[maxEpochs=2]]  speechTrain=[SGD=[numMBsToShowResult=1]]  modelSelector=2  shareNodeValueMatrices=true
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data
12/20/2016 15:31:06: -------------------------------------------------------------------
12/20/2016 15:31:06: Build info: 

12/20/2016 15:31:06: 		Built time: Dec 20 2016 05:52:50
12/20/2016 15:31:06: 		Last modified date: Tue Dec 20 03:38:53 2016
12/20/2016 15:31:06: 		Build type: Release
12/20/2016 15:31:06: 		Build target: GPU
12/20/2016 15:31:06: 		With ASGD: yes
12/20/2016 15:31:06: 		Math lib: mkl
12/20/2016 15:31:06: 		CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
12/20/2016 15:31:06: 		CUB_PATH: c:\src\cub-1.4.1
12/20/2016 15:31:06: 		CUDNN_PATH: C:\local\cudnn-8.0-windows10-x64-v5.1
12/20/2016 15:31:06: 		Build Branch: HEAD
12/20/2016 15:31:06: 		Build SHA1: bbb4406fcb69b190109ec665d8c0a619271f1b24 (modified)
12/20/2016 15:31:06: 		Built by svcphil on liana-08-w
12/20/2016 15:31:06: 		Build Path: C:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\
12/20/2016 15:31:06: -------------------------------------------------------------------
12/20/2016 15:31:08: -------------------------------------------------------------------
12/20/2016 15:31:08: GPU info:

12/20/2016 15:31:08: 		Device[0]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
12/20/2016 15:31:08: 		Device[1]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
12/20/2016 15:31:08: 		Device[2]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
12/20/2016 15:31:08: 		Device[3]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
12/20/2016 15:31:08: -------------------------------------------------------------------

Configuration After Processing and Variable Resolution:

configparameters: cntk.cntk:// Note: These options are overridden from the command line in some test cases.=true
configparameters: cntk.cntk:command=speechCreate:speechTrain
configparameters: cntk.cntk:ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\LSTM
configparameters: cntk.cntk:currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data
configparameters: cntk.cntk:DataDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data
configparameters: cntk.cntk:deviceId=0
configparameters: cntk.cntk:frameMode=false
configparameters: cntk.cntk:modelPath=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn
configparameters: cntk.cntk:modelSelector=2
configparameters: cntk.cntk:OutputDir=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu
configparameters: cntk.cntk:parallelTrain=false
configparameters: cntk.cntk:precision=float
configparameters: cntk.cntk:RunDir=C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu
configparameters: cntk.cntk:shareNodeValueMatrices=true
configparameters: cntk.cntk:speechCreate={
    action = "edit"
    outputModelPath = "C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn.initial"
    BrainScriptNetworkBuilder = {
        useLayerNorm = false
        // dimensions (needed for both model and readers)
        baseFeatDim = 33
        featDim = 11 * baseFeatDim
        labelDim = 132
        // hidden dimensions
        innerCellDim  = 1024
        hiddenDim     = 256
        numLSTMLayers = 3        // number of hidden LSTM model layers
        modelUsingCuDNN5 = Sequential
        (
            MeanVarNorm :
            (_ => OptimizedRNNStack(ParameterTensor {0:0, initOutputRank=-1, init='heNormal', initValueScale=1/10}, _, hiddenDim, numLayers=numLSTMLayers, bidirectional=true)) :
            DenseLayer {labelDim, init='heUniform', initValueScale=1/3}
        )
        modelUsingLayersLikeCuDNN5 = Sequential
        (
            MeanVarNorm :
            LayerStack {numLSTMLayers, _ => Sequential (
                (x => Splice (
                    RecurrentLSTMLayer {hiddenDim, init='heUniform', initValueScale=1/10} (x) :
                    RecurrentLSTMLayer {hiddenDim, goBackwards=true, init='heUniform', initValueScale=1/10} (x)
                ))
            )} :
            DenseLayer {labelDim, init='heUniform', initValueScale=1/3}
        )
        modelUsingLayers = Sequential
        (
            MeanVarNorm :
            LayerStack {numLSTMLayers, _ => Sequential (
                if useLayerNorm then LayerNormalizationLayer{} else Identity :
                RecurrentLSTMLayer {hiddenDim, cellShape=innerCellDim, init='heUniform', initValueScale=1/3}
            )} :
            DenseLayer {labelDim, init='heUniform', initValueScale=1/3}
        )
        modelRegressionTest (features) =
        {
            useSelfStabilization = true
            featNorm = MeanVarNorm(features)
            // we define the LSTM locally for now, since the one in CNTK.core.bs has a slightly changed configuration that breaks this test
            Stabilize (x, enabled=true) =
                if enabled
                then {
beta = Exp (BS.Parameters.BiasParam ((1))) 
                    result = beta .* x
                }.result
                else x
            LSTMP (outputDim, cellDim=outputDim, x, inputDim=x.dim, prevState, enableSelfStabilization=false) =
            {
                _privateInnards = {       // encapsulate the inner workings
                    dh = prevState.h // previous values
                    dc = prevState.c
                    // parameter macros--these carry their own weight matrices
                    B() = BS.Parameters.BiasParam (cellDim)
                    W(v) = BS.Parameters.WeightParam (cellDim, Inferred)  * Stabilize (v, enabled=enableSelfStabilization) // input-to-hidden
                    H(h) = BS.Parameters.WeightParam (cellDim, outputDim) * Stabilize (h, enabled=enableSelfStabilization) // hidden-to-hidden
                    C(c) = BS.Parameters.DiagWeightParam (cellDim)       .* Stabilize (c, enabled=enableSelfStabilization) // cell-to-hiddden (note: applied elementwise)
                    // note: the W(x) here are all different, they all come with their own set of weights; same for H(dh), C(dc), and B()
                    it = Sigmoid (W(x) + B() + H(dh) + C(dc))          // input gate(t)
                    bit = it .* Tanh (W(x) + (H(dh) + B()))            // applied to tanh of input network
                    ft = Sigmoid (W(x) + B() + H(dh) + C(dc))          // forget-me-not gate(t)
                    bft = ft .* dc                                     // applied to cell(t-1)
                    ct = bft + bit                                     // c(t) is sum of both
                    ot = Sigmoid (W(x) + B() + H(dh) + C(ct))          // output gate(t)
                    ht = ot .* Tanh (ct)                               // applied to tanh(cell(t))
                }
                c = _privateInnards.ct          // cell value
                h = if outputDim != cellDim     // output/hidden state
                    then {                      // project
                        Wmr = BS.Parameters.WeightParam (outputDim, cellDim);
                        htp = Wmr * Stabilize (_privateInnards.ht, enabled=enableSelfStabilization)
                    }.htp         // TODO: ^^ extend BS syntax to allow to say: then { Wmr = WeightParam(outputDim, cellDim) } in Wmr * Stabilize (...)
                    else _privateInnards.ht     // no projection
                dim = outputDim
            }
            RecurrentLSTMP (outputDim, cellDim=outputDim.dim, x, inputDim=x.dim, previousHook=BS.RNNs.PreviousHC, enableSelfStabilization=false) =
            {
                prevState = previousHook (lstmState)
                inputDim1 = inputDim ; cellDim1 = cellDim ; enableSelfStabilization1 = enableSelfStabilization
                lstmState = LSTMP (outputDim, cellDim=cellDim1, x, inputDim=inputDim1, prevState, enableSelfStabilization=enableSelfStabilization1)
            }.lstmState // we return the state record (h,c)
            // define the stack of hidden LSTM layers  --TODO: change to RecurrentLSTMPStack(), change stabilizer config
            S(x) = Stabilize (x, enabled=useSelfStabilization)
            LSTMoutput[k:1..numLSTMLayers] =
                if k == 1
                then /*BS.RNNs.*/ RecurrentLSTMP (hiddenDim, cellDim=innerCellDim, /*S*/ (featNorm),        inputDim=baseFeatDim, enableSelfStabilization=useSelfStabilization).h
                else /*BS.RNNs.*/ RecurrentLSTMP (hiddenDim, cellDim=innerCellDim, /*S*/ (LSTMoutput[k-1]), inputDim=hiddenDim,   enableSelfStabilization=useSelfStabilization).h
            // and add a softmax layer on top
            W = BS.Parameters.WeightParam (labelDim, Inferred)
            B = BS.Parameters.BiasParam   (labelDim)
            // (unnecessarily using explicit Times with inferInputRankToMap in order to have a test for inferInputRankToMap parameter)
            z = Times (W, S(LSTMoutput[numLSTMLayers]), inferInputRankToMap=0) + B; // top-level input to Softmax
        }.z
        // features
        features = Input((1 : featDim),  tag='feature') // TEST: Artificially reading data transposed
        realFeatures = FlattenDimensions (Transpose (features), 1, 2)             //       and swapping them back to (featDim:1), for testing Transpose()
feashift = RowSlice(featDim - baseFeatDim, baseFeatDim, realFeatures);  
        labels   = Input(labelDim, tag='label')
        // link model to inputs
models = [| modelRegressionTest; modelUsingLayers; modelUsingCuDNN5; modelUsingLayersLikeCuDNN5 |]  
model = models[2]     
        z = model (feashift)
        // link model to training
        ce  = /*Pass*/ SumElements (ReduceLogSum (z) - TransposeTimes (labels,          z),  tag='criterion')  // manually-defined per-sample objective
        err = /*Pass*/ SumElements (BS.Constants.One - TransposeTimes (labels, Hardmax (z)), tag='evaluation') // also track frame errors
        // decoding
        logPrior = LogPrior(labels)	 
        scaledLogLikelihood = Pass (z - logPrior, tag='output') // using Pass() since we can't assign a tag to x - y
        featureNodes = (features)
        labelNodes = (labels)
        criterionNodes = (ce)
        evaluationNodes = (err)
        outputNodes = (scaledLogLikelihood)
    }
}

configparameters: cntk.cntk:speechTrain={
    action = "train"
    BrainScriptNetworkBuilder = (BS.Network.Load("C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn.initial"))
    SGD = {
        epochSize = 20480 ; maxEpochs = 4 ; minibatchSize = 20
        learningRatesPerMB = 0.5 ; momentumAsTimeConstant = 2500
        numMBsToShowResult = 10
        keepCheckPointFiles = true       
    }
    reader = {
        readerType = "HTKMLFReader"
        randomize = "auto" ; readMethod = "blockRandomize"
        nbruttsineachrecurrentiter = 32
        miniBatchMode = "partial" ; verbosity = 0 ; useMersenneTwisterRand = true
        features = { dim =      363 ; type      = "real"     ; scpFile = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data/glob_0000.scp" ; }
        labels   = { labelDim = 132 ; labelType = "category" ; mlfFile = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data/glob_0000.mlf" ; labelMappingFile = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data/state.list" }
    }
} [reader=[nbruttsineachrecurrentiter=2]] [SGD=[epochSize=2560]] [SGD=[maxEpochs=2]] [SGD=[numMBsToShowResult=1]]

configparameters: cntk.cntk:timestamping=true
configparameters: cntk.cntk:traceLevel=1
configparameters: cntk.cntk:truncated=false
12/20/2016 15:31:08: Commands: speechCreate speechTrain
12/20/2016 15:31:08: precision = "float"

12/20/2016 15:31:08: ##############################################################################
12/20/2016 15:31:08: #                                                                            #
12/20/2016 15:31:08: # speechCreate command (edit action)                                         #
12/20/2016 15:31:08: #                                                                            #
12/20/2016 15:31:08: ##############################################################################

Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[132 x 0] as heUniform later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 0] as heNormal later when dimensions are fully known.

Post-processing network...

6 roots:
	ce = SumElements()
	err = SumElements()
	logPrior._ = Mean()
	scaledLogLikelihood = Pass()
	z.x._.invStdDev = InvStdDev()
	z.x._.mean = Mean()

Validating network. 28 nodes to process in pass 1.

Validating --> modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W = LearnableParameter() :  -> [132 x 0]
Validating --> z.x.weights = LearnableParameter() :  -> [0 x 0]
Validating --> features = InputValue() :  -> [1 x 363 x *]
Validating --> realFeatures.x = TransposeDimensions (features) : [1 x 363 x *] -> [363 x 1 x *]
Validating --> realFeatures = Reshape (realFeatures.x) : [363 x 1 x *] -> [363 x *]
Validating --> feashift = Slice (realFeatures) : [363 x *] -> [33 x *]
Validating --> z.x._.mean = Mean (feashift) : [33 x *] -> [33]
Validating --> z.x._.ElementTimesArgs[0] = Minus (feashift, z.x._.mean) : [33 x *], [33] -> [33 x *]
Validating --> z.x._.invStdDev = InvStdDev (feashift) : [33 x *] -> [33]
Validating --> z.x._ = ElementTimes (z.x._.ElementTimesArgs[0], z.x._.invStdDev) : [33 x *], [33] -> [33 x *]
Node 'z.x.weights' (LearnableParameter operation) operation: Tensor shape was inferred as [256 x 14648].
Node 'z.x.weights' (LearnableParameter operation): Initializing Parameter[256 x 14648] <- heNormal(seed=2, init dims=[14648 x 256], range=0.008839(0.088388*0.100000), onCPU=true.
)Validating --> z.x = OptimizedRNNStack (z.x.weights, z.x._) : [256 x 14648], [33 x *] -> [512 x *]
Node 'modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W' (LearnableParameter operation) operation: Tensor shape was inferred as [132 x 512].
Node 'modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W' (LearnableParameter operation): Initializing Parameter[132 x 512] <- heUniform(seed=1, init dims=[132 x 512], range=0.036084(0.108253*0.333333), onCPU=true.
)Validating --> z.x.PlusArgs[0] = Times (modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W, z.x) : [132 x 512], [512 x *] -> [132 x *]
Validating --> modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].b = LearnableParameter() :  -> [132]
Validating --> z = Plus (z.x.PlusArgs[0], modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].b) : [132 x *], [132] -> [132 x *]
Validating --> ce.matrix.MinusArgs[0].r = ReduceElements (z) : [132 x *] -> [1 x *]
Validating --> labels = InputValue() :  -> [132 x *]
Validating --> ce.matrix.MinusArgs[1] = TransposeTimes (labels, z) : [132 x *], [132 x *] -> [1 x *]
Validating --> ce.matrix = Minus (ce.matrix.MinusArgs[0].r, ce.matrix.MinusArgs[1]) : [1 x *], [1 x *] -> [1 x *]
Validating --> ce = SumElements (ce.matrix) : [1 x *] -> [1]
Validating --> BS.Constants.One = LearnableParameter() :  -> [1]
Validating --> err.matrix.MinusArgs[1].rightMatrix = Hardmax (z) : [132 x *] -> [132 x *]
Validating --> err.matrix.MinusArgs[1] = TransposeTimes (labels, err.matrix.MinusArgs[1].rightMatrix) : [132 x *], [132 x *] -> [1 x *]
Validating --> err.matrix = Minus (BS.Constants.One, err.matrix.MinusArgs[1]) : [1], [1 x *] -> [1 x *]
Validating --> err = SumElements (err.matrix) : [1 x *] -> [1]
Validating --> logPrior._ = Mean (labels) : [132 x *] -> [132]
Validating --> logPrior = Log (logPrior._) : [132] -> [132]
Validating --> scaledLogLikelihood._ = Minus (z, logPrior) : [132 x *], [132] -> [132 x *]
Validating --> scaledLogLikelihood = Pass (scaledLogLikelihood._) : [132 x *] -> [132 x *]

Validating network. 22 nodes to process in pass 2.


Validating network, final pass.




Post-processing network complete.

12/20/2016 15:31:09: 
Model with 28 nodes saved as 'C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn.initial'.

12/20/2016 15:31:09: Action "edit" complete.


12/20/2016 15:31:09: ##############################################################################
12/20/2016 15:31:09: #                                                                            #
12/20/2016 15:31:09: # speechTrain command (train action)                                         #
12/20/2016 15:31:09: #                                                                            #
12/20/2016 15:31:09: ##############################################################################

parallelTrain option is not enabled. ParallelTrain config will be ignored.
12/20/2016 15:31:09: 
Creating virgin network.
Load: Loading model file: C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn.initial
Post-processing network...

6 roots:
	ce = SumElements()
	err = SumElements()
	logPrior._ = Mean()
	scaledLogLikelihood = Pass()
	z.x._.invStdDev = InvStdDev()
	z.x._.mean = Mean()

Validating network. 28 nodes to process in pass 1.

Validating --> modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W = LearnableParameter() :  -> [132 x 512]
Validating --> z.x.weights = LearnableParameter() :  -> [256 x 14648]
Validating --> features = InputValue() :  -> [1 x 363 x *1]
Validating --> realFeatures.x = TransposeDimensions (features) : [1 x 363 x *1] -> [363 x 1 x *1]
Validating --> realFeatures = Reshape (realFeatures.x) : [363 x 1 x *1] -> [363 x *1]
Validating --> feashift = Slice (realFeatures) : [363 x *1] -> [33 x *1]
Validating --> z.x._.mean = Mean (feashift) : [33 x *1] -> [33]
Validating --> z.x._.ElementTimesArgs[0] = Minus (feashift, z.x._.mean) : [33 x *1], [33] -> [33 x *1]
Validating --> z.x._.invStdDev = InvStdDev (feashift) : [33 x *1] -> [33]
Validating --> z.x._ = ElementTimes (z.x._.ElementTimesArgs[0], z.x._.invStdDev) : [33 x *1], [33] -> [33 x *1]
Validating --> z.x = OptimizedRNNStack (z.x.weights, z.x._) : [256 x 14648], [33 x *1] -> [512 x *1]
Validating --> z.x.PlusArgs[0] = Times (modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W, z.x) : [132 x 512], [512 x *1] -> [132 x *1]
Validating --> modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].b = LearnableParameter() :  -> [132]
Validating --> z = Plus (z.x.PlusArgs[0], modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].b) : [132 x *1], [132] -> [132 x *1]
Validating --> ce.matrix.MinusArgs[0].r = ReduceElements (z) : [132 x *1] -> [1 x *1]
Validating --> labels = InputValue() :  -> [132 x *1]
Validating --> ce.matrix.MinusArgs[1] = TransposeTimes (labels, z) : [132 x *1], [132 x *1] -> [1 x *1]
Validating --> ce.matrix = Minus (ce.matrix.MinusArgs[0].r, ce.matrix.MinusArgs[1]) : [1 x *1], [1 x *1] -> [1 x *1]
Validating --> ce = SumElements (ce.matrix) : [1 x *1] -> [1]
Validating --> BS.Constants.One = LearnableParameter() :  -> [1]
Validating --> err.matrix.MinusArgs[1].rightMatrix = Hardmax (z) : [132 x *1] -> [132 x *1]
Validating --> err.matrix.MinusArgs[1] = TransposeTimes (labels, err.matrix.MinusArgs[1].rightMatrix) : [132 x *1], [132 x *1] -> [1 x *1]
Validating --> err.matrix = Minus (BS.Constants.One, err.matrix.MinusArgs[1]) : [1], [1 x *1] -> [1 x *1]
Validating --> err = SumElements (err.matrix) : [1 x *1] -> [1]
Validating --> logPrior._ = Mean (labels) : [132 x *1] -> [132]
Validating --> logPrior = Log (logPrior._) : [132] -> [132]
Validating --> scaledLogLikelihood._ = Minus (z, logPrior) : [132 x *1], [132] -> [132 x *1]
Validating --> scaledLogLikelihood = Pass (scaledLogLikelihood._) : [132 x *1] -> [132 x *1]

Validating network. 22 nodes to process in pass 2.


Validating network, final pass.




Post-processing network complete.

reading script file C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data/glob_0000.scp ... 948 entries
total 132 state names in state list C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data/state.list
htkmlfreader: reading MLF file C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Tests\EndToEndTests\Speech\Data/glob_0000.mlf ... total 948 entries
...............................................................................................feature set 0: 252734 frames in 948 out of 948 utterances
label set 0: 129 classes
minibatchutterancesource: 948 utterances grouped into 3 chunks, av. chunk size: 316.0 utterances, 84244.7 frames
12/20/2016 15:31:10: 
Model has 28 nodes. Using GPU 0.

12/20/2016 15:31:10: Training criterion:   ce = SumElements
12/20/2016 15:31:10: Evaluation criterion: err = SumElements


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 38 matrices, 24 are shared as 7, and 14 are not shared.

	{ ce : [1] (gradient)
	  err.matrix : [1 x *1]
	  scaledLogLikelihood._ : [132 x *1] }
	{ ce.matrix.MinusArgs[0].r : [1 x *1] (gradient)
	  modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].b : [132] (gradient) }
	{ ce.matrix.MinusArgs[0].r : [1 x *1]
	  feashift : [33 x *1]
	  realFeatures.x : [363 x 1 x *1]
	  z.x.PlusArgs[0] : [132 x *1]
	  z.x.PlusArgs[0] : [132 x *1] (gradient)
	  z.x._ : [33 x *1]
	  z.x.weights : [256 x 14648] (gradient) }
	{ realFeatures : [363 x *1]
	  z.x : [512 x *1]
	  z.x._.ElementTimesArgs[0] : [33 x *1] }
	{ modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W : [132 x 512] (gradient)
	  z : [132 x *1] }
	{ ce.matrix : [1 x *1] (gradient)
	  ce.matrix.MinusArgs[1] : [1 x *1]
	  err.matrix.MinusArgs[1] : [1 x *1]
	  z : [132 x *1] (gradient)
	  z.x : [512 x *1] (gradient) }
	{ ce.matrix : [1 x *1]
	  err.matrix.MinusArgs[1].rightMatrix : [132 x *1] }


12/20/2016 15:31:10: Training 3817604 parameters in 3 out of 3 parameter tensors and 10 nodes with gradient:

12/20/2016 15:31:10: 	Node 'modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].W' (LearnableParameter operation) : [132 x 512]
12/20/2016 15:31:10: 	Node 'modelUsingCuDNN5.arrayOfFunctions[2].arrayOfFunctions[0].b' (LearnableParameter operation) : [132]
12/20/2016 15:31:10: 	Node 'z.x.weights' (LearnableParameter operation) : [256 x 14648]


12/20/2016 15:31:10: Precomputing --> 3 PreCompute nodes found.

12/20/2016 15:31:10: 	z.x._.mean = Mean()
12/20/2016 15:31:10: 	z.x._.invStdDev = InvStdDev()
12/20/2016 15:31:10: 	logPrior._ = Mean()
minibatchiterator: epoch 0: frames [0..252734] (first utterance at frame 0), data subset 0 of 1, with 1 datapasses
requiredata: determined feature kind as 33-dimensional 'USER' with frame shift 10.0 ms

12/20/2016 15:31:11: Precomputing --> Completed.


12/20/2016 15:31:11: Starting Epoch 1: learning rate per sample = 0.025000  effective momentum = 0.992031  momentum as time constant = 2499.8 samples
minibatchiterator: epoch 0: frames [0..2560] (first utterance at frame 0), data subset 0 of 1, with 1 datapasses

12/20/2016 15:31:11: Starting minibatch loop.
12/20/2016 15:31:12:  Epoch[ 1 of 2]-Minibatch[   1-   1, 0.78%]: ce = 4.88338290 * 886; err = 0.99661400 * 886; time = 0.7541s; samplesPerSecond = 1174.9
12/20/2016 15:31:12:  Epoch[ 1 of 2]-Minibatch[   2-   2, 1.56%]: ce = 4.54264899 * 226; err = 0.76106195 * 226; time = 0.1041s; samplesPerSecond = 2171.5
12/20/2016 15:31:12:  Epoch[ 1 of 2]-Minibatch[   3-   3, 2.34%]: ce = 4.22057502 * 526; err = 0.82889734 * 526; time = 0.2193s; samplesPerSecond = 2398.3
12/20/2016 15:31:13:  Epoch[ 1 of 2]-Minibatch[   4-   4, 3.13%]: ce = 4.44286129 * 946; err = 0.89217759 * 946; time = 0.4480s; samplesPerSecond = 2111.4
12/20/2016 15:31:13: Finished Epoch[ 1 of 2]: [Training] ce = 4.55738590 * 2584; err = 0.90363777 * 2584; totalSamplesSeen = 2584; learningRatePerSample = 0.025; epochTime=1.52781s
12/20/2016 15:31:13: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn.1'

12/20/2016 15:31:13: Starting Epoch 2: learning rate per sample = 0.025000  effective momentum = 0.992031  momentum as time constant = 2499.8 samples
minibatchiterator: epoch 1: frames [2560..5120] (first utterance at frame 2584), data subset 0 of 1, with 1 datapasses

12/20/2016 15:31:13: Starting minibatch loop.
12/20/2016 15:31:14:  Epoch[ 2 of 2]-Minibatch[   1-   1, 0.78%]: ce = 4.28727666 * 594; err = 0.91582492 * 594; time = 0.2407s; samplesPerSecond = 2467.6
12/20/2016 15:31:14:  Epoch[ 2 of 2]-Minibatch[   2-   2, 1.56%]: ce = 4.73987702 * 386; err = 0.94818653 * 386; time = 0.1655s; samplesPerSecond = 2331.8
12/20/2016 15:31:14:  Epoch[ 2 of 2]-Minibatch[   3-   3, 2.34%]: ce = 4.02257527 * 884; err = 0.89705882 * 884; time = 0.3554s; samplesPerSecond = 2487.3
12/20/2016 15:31:15:  Epoch[ 2 of 2]-Minibatch[   4-   4, 3.13%]: ce = 4.43765485 * 946; err = 0.91014799 * 946; time = 0.4267s; samplesPerSecond = 2216.9
12/20/2016 15:31:15: Finished Epoch[ 2 of 2]: [Training] ce = 4.31680174 * 2810; err = 0.91245552 * 2810; totalSamplesSeen = 5394; learningRatePerSample = 0.025; epochTime=1.19148s
12/20/2016 15:31:15: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161220071215.50799\Speech\LSTM_FullUtteranceCuDNN5@release_gpu/models/cntkSpeech.dnn'

12/20/2016 15:31:15: Action "train" complete.

12/20/2016 15:31:15: __COMPLETED__
