CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
    Hardware threads: 24
    Total Memory: 268381192 kB
-------------------------------------------------------------------
=== Running /cygdrive/c/jenkins/workspace/CNTK-Test-Windows-W1/x64/release/cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\GettingStarted/04_OneConvBN.cntk currentDirectory=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu DataDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\GettingStarted OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu DeviceId=0 timestamping=true forceDeterministicAlgorithms=true stderr=- trainNetwork=[SGD=[maxEpochs=3]]
CNTK 1.7+ (HEAD 216029, Sep 22 2016 16:13:35) on DPHAIM-22 at 2016/09/22 16:27:38

C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\GettingStarted/04_OneConvBN.cntk  currentDirectory=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData  RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu  DataDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\GettingStarted  OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu  DeviceId=0  timestamping=true  forceDeterministicAlgorithms=true  stderr=-  trainNetwork=[SGD=[maxEpochs=3]]
Changed current directory to C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData
09/22/2016 16:27:40: Redirecting stderr to file -_trainNetwork_testNetwork.log
09/22/2016 16:27:40: -------------------------------------------------------------------
09/22/2016 16:27:40: Build info: 

09/22/2016 16:27:40: 		Built time: Sep 22 2016 16:13:35
09/22/2016 16:27:40: 		Last modified date: Thu Sep 22 13:24:23 2016
09/22/2016 16:27:40: 		Build type: Release
09/22/2016 16:27:40: 		Build target: GPU
09/22/2016 16:27:40: 		Math lib: mkl
09/22/2016 16:27:40: 		CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
09/22/2016 16:27:40: 		CUB_PATH: C:\src\cub-1.4.1
09/22/2016 16:27:40: 		CUDNN_PATH: c:\NVIDIA\cudnn-5.1\cuda
09/22/2016 16:27:40: 		Build Branch: HEAD
09/22/2016 16:27:40: 		Build SHA1: 216029bfedd92253fd45034da1d1cc68c4d4c7f1
09/22/2016 16:27:40: 		Built by svcphil on liana-08-w
09/22/2016 16:27:40: 		Build Path: c:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\
09/22/2016 16:27:40: -------------------------------------------------------------------
09/22/2016 16:27:43: -------------------------------------------------------------------
09/22/2016 16:27:43: GPU info:

09/22/2016 16:27:43: 		Device[0]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/22/2016 16:27:43: 		Device[1]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/22/2016 16:27:43: 		Device[2]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/22/2016 16:27:43: 		Device[3]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/22/2016 16:27:43: -------------------------------------------------------------------

Configuration After Processing and Variable Resolution:

configparameters: 04_OneConvBN.cntk:command=trainNetwork:testNetwork
configparameters: 04_OneConvBN.cntk:ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\GettingStarted
configparameters: 04_OneConvBN.cntk:currentDirectory=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData
configparameters: 04_OneConvBN.cntk:dataDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData
configparameters: 04_OneConvBN.cntk:deviceId=0
configparameters: 04_OneConvBN.cntk:forceDeterministicAlgorithms=true
configparameters: 04_OneConvBN.cntk:modelPath=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu/Models/04_OneConvBN
configparameters: 04_OneConvBN.cntk:outputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu
configparameters: 04_OneConvBN.cntk:precision=float
configparameters: 04_OneConvBN.cntk:rootDir=..
configparameters: 04_OneConvBN.cntk:RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu
configparameters: 04_OneConvBN.cntk:stderr=-
configparameters: 04_OneConvBN.cntk:testNetwork={
    action = "test"
minibatchSize = 1024    
    reader = {
        readerType = "CNTKTextFormatReader"
        file = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData/Test-28x28_cntk_text.txt"
        input = {
            features = { dim = 784 ; format = "dense" }
            labels =   { dim = 10  ; format = "dense" }
        }
    }
}

configparameters: 04_OneConvBN.cntk:timestamping=true
configparameters: 04_OneConvBN.cntk:traceLevel=1
configparameters: 04_OneConvBN.cntk:trainNetwork={
    action = "train"
    BrainScriptNetworkBuilder = {
imageShape = 28:28:1                        
labelDim = 10                               
        featScale = 1/256
        Scale{f} = x => Constant(f) .* x
        ConvBnReluPoolLayer {outChannels, filterShape} = Sequential (
            ConvolutionalLayer      {outChannels, filterShape, pad=true, bias=false} :
            BatchNormalizationLayer {spatialRank = 2} :
            ReLU :
            MaxPoolingLayer         {(2:2), stride = (2:2)} 
        )
        DenseBnReluLayer {outDim} = Sequential (
            LinearLayer             {outDim} :   
            BatchNormalizationLayer {spatialRank = 1} : ReLU
        )
        model = Sequential (
            Scale {featScale} : 
            ConvBnReluPoolLayer {16, (5:5)} : 
            DenseBnReluLayer {64} : 
            LinearLayer {labelDim}
        )
        features = Input {imageShape}
        labels = Input (labelDim)
        ol = model (features)
        ce   = CrossEntropyWithSoftmax (labels, ol)
        errs = ClassificationError (labels, ol)
        featureNodes    = (features)
        labelNodes      = (labels)
        criterionNodes  = (ce)
        evaluationNodes = (errs)
        outputNodes     = (ol)
    }
    SGD = {
        epochSize = 60000
        minibatchSize = 64
        maxEpochs = 10
        learningRatesPerSample = 0.01*5:0.001
        momentumAsTimeConstant = 0
        numMBsToShowResult = 500
    }
    reader = {
        readerType = "CNTKTextFormatReader"
        file = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu\TestData/Train-28x28_cntk_text.txt"
        input = {
            features = { dim = 784 ; format = "dense" }
            labels =   { dim = 10  ; format = "dense" }
        }
    }   
} [SGD=[maxEpochs=3]]

09/22/2016 16:27:43: Commands: trainNetwork testNetwork
09/22/2016 16:27:43: precision = "float"

09/22/2016 16:27:43: ##############################################################################
09/22/2016 16:27:43: #                                                                            #
09/22/2016 16:27:43: # trainNetwork command (train action)                                        #
09/22/2016 16:27:43: #                                                                            #
09/22/2016 16:27:43: ##############################################################################

09/22/2016 16:27:43: 
Creating virgin network.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[10 x 0] as glorotUniform later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[64 x 0] as glorotUniform later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[5 x 5 x 0 x 16] as glorotUniform later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.
Node '<placeholder>' (LearnableParameter operation): Initializating Parameter[0 x 1] as fromValue later when dimensions are fully known.

Post-processing network...

3 roots:
	ce = CrossEntropyWithSoftmax()
	errs = ClassificationError()
	ol = Plus()

Validating network. 29 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [10 x *]
Validating --> model.arrayOfFunctions[3].W = LearnableParameter() :  -> [10 x 0]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[0].W = LearnableParameter() :  -> [64 x 0]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[0].W = LearnableParameter() :  -> [5 x 5 x 0 x 16]
Validating --> ol.x.x.x.ElementTimesArgs[0] = LearnableParameter() :  -> [1 x 1]
Validating --> features = InputValue() :  -> [28 x 28 x 1 x *]
Validating --> _ol.x.x.x = ElementTimes (ol.x.x.x.ElementTimesArgs[0], features) : [1 x 1], [28 x 28 x 1 x *] -> [28 x 28 x 1 x *]
Node 'model.arrayOfFunctions[1].arrayOfFunctions[0].W' (LearnableParameter operation) operation: Tensor shape was inferred as [5 x 5 x 1 x 16].
Node 'model.arrayOfFunctions[1].arrayOfFunctions[0].W' (LearnableParameter operation): Initializing Parameter[5 x 5 x 1 x 16] <- glorotUniform(seed=3, init dims=[400 x 25], range=0.118818*1.000000, onCPU=true) { -0.01709269, ... }
.
Validating --> ol.x.x.x._.x.c = Convolution (model.arrayOfFunctions[1].arrayOfFunctions[0].W, _ol.x.x.x) : [5 x 5 x 1 x 16], [28 x 28 x 1 x *] -> [28 x 28 x 16 x *]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].scale = LearnableParameter() :  -> [0 x 1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].bias = LearnableParameter() :  -> [0 x 1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].runMean = LearnableParameter() :  -> [0 x 1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].runVariance = LearnableParameter() :  -> [0 x 1]
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].scale' (LearnableParameter operation) operation: Tensor shape was inferred as [16 x 1].
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].scale' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 1.000000.
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].bias' (LearnableParameter operation) operation: Tensor shape was inferred as [16 x 1].
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].bias' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].runMean' (LearnableParameter operation) operation: Tensor shape was inferred as [16 x 1].
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].runMean' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].runVariance' (LearnableParameter operation) operation: Tensor shape was inferred as [16 x 1].
Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].runVariance' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Validating --> ol.x.x.x._ = BatchNormalization (ol.x.x.x._.x.c, model.arrayOfFunctions[1].arrayOfFunctions[1].scale, model.arrayOfFunctions[1].arrayOfFunctions[1].bias, model.arrayOfFunctions[1].arrayOfFunctions[1].runMean, model.arrayOfFunctions[1].arrayOfFunctions[1].runVariance) : [28 x 28 x 16 x *], [16 x 1], [16 x 1], [16 x 1], [16 x 1] -> [28 x 28 x 16 x *]
Validating --> ol.x.x.x = RectifiedLinear (ol.x.x.x._) : [28 x 28 x 16 x *] -> [28 x 28 x 16 x *]
Validating --> ol.x.x = Pooling (ol.x.x.x) : [28 x 28 x 16 x *] -> [14 x 14 x 16 x *]
Node 'model.arrayOfFunctions[2].arrayOfFunctions[0].W' (LearnableParameter operation) operation: Tensor shape was inferred as [64 x 14 x 14 x 16].
Node 'model.arrayOfFunctions[2].arrayOfFunctions[0].W' (LearnableParameter operation): Initializing Parameter[64 x 14 x 14 x 16] <- glorotUniform(seed=2, init dims=[64 x 3136], range=0.043301*1.000000, onCPU=true) { -0.01858653, ... }
.
Validating --> ol.x._.x.PlusArgs[0] = Times (model.arrayOfFunctions[2].arrayOfFunctions[0].W, ol.x.x) : [64 x 14 x 14 x 16], [14 x 14 x 16 x *] -> [64 x *]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[0].b = LearnableParameter() :  -> [64]
Validating --> ol.x._.x = Plus (ol.x._.x.PlusArgs[0], model.arrayOfFunctions[2].arrayOfFunctions[0].b) : [64 x *], [64] -> [64 x *]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].scale = LearnableParameter() :  -> [0 x 1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].bias = LearnableParameter() :  -> [0 x 1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].runMean = LearnableParameter() :  -> [0 x 1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].runVariance = LearnableParameter() :  -> [0 x 1]
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].scale' (LearnableParameter operation) operation: Tensor shape was inferred as [64 x 1].
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].scale' (LearnableParameter operation): Initializing Parameter[64 x 1] <- 1.000000.
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].bias' (LearnableParameter operation) operation: Tensor shape was inferred as [64 x 1].
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].bias' (LearnableParameter operation): Initializing Parameter[64 x 1] <- 0.000000.
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].runMean' (LearnableParameter operation) operation: Tensor shape was inferred as [64 x 1].
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].runMean' (LearnableParameter operation): Initializing Parameter[64 x 1] <- 0.000000.
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].runVariance' (LearnableParameter operation) operation: Tensor shape was inferred as [64 x 1].
Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].runVariance' (LearnableParameter operation): Initializing Parameter[64 x 1] <- 0.000000.
Validating --> ol.x._ = BatchNormalization (ol.x._.x, model.arrayOfFunctions[2].arrayOfFunctions[1].scale, model.arrayOfFunctions[2].arrayOfFunctions[1].bias, model.arrayOfFunctions[2].arrayOfFunctions[1].runMean, model.arrayOfFunctions[2].arrayOfFunctions[1].runVariance) : [64 x *], [64 x 1], [64 x 1], [64 x 1], [64 x 1] -> [64 x *]
Validating --> ol.x = RectifiedLinear (ol.x._) : [64 x *] -> [64 x *]
Node 'model.arrayOfFunctions[3].W' (LearnableParameter operation) operation: Tensor shape was inferred as [10 x 64].
Node 'model.arrayOfFunctions[3].W' (LearnableParameter operation): Initializing Parameter[10 x 64] <- glorotUniform(seed=1, init dims=[10 x 64], range=0.284747*1.000000, onCPU=true) { -0.20348585, ... }
.
Validating --> ol.PlusArgs[0] = Times (model.arrayOfFunctions[3].W, ol.x) : [10 x 64], [64 x *] -> [10 x *]
Validating --> model.arrayOfFunctions[3].b = LearnableParameter() :  -> [10]
Validating --> ol = Plus (ol.PlusArgs[0], model.arrayOfFunctions[3].b) : [10 x *], [10] -> [10 x *]
Validating --> ce = CrossEntropyWithSoftmax (labels, ol) : [10 x *], [10 x *] -> [1]
Validating --> errs = ClassificationError (labels, ol) : [10 x *], [10 x *] -> [1]

Validating network. 13 nodes to process in pass 2.


Validating network, final pass.

ol.x.x.x._.x.c: using cuDNN convolution engine for geometry: Input: 28 x 28 x 1, Output: 28 x 28 x 16, Kernel: 5 x 5 x 1, Map: 16, Stride: 1 x 1 x 1, Sharing: (1, 1, 1), AutoPad: (1, 1, 0), LowerPad: 0 x 0 x 0, UpperPad: 0 x 0 x 0.
Using CNTK batch normalization engine.
ol.x.x: using cuDNN convolution engine for geometry: Input: 28 x 28 x 16, Output: 14 x 14 x 16, Kernel: 2 x 2 x 1, Map: 1, Stride: 2 x 2 x 1, Sharing: (1, 1, 1), AutoPad: (0, 0, 0), LowerPad: 0 x 0 x 0, UpperPad: 0 x 0 x 0.
Using CNTK batch normalization engine.



Post-processing network complete.

09/22/2016 16:27:44: 
Model has 29 nodes. Using GPU 0.

09/22/2016 16:27:44: Training criterion:   ce = CrossEntropyWithSoftmax
09/22/2016 16:27:44: Evaluation criterion: errs = ClassificationError


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 49 matrices, 22 are shared as 10, and 27 are not shared.

	{ ol.x.x : [14 x 14 x 16 x *]
	  ol.x.x.x._ : [28 x 28 x 16 x *] (gradient) }
	{ model.arrayOfFunctions[1].arrayOfFunctions[1].scale : [16 x 1] (gradient)
	  ol.x._.x.PlusArgs[0] : [64 x *]
	  ol.x.x.x : [28 x 28 x 16 x *] (gradient) }
	{ ol.x.x.x : [28 x 28 x 16 x *]
	  ol.x.x.x._.x.c : [28 x 28 x 16 x *] (gradient) }
	{ ol.PlusArgs[0] : [10 x *]
	  ol.x._ : [64 x *] (gradient) }
	{ model.arrayOfFunctions[3].W : [10 x 64] (gradient)
	  ol : [10 x *] (gradient) }
	{ model.arrayOfFunctions[2].arrayOfFunctions[0].b : [64] (gradient)
	  ol.PlusArgs[0] : [10 x *] (gradient) }
	{ model.arrayOfFunctions[2].arrayOfFunctions[1].scale : [64 x 1] (gradient)
	  ol.x : [64 x *] (gradient) }
	{ ol.x : [64 x *]
	  ol.x._.x : [64 x *] (gradient)
	  ol.x.x : [14 x 14 x 16 x *] (gradient) }
	{ model.arrayOfFunctions[2].arrayOfFunctions[0].W : [64 x 14 x 14 x 16] (gradient)
	  ol.x._.x : [64 x *] }
	{ model.arrayOfFunctions[1].arrayOfFunctions[0].W : [5 x 5 x 1 x 16] (gradient)
	  ol.x._.x.PlusArgs[0] : [64 x *] (gradient) }


09/22/2016 16:27:44: Training 201978 parameters in 9 out of 9 parameter tensors and 20 nodes with gradient:

09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[1].arrayOfFunctions[0].W' (LearnableParameter operation) : [5 x 5 x 1 x 16]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].bias' (LearnableParameter operation) : [16 x 1]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[1].arrayOfFunctions[1].scale' (LearnableParameter operation) : [16 x 1]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[2].arrayOfFunctions[0].W' (LearnableParameter operation) : [64 x 14 x 14 x 16]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[2].arrayOfFunctions[0].b' (LearnableParameter operation) : [64]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].bias' (LearnableParameter operation) : [64 x 1]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[2].arrayOfFunctions[1].scale' (LearnableParameter operation) : [64 x 1]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[3].W' (LearnableParameter operation) : [10 x 64]
09/22/2016 16:27:44: 	Node 'model.arrayOfFunctions[3].b' (LearnableParameter operation) : [10]

09/22/2016 16:27:44: No PreCompute nodes found, or all already computed. Skipping pre-computation step.

09/22/2016 16:27:44: Starting Epoch 1: learning rate per sample = 0.010000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

09/22/2016 16:27:44: Starting minibatch loop.
09/22/2016 16:27:47:  Epoch[ 1 of 3]-Minibatch[   1- 500, 53.33%]: ce = 0.15361041 * 32000; errs = 4.659% * 32000; time = 3.2004s; samplesPerSecond = 9998.9
09/22/2016 16:27:49: Finished Epoch[ 1 of 3]: [Training] ce = 0.11345640 * 60000; errs = 3.435% * 60000; totalSamplesSeen = 60000; learningRatePerSample = 0.0099999998; epochTime=4.6244s
09/22/2016 16:27:49: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu/Models/04_OneConvBN.1'

09/22/2016 16:27:49: Starting Epoch 2: learning rate per sample = 0.010000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

09/22/2016 16:27:49: Starting minibatch loop.
09/22/2016 16:27:50:  Epoch[ 2 of 3]-Minibatch[   1- 500, 53.33%]: ce = 0.04455460 * 32000; errs = 1.394% * 32000; time = 1.6107s; samplesPerSecond = 19866.6
09/22/2016 16:27:52: Finished Epoch[ 2 of 3]: [Training] ce = 0.04574103 * 60000; errs = 1.442% * 60000; totalSamplesSeen = 120000; learningRatePerSample = 0.0099999998; epochTime=3.0276s
09/22/2016 16:27:52: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu/Models/04_OneConvBN.2'

09/22/2016 16:27:52: Starting Epoch 3: learning rate per sample = 0.010000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

09/22/2016 16:27:52: Starting minibatch loop.
09/22/2016 16:27:54:  Epoch[ 3 of 3]-Minibatch[   1- 500, 53.33%]: ce = 0.02988900 * 32000; errs = 0.963% * 32000; time = 1.6309s; samplesPerSecond = 19620.9
09/22/2016 16:27:55: Finished Epoch[ 3 of 3]: [Training] ce = 0.03209849 * 60000; errs = 1.035% * 60000; totalSamplesSeen = 180000; learningRatePerSample = 0.0099999998; epochTime=3.06063s
09/22/2016 16:27:55: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160922162518.374503\Examples\Image\GettingStarted_04_OneConvBN@release_gpu/Models/04_OneConvBN'

09/22/2016 16:27:55: Action "train" complete.


09/22/2016 16:27:55: ##############################################################################
09/22/2016 16:27:55: #                                                                            #
09/22/2016 16:27:55: # testNetwork command (test action)                                          #
09/22/2016 16:27:55: #                                                                            #
09/22/2016 16:27:55: ##############################################################################


Post-processing network...

3 roots:
	ce = CrossEntropyWithSoftmax()
	errs = ClassificationError()
	ol = Plus()

Validating network. 29 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [10 x *1]
Validating --> model.arrayOfFunctions[3].W = LearnableParameter() :  -> [10 x 64]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[0].W = LearnableParameter() :  -> [64 x 14 x 14 x 16]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[0].W = LearnableParameter() :  -> [5 x 5 x 1 x 16]
Validating --> ol.x.x.x.ElementTimesArgs[0] = LearnableParameter() :  -> [1 x 1]
Validating --> features = InputValue() :  -> [28 x 28 x 1 x *1]
Validating --> _ol.x.x.x = ElementTimes (ol.x.x.x.ElementTimesArgs[0], features) : [1 x 1], [28 x 28 x 1 x *1] -> [28 x 28 x 1 x *1]
Validating --> ol.x.x.x._.x.c = Convolution (model.arrayOfFunctions[1].arrayOfFunctions[0].W, _ol.x.x.x) : [5 x 5 x 1 x 16], [28 x 28 x 1 x *1] -> [28 x 28 x 16 x *1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].scale = LearnableParameter() :  -> [16 x 1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].bias = LearnableParameter() :  -> [16 x 1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].runMean = LearnableParameter() :  -> [16 x 1]
Validating --> model.arrayOfFunctions[1].arrayOfFunctions[1].runVariance = LearnableParameter() :  -> [16 x 1]
Validating --> ol.x.x.x._ = BatchNormalization (ol.x.x.x._.x.c, model.arrayOfFunctions[1].arrayOfFunctions[1].scale, model.arrayOfFunctions[1].arrayOfFunctions[1].bias, model.arrayOfFunctions[1].arrayOfFunctions[1].runMean, model.arrayOfFunctions[1].arrayOfFunctions[1].runVariance) : [28 x 28 x 16 x *1], [16 x 1], [16 x 1], [16 x 1], [16 x 1] -> [28 x 28 x 16 x *1]
Validating --> ol.x.x.x = RectifiedLinear (ol.x.x.x._) : [28 x 28 x 16 x *1] -> [28 x 28 x 16 x *1]
Validating --> ol.x.x = Pooling (ol.x.x.x) : [28 x 28 x 16 x *1] -> [14 x 14 x 16 x *1]
Validating --> ol.x._.x.PlusArgs[0] = Times (model.arrayOfFunctions[2].arrayOfFunctions[0].W, ol.x.x) : [64 x 14 x 14 x 16], [14 x 14 x 16 x *1] -> [64 x *1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[0].b = LearnableParameter() :  -> [64]
Validating --> ol.x._.x = Plus (ol.x._.x.PlusArgs[0], model.arrayOfFunctions[2].arrayOfFunctions[0].b) : [64 x *1], [64] -> [64 x *1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].scale = LearnableParameter() :  -> [64 x 1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].bias = LearnableParameter() :  -> [64 x 1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].runMean = LearnableParameter() :  -> [64 x 1]
Validating --> model.arrayOfFunctions[2].arrayOfFunctions[1].runVariance = LearnableParameter() :  -> [64 x 1]
Validating --> ol.x._ = BatchNormalization (ol.x._.x, model.arrayOfFunctions[2].arrayOfFunctions[1].scale, model.arrayOfFunctions[2].arrayOfFunctions[1].bias, model.arrayOfFunctions[2].arrayOfFunctions[1].runMean, model.arrayOfFunctions[2].arrayOfFunctions[1].runVariance) : [64 x *1], [64 x 1], [64 x 1], [64 x 1], [64 x 1] -> [64 x *1]
Validating --> ol.x = RectifiedLinear (ol.x._) : [64 x *1] -> [64 x *1]
Validating --> ol.PlusArgs[0] = Times (model.arrayOfFunctions[3].W, ol.x) : [10 x 64], [64 x *1] -> [10 x *1]
Validating --> model.arrayOfFunctions[3].b = LearnableParameter() :  -> [10]
Validating --> ol = Plus (ol.PlusArgs[0], model.arrayOfFunctions[3].b) : [10 x *1], [10] -> [10 x *1]
Validating --> ce = CrossEntropyWithSoftmax (labels, ol) : [10 x *1], [10 x *1] -> [1]
Validating --> errs = ClassificationError (labels, ol) : [10 x *1], [10 x *1] -> [1]

Validating network. 13 nodes to process in pass 2.


Validating network, final pass.

ol.x.x.x._.x.c: using cuDNN convolution engine for geometry: Input: 28 x 28 x 1, Output: 28 x 28 x 16, Kernel: 5 x 5 x 1, Map: 16, Stride: 1 x 1 x 1, Sharing: (1, 1, 1), AutoPad: (1, 1, 0), LowerPad: 0 x 0 x 0, UpperPad: 0 x 0 x 0.
Using CNTK batch normalization engine.
ol.x.x: using cuDNN convolution engine for geometry: Input: 28 x 28 x 16, Output: 14 x 14 x 16, Kernel: 2 x 2 x 1, Map: 1, Stride: 2 x 2 x 1, Sharing: (1, 1, 1), AutoPad: (0, 0, 0), LowerPad: 0 x 0 x 0, UpperPad: 0 x 0 x 0.
Using CNTK batch normalization engine.



Post-processing network complete.

evalNodeNames are not specified, using all the default evalnodes and training criterion nodes.


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 29 matrices, 0 are shared as 0, and 29 are not shared.


09/22/2016 16:27:56: Minibatch[1-10]: errs = 1.230% * 10000; ce = 0.03794491 * 10000
09/22/2016 16:27:56: Final Results: Minibatch[1-10]: errs = 1.230% * 10000; ce = 0.03794491 * 10000; perplexity = 1.03867401

09/22/2016 16:27:56: Action "test" complete.

09/22/2016 16:27:56: __COMPLETED__
