CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
    Hardware threads: 24
    Total Memory: 268381192 kB
-------------------------------------------------------------------
=== Running /cygdrive/c/jenkins/workspace/CNTK-Test-Windows-W1/x64/release/cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated/03_ConvBatchNorm_ndl_deprecated.cntk currentDirectory=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu DataDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu DeviceId=0 timestamping=true train=[SGD=[maxEpochs=3]] imageLayout="cudnn" stderr=-
-------------------------------------------------------------------
Build info: 

		Built time: Sep 13 2016 08:34:18
		Last modified date: Tue Sep 13 08:14:18 2016
		Build type: Release
		Build target: GPU
		Math lib: mkl
		CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
		CUB_PATH: C:\src\cub-1.4.1
		CUDNN_PATH: c:\NVIDIA\cudnn-5.1\cuda
		Build Branch: HEAD
		Build SHA1: dc27f22964e37ed721e445787ed58be17efe8f24
		Built by svcphil on liana-08-w
		Build Path: c:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\
-------------------------------------------------------------------
Changed current directory to C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData
09/13/2016 08:53:50: Redirecting stderr to file -_train_test.log
09/13/2016 08:53:50: -------------------------------------------------------------------
09/13/2016 08:53:50: Build info: 

09/13/2016 08:53:50: 		Built time: Sep 13 2016 08:34:18
09/13/2016 08:53:50: 		Last modified date: Tue Sep 13 08:14:18 2016
09/13/2016 08:53:50: 		Build type: Release
09/13/2016 08:53:50: 		Build target: GPU
09/13/2016 08:53:50: 		Math lib: mkl
09/13/2016 08:53:50: 		CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
09/13/2016 08:53:50: 		CUB_PATH: C:\src\cub-1.4.1
09/13/2016 08:53:50: 		CUDNN_PATH: c:\NVIDIA\cudnn-5.1\cuda
09/13/2016 08:53:50: 		Build Branch: HEAD
09/13/2016 08:53:50: 		Build SHA1: dc27f22964e37ed721e445787ed58be17efe8f24
09/13/2016 08:53:50: 		Built by svcphil on liana-08-w
09/13/2016 08:53:50: 		Build Path: c:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\
09/13/2016 08:53:50: -------------------------------------------------------------------
09/13/2016 08:53:51: -------------------------------------------------------------------
09/13/2016 08:53:51: GPU info:

09/13/2016 08:53:51: 		Device[0]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/13/2016 08:53:51: 		Device[1]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/13/2016 08:53:51: 		Device[2]: cores = 2880; computeCapability = 3.5; type = "GeForce GTX 780 Ti"; memory = 3072 MB
09/13/2016 08:53:51: -------------------------------------------------------------------

09/13/2016 08:53:51: Running on DPHAIM-22 at 2016/09/13 08:53:51
09/13/2016 08:53:51: Command line: 
C:\jenkins\workspace\CNTK-Test-Windows-W1\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated/03_ConvBatchNorm_ndl_deprecated.cntk  currentDirectory=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData  RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu  DataDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated  OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu  DeviceId=0  timestamping=true  train=[SGD=[maxEpochs=3]]  imageLayout="cudnn"  stderr=-


Configuration After Processing and Variable Resolution:

configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:command=train:test
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:currentDirectory=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:DataDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:deviceId=0
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:imageLayout=cudnn
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:ModelDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:modelPath=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models/03_ConvBatchNorm
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:numMBsToShowResult=500
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:precision=float
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:RootDir=..
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:stderr=-
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:test=[
    action = "test"
    minibatchSize = 1024
    modelPath=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models/03_ConvBatchNorm
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData/Test-28x28_cntk_text.txt"
        input = [
            features = [
                dim = 784
                format = "dense"
            ]
            labels = [
                dim = 10
                format = "dense"
            ]
        ]
    ]
]

configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:timestamping=true
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:traceLevel=1
configparameters: 03_ConvBatchNorm_ndl_deprecated.cntk:train=[
    action = "train"
    NDLNetworkBuilder = [
        imageLayout = "cudnn"
        initOnCPUOnly=true
        ndlMacros = "C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated/Macros.ndl"
        networkDescription = "C:\jenkins\workspace\CNTK-Test-Windows-W1\Examples\Image\MNIST\Config\Ndl_deprecated/03_ConvBatchNorm.ndl"
    ]
    SGD = [
        epochSize = 60000
        minibatchSize = 32
        learningRatesPerMB = 0.5:0.1
        momentumPerMB = 0.9
        maxEpochs = 2
        batchNormalizationBlendTimeConstant=0:1#INF
    ]
    reader = [
        readerType = "CNTKTextFormatReader"
        file = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu\TestData/Train-28x28_cntk_text.txt"
        input = [
            features = [
                dim = 784
                format = "dense"
            ]
            labels = [
                dim = 10
                format = "dense"
            ]
        ]
    ]
] [SGD=[maxEpochs=3]]

09/13/2016 08:53:51: Commands: train test
09/13/2016 08:53:51: Precision = "float"
09/13/2016 08:53:51: CNTKModelPath: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models/03_ConvBatchNorm
09/13/2016 08:53:51: CNTKCommandTrainInfo: train : 3
09/13/2016 08:53:51: CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 3

09/13/2016 08:53:51: ##############################################################################
09/13/2016 08:53:51: #                                                                            #
09/13/2016 08:53:51: # Action "train"                                                             #
09/13/2016 08:53:51: #                                                                            #
09/13/2016 08:53:51: ##############################################################################

09/13/2016 08:53:51: CNTKCommandTrainBegin: train

09/13/2016 08:53:51: Creating virgin network.
NDLBuilder Using GPU 0
Node 'featScale' (LearnableParameter operation): Initializing Parameter[1 x 1] <- 0.000000.
Node 'conv1.c.W' (LearnableParameter operation): Initializing Parameter[16 x 25] <- 0.000000.
Node 'conv1.c.c.b' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv1.c.c.sc' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv1.c.c.m' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv1.c.c.v' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv2.c.W' (LearnableParameter operation): Initializing Parameter[32 x 400] <- 0.000000.
Node 'conv2.c.c.b' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'conv2.c.c.sc' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'conv2.c.c.m' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'conv2.c.c.v' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'h1.W' (LearnableParameter operation): Initializing Parameter[128 x 1568] <- 0.000000.
Node 'h1.b' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'h1.sc' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'h1.m' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'h1.v' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'ol.W' (LearnableParameter operation): Initializing Parameter[10 x 128] <- 0.000000.
Node 'ol.b' (LearnableParameter operation): Initializing Parameter[10 x 1] <- 0.000000.
Node 'featScale' (LearnableParameter operation): Initializing Parameter[1 x 1] <- 0.003906.
Node 'featScale' (LearnableParameter operation): Initializing Parameter[1 x 1] <- 0.003906.
Node 'featScale' (LearnableParameter operation): Initializing Parameter[1 x 1] <- 0.003906.
Node 'conv1.c.W' (LearnableParameter operation): Initializing Parameter[16 x 25] <- gaussian(seed=1, init dims=[16 x 25], range=0.040000*10.000000, onCPU=true).
Node 'conv1.c.c.b' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv1.c.c.sc' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 1.000000.
Node 'conv1.c.c.m' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv1.c.c.v' (LearnableParameter operation): Initializing Parameter[16 x 1] <- 0.000000.
Node 'conv2.c.W' (LearnableParameter operation): Initializing Parameter[32 x 400] <- gaussian(seed=2, init dims=[32 x 400], range=0.010000*10.000000, onCPU=true).
Node 'conv2.c.c.b' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'conv2.c.c.sc' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 1.000000.
Node 'conv2.c.c.m' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'conv2.c.c.v' (LearnableParameter operation): Initializing Parameter[32 x 1] <- 0.000000.
Node 'h1.W' (LearnableParameter operation): Initializing Parameter[128 x 1568] <- gaussian(seed=3, init dims=[128 x 1568], range=0.005051*1.000000, onCPU=true).
Node 'h1.b' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'h1.sc' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 1.000000.
Node 'h1.m' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'h1.v' (LearnableParameter operation): Initializing Parameter[128 x 1] <- 0.000000.
Node 'ol.W' (LearnableParameter operation): Initializing Parameter[10 x 128] <- uniform(seed=4, init dims=[10 x 128], range=0.050000*1.000000, onCPU=true).
Node 'ol.b' (LearnableParameter operation): Initializing Parameter[10 x 1] <- uniform(seed=5, init dims=[10 x 1], range=0.050000*1.000000, onCPU=true).

Post-processing network...

3 roots:
	ce = CrossEntropyWithSoftmax()
	errs = ClassificationError()
	ol.z = Plus()

Validating network. 36 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [10 x *]
Validating --> ol.W = LearnableParameter() :  -> [10 x 128]
Validating --> h1.W = LearnableParameter() :  -> [128 x 1568]
Validating --> conv2.c.W = LearnableParameter() :  -> [32 x 400]
Validating --> conv1.c.W = LearnableParameter() :  -> [16 x 25]
Validating --> featScale = LearnableParameter() :  -> [1 x 1]
Validating --> features = InputValue() :  -> [28 x 28 x 1 x *]
Validating --> featScaled = ElementTimes (featScale, features) : [1 x 1], [28 x 28 x 1 x *] -> [28 x 28 x 1 x *]
Validating --> conv1.c.c.c = Convolution (conv1.c.W, featScaled) : [16 x 25], [28 x 28 x 1 x *] -> [28 x 28 x 16 x *]
Validating --> conv1.c.c.sc = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.b = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.m = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.v = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.y = BatchNormalization (conv1.c.c.c, conv1.c.c.sc, conv1.c.c.b, conv1.c.c.m, conv1.c.c.v) : [28 x 28 x 16 x *], [16 x 1], [16 x 1], [16 x 1], [16 x 1] -> [28 x 28 x 16 x *]
Validating --> conv1.y = RectifiedLinear (conv1.c.c.y) : [28 x 28 x 16 x *] -> [28 x 28 x 16 x *]
Validating --> pool1 = MaxPooling (conv1.y) : [28 x 28 x 16 x *] -> [14 x 14 x 16 x *]
Validating --> conv2.c.c.c = Convolution (conv2.c.W, pool1) : [32 x 400], [14 x 14 x 16 x *] -> [14 x 14 x 32 x *]
Validating --> conv2.c.c.sc = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.b = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.m = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.v = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.y = BatchNormalization (conv2.c.c.c, conv2.c.c.sc, conv2.c.c.b, conv2.c.c.m, conv2.c.c.v) : [14 x 14 x 32 x *], [32 x 1], [32 x 1], [32 x 1], [32 x 1] -> [14 x 14 x 32 x *]
Validating --> conv2.y = RectifiedLinear (conv2.c.c.y) : [14 x 14 x 32 x *] -> [14 x 14 x 32 x *]
Validating --> pool2 = MaxPooling (conv2.y) : [14 x 14 x 32 x *] -> [7 x 7 x 32 x *]

h1.t Times operation: For legacy compatibility, the sample layout of left input (h1.W LearnableParameter operation) was patched to [128 x 7 x 7 x 32] (from [128 x 1568])
Validating --> h1.t = Times (h1.W, pool2) : [128 x 7 x 7 x 32], [7 x 7 x 32 x *] -> [128 x *]
Validating --> h1.sc = LearnableParameter() :  -> [128 x 1]
Validating --> h1.b = LearnableParameter() :  -> [128 x 1]
Validating --> h1.m = LearnableParameter() :  -> [128 x 1]
Validating --> h1.v = LearnableParameter() :  -> [128 x 1]
Validating --> h1.bn = BatchNormalization (h1.t, h1.sc, h1.b, h1.m, h1.v) : [128 x *], [128 x 1], [128 x 1], [128 x 1], [128 x 1] -> [128 x *]
Validating --> h1.y = RectifiedLinear (h1.bn) : [128 x *] -> [128 x *]
Validating --> ol.t = Times (ol.W, h1.y) : [10 x 128], [128 x *] -> [10 x *]
Validating --> ol.b = LearnableParameter() :  -> [10 x 1]
Validating --> ol.z = Plus (ol.t, ol.b) : [10 x *], [10 x 1] -> [10 x 1 x *]
Validating --> ce = CrossEntropyWithSoftmax (labels, ol.z) : [10 x *], [10 x 1 x *] -> [1]
Validating --> errs = ClassificationError (labels, ol.z) : [10 x *], [10 x 1 x *] -> [1]

Validating network. 16 nodes to process in pass 2.


Validating network, final pass.

conv1.c.c.c: using cuDNN convolution engine for geometry: Input: 28 x 28 x 1, Output: 28 x 28 x 16, Kernel: 5 x 5 x 1, Map: 1 x 1 x 16, Stride: 1 x 1 x 1, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
Using CNTK batch normalization engine.
pool1: using cuDNN convolution engine for geometry: Input: 28 x 28 x 16, Output: 14 x 14 x 16, Kernel: 2 x 2 x 1, Map: 1, Stride: 2 x 2 x 1, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
conv2.c.c.c: using cuDNN convolution engine for geometry: Input: 14 x 14 x 16, Output: 14 x 14 x 32, Kernel: 5 x 5 x 16, Map: 1 x 1 x 32, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
Using CNTK batch normalization engine.
pool2: using cuDNN convolution engine for geometry: Input: 14 x 14 x 32, Output: 7 x 7 x 32, Kernel: 2 x 2 x 1, Map: 1, Stride: 2 x 2 x 1, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
Using CNTK batch normalization engine.


20 out of 36 nodes do not share the minibatch layout with the input data.

Post-processing network complete.

09/13/2016 08:53:53: Created model with 36 nodes on GPU 0.

09/13/2016 08:53:53: Training criterion node(s):
09/13/2016 08:53:53: 	ce = CrossEntropyWithSoftmax

09/13/2016 08:53:53: Evaluation criterion node(s):
09/13/2016 08:53:53: 	errs = ClassificationError


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 61 matrices, 28 are shared as 12, and 33 are not shared.

	{ h1.bn : [128 x *] (gradient)
	  ol.t : [10 x *] }
	{ ol.t : [10 x *] (gradient)
	  pool1 : [14 x 14 x 16 x *] (gradient)
	  pool2 : [7 x 7 x 32 x *] (gradient) }
	{ h1.sc : [128 x 1] (gradient)
	  h1.y : [128 x *] (gradient) }
	{ conv1.c.W : [16 x 25] (gradient)
	  conv2.c.c.c : [14 x 14 x 32 x *] }
	{ conv2.c.c.y : [14 x 14 x 32 x *] (gradient)
	  pool2 : [7 x 7 x 32 x *] }
	{ conv1.c.c.b : [16 x 1] (gradient)
	  conv2.c.c.c : [14 x 14 x 32 x *] (gradient)
	  conv2.y : [14 x 14 x 32 x *] }
	{ conv2.c.W : [32 x 400] (gradient)
	  h1.t : [128 x *] (gradient)
	  h1.y : [128 x *] }
	{ conv1.c.c.sc : [16 x 1] (gradient)
	  conv1.y : [28 x 28 x 16 x *] (gradient) }
	{ conv2.c.c.sc : [32 x 1] (gradient)
	  conv2.y : [14 x 14 x 32 x *] (gradient)
	  h1.t : [128 x *] }
	{ ol.W : [10 x 128] (gradient)
	  ol.z : [10 x 1 x *] (gradient) }
	{ conv1.c.c.y : [28 x 28 x 16 x *] (gradient)
	  pool1 : [14 x 14 x 16 x *] }
	{ conv1.c.c.c : [28 x 28 x 16 x *] (gradient)
	  conv1.y : [28 x 28 x 16 x *] }


09/13/2016 08:53:53: Training 215546 parameters in 11 out of 11 parameter tensors and 25 nodes with gradient:

09/13/2016 08:53:53: 	Node 'conv1.c.W' (LearnableParameter operation) : [16 x 25]
09/13/2016 08:53:53: 	Node 'conv1.c.c.b' (LearnableParameter operation) : [16 x 1]
09/13/2016 08:53:53: 	Node 'conv1.c.c.sc' (LearnableParameter operation) : [16 x 1]
09/13/2016 08:53:53: 	Node 'conv2.c.W' (LearnableParameter operation) : [32 x 400]
09/13/2016 08:53:53: 	Node 'conv2.c.c.b' (LearnableParameter operation) : [32 x 1]
09/13/2016 08:53:53: 	Node 'conv2.c.c.sc' (LearnableParameter operation) : [32 x 1]
09/13/2016 08:53:53: 	Node 'h1.W' (LearnableParameter operation) : [128 x 7 x 7 x 32]
09/13/2016 08:53:53: 	Node 'h1.b' (LearnableParameter operation) : [128 x 1]
09/13/2016 08:53:53: 	Node 'h1.sc' (LearnableParameter operation) : [128 x 1]
09/13/2016 08:53:53: 	Node 'ol.W' (LearnableParameter operation) : [10 x 128]
09/13/2016 08:53:53: 	Node 'ol.b' (LearnableParameter operation) : [10 x 1]

09/13/2016 08:53:53: No PreCompute nodes found, or all already computed. Skipping pre-computation step.

09/13/2016 08:53:53: Starting Epoch 1: learning rate per sample = 0.015625  effective momentum = 0.900000  momentum as time constant = 303.7 samples

09/13/2016 08:53:53: Starting minibatch loop.
09/13/2016 08:53:56:  Epoch[ 1 of 3]-Minibatch[   1- 500, 26.67%]: ce = 0.17360495 * 16000; errs = 5.431% * 16000; time = 3.4445s; samplesPerSecond = 4645.1
09/13/2016 08:53:58:  Epoch[ 1 of 3]-Minibatch[ 501-1000, 53.33%]: ce = 0.08595715 * 16000; errs = 2.594% * 16000; time = 1.6553s; samplesPerSecond = 9666.0
09/13/2016 08:54:00:  Epoch[ 1 of 3]-Minibatch[1001-1500, 80.00%]: ce = 0.06540656 * 16000; errs = 2.031% * 16000; time = 1.6421s; samplesPerSecond = 9743.9
09/13/2016 08:54:01: Finished Epoch[ 1 of 3]: [Training] ce = 0.09696341 * 60000; errs = 3.013% * 60000; totalSamplesSeen = 60000; learningRatePerSample = 0.015625; epochTime=7.99169s
09/13/2016 08:54:01: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models/03_ConvBatchNorm.1'
Setting batch normalization blend time constant to 1.#INF.

09/13/2016 08:54:01: Starting Epoch 2: learning rate per sample = 0.003125  effective momentum = 0.900000  momentum as time constant = 303.7 samples

09/13/2016 08:54:01: Starting minibatch loop.
09/13/2016 08:54:03:  Epoch[ 2 of 3]-Minibatch[   1- 500, 26.67%]: ce = 0.02389898 * 16000; errs = 0.750% * 16000; time = 1.6323s; samplesPerSecond = 9801.9
09/13/2016 08:54:04:  Epoch[ 2 of 3]-Minibatch[ 501-1000, 53.33%]: ce = 0.01968084 * 16000; errs = 0.569% * 16000; time = 1.6322s; samplesPerSecond = 9802.9
09/13/2016 08:54:06:  Epoch[ 2 of 3]-Minibatch[1001-1500, 80.00%]: ce = 0.02329140 * 16000; errs = 0.744% * 16000; time = 1.6312s; samplesPerSecond = 9808.7
09/13/2016 08:54:07: Finished Epoch[ 2 of 3]: [Training] ce = 0.02231229 * 60000; errs = 0.690% * 60000; totalSamplesSeen = 120000; learningRatePerSample = 0.003125; epochTime=6.12732s
09/13/2016 08:54:07: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models/03_ConvBatchNorm.2'

09/13/2016 08:54:07: Starting Epoch 3: learning rate per sample = 0.003125  effective momentum = 0.900000  momentum as time constant = 303.7 samples

09/13/2016 08:54:07: Starting minibatch loop.
09/13/2016 08:54:09:  Epoch[ 3 of 3]-Minibatch[   1- 500, 26.67%]: ce = 0.01647731 * 16000; errs = 0.506% * 16000; time = 1.6329s; samplesPerSecond = 9798.5
09/13/2016 08:54:10:  Epoch[ 3 of 3]-Minibatch[ 501-1000, 53.33%]: ce = 0.01558325 * 16000; errs = 0.413% * 16000; time = 1.6318s; samplesPerSecond = 9805.2
09/13/2016 08:54:12:  Epoch[ 3 of 3]-Minibatch[1001-1500, 80.00%]: ce = 0.01761477 * 16000; errs = 0.519% * 16000; time = 1.6308s; samplesPerSecond = 9811.1
09/13/2016 08:54:13: Finished Epoch[ 3 of 3]: [Training] ce = 0.01708807 * 60000; errs = 0.510% * 60000; totalSamplesSeen = 180000; learningRatePerSample = 0.003125; epochTime=6.13331s
09/13/2016 08:54:13: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20160913085011.264477\Examples\Image\MNIST_03_ConvBatchNorm_ndl@release_gpu/Models/03_ConvBatchNorm'
09/13/2016 08:54:13: CNTKCommandTrainEnd: train

09/13/2016 08:54:13: Action "train" complete.


09/13/2016 08:54:13: ##############################################################################
09/13/2016 08:54:13: #                                                                            #
09/13/2016 08:54:13: # Action "test"                                                              #
09/13/2016 08:54:13: #                                                                            #
09/13/2016 08:54:13: ##############################################################################


Post-processing network...

3 roots:
	ce = CrossEntropyWithSoftmax()
	errs = ClassificationError()
	ol.z = Plus()

Validating network. 36 nodes to process in pass 1.

Validating --> labels = InputValue() :  -> [10 x *1]
Validating --> ol.W = LearnableParameter() :  -> [10 x 128]
Validating --> h1.W = LearnableParameter() :  -> [128 x 7 x 7 x 32]
Validating --> conv2.c.W = LearnableParameter() :  -> [32 x 400]
Validating --> conv1.c.W = LearnableParameter() :  -> [16 x 25]
Validating --> featScale = LearnableParameter() :  -> [1 x 1]
Validating --> features = InputValue() :  -> [28 x 28 x 1 x *1]
Validating --> featScaled = ElementTimes (featScale, features) : [1 x 1], [28 x 28 x 1 x *1] -> [28 x 28 x 1 x *1]
Validating --> conv1.c.c.c = Convolution (conv1.c.W, featScaled) : [16 x 25], [28 x 28 x 1 x *1] -> [28 x 28 x 16 x *1]
Validating --> conv1.c.c.sc = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.b = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.m = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.v = LearnableParameter() :  -> [16 x 1]
Validating --> conv1.c.c.y = BatchNormalization (conv1.c.c.c, conv1.c.c.sc, conv1.c.c.b, conv1.c.c.m, conv1.c.c.v) : [28 x 28 x 16 x *1], [16 x 1], [16 x 1], [16 x 1], [16 x 1] -> [28 x 28 x 16 x *1]
Validating --> conv1.y = RectifiedLinear (conv1.c.c.y) : [28 x 28 x 16 x *1] -> [28 x 28 x 16 x *1]
Validating --> pool1 = MaxPooling (conv1.y) : [28 x 28 x 16 x *1] -> [14 x 14 x 16 x *1]
Validating --> conv2.c.c.c = Convolution (conv2.c.W, pool1) : [32 x 400], [14 x 14 x 16 x *1] -> [14 x 14 x 32 x *1]
Validating --> conv2.c.c.sc = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.b = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.m = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.v = LearnableParameter() :  -> [32 x 1]
Validating --> conv2.c.c.y = BatchNormalization (conv2.c.c.c, conv2.c.c.sc, conv2.c.c.b, conv2.c.c.m, conv2.c.c.v) : [14 x 14 x 32 x *1], [32 x 1], [32 x 1], [32 x 1], [32 x 1] -> [14 x 14 x 32 x *1]
Validating --> conv2.y = RectifiedLinear (conv2.c.c.y) : [14 x 14 x 32 x *1] -> [14 x 14 x 32 x *1]
Validating --> pool2 = MaxPooling (conv2.y) : [14 x 14 x 32 x *1] -> [7 x 7 x 32 x *1]
Validating --> h1.t = Times (h1.W, pool2) : [128 x 7 x 7 x 32], [7 x 7 x 32 x *1] -> [128 x *1]
Validating --> h1.sc = LearnableParameter() :  -> [128 x 1]
Validating --> h1.b = LearnableParameter() :  -> [128 x 1]
Validating --> h1.m = LearnableParameter() :  -> [128 x 1]
Validating --> h1.v = LearnableParameter() :  -> [128 x 1]
Validating --> h1.bn = BatchNormalization (h1.t, h1.sc, h1.b, h1.m, h1.v) : [128 x *1], [128 x 1], [128 x 1], [128 x 1], [128 x 1] -> [128 x *1]
Validating --> h1.y = RectifiedLinear (h1.bn) : [128 x *1] -> [128 x *1]
Validating --> ol.t = Times (ol.W, h1.y) : [10 x 128], [128 x *1] -> [10 x *1]
Validating --> ol.b = LearnableParameter() :  -> [10 x 1]
Validating --> ol.z = Plus (ol.t, ol.b) : [10 x *1], [10 x 1] -> [10 x 1 x *1]
Validating --> ce = CrossEntropyWithSoftmax (labels, ol.z) : [10 x *1], [10 x 1 x *1] -> [1]
Validating --> errs = ClassificationError (labels, ol.z) : [10 x *1], [10 x 1 x *1] -> [1]

Validating network. 16 nodes to process in pass 2.


Validating network, final pass.

conv1.c.c.c: using cuDNN convolution engine for geometry: Input: 28 x 28 x 1, Output: 28 x 28 x 16, Kernel: 5 x 5 x 1, Map: 1 x 1 x 16, Stride: 1 x 1 x 1, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
Using CNTK batch normalization engine.
pool1: using cuDNN convolution engine for geometry: Input: 28 x 28 x 16, Output: 14 x 14 x 16, Kernel: 2 x 2 x 1, Map: 1, Stride: 2 x 2 x 1, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
conv2.c.c.c: using cuDNN convolution engine for geometry: Input: 14 x 14 x 16, Output: 14 x 14 x 32, Kernel: 5 x 5 x 16, Map: 1 x 1 x 32, Stride: 1 x 1 x 16, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
Using CNTK batch normalization engine.
pool2: using cuDNN convolution engine for geometry: Input: 14 x 14 x 32, Output: 7 x 7 x 32, Kernel: 2 x 2 x 1, Map: 1, Stride: 2 x 2 x 1, Sharing: (1), AutoPad: (0), LowerPad: 0, UpperPad: 0.
Using CNTK batch normalization engine.


20 out of 36 nodes do not share the minibatch layout with the input data.

Post-processing network complete.

evalNodeNames are not specified, using all the default evalnodes and training criterion nodes.


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 36 matrices, 0 are shared as 0, and 36 are not shared.


09/13/2016 08:54:14: Minibatch[1-10]: errs = 0.760% * 10000; ce = 0.02298447 * 10000
09/13/2016 08:54:14: Final Results: Minibatch[1-10]: errs = 0.760% * 10000; ce = 0.02298447 * 10000; perplexity = 1.02325065

09/13/2016 08:54:14: Action "test" complete.

09/13/2016 08:54:14: __COMPLETED__
