CPU info:
    CPU Model Name: Intel(R) Xeon(R) CPU W3565 @ 3.20GHz
    Hardware threads: 8
    Total Memory: 12580436 kB
-------------------------------------------------------------------
=== Running /cygdrive/c/jenkins/workspace/CNTK-Test-Windows-SlaveTest/x64/release/cntk.exe configFile=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Config/rnn.cntk currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu DataDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Config OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu DeviceId=0 timestamping=true initOnCPUOnly=true command=train:test train=[SGD=[maxEpochs=3]] train=[epochSize=2048] test=[SGD=[maxEpochs=3]] test=[epochSize=2048] train=[reader=[wordclass="$DataDir$/vocab.txt"]] train=[cvreader=[wordclass="$DataDir$/vocab.txt"]] test=[reader=[wordclass="$DataDir$/vocab.txt"]]
CNTK 2.0.beta6.0+ (HEAD bbb440, Dec 20 2016 05:52:50) on cntk-muc01 at 2016/12/21 00:52:08

C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\x64\release\cntk.exe  configFile=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Config/rnn.cntk  currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data  RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu  DataDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data  ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Config  OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu  DeviceId=0  timestamping=true  initOnCPUOnly=true  command=train:test  train=[SGD=[maxEpochs=3]]  train=[epochSize=2048]  test=[SGD=[maxEpochs=3]]  test=[epochSize=2048]  train=[reader=[wordclass="$DataDir$/vocab.txt"]]  train=[cvreader=[wordclass="$DataDir$/vocab.txt"]]  test=[reader=[wordclass="$DataDir$/vocab.txt"]]
Changed current directory to C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data
12/21/2016 00:52:08: -------------------------------------------------------------------
12/21/2016 00:52:08: Build info: 

12/21/2016 00:52:08: 		Built time: Dec 20 2016 05:52:50
12/21/2016 00:52:08: 		Last modified date: Tue Dec 20 03:38:53 2016
12/21/2016 00:52:08: 		Build type: Release
12/21/2016 00:52:08: 		Build target: GPU
12/21/2016 00:52:08: 		With ASGD: yes
12/21/2016 00:52:08: 		Math lib: mkl
12/21/2016 00:52:08: 		CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
12/21/2016 00:52:08: 		CUB_PATH: c:\src\cub-1.4.1
12/21/2016 00:52:08: 		CUDNN_PATH: C:\local\cudnn-8.0-windows10-x64-v5.1
12/21/2016 00:52:08: 		Build Branch: HEAD
12/21/2016 00:52:08: 		Build SHA1: bbb4406fcb69b190109ec665d8c0a619271f1b24 (modified)
12/21/2016 00:52:08: 		Built by svcphil on liana-08-w
12/21/2016 00:52:08: 		Build Path: C:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\
12/21/2016 00:52:08: -------------------------------------------------------------------
12/21/2016 00:52:08: -------------------------------------------------------------------
12/21/2016 00:52:08: GPU info:

12/21/2016 00:52:08: 		Device[0]: cores = 2496; computeCapability = 5.2; type = "Quadro M4000"; memory = 8192 MB
12/21/2016 00:52:08: -------------------------------------------------------------------

Configuration After Processing and Variable Resolution:

configparameters: rnn.cntk:command=train:test
configparameters: rnn.cntk:confClassSize=50
configparameters: rnn.cntk:ConfigDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Config
configparameters: rnn.cntk:confVocabSize=10000
configparameters: rnn.cntk:currentDirectory=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data
configparameters: rnn.cntk:DataDir=C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data
configparameters: rnn.cntk:deviceId=0
configparameters: rnn.cntk:initOnCPUOnly=true
configparameters: rnn.cntk:ModelDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models
configparameters: rnn.cntk:modelPath=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/rnn.dnn
configparameters: rnn.cntk:numCPUThreads=1
configparameters: rnn.cntk:OutputDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu
configparameters: rnn.cntk:precision=float
configparameters: rnn.cntk:RootDir=..
configparameters: rnn.cntk:RunDir=C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu
configparameters: rnn.cntk:test=[
    action = "eval"
minibatchSize = 8192                
    traceLevel = 1
    epochSize = 0
    reader = [
        readerType = "LMSequenceReader"
        randomize = "none"
nbruttsineachrecurrentiter = 0  
cacheBlockSize = 2000000        
        wordclass = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/vocab.txt"
        wfile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sequenceSentence.bin"
        wsize = 256
        wrecords = 1000
        windowSize = "10000"
        file = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.test.txt"
        features = [
            dim = 0
            sectionType = "data"
        ]
        labelIn = [
            dim = 1
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.txt"
            labelType = "Category"
            beginSequence = "</s>"
            endSequence = "</s>"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 11
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 11
                sectionType = "categoryLabels"
            ]
        ]
        labels = [
            dim = 1
            labelType = "NextWord"
            beginSequence = "O"
            endSequence = "O"
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 3
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 3
                sectionType = "categoryLabels"
            ]
        ]
    ]
] [SGD=[maxEpochs=3]] [epochSize=2048] [reader=[wordclass="C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/vocab.txt"]]

configparameters: rnn.cntk:testFile=ptb.test.txt
configparameters: rnn.cntk:timestamping=true
configparameters: rnn.cntk:traceLevel=1
configparameters: rnn.cntk:train=[
    action = "train"
    traceLevel = 1
epochSize = 0               
    SimpleNetworkBuilder = [
rnnType = "CLASSLSTM"   
recurrentLayer = 1      
        trainingCriterion = "classCrossEntropyWithSoftmax"
        evalCriterion     = "classCrossEntropyWithSoftmax"
        initValueScale = 6.0
        uniformInit = true
        layerSizes = "10000:150:200:10000"
defaultHiddenActivity = 0.1 
        addPrior = false
        addDropoutNodes = false
        applyMeanVarNorm = false
lookupTableOrder = 1        
        vocabSize = "10000"
        nbrClass  = "50"
    ]
    SGD = [
        minibatchSize = 128:256:512
        learningRatesPerSample = 0.1
        momentumPerMB = 0
        gradientClippingWithTruncation = true
        clippingThresholdPerSample = 15.0
        maxEpochs = 16
        numMBsToShowResult = 100
        gradUpdateType = "none"
        loadBestModel = true
        dropoutRate = 0.0
        AutoAdjust = [
            autoAdjustLR = "adjustAfterEpoch"
            reduceLearnRateIfImproveLessThan = 0.001
            continueReduce = false
            increaseLearnRateIfImproveMoreThan = 1000000000
            learnRateDecreaseFactor = 0.5
            learnRateIncreaseFactor = 1.382
            numMiniBatch4LRSearch = 100
            numPrevLearnRates = 5
            numBestSearchEpoch = 1
        ]
    ]
    reader = [
        readerType = "LMSequenceReader"
randomize = "none"              
nbruttsineachrecurrentiter = 0  
cacheBlockSize = 2000000        
        wordclass = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/vocab.txt"
        wfile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sequenceSentence.bin"
        wsize = 256
        wrecords = 1000
        windowSize = "10000"
        file = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.train.txt"
        features = [
            dim = 0
            sectionType = "data"
        ]
        labelIn = [
            dim = 1
            labelType = "Category"
            beginSequence = "</s>"
            endSequence = "</s>"
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.txt"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 11                
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 11
                sectionType = "categoryLabels"
            ]
        ]
        labels = [
            dim = 1
            labelType = "NextWord"
            beginSequence = "O"
            endSequence = "O"
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 3
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 3
                sectionType = categoryLabels
            ]
        ]
    ]
    cvReader = [
        readerType = "LMSequenceReader"
        randomize = "none"
nbruttsineachrecurrentiter = 0  
cacheBlockSize = 2000000        
        wordclass = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/vocab.txt"
        wfile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sequenceSentence.valid.bin"
        wsize = 256
        wrecords = 1000
        windowSize = "10000"
        file = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.valid.txt"
        features = [
            dim = 0
            sectionType = "data"
        ]
        labelIn = [
            dim = 1
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt"
            labelType = "Category"
            beginSequence = "</s>"
            endSequence = "</s>"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 11
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 11
                sectionType = "categoryLabels"
            ]
        ]
        labels = [
            dim = 1
            labelType = "NextWord"
            beginSequence = "O"
            endSequence = "O"
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 3
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 3
                sectionType = "categoryLabels"
            ]
        ]
    ]
] [SGD=[maxEpochs=3]] [epochSize=2048] [reader=[wordclass="C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/vocab.txt"]] [cvreader=[wordclass="C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/vocab.txt"]]

configparameters: rnn.cntk:trainFile=ptb.train.txt
configparameters: rnn.cntk:validFile=ptb.valid.txt
configparameters: rnn.cntk:write=[
    action = "write"
    outputPath = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Write"
outputNodeNames = TrainNodeClassBasedCrossEntropy 
    format = [
sequencePrologue = "log P(W)="    
        type = "real"
    ]
minibatchSize = 8192                
    traceLevel = 1
    epochSize = 0
    reader = [
        readerType = "LMSequenceReader"
randomize = "none"              
nbruttsineachrecurrentiter = 1  
cacheBlockSize = 1              
        wordclass = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/vocab.txt"
        wfile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sequenceSentence.bin"
        wsize = 256
        wrecords = 1000
        windowSize = "10000"
        file = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.test.txt"
        features = [
            dim = 0
            sectionType = "data"
        ]
        labelIn = [
            dim = 1
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.txt"
            labelType = "Category"
            beginSequence = "</s>"
            endSequence = "</s>"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 11
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 11
                sectionType = "categoryLabels"
            ]
        ]
        labels = [
            dim = 1
            labelType = "NextWord"
            beginSequence = "O"
            endSequence = "O"
            labelDim = "10000"
            labelMappingFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt"
            elementSize = 4
            sectionType = "labels"
            mapping = [
                wrecords = 3
                elementSize = 10
                sectionType = "labelMapping"
            ]
            category = [
                dim = 3
                sectionType = "categoryLabels"
            ]
        ]
    ]
]

configparameters: rnn.cntk:writeWordAndClassInfo=[
    action = "writeWordAndClass"
    inputFile = "C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.train.txt"
    beginSequence = "</s>"
    endSequence   = "</s>"
    outputVocabFile = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/vocab.txt"
    outputWord2Cls  = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/word2cls.txt"
    outputCls2Index = "C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/cls2idx.txt"
    vocabSize = "10000"
    nbrClass = "50"
    cutoff = 0
    printValues = true
]

12/21/2016 00:52:08: Commands: train test
12/21/2016 00:52:08: precision = "float"
12/21/2016 00:52:08: Using 1 CPU threads.

12/21/2016 00:52:08: ##############################################################################
12/21/2016 00:52:08: #                                                                            #
12/21/2016 00:52:08: # train command (train action)                                               #
12/21/2016 00:52:08: #                                                                            #
12/21/2016 00:52:08: ##############################################################################

12/21/2016 00:52:08: WARNING: 'numMiniBatch4LRSearch' is deprecated, please remove it and use 'numSamples4Search' instead.
12/21/2016 00:52:08: 
Creating virgin network.
SimpleNetworkBuilder Using GPU 0
Microsoft::MSR::CNTK::GPUMatrix<ElemType>::SetUniformRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4
LMSequenceReader: Label mapping will be created internally on the fly because the labelMappingFile was not found: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.txt
LMSequenceReader: Label mapping will be created internally on the fly because the labelMappingFile was not found: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt
LMSequenceReader: Input file is 'C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.train.txt'.
LMSequenceReader: Label mapping will be created internally on the fly because the labelMappingFile was not found: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt
LMSequenceReader: Label mapping will be created internally on the fly because the labelMappingFile was not found: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt
LMSequenceReader: Input file is 'C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.valid.txt'.
12/21/2016 00:52:09: 
Model has 63 nodes. Using GPU 0.

12/21/2016 00:52:09: Training criterion:   TrainNodeClassBasedCrossEntropy = ClassBasedCrossEntropyWithSoftmax


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 122 matrices, 26 are shared as 12, and 96 are not shared.

	{ AutoName3 : [200 x 1 x *]
	  WXC0 : [200 x 150] (gradient) }
	{ AutoName32 : [200 x *]
	  E0 : [150 x 10000] (gradient) }
	{ AutoName23 : [200 x *]
	  AutoName32 : [200 x *] (gradient) }
	{ AutoName24 : [200 x 1 x *]
	  WXF0 : [200 x 150] (gradient) }
	{ AutoName11 : [200 x 1 x *]
	  WXI0 : [200 x 150] (gradient) }
	{ AutoName33 : [200 x 1 x *]
	  WXO0 : [200 x 150] (gradient) }
	{ AutoName10 : [200 x *]
	  AutoName23 : [200 x *] (gradient)
	  bo0 : [200 x 1] (gradient) }
	{ AutoName10 : [200 x *] (gradient)
	  AutoName17 : [200 x *]
	  bf0 : [200 x 1] (gradient) }
	{ AutoName36 : [200 x 1 x *] (gradient)
	  ClassPostProb : [50 x 1 x *] }
	{ AutoName37 : [200 x 1 x *] (gradient)
	  ClassPostProb : [50 x 1 x *] (gradient) }
	{ AutoName17 : [200 x *] (gradient)
	  bi0 : [200 x 1] (gradient) }
	{ AutoName3 : [200 x 1 x *] (gradient)
	  LookupTable : [150 x *] (gradient) }


12/21/2016 00:52:09: Training 3791400 parameters in 18 out of 18 parameter tensors and 59 nodes with gradient:

12/21/2016 00:52:09: 	Node 'E0' (LearnableParameter operation) : [150 x 10000]
12/21/2016 00:52:09: 	Node 'W2' (LearnableParameter operation) : [200 x 10000]
12/21/2016 00:52:09: 	Node 'WCF0' (LearnableParameter operation) : [200 x 1]
12/21/2016 00:52:09: 	Node 'WCI0' (LearnableParameter operation) : [200 x 1]
12/21/2016 00:52:09: 	Node 'WCO0' (LearnableParameter operation) : [200 x 1]
12/21/2016 00:52:09: 	Node 'WHC0' (LearnableParameter operation) : [200 x 200]
12/21/2016 00:52:09: 	Node 'WHF0' (LearnableParameter operation) : [200 x 200]
12/21/2016 00:52:09: 	Node 'WHI0' (LearnableParameter operation) : [200 x 200]
12/21/2016 00:52:09: 	Node 'WHO0' (LearnableParameter operation) : [200 x 200]
12/21/2016 00:52:09: 	Node 'WXC0' (LearnableParameter operation) : [200 x 150]
12/21/2016 00:52:09: 	Node 'WXF0' (LearnableParameter operation) : [200 x 150]
12/21/2016 00:52:09: 	Node 'WXI0' (LearnableParameter operation) : [200 x 150]
12/21/2016 00:52:09: 	Node 'WXO0' (LearnableParameter operation) : [200 x 150]
12/21/2016 00:52:09: 	Node 'WeightForClassPostProb' (LearnableParameter operation) : [50 x 200]
12/21/2016 00:52:09: 	Node 'bc0' (LearnableParameter operation) : [200 x 1]
12/21/2016 00:52:09: 	Node 'bf0' (LearnableParameter operation) : [200 x 1]
12/21/2016 00:52:09: 	Node 'bi0' (LearnableParameter operation) : [200 x 1]
12/21/2016 00:52:09: 	Node 'bo0' (LearnableParameter operation) : [200 x 1]

12/21/2016 00:52:09: No PreCompute nodes found, or all already computed. Skipping pre-computation step.

12/21/2016 00:52:10: Starting Epoch 1: learning rate per sample = 0.100000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

12/21/2016 00:52:10: Starting minibatch loop.
LMSequenceReader: Reading epoch data... 42068 sequences read.
WARNING: The same matrix with dim [4, 112] has been transferred between different devices for 20 times.
12/21/2016 00:52:12: Finished Epoch[ 1 of 3]: [Training] TrainNodeClassBasedCrossEntropy = 6.86386169 * 2061; totalSamplesSeen = 2061; learningRatePerSample = 0.1; epochTime=2.29748s
LMSequenceReader: Reading epoch data... 3370 sequences read.
LMSequenceReader: Reading epoch data... 0 sequences read.
12/21/2016 00:52:36: Final Results: Minibatch[1-704]: TrainNodeClassBasedCrossEntropy = 6.68271886 * 73760; perplexity = 798.48713417
12/21/2016 00:52:36: Finished Epoch[ 1 of 3]: [Validate] TrainNodeClassBasedCrossEntropy = 6.68271886 * 73760
12/21/2016 00:52:37: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/rnn.dnn.1'

12/21/2016 00:52:37: Starting Epoch 2: learning rate per sample = 0.100000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

12/21/2016 00:52:37: Starting minibatch loop.
LMSequenceReader: Reading epoch data... 42068 sequences read.
12/21/2016 00:52:38: Finished Epoch[ 2 of 3]: [Training] TrainNodeClassBasedCrossEntropy = 6.67562029 * 2081; totalSamplesSeen = 4142; learningRatePerSample = 0.1; epochTime=1.30858s
LMSequenceReader: Reading epoch data... 3370 sequences read.
LMSequenceReader: Reading epoch data... 0 sequences read.
12/21/2016 00:52:55: Final Results: Minibatch[1-353]: TrainNodeClassBasedCrossEntropy = 6.68034719 * 73760; perplexity = 796.59563218
12/21/2016 00:52:55: Finished Epoch[ 2 of 3]: [Validate] TrainNodeClassBasedCrossEntropy = 6.68034719 * 73760
12/21/2016 00:52:55: learnRatePerSample reduced to 0.050000001
12/21/2016 00:52:55: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/rnn.dnn.2'

12/21/2016 00:52:56: Starting Epoch 3: learning rate per sample = 0.050000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

12/21/2016 00:52:56: Starting minibatch loop.
LMSequenceReader: Reading epoch data... 42068 sequences read.
12/21/2016 00:52:57: Finished Epoch[ 3 of 3]: [Training] TrainNodeClassBasedCrossEntropy = 6.64467671 * 2343; totalSamplesSeen = 6485; learningRatePerSample = 0.050000001; epochTime=1.24087s
LMSequenceReader: Reading epoch data... 3370 sequences read.
LMSequenceReader: Reading epoch data... 0 sequences read.
12/21/2016 00:53:11: Final Results: Minibatch[1-193]: TrainNodeClassBasedCrossEntropy = 7.01536072 * 73760; perplexity = 1113.60827096
12/21/2016 00:53:11: Finished Epoch[ 3 of 3]: [Validate] TrainNodeClassBasedCrossEntropy = 7.01536072 * 73760
12/21/2016 00:53:11: Loading (rolling back to) previous model with best training-criterion value: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/rnn.dnn.2.
12/21/2016 00:53:11: learnRatePerSample reduced to 0.025
12/21/2016 00:53:11: SGD: revoke back to and update checkpoint file for epoch 2

12/21/2016 00:53:11: Starting Epoch 3: learning rate per sample = 0.025000  effective momentum = 0.000000  momentum as time constant = 0.0 samples

12/21/2016 00:53:11: Starting minibatch loop.
LMSequenceReader: Reading epoch data... 42068 sequences read.
12/21/2016 00:53:13: Finished Epoch[ 3 of 3]: [Training] TrainNodeClassBasedCrossEntropy = 6.37243835 * 2343; totalSamplesSeen = 6485; learningRatePerSample = 0.025; epochTime=1.11836s
LMSequenceReader: Reading epoch data... 3370 sequences read.
LMSequenceReader: Reading epoch data... 0 sequences read.
12/21/2016 00:53:27: Final Results: Minibatch[1-193]: TrainNodeClassBasedCrossEntropy = 6.48770243 * 73760; perplexity = 657.01209369
12/21/2016 00:53:27: Finished Epoch[ 3 of 3]: [Validate] TrainNodeClassBasedCrossEntropy = 6.48770243 * 73760
12/21/2016 00:53:27: SGD: Saving checkpoint model 'C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/Models/rnn.dnn'

12/21/2016 00:53:27: Action "train" complete.


12/21/2016 00:53:27: ##############################################################################
12/21/2016 00:53:27: #                                                                            #
12/21/2016 00:53:27: # test command (eval action)                                                 #
12/21/2016 00:53:27: #                                                                            #
12/21/2016 00:53:27: ##############################################################################

LMSequenceReader: Label mapping will be created internally on the fly because the labelMappingFile was not found: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.txt
LMSequenceReader: Label mapping will be created internally on the fly because the labelMappingFile was not found: C:\Users\svcphil\AppData\Local\Temp\cntk-test-20161221004215.44492\Examples\Text\PennTreebank_RNN@release_gpu/sentenceLabels.out.txt
LMSequenceReader: Input file is 'C:\jenkins\workspace\CNTK-Test-Windows-SlaveTest\Examples\SequenceToSequence\PennTreebank\Data/ptb.test.txt'.

Post-processing network...

3 roots:
	PosteriorProb = Softmax()
	TrainNodeClassBasedCrossEntropy = ClassBasedCrossEntropyWithSoftmax()
	outputs = TransposeTimes()

Loop[0] --> Loop_AutoName38 -> 31 nodes

	AutoName3	AutoName31	AutoName34
	AutoName2	AutoName22	AutoName25
	AutoName6	AutoName21	AutoName26
	AutoName27	AutoName7	AutoName28
	AutoName1	AutoName9	AutoName12
	AutoName5	AutoName8	AutoName13
	AutoName14	AutoName4	AutoName15
	AutoName16	AutoName18	AutoName19
	AutoName20	AutoName29	AutoName30
	AutoName35	AutoName36	AutoName37
	AutoName38

Validating network. 63 nodes to process in pass 1.

Validating --> W2 = LearnableParameter() :  -> [200 x 10000]
Validating --> WXO0 = LearnableParameter() :  -> [200 x 150]
Validating --> E0 = LearnableParameter() :  -> [150 x 10000]
Validating --> features = SparseInputValue() :  -> [10000 x *1]
Validating --> LookupTable = LookupTable (E0, features) : [150 x 10000], [10000 x *1] -> [150 x *1]
Validating --> AutoName32 = Times (WXO0, LookupTable) : [200 x 150], [150 x *1] -> [200 x *1]
Validating --> bo0 = LearnableParameter() :  -> [200 x 1]
Validating --> AutoName33 = Plus (AutoName32, bo0) : [200 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> WHO0 = LearnableParameter() :  -> [200 x 200]
Validating --> WCO0 = LearnableParameter() :  -> [200 x 1]
Validating --> WXF0 = LearnableParameter() :  -> [200 x 150]
Validating --> AutoName23 = Times (WXF0, LookupTable) : [200 x 150], [150 x *1] -> [200 x *1]
Validating --> bf0 = LearnableParameter() :  -> [200 x 1]
Validating --> AutoName24 = Plus (AutoName23, bf0) : [200 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> WHF0 = LearnableParameter() :  -> [200 x 200]
Validating --> WCF0 = LearnableParameter() :  -> [200 x 1]
Validating --> WXI0 = LearnableParameter() :  -> [200 x 150]
Validating --> AutoName10 = Times (WXI0, LookupTable) : [200 x 150], [150 x *1] -> [200 x *1]
Validating --> bi0 = LearnableParameter() :  -> [200 x 1]
Validating --> AutoName11 = Plus (AutoName10, bi0) : [200 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> WHI0 = LearnableParameter() :  -> [200 x 200]
Validating --> WCI0 = LearnableParameter() :  -> [200 x 1]
Validating --> WXC0 = LearnableParameter() :  -> [200 x 150]
Validating --> AutoName17 = Times (WXC0, LookupTable) : [200 x 150], [150 x *1] -> [200 x *1]
Validating --> WHC0 = LearnableParameter() :  -> [200 x 200]
Validating --> bc0 = LearnableParameter() :  -> [200 x 1]
Validating --> AutoName31 = Times (WHO0, AutoName3) : [200 x 200], [200 x 1] -> [200 x 1]
Validating --> AutoName34 = Plus (AutoName33, AutoName31) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName22 = Times (WHF0, AutoName2) : [200 x 200], [200 x 1] -> [200 x 1]
Validating --> AutoName25 = Plus (AutoName24, AutoName22) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName21 = DiagTimes (WCF0, AutoName6) : [200 x 1], [200 x 1] -> [200 x 1]
Validating --> AutoName26 = Plus (AutoName25, AutoName21) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName27 = Sigmoid (AutoName26) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName28 = ElementTimes (AutoName27, AutoName7) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName9 = Times (WHI0, AutoName1) : [200 x 200], [200 x 1] -> [200 x 1]
Validating --> AutoName12 = Plus (AutoName11, AutoName9) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName8 = DiagTimes (WCI0, AutoName5) : [200 x 1], [200 x 1] -> [200 x 1]
Validating --> AutoName13 = Plus (AutoName12, AutoName8) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName14 = Sigmoid (AutoName13) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName15 = Times (WHC0, AutoName4) : [200 x 200], [200 x 1] -> [200 x 1]
Validating --> AutoName16 = Plus (AutoName15, bc0) : [200 x 1], [200 x 1] -> [200 x 1]
Validating --> AutoName18 = Plus (AutoName17, AutoName16) : [200 x *1], [200 x 1] -> [200 x 1 x *1]
Validating --> AutoName19 = Tanh (AutoName18) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName20 = ElementTimes (AutoName14, AutoName19) : [200 x 1 x *1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName29 = Plus (AutoName28, AutoName20) : [200 x 1 x *1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName30 = DiagTimes (WCO0, AutoName29) : [200 x 1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName35 = Plus (AutoName34, AutoName30) : [200 x 1 x *1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName36 = Sigmoid (AutoName35) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName37 = Tanh (AutoName29) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName38 = ElementTimes (AutoName36, AutoName37) : [200 x 1 x *1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> outputs = TransposeTimes (W2, AutoName38) : [200 x 10000], [200 x 1 x *1] -> [10000 x 1 x *1]
Validating --> PosteriorProb = Softmax (outputs) : [10000 x 1 x *1] -> [10000 x 1 x *1]
Validating --> labels = InputValue() :  -> [4 x *1]
Validating --> WeightForClassPostProb = LearnableParameter() :  -> [50 x 200]
Validating --> ClassPostProb = Times (WeightForClassPostProb, AutoName38) : [50 x 200], [200 x 1 x *1] -> [50 x 1 x *1]
Validating --> TrainNodeClassBasedCrossEntropy = ClassBasedCrossEntropyWithSoftmax (labels, AutoName38, W2, ClassPostProb) : [4 x *1], [200 x 1 x *1], [200 x 10000], [50 x 1 x *1] -> [1]

Validating network. 43 nodes to process in pass 2.

Validating --> AutoName3 = PastValue (AutoName38) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName31 = Times (WHO0, AutoName3) : [200 x 200], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName2 = PastValue (AutoName38) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName22 = Times (WHF0, AutoName2) : [200 x 200], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName6 = PastValue (AutoName29) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName21 = DiagTimes (WCF0, AutoName6) : [200 x 1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName7 = PastValue (AutoName29) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName1 = PastValue (AutoName38) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName9 = Times (WHI0, AutoName1) : [200 x 200], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName5 = PastValue (AutoName29) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName8 = DiagTimes (WCI0, AutoName5) : [200 x 1], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName4 = PastValue (AutoName38) : [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName15 = Times (WHC0, AutoName4) : [200 x 200], [200 x 1 x *1] -> [200 x 1 x *1]
Validating --> AutoName16 = Plus (AutoName15, bc0) : [200 x 1 x *1], [200 x 1] -> [200 x 1 x *1]

Validating network. 14 nodes to process in pass 3.


Validating network, final pass.




Post-processing network complete.

evalNodeNames are not specified, using all the default evalnodes and training criterion nodes.


Allocating matrices for forward and/or backward propagation.

Memory Sharing: Out of 63 matrices, 2 are shared as 1, and 61 are not shared.

	{ PosteriorProb : [10000 x 1 x *1]
	  outputs : [10000 x 1 x *1] }

LMSequenceReader: Reading epoch data... 3760 sequences read.
12/21/2016 00:53:29: Minibatch[1-1]: TrainNodeClassBasedCrossEntropy = 6.46412263 * 3456
12/21/2016 00:53:29: Final Results: Minibatch[1-1]: TrainNodeClassBasedCrossEntropy = 6.46412263 * 3456; perplexity = 641.70110769

12/21/2016 00:53:29: Action "eval" complete.

12/21/2016 00:53:29: __COMPLETED__
