Fft based convolutional layer #542

borisgin · 2014-06-26T11:14:58Z

I implemented Convolutional layer with fft-based Forward() . There is no FFT support in Backward() yet.
The implementation is based on FFTW3 library. It was tested both with native FFTW3 and with MKL.
Into addition it supports OpenMP to utilize all cores. Was tested with native gcc OpenMP and with MKL.
The current version is CPU-only . Is anybody interested in doing CUDA - version?
My impression, based on current CPU implementation (FFT+openMP), is that FFT-based convolutional layer makes sense only for large kernels (kernel_size / stride >= 7). There are more details on benchmark below:
I modified net_speed-benchmark to test Forward() only, then I took "examples/imagenet" imagenet topology and modified two first convolutonal layers:

batch = 128
stride = 1
kernel = { 5,7, 9,11,13,15}

10 forward iterations
For each kernel I slightly changed crop size in data layers to make map size FFT- friendly (128, 256,..) The results are below (time is seconds for 10 forward iterations):

Layer Kernel Input Output base,sec fft,sec
conv1 15 128x3x242x242 128x96x228x228 79 28
conv2 15 128x96x114x114 128x256x104x104 549 168
----------------------------------------------------------------------------
conv1 13 128x3x244x244 128x96x232x232 58 30
conv2 13 128x96x116x116 128x256x108x108 431 170
----------------------------------------------------------------------------
conv1 11 128x3x246x246 128x96x236x236 44 28
conv2 11 128x96x118x118 128x256x112x112 314 168
---------------------------------------------------------------------------
conv1 9 128x3x248x248 128x96x240x240 33 29
conv2 9 128x96x120x120 128x256x116x116 230 170
--------------------------------------------------------------------------
conv1 7 128x3x250x250 128x96x244x244 23 29
conv2 7 128x96x122x122 128x256x122x122 152 170
-------------------------------------------------------------------------
conv1 5 128x3x252x252 128x96x248x248 16 28
conv2 5 128x96x124x124 128x256x120x120 83 167

multiple test nets

for first/last 500 iterations

Specify net params in solver; log {Net,Solver} parameters; multiple test nets

Conflicts: src/caffe/proto/caffe.proto

Conflicts: include/caffe/vision_layers.hpp src/caffe/proto/caffe.proto

1701 is the canonical random seed, and as this test makes only one call for seeding there's no need for a member var.

Reproduce elementwise product layer in more generality. Add elementwise operation parameter. Prepare for elementwise sum operation choice.

cpu/gpu and leveldb/lmdb; now just one copy of each test body

Minor Net::Init refactoring: name loop indices, add helpers

Add support for LMDB (LevelDB alternative)

Otherwise initialization will be performed on whichever device is default.

Fix Caffe::SetDevice to avoid initializing on default device

No openmp.

flatten Forward_fft_task added condition when fft used: kernel_size/ stride > 3.

added openmp support put all fftw wrappers into caffe/util/fft.hpp and caffe.cpp files

cleaned makefile and makefile.config

replaced build-in std::complex multiplication by c-style implementation

kloudkl · 2014-06-26T12:52:26Z

This is great! But you need to set the target of the PR to BVLC:dev. GitHub does not allow one to change it. So you have to replace this with a new one.

borisgin · 2014-06-26T14:23:29Z

Done

borisgin · 2014-06-26T14:46:53Z

I used borisgin/dev to rebase. Should I rebase wrt BLVC/dev ?

sguada · 2014-06-26T15:28:58Z

Yeah you should rebase against BVLC/dev

On Thursday, June 26, 2014, Boris Ginzburg [email protected] wrote:

I used borisgin/dev to rebase. Should I rebase wrt BLVC/dev ?

—
Reply to this email directly or view it on GitHub
#542 (comment).

Sergio

jeffdonahue and others added 30 commits May 9, 2014 19:51

specify NetParameters directly in the SolverParameter

65ef9ff

log {Net,Solver}Parameters on Init

bf66ac2

allow multiple test nets

c97fff6

add a lenet example of specifying train/test net directly in solver;

b64c597

multiple test nets

multiple test_iter

2cd46db

lint and two test_iters in lenet_consolidated_solver

0ec86f2

add script to run lenet_consolidated_solver and add comment with results

5116659

for first/last 500 iterations

fix proto comment for multiple test nets

41da421

require either train_net or train_net_param to be specified

ba2875b

fix detection notebook link

b6da40a

link canonical bvlc site

5c982ee

Merge pull request BVLC#404 from jeffdonahue/net-param-in-solver

c30bb24

Specify net params in solver; log {Net,Solver} parameters; multiple test nets

Added ArgMax Layer

84788c6

Conflicts: src/caffe/proto/caffe.proto

Added Test for ArgMax Layer

cdebe7a

Fixed numbers in proto and name of ArgMaxParameter

69dbbc2

Conflicts: src/caffe/proto/caffe.proto

Fix types of ArgMax Layers params

bdcd75e

Conflicts: include/caffe/vision_layers.hpp src/caffe/proto/caffe.proto

Added FLT_MAX to argmax layer

d19c180

Added missing ;

aa57dfb

Fixed name of ArgMaxLayerParameter

d6748cb

Fixed name of blob_bottom_

91ab1f6

Change ThresholdLayer to ArgMaxLayer in test_argmax

766dd36

Change ArgMaxLayerParam to ArgMaxParam for consitency

f6c2d93

corrected the caffe.proto ids

a3fbe2d

Documented ArgMax layer in vision_layers.hpp

a31dc65

Fixed lint errors due to ArgmaxLayer

0033f9c

setting canonical random seed

4d52ca7

Merge pull request BVLC#421 from sguada/argmax_layer

efbea35

Revert "setting canonical random seed"

a13e7ee

1701 is the canonical random seed, and as this test makes only one call for seeding there's no need for a member var.

EltwiseProductLayer -> EltwiseLayer for generality

ffedfa6

Reproduce elementwise product layer in more generality. Add elementwise operation parameter. Prepare for elementwise sum operation choice.

add caffe_gpu_add() and caffe_gpu_sub()

b23b2ee

mavenlin and others added 25 commits June 12, 2014 22:23

fixed cpplint error

c77c6e1

lint

c17e413

unify test_data_layer tests

f8ebd09

unify data layer tests: was copied four times for all combinations of

1dd0d94

cpu/gpu and leveldb/lmdb; now just one copy of each test body

add net surgery link to docs (+ drop old comment)

27ece93

Merge pull request BVLC#495 from jeffdonahue/refactor-net

6b6ab33

Minor Net::Init refactoring: name loop indices, add helpers

add lmdb support for convert_imageset

e0ce933

add lmdb support for compute_image_mean

01481d5

fix string compare error

dbc5ebf

Merge pull request BVLC#431 from mavenlin/lmdb

b50f591

Add support for LMDB (LevelDB alternative)

in Caffe::SetDevice, call cudaSetDevice before Get

47ef562

Otherwise initialization will be performed on whichever device is default.

Merge pull request BVLC#507 from longjon/set-device-early

0d7aadc

Fix Caffe::SetDevice to avoid initializing on default device

First version of fft.

265bb15

No openmp.

fft + openmp

745bdc1

flatten Forward_fft_task added condition when fft used: kernel_size/ stride > 3.

cleaned style by lint

b8cc1df

added Makefile.config

5045ced

rewrote fft using fftw3 library

a822c2c

added openmp support put all fftw wrappers into caffe/util/fft.hpp and caffe.cpp files

caffe_cpu_fft package: fftw based + openMP + MKL

2d4f1b3

cleaned makefile and makefile.config

cleaned style with link

c280310

more lint...

d417f25

even more lint

3015b2e

rebase with dev

ea85572

modified conv_layer.cpp:

5ad150b

replaced build-in std::complex multiplication by c-style implementation

lint cleaning

5a67809

more cleaning

3a0c54f

borisgin closed this Jun 26, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fft based convolutional layer #542

Fft based convolutional layer #542

Uh oh!

borisgin commented Jun 26, 2014

Uh oh!

kloudkl commented Jun 26, 2014

Uh oh!

borisgin commented Jun 26, 2014

Uh oh!

borisgin commented Jun 26, 2014

Uh oh!

sguada commented Jun 26, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Fft based convolutional layer #542

Fft based convolutional layer #542

Uh oh!

Conversation

borisgin commented Jun 26, 2014

Uh oh!

kloudkl commented Jun 26, 2014

Uh oh!

borisgin commented Jun 26, 2014

Uh oh!

borisgin commented Jun 26, 2014

Uh oh!

sguada commented Jun 26, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants