-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Implement Krizhevsky-style relighting augmentation #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Travis CI build for "WITH_CUDA=false WITH_CMAKE=false" failed on BlobMathTest, which as far as I can see has no relation to my patch. Running same test on my machine with CPU only and not using CMake this test passes. Looks like numerical error:
EDIT: re-running Travis CI build passed with no changes - this test needs to be fixed. |
Just discovered there is an overflow issue running the compute_image_pca util on very large datasets (i.e. imagenet training set) since it is summing over all pixels, had only used it on smaller datasets so far. Will re-implement using a numerically stable algorithm ASAP. |
Thanks for the augmentation!
Good choice. It's best to bundle these in the definition for clarity.
No argument. The data transformer was at least a step to reduce duplication across data layers, but the data pipeline deserves attention to make it simpler and more general. |
Re: #1865 (comment) @jeffdonahue could you relax the |
Wow, doing the covariance calculations on Imagenet is a numerical nightmare! I've implemented an online method for calculating mean/co-variance that works well though - the side-benefit is it only needs one pass of the dataset now, and it seems to work nicely for Imagenet. I get the following eigenvectors/values, which are around what I got when I ran the naive implementation on a 1% subset of Imagenet:
There doesn't seem to be any published numbers to compare to unfortunately. Future work might be to vectorize this code by calculating the covariance for a block of samples at a time, and combining with the online covariance according to the rule (see end of covariance section of http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Covariance) . Tried doing this already but it's tricky to get it numerically correct for an image-sized block, and didn't speed things up as much as I'd hoped. Might make more sense to just run on a smaller subset of the data however if speed is critical. |
…incipal components specified in TransformParameter. Also adds utility to calculate pixelwise principle components (eigenvectors/values) for a given dataset. Use online algorithm for calculating the mean and covariance matrix to avoid numerical issues on large datasets.
59eed7c
to
3b9f419
Compare
Do you have a demo about how to use the compute_image_pca.cpp ? |
Simply run it like you would the compute_image_mean util, i.e. give it the lmdb/leveldb as a parameter - that's all. It prints out usage when you call it with no params, although it looks like it's slightly wrong at the moment, in that there is no output required (or created). The mean and PCA are simply printed to the screen. |
@yanii What type of accuracy improvements did you see after adding this augmentation step? |
@vimalthilak sorry for the late response! Was working like crazy towards the ICCV deadline. I haven't run with with a standard network I'm afraid - I will try to do so when I have a spare GPU and report it though. Please feel free to try it out and let me know though. |
Closing since the dev branch is deprecated. Please send PRs to master. |
Hi all,
I'm trying to add the various augmentation tricks to the Caffe code which are sorely missed at the moment, and this is my first patch towards that. Later much more involved patch for full-image cropping/scale augmentation to come.
This attempts to implement the relighting augmentation as described in the Krizhevsky supervision paper. Towards this, there is a utility to calculate both the pixelwise mean (of floating point precision) and pixelwise principal components of a given data set (e.g. LMDB).
There are some design decisions to think about - most notably how to store/load the eigenvectors/eigenvalues. I took the approach of loading these directly from the prototxt parameters, just as pixelwise mean is done, since they are not very large and this is a bit more transparent. I could also see the view that they should be stored as a separate proto file however, as the image-wise mean is done.
I really don't like the code duplication that exists in data_transformer at the moment (i.e. for the mean/croppping), etc, but I followed the present format and repeated the relighting code for each of the various data formats, but I personally think there should be some work into refactoring this in future. I recognize that due to differences in the formats this isn't that simple though.
Yani