-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Description
In #508, I thought that the DataLayer and the WindowDataLayer was free of this issue. I was wrong.
In the DataLayer, tools/test_net.bin core dumped again.
(mirror && layer->PrefetchRand() % 2) {
As before, the trouble was caused by the complex condition to decide whether to initialize the prefetch_rng_. Maybe such a condition is not necessary at all.
const bool prefetch_needs_rand = (phase_ == Caffe::TRAIN) &&
(this->layer_param_.data_param().mirror() ||
this->layer_param_.data_param().crop_size());
I do not open a new PR to fix it directly because these bugs reflect the more fundamental design defects in the data layers. The logic in the data layers has become too intricate to reason, develop and debug intuitively. When the ImageDataLayer and the WindowDataLayer were written, many lines were in fact copied from the DataLayer. Therefore, it is not surprising that they suffer from the same segmentation fault bug. The duplicated codes may also include other complicated issues not exposed yet.
In the past, several PRs including #407 have tried to separate the data IO and processing modules. But unfortunately, none of them completed the mission. It is very important to refactor the data layers as soon as possible to allow to add more data sources and formats with little to no overhead.