RoBin

Robin is a Robustness Benchmark for range indexes (especially for updatable learned indexes).

Robin is also an insect-eating bird that offers great benefits to agriculture.

Notice

Face dataset contains the numeric_limit<uint64_t>, Some indexes may use this as a sential for easing implementation. Therefore, we shifted all fb keys by minus one as Face-1 dataset.
We modify the LIPP/SALI's hyperparameter MAX_DEPTH to ensure it can successfully run all the test cases (otherwise it will crash due to its assertion at runtime).
We modify the bulkload process of STX B+tree to ensure its node half filled (load factor = 0.5) after bulkloaading, which aligns its insertions and splits to show its performance robustness.
Other parameters of all indexes are the same as their original implementations.
All of our tested index implementations can be found in this repo. Each branch is corresponding to one index.
We add profiling stats for art, btree, alex and lipp about the distribution of depth, comparison count of leaf node search, the model of root node and so on, with minor invasion.

Reproduce Step

If you want to go faster, you can just run the following script to install the dependencies download the dataset and build the project:

bash prepare.sh

Prepare

RoBin depends on the tbb, jemalloc and boost library. You can install them by the following command:

sudo apt update
sudo apt install -y libtbb-dev libjemalloc-dev libboost-dev

If the repository is not cloned with the --recursive option, you can run the following command to clone the submodule:

git submodule update --init --recursive

Download the dataset from remote and construct linear and fb-1:

cd datasets
bash download.sh
python3 gen_linear_fb-1.py

Build

rm -rf build
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
make -j

or just run the following script:

bash build.sh

Reproduce

Benchmark all the competitors via RoBin with the following command:

It may cost some time to finish.

bash reproduce.sh

The results will be stored in the results directory.

Using the jupyter notebook to plot the results:

cd results
# open and run the following jupyter notebook to reproduce the figure in our paper
# such as single_thread.ipynb and etc.

Profiling

Build with the flag "PROFILING":

rm -rf build
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DPROFILING=ON
make -j

or just run the following script:

bash build.sh profiling

Note that our code modifications for profiling have no impact on index performance when building without this flag for benchmark test.

Run profiling script:

bash run_case_profiling.sh  # (recommended) minimal case study to reproduce the figures in our paper [2~3 hours]
# bash run_all_profiling.sh   # all case profiling which may take large amount of running time and disk space

Using the jupyter notebook to plot the profiling results:

cd profiling_result
mkdir -p fig
## open and run the following jupyter notebooks to reproduce the figures in our paper
## analysis_depth.ipynb
## analysis_memory.ipynb
## analysis_overfit.ipynb
## analysis_smo.ipynb

Run and Play

We also provide a script to run the RoBin with custom parameters. You can run the following command to see the help information:

python3 run.py --help

Reference

We build this benchmark based on a well-designed benchmark GRE. The related paper is:

Wongkham, Chaichon, et al. "Are updatable learned indexes ready?." Proceedings of the VLDB Endowment 15.11 (2022): 3004-3017.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RoBin

Notice

Reproduce Step

Prepare

Build

Reproduce

Profiling

Run and Play

Reference

About

Uh oh!

Releases 74

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
.github/workflows		.github/workflows
cmake		cmake
datasets		datasets
profiling_result		profiling_result
result		result
script		script
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
build.sh		build.sh
prepare.sh		prepare.sh
reproduce.sh		reproduce.sh
run.py		run.py
run_all_profiling.sh		run_all_profiling.sh
run_case_profiling.sh		run_case_profiling.sh

cds-ruc/RoBin

Folders and files

Latest commit

History

Repository files navigation

RoBin

Notice

Reproduce Step

Prepare

Build

Reproduce

Profiling

Run and Play

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 74

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages