Skip to content

songzLi/Coded-TeraSort

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Coded-TeraSort

Utilizing in-network coding to trade extra computations for more communication bandwidth in a distributed sorting algorithm (TeraSort). The paper for this project is available at https://arxiv.org/abs/1702.04850.

Requirements

  • C/C++ compiler (g++)
  • OpenMPI library

Usage

Input File

A file containing data to be sorted must be placed in the input directory. Note that the format of the data points follows standard TeraSort input data. Each record contains a 10-byte key and a 90-byte value. An input file can be generated by TeraSort Example.

TeraSort Execution

Specify in Configuration.h:

  • numReducer: number of distributed computing nodes
  • inputPath: a path to the input file

Run make to compile TeraSort.

Run ./Splitter to split the input data points.

Run mpirun -np 4 ./TeraSort.

The above execution creates 3 computing processes and 1 master process to sorts data according to TeraSort algorithm. All processes are local.

Coded-TeraSort Execution

Specify in CodedConfiguration.h:

  • numReducer: number of distributed computing nodes
  • load: number of nodes on which each data point is processed (computation load)
  • numInput is set to be equal to (numReducer choose load)
  • inputPath: a path to the input file

Run make to compile CodedTeraSort.

Run ./Splitter code to split the input data points.

Run mpirun -np 4 ./CodedTeraSort.

The above execution creates 3 computing processes and 1 master process to sorts data according to CodedTeraSort algorithm. All processes are local.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •