This document describes how to run the main experiments in the SOSP 2019 paper. The goal of this document is to satisfy the ACM "Artifact Functional" requirements.
Most of the setup required in this document in done already in the Amazon AWS AMI image described below. In particular, the steps that are pre-installed on this AMI are marked as "(preinstalled)."
We have only tested the experiments on an m4.10xlarge Amazon EC2 instance. We have made a public AMI that can be used to run everything, with all necessary packages etc. pre-installed.
| Field | Value |
|---|---|
| Cloud Provider | AWS |
| Region | us-west-2 |
| AMI ID | ami-0312f629d1551e6b2 |
| AMI Name | split-annotations-public-sosp19 |
| Instance Type | m4.10xlarge |
See this link for how to find and launch a public AMI (this assumes you have a valid billable AWS account setup).
For anyone running outside of this environment, the assumed system requirements are:
- At least 150GB of RAM
- At least 200GB of disk space
- At least 16 cores that can compile an optimized version of Intel MKL
- Running Ubuntu 16.04 with a recent Linux kernel (we've only tested it on 4.4.0)
- (preinstalled) Follow the build instructions in the README in this directory. Add the following to your
rcfile:
export WELD_HOME=$HOME/weld/ # We will install Weld here for the Weld baselines
export SA_HOME=<path-to-this-repo> # this directory should live in $HOME.
export PATH=$SA_HOME/c/target/release:$PATH
export LD_LIBRARY_PATH=$SA_HOME/c/target/release:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$SA_HOME/c/lib/composer_mkl:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$SA_HOME/c/lib/ImageMagick:$LD_LIBRARY_PATH
And then:
# Or whatever your rc file is
source ~/.bashrc- (preinstalled) Install Intel MKL and ImageMagick, as described below. If these are preinstalled, skip to step 3 below.
We tested our code with MKL 2018 (Update 2). To install, try the following:
wget http://registrationcenter-download.intel.com/akdlm/irc_nas/tec/12725/l_mkl_2018.2.199.tgz
tar zxvf l_mkl_2018.2.199.tgz
cd l_mkl_2018.2.199
./install.shand follow the on-screen instructions. If using EC2, we suggest using the second installation option, "Install using sudo privileges."
If the wget doesn't work, visit this link and follow the instructions below.
- Fill out the information requested in the form and click "Submit"
- In the dropdown menu stating "Please select a Product" choose "Intel Math Kernel Library for Linux"
- Under "Choose a Version" choose "Intel MKL 2018 (Update 2)"
- Right-click "Full Package" and copy the link.
wgetas above, and the continue below.
Once MKL is set up, make sure that the $MKLROOT environment variable is set to the correct value. On our system, it is set to the following:
/opt/intel/compilers_and_libraries_2018.2.199/linux/mklWe suggest adding it to your rc file:
export MKLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl
source /opt/intel/bin/compilervars.sh intel64And then source ~/.bashrc (or whatever your rc file is for your shell).
We use ImageMagick-7 in our benchmarks. To install:
- Make sure build tools are available and up to date, and install
libtiff5:
sudo apt-get update
sudo apt-get install build-essential libtiff5-dev
sudo ldconfig- Install ImageMagick from source:
cd $HOME
wget https://www.imagemagick.org/download/ImageMagick.tar.gz
tar xvzf ImageMagick.tar.gz
# Your minor version may be different, but the major version should be 7
cd ImageMagick-7.0.8-59- Configure, build and install:
./configure --with-tiff=yes
# Set to number of cores on your machine
make -j 40
sudo make install
sudo ldconfig- Make sure everything worked:
magick -version | head -1You should see ImageMagick 7.xxxx.
- Build the annotated libraries. Assuming
$SA_HOMEis the root directory:
cd $SA_HOME/c/lib/composer_mkl
make
cd $SA_HOME/c/lib/ImageMagick
make- Run the benchmarks using the provided script. We suggest doing this in a
tmuxsession, since it will take some time to complete. This will also download all the data needed to run the benchmarks. Make sure you change to the correct directory first, because some things use relative directories:
cd $SA_HOME/c/benchmarks/
./run-all.shThe results will be in the $SA_HOME/c/benchmarks/results directory.
- (preinstalled) Install the necessary packages:
sudo apt-get install python2.7-dev python3.5-dev unzip virtualenv- To run the Python experiments, go to the benchmark directory and run the provided
run-all.shscript. This will set up an environment and download the necessary data, and run everything:
cd $SA_HOME/python/benchmarks
./run-all.shThe results will be in the $SA_HOME/python/benchmarks/results directory.
Since Weld requires slightly different Python distribution requirements and other dependencies, we run them in a separate virtual environment. Make sure everything
is run from the appropriate directory (e.g., $HOME if cd $HOME is specified):
- (preinstalled) Clone the Weld repo. Make sure you are the
v0.2.0branch, which supports multi-threading.
cd $HOME
git clone -b v0.2.0 https://github.com/weld-project/weld.git- (preinstalled) Make sure LLVM is installed and that everything is configured properly. In particular, you should be able to run
llvm-config --versionand see6.x.x. If you don't have LLVM, run the following, which downloads all the Weld requirements:
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main"
sudo apt-get updateThen install:
sudo apt-get install llvm-6.0-dev clang-6.0 zlib1g-devAnd link:
sudo ln -s /usr/bin/llvm-config-6.0 /usr/local/bin/llvm-config- (preinstalled) Build Weld. You should already have Rust installed for Mozart:
cd $HOME/weld
cargo build --release- (preinstalled) Clone the Weld experiments.
cd $HOME
git clone https://github.com/sppalkia/weld-experiments-mozart.git- Run the experiments. This will build the Weld versions of Pandas and NumPy, setup
a environment, install the requirements, and run each experiment. NOTE: these should be run after running the experiments in the main repository, because the
run-all.shscript will generate data that this script accesses.
cd weld-experiments-mozart
# Run all the benchmarks
./run-all.shThis should conclude the main results of the paper. Please email shoumik@cs.stanford.edu with any questions.