[opendc-sc18-software] A Reference Architecture for Datacenter Scheduling: Software Artifacts
This release contains the software artifacts of the paper A Reference Architecture for Datacenter Scheduling presented at Supercomputing 2018
For the paper, experiments have been run on the following traces:
- Askalon (W-Eng) -
- Chronos (W-Ind) -
Each of the directories for the traces have the following structure:
This text file describes the trace used for the experiment in addition to the amount of times the experiment was repeated and the amount of warm-up experiments.
This JSON file describes the topology of the datacenter used in the experiments. Each item represents the identifiers of the resource (here, CPU type) to use in the machine. The available CPU types are (1) Intel i7 (4 cores, 4100 MHz) and (2) Intel i5 (2 cores, 3500 MHz).
This directory contains the trace used in the simulation. The trace is stored in the Grid Workload Format. See the Grid Workload Archive for more information.
A CSV file containing information of all simulations that have been run on the OpenDC platform for this experiment.
A CSV file containing metrics (NSL, JMS, etc.) for each job that ran during the simulations.
A CSV file containing timing measurements for the scheduling stages that ran during the simulations.
A CSV file containing metrics for each task that ran during the simulations.
A CSV file containing information about the tasks (submit time, runtime, etc.) that ran during the simulations as extracted from the traces.
Additionally, we describe the format of each data file in the associated metadata file.
The hardware used for running the experiments is a MacBook Pro with a 2,9 GHz Intel Core i7 processor and 16 GB 2133 MHz LPDDR3 internal memory.
This section describes the instructions for reproducing the paper results using a provided Docker image. Please make sure you have Docker installed and running.
For reproduction, you will run the following experiments:
This is the large experiment of the paper and will take approximately 4 hours to complete similar hardware.
This is the smaller experiment of the paper and will take approximately 5 minutes to complete on similar hardware.
The Docker image
atlargeresearch/sc18-experiment-runner can be used for running the experiments. A volume can be attached to the directory
/home/gradle/simulator/data to capture the results of the experiments.
Make sure you have, in your current working directory, the following files:
This JSON file describes the topology of the datacenter and can be found in this archive at
This file contains the trace for the Askalon workload. This file can be found in the archive at
This file contains the trace for the Chronos workload. This file can be found in the archive at
Then, you can start the Askalon experiments as follows:
$ docker run -it --rm -v $(pwd):/home/gradle/simulator/data atlargeresearch/sc18-experiment-runner -r 32 -w 4 -s data/setup.json data/askalon_workload_ee.gwf
The experiment runner can be configured with the following options
- -r, --repeat
The amount of times to repeat an experiment for each scheduler.
- -w, --warm-up
The amount of times to warm-up the simulator for each scheduler.
- -p, --parallelism
The number of experiments to run in parallel.
The list of schedulers to test, separated by spaces. The following schedulers are available:
After the Askalon experiments have been finished, you can start the Chronos experiments. Make sure you have a copy of the result files in your directory as the result files will be overwritten.
$ docker run -it --rm -v $(pwd):/home/gradle/simulator/data atlargeresearch/sc18-experiment-runner -r 32 -w 4 -s data/setup.json data/chronos_exp_noscaler_ca.gwf