Skip to content

RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers

License

Notifications You must be signed in to change notification settings

netx-repo/RackSched

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

0. Introduction

This repository contains the source code for our OSDI'20 paper "RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers".

1. Content

  • client_code/
    • client/: dpdk code for default client.
    • cs-client/: Client(100) in Figure 14, where the clients track the queue lengths.
    • failure-client/: used in Figure 17(a).
    • r2p2-client/: used for R2P2 in Figure 14.
    • reconfig-client/: used in Figure 17(b).
  • server_code/
    • r2p2/: used for R2P2 in Figure 14.
    • shinjuku/: default Shinjuku server.
    • shinjuku-rocskdb/: used in Figure 13.
  • switch_code/
    • basic_switch/: only do ipv4 routing.
    • ht/: multi-stage register arrays for ReqTable.
    • includes/: packet header, parser and routing table.
    • qlen/: register arrays for LoadTable.
    • int2/: used in Figure 16, which only tracks the minimum number of outstanding requests.
    • p4_sq/: used for Sampling-4 in Figure 15.
    • proactive/: used for Proactive in Figure 16.
    • r2p2/: used for R2P2 in Figure 14.
    • random_schedule/: used for Shinjuku by default.
    • random_schedule_2server/: used for Shinjuku(2) in Figure 12.
    • random_schedule_4server/: used for Shinjuku(4) in Figure 12.
    • rr_schedule/: used for RR in Figure 15.
    • rscs/: used for RSCS by default.
    • rscs_2server/: used for RSCS(2) in Figure 12.
    • rscs_4server/: used for RSCS(4) in Figure 12.
    • rscs_multi/: used in Figure 17(a), which has the ReqTable to store the connection states.
    • server_reconfig/: used in Figure 17(b).
    • shortest/: used for Shortest in Figure 15.
    • po2.p4: used for power-of-2 choices
  • console.py: A script to help run evaluations.
  • config.py: Some parameters to configure.
  • README.md: This file.

2. Environment requirement

  • Hardware
    • A Barefoot Tofino switch.
    • Servers with a DPDK-compatible NIC (we used an Intel XL710 for 40GbE QSFP+) and multi-core CPU.
  • Software
    The current version of RSCS is tested on:
    • Barefoot P4 Studio (version 8.9.1 or later).
    • DPDK (16.11.1) for the clients.
    • Linux kernel 4.4 and gcc version 5.5 for Shinjuku servers.
      We provide easy-to-use scripts to run the experiments and to analyze the results. To use the scripts, you need:
    • Python 3.6+, paramiko at your endhost.
      pip3 install paramiko

3. How to run

  • Configure the parameters in the files based on your environment
    • config.py: provide the information of your servers (username, passwd, hostname, dir).
  • Environment setup
    • Setup the switch
      • Setup the necessary environment variables to point to the appropriate locations.
      • Copy the files to the switch.
      • python3 console.py sync_switch
    • Compile p4 programs.
      • python3 console.py compile_switch <prog_name>
        This will take a couple of minutes. You can check switch_code/logs/p4_compile.log in the switch to see if it's finished. Example: python3 console.py compile_switch rscs
  • Setup the servers
    • Setup DPDK environment (install dpdk, and set correct environment variables).
    • Copy the files to the servers.
      • python3 console.py sync_server
    • For Shinjuku-based basic and RocksDB servers: install the necessary libraries and pull the dependencies
      • python3 console.py setup_basic_server or python3 console.py setup_rocksdb_server
      • Note that depending on your specific machine configuration and networking condition, the time to finish the above command may vary. We have reserved 60s for basic server and 90s for RocksDB server.
      • Also note that sometimes the dependency source will randomly reset the connection when too many servers fetch dependencies at the same time, causing some servers successfully fetch the dependencies while the others fail. In this case, please try running the command for another time or manually fetch the dependency by running ./deps/fetch-deps.sh in the corresponding server directory.
    • For R2P2 server: build the customized DPDK and setup the DPDK related environment variables
      • python3 console.py setup_r2p2_server
      • Building DPDK takes time, we reserved 180s for it to complete.
    • Building the server is incorporated in running the server.
  • Build the clients
    • For all kinds of clients, install DPDK, and set correct environment variables).
      • python3 console.py install_dpdk
    • make/build the client.
  • Run the programs
    • Run p4 program on the switch
      • python3 console.py run_switch rscs
        It will bring up both the data-plane module and the control-plane module. It may take up to 150 seconds (may vary between devices). You can check switch_code/logs/run_ptf_test.log in the switch to see if it's finished (it will output the real-time queue length list).
    • Run Shinjuku servers and clients to reproduce results in the paper
      • python3 console.py fig_*
  • Kill the processes
    • Kill the switch process
      • python3 console.py kill_switch
    • Kill the Shinjuku or R2P2 processes
      • python3 console.py kill_server
    • Kill the client processes
      • python3 console.py kill_client
    • Kill all the processes (switch, servers, clients)
      • python3 console.py kill_all
  • Other commands
    There are also some other commands you can use:
    • python3 console.py sync_switch
      copy the local "switch_code" to the switch
    • python3 console.py sync_server
      copy the local "server_code" to the servers
    • python3 console.py sync_client
      copy the local "client_code" to the clients

4. How to reproduce the results

NOTE We recommend running RocksDB(python3 console.py fig_13) and R2P2(python3 console.py fig_14_r2p2) at last, as these experiments may require server reboot.

  • Configure the parameters in the files based on your environment
    • config.py: provide the information of your servers (username, passwd, hostname, dir).
  • Setup the switch
    • Setup the necessary environment variables to point to the appropriate locations.
    • Copy the files to the switch: python3 console.py sync_switch
    • Compile the rscs: python3 console.py compile_switch
      Again it will take a couple of minutes. You can check switch_code/logs/p4_compile.log in the switch to see if it's finished.
  • Setup the servers
    • Copy the necessary files to server: python3 console.py sync_server
    • Setup the basic server, RocksDB server (for Figure 13), and R2P2 server (for Figure 14) before reproducing the figures.
      • Basic Shinjuku server: python3 console.py setup_basic_server
      • RocksDB Shinjuku server: python3 console.py setup_rocksdb_server
      • R2P2 server: python3 console.py setup_r2p2_server
  • Build the clients
    • Basic client: python3 console.py build_basic_client
    • CS client: python3 console.py build_cs_client
    • Failure client: python3 console.py build_failure_client
    • R2P2 client: python3 console.py build_r2p2_client
    • Reconfig client: python3 console.py build_reconfig_client
  • After both the switch and the servers are correctly configured, you can replay the results by running console.py. The following command will execute the switch program, shinjuku server programs, and client programs automatically and output the results to the terminal.
    • Figure 10: python3 console.py fig_10
    • Figure 11: python3 console.py fig_11
    • Figure 12: python3 console.py fig_12
    • Figure 13: python3 console.py fig_13
    • R2P2 in Figure 14: python3 console.py fig_14_r2p2
    • Client(100) in Figure 11: python3 console.py fig_14_client
    • Figure 15: python3 console.py fig_15
    • Figure 16: python3 console.py fig_16
    • Figure 17(a): python3 console.py fig_17_failure
    • Figure 17(b): python3 console.py fig_17_reconfig

5. Contact

For any question, please contact hzhu at jhu dot edu.

About

RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published