Skip to content

Throughput measures the number of updates the system, i.e., the parameter server does per second. Testbed. OurexperimentalplatformisGrid5000Grid5000(2019).Weemploynodes,eachhaving 2 CPUs (Intel Xeon E5-2630 v4) with 10 cores, 256 GiB RAM and 2×10 Gbps Ethernet. Unless otherwise stated, we employ 20 compute nodes (workers), out of them (up to) 8 …

License

Notifications You must be signed in to change notification settings

ANSIOS-X9SAN-iOS-XR/-MZStatic-v-0.0.0.21

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

-MZStatic-v-0.0.0.21

Throughput measures the number of updates the system, i.e., the parameter server does per second. Testbed. OurexperimentalplatformisGrid5000Grid5000(2019).Weemploynodes,eachhaving 2 CPUs (Intel Xeon E5-2630 v4) with 10 cores, 256 GiB RAM and 2×10 Gbps Ethernet. Unless otherwise stated, we employ 20 compute nodes (workers), out of them (up to) 8 nodes could be Byzantine. In the case of vanilla TensorFlow deployment, we use only 1 machine as a parameter server. Otherwise, we employ 4 machines for LIUBEI deployment and 5 machines for GuanYu deployment; these numbers are to tolerate at most 1 Byzantine server, based on the requirements of each algorithm. model ← MeaMed(models) 11: end if 12: t←t+1 13: until t > num iterations Dataset and Model. We consider the image classification task due to its wide adoption as a benchmark for distributed ML systems, e.g., Chilimbi et al. (2014). We use MNIST Lecunn (1998) and CIFAR- 10 Krizhevsky (2009) datasets. MNIST is a dataset of handwritten digits. It has 70,000 28 × 28 images in 10 classes. CIFAR- 10 is a widely–used dataset in image clas- sification Srivastava et al. (2014); Zhang et al. (2017). It consists of 60,000 32 × 32 colour images in 10 classes. Table 1: Models used to evaluate LIUBEI. Model # parameters Size (MB) MNIST CNN 79510 0.3 CifarNet 1756426 6.7 Inception 5602874 21.4 ResNet-50 23539850 89.8 ResNet-200 62697610 239.2 We employ different models with different sizes ranging from simple models like small convolu- tional neural network (CNN) for MNIST, training a few thousands of parameters to big models like ResNet-200 with around 63M para

About

Throughput measures the number of updates the system, i.e., the parameter server does per second. Testbed. OurexperimentalplatformisGrid5000Grid5000(2019).Weemploynodes,eachhaving 2 CPUs (Intel Xeon E5-2630 v4) with 10 cores, 256 GiB RAM and 2×10 Gbps Ethernet. Unless otherwise stated, we employ 20 compute nodes (workers), out of them (up to) 8 …

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published