Generate configuration from template

Index

Step 1. Write Quick Start Configuration
Step 2. Generate OpenPAI configuration files
Optional Step 3. Customize configure OpenPAI

Step 1. Write Quick start

There is a example file in the link .

An example yaml file is shown below. Note that you should change the IP address of the machine and ssh information accordingly.

# quick-start.yaml

# (Required) Please fill in the IP address of the server you would like to deploy OpenPAI
machines:

  - 192.168.1.11
  - 192.168.1.12
  - 192.168.1.13

# (Required) Log-in info of all machines. System administrator should guarantee
# that the username/password pair or username/key-filename is valid and has sudo privilege.
ssh-username: pai
ssh-password: pai-password

# (Optional, default=None) the key file that ssh client uses, that has higher priority then password.
#ssh-keyfile-path: <keyfile-path>

# (Optional, default=22) Port number of ssh service on each machine.
#ssh-port: 22

# (Optional, default=DNS of the first machine) Cluster DNS.
#dns: <ip-of-dns>

# (Optional, default=10.254.0.0/16) IP range used by Kubernetes. Note that
# this IP range should NOT conflict with the current network.
#service-cluster-ip-range: <ip-range-for-k8s>

Step 2. Generate OpenPAI configuration files

(1) generate configuration files

cd /pai

# cmd should be executed under pai directory in the dev-box.

python paictl.py config generate -i /pai/deployment/quick-start/quick-start.yaml -o ~/pai-config -f

(2) update docker tag to release version

vi ~/pai-config/services-configuration.yaml

For example: v0.x.y branch, user should change docker-tag to v0.x.y.

docker-tag: v0.x.y

(3) changing gpu count and type

Quick start will generate node with 1 gpu with type generic, this may not suit your situation, for example, if you have two types of machines, and one type has 4 Tesla K80 gpu cards, and another has 2 Tesla P100 cards, you should modify your ~/pai-config/layout.yaml as following:

machine-sku:
  k80-node:
    mem: 40G
    gpu:
      type: Tesla K80
      count: 4
    cpu:
      vcore: 24
    os: ubuntu16.04
  p100-node:
    mem: 20G
    gpu:
      type: Tesla P100
      count: 2
    cpu:
      vcore: 24
    os: ubuntu16.04

machine-list:

  - hostname: xxx
    hostip: yyy
    machine-type: k80-node
  - hostname: xxx
    hostip: yyy
    machine-type: p100-node

(4) The default value in the generated configuration

The paictl tool sets the following default values in the 4 configuration files:

<th>
  Default value
</th>

<td>
  The first machine in the machine list will be configured as the master node.
</td>

<td>
  If not explicitly specified, the SSH port is set to <code>22</code>.
</td>

<td>
  If not explicitly specified, the cluster DNS is set to the value of the <code>nameserver</code> field in <code>/etc/resolv.conf</code> file of the master node.
</td>

<td>
  If not explicitly specified, the IP range used by Kubernetes is set to <code>10.254.0.0/16</code>.
</td>

<td>
  The docker registry is set to <code>docker.io</code>, and the docker namespace is set to <code>openpai</code>. In another word, all PAI service images will be pulled from <code>docker.io/openpai</code> (see <a href="https://hub.docker.com/r/openpai/">this link</a> on DockerHub for the details of all images).
</td>

<td>
  Cluster id is set to <code>pai-example</code>
</td>

<td>
  REST server's admin user is set to <code>admin</code>, and its password is set to <code>admin-password</code>
</td>

<td>
  There is only one VC in the system, <code>default</code>, which has 100% of the resource capacity.
</td>

Configuration Property
`master node`
`SSH port`
`cluster DNS`
`IP range used by Kubernetes`
`docker registry`
`Cluster id`
`REST server's admin user`
`VC`

Optional Step 3. Customize configure OpenPAI

This method is for advanced users.

The description of each field in these configuration files can be found in A Guide For Cluster Configuration.

If user want to customize configuration, please see the table below

Configure OpenPAI from scenarios
- placement
  - configure node placement of service
  - configure install gpu driver on which server
- scheduling
  - configure virtual cluster capacity
- account
  - configure customize docker repository
  - configure OpenPAI admin user account
- port / data folder etc.
  - configure service entry
  - configure HDFS data / OpenPAI temp data folder
- component version
  - configure K8s component version
  - configure nvidia gpu driver version
- HA
  - Kubernetes High Availability Configuration
Configure OpenPAI from files
- Cluster related configuration: configuration of layout.yaml
- Kubernetes role related configuration: It will be deprecated
- Kubernetes related configuration: configuration of kubernetes-configuration.yaml
- Service related configuration: configuration of services-configuration.yaml
Configure OpenPAI services [Note: This part is for advanced user who wants to customize OpenPAI each service]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how-to-generate-cluster-config.md

how-to-generate-cluster-config.md

Generate configuration from template

Index

Step 1. Write Quick start

Step 2. Generate OpenPAI configuration files

(1) generate configuration files

(2) update docker tag to release version

(3) changing gpu count and type

(4) The default value in the generated configuration

Optional Step 3. Customize configure OpenPAI

Files

how-to-generate-cluster-config.md

Latest commit

History

how-to-generate-cluster-config.md

File metadata and controls

Generate configuration from template

Index

Step 1. Write Quick start

Step 2. Generate OpenPAI configuration files

(1) generate configuration files

(2) update docker tag to release version

(3) changing gpu count and type

(4) The default value in the generated configuration

Optional Step 3. Customize configure OpenPAI