Skip to content

Commit

Permalink
Merge branch 'gary/bare_metal_docs' into 'master'
Browse files Browse the repository at this point in the history
Bare metal deployment improvements

* Revamp docs
* Add SSH identity file arg for CI 

See merge request dfinity-lab/public/ic!16750
  • Loading branch information
garym-dfinity committed Dec 18, 2023
2 parents 91bbd24 + 7594e14 commit f6a498c
Show file tree
Hide file tree
Showing 3 changed files with 69 additions and 39 deletions.
92 changes: 57 additions & 35 deletions ic-os/utils/bare_metal_deployment/README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,73 @@
# The Remote Image Deployer

Get SetupOS images to run remotely given BMC info. Works only for iDRAC (currently). Use bazel target or run in poetry shell.
Deploy SetupOS to bare metal remotely using BMC.
Works only for iDRAC (currently).
Reserve the target machine in Dee before deploying.

## What do you need?

* A Dell machine with iDRAC version 6 or higher
* SSH key access to a file share, preferably close to the target machine
* A [yaml file](#whats-in-the-yaml-configuration-file) containing info to configure deployment
* A [csv file](#whats-in-the-csv-secrets-file) containing the BMC info and credentials


### Run it via bazel target

From inside the devenv container! I.e.: first run `./gitlab-ci/container/container-run.sh` -
Must be run inside the devenv container. Use `./gitlab-ci/container/container-run.sh`.

The config files must be accessible from inside the container - e.g., at the root of the ic directory, which maps to `/ic` inside the container.

```
bazel run //ic-os/setupos/envs/dev:launch_bare_metal --config=local -- \
--config_path $(realpath ./ic-os/utils/bare_metal_deployment/example_config.yaml) \
--csv_filename $(realpath ./zh2-dll01.csv)
```

This is all you need for local usage.

To develop or use finer grained features, read on.
#### What's in the yaml configuration file?

```
file_share_url: <NFS share on which to upload the file>
file_share_dir: <directory on NFS share which is exposed via NFS>
file_share_image_filename: <name of image file to appear over NFS>
file_share_username: <SSH username to log into file share> # NOTE SSH KEYS ARE ASSUMED TO BE FUNCTIONAL
inject_image_ipv6_prefix: <config.ini: ipv6_prefix>
inject_image_ipv6_gateway: <config.ini: ipv6_gateway>
```

These are CLI args submitted in yaml form. See [why](#why-two-config-files) or `./deploy.py --help` for detailed docs on the arguments.
See ./example_config.yaml for a functional example.

#### What's in the csv secrets file?

Per-machine BMC secrets. Each row represents a machine. The tool will deploy to each with the given information.

```
<ip_address>,<username>,<password>,<guestos ipv6 address>
```

See [CSV secrets file](#csv-secrets-file) for more info.

##### Where can I find csv files for the bare metal test machines?

Next to each machine entry in 1Pass. Ask node team for details.


#### Why two config files?

`deploy.py` accepts many CLI arguments and can source a yaml configuration file for those same arguments. The file is a convenient way to manage these but all args can be specified on the command line.

The csv file contains secrets which should _not_ be submitted via the command line. It also supports an arbitrary number of rows to deploy to an arbitrary number of machines.


### This is all you need for local usage.

# To develop or use finer grained features, read on.

## Requirements

Ignore if running via bazel + devenv container from the ic repo (see below).
* Ignore if running via bazel + devenv container from the ic repo *

* Python 3.10 (maybe lower works)
* Poetry - `pip install poetry`
Expand All @@ -33,11 +81,11 @@ Ignore if running via bazel + devenv container from the ic repo (see below).

### Prep + Review input data

#### CSV files
#### CSV secrets file

deploy.py requires a CSV file with the information to deploy to multiple BMC's. Include the BMC info _for each BMC_ where each row is "ip address, username, password".

Each row can include an extra parameter - the GuestOS ipv6 address. This is used to check if the resulting machine has deployed successfully.
Each row optionally includes a final parameter - the GuestOS ipv6 address. This is used to check if the resulting machine has deployed successfully. This is calculated deterministically. See bazel target /rs/ic_os/deterministic_ips to calculate.

This file is plaintext readable - make it readable only by the current user.

Expand All @@ -54,35 +102,9 @@ or
10.10.10.124,root,password,2a00:fb01:400:200:6801::1235
```

#### SetupOS image

If running via the bazel target, skip this section.

Prepare the image for deployment - config.ini, etc.. See the related google doc for details: 'SetupOS bare-metal hardware installation guide'.

Skip these instructions if using the `--upload_file` flag, passing in the compressed file from the above step. Skip directly to running `deploy.py`.

Send to NFS file share:
```bash
# Send to nfs file share machine. Alternatively mount the nfs (if you're allowlisted) and cp to it
scp sh1-setupos.img.zst [email protected]:
```

Log in, decompress, host image:
```bash
# Commands run after `ssh [email protected]`
zstd -d sh1-setupos.img.zst
sudo mv sh1-setupos.img /srv/images
```

Consider the network url format expected by the tool+iDRAC: "<IP_ADDRESS>:<PATH_TO_IMAGE>"
E.g., "10.10.101.254:/srv/images/sh1-setupos.img"

The network image url must point to an NFS file share.

**The file share machine firewall must allow traffic from the target bmc ip addresses!**
#### Manually preparing SetupOS image

We've been using `zh2-rmu` for testing. Add the new DC's to the allowlist.
See related google doc for details: 'SetupOS bare-metal hardware installation guide'.


## Run it
Expand Down
16 changes: 12 additions & 4 deletions ic-os/utils/bare_metal_deployment/deploy.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ class Args:
Use --config_path <yaml file> to load args from a file. Args on the command line will override config file args.
"""

# URL for NFS enabled fileshare, e.g. 10.10.101.254
# Endpoint for NFS enabled fileshare, e.g. zh2-rmu or 10.10.101.254
file_share_url: str = field(alias="-u")

# Directory on the remote file share where files are served from. E.g. /srv/images. This will be postfixed to the file_share_url, e.g.: 10.10.101.254:/srv/images
Expand All @@ -59,6 +59,9 @@ class Args:
# Username for SSH/SCP access to file share. Defaults to the current username
file_share_username: Optional[str] = None

# SSH private key file for access to file share. This is passed via the '-i' flag to `scp`. If omitted, the '-i' flag is omitted.
file_share_ssh_key: Optional[str] = None

upload_img: Optional[str] = field(default=None, alias="-f")
"""
If specified, file will be scp'd to `file_share_url`, decompressed, and the contained disk.img file moved to `file_share_dir` and renamed to `file_share_image_filename`.
Expand Down Expand Up @@ -105,7 +108,9 @@ def __post_init__(self):
if self.inject_image_ipv6_prefix:
assert self.inject_configuration_tool, \
"setupos_inject_configuration tool required to modify image"

assert self.file_share_ssh_key is None \
or Path(self.file_share_ssh_key).exists(), \
"File share ssh key path does not exist"

@dataclass(frozen=True)
class BMCInfo:
Expand Down Expand Up @@ -387,6 +392,7 @@ def upload_to_file_share(
file_share_url: str,
file_share_dir: str,
file_share_image_name: str,
file_share_ssh_key: str,
):
endpoint = (
file_share_url
Expand All @@ -401,7 +407,8 @@ def upload_to_file_share(
result = conn.run("mktemp --directory", hide="both", echo=True)
tmp_dir = str.strip(result.stdout)
# scp is faster than fabric's built-in transfer.
invoke.run(f"scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null {upload_img} {endpoint}:{tmp_dir}", echo=True)
ssh_key_arg = f"-i {file_share_ssh_key}" if file_share_ssh_key else ""
invoke.run(f"scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null {ssh_key_arg} {upload_img} {endpoint}:{tmp_dir}", echo=True)

upload_img_filename = upload_img.name
# Decompress in place. disk.img should appear in the same directory
Expand Down Expand Up @@ -495,7 +502,8 @@ def main():
args.file_share_username,
args.file_share_url,
args.file_share_dir,
args.file_share_image_filename)
args.file_share_image_filename,
args.file_share_ssh_key)

elif args.upload_img:
upload_to_file_share(
Expand Down

0 comments on commit f6a498c

Please sign in to comment.