Skip to content

Commit

Permalink
Update README and run all plugins by default
Browse files Browse the repository at this point in the history
Also remove use of verbosity levels until we have figured out
a useful way to use then.
  • Loading branch information
dosaboy committed Feb 25, 2021
1 parent 2003fb8 commit 8ea3fc7
Show file tree
Hide file tree
Showing 4 changed files with 206 additions and 67 deletions.
155 changes: 141 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,153 @@
# hotsos

Tool to extract application-specific information from a [sosreport](https://github.com/sosreport/sos) which are commonly used as a source of debugging information.
This tool has two uses; create an application-specific summary of the contents of a [sosreport](https://github.com/sosreport/sos) or a live system and optionally perform extended analysis of those components.

When running hotsos you can choose from a selection of plugins and can run it against either a sosreport or live host.

There are multiple *verbosity* levels (`-v` flag) to choose from, each providing more fine-grained information from each plugin.
By default all plugins are run but only display information if any is found. If you want to run specific plugins you can choose from a selection (--list-plugins). The default output (stdout) format is yaml to allow easy parsing.

NOTE: hotsos is not intended to replace the functionality of [xsos](https://github.com/ryran/xsos) but rather to provide extra application-specific information to get a useful view of applications running on a host.

### Usage

- Get the list of plugins:
> hotsos --list-plugins
- Get details about *openstack*:
> hotsos /path/to/sos/report --openstack
- Get more details on *openstack*:
> hotsos /path/to/sos/report --openstack -v
- Get all information (run all plugins):
```> hotsos /path/to/sos/report
hotsos:
version: development
repo-info: 2003fb8
system:
hostname: acmehost1
os: ubuntu bionic
num-cpus: 80
load: 2.56, 2.44, 2.27
rootfs: /dev/mapper/vg0-lvroot 364219208 107557132 238091020 32% /
unattended-upgrades: disabled
openstack:
release: ussuri
services:
- ceilometer-polling (1)
- dnsmasq (18)
- haproxy (47)
- neutron-dhcp-agent (1)
- neutron-l3-agent (1)
- neutron-metadata-agent (21)
- neutron-openvswitch-agent (1)
- nova-api-metadata (21)
- nova-compute (1)
- ovs-vswitchd (1)
- ovsdb-client (1)
- ovsdb-server (1)
- qemu-system-x86_64 (2)
debug-logging-enabled:
ceilometer: false
neutron: false
nova: false
instances:
- 9fcc851a-821e-4a90-8272-fad52e0c5617
- 5ad5d022-a5ab-472d-8307-894a15e64354
dpkg:
- ceilometer-agent-compute 1:14.0.0-0ubuntu0.20.04.1~cloud0
- ceilometer-common 1:14.0.0-0ubuntu0.20.04.1~cloud0
- conntrack 1:1.4.4+snapshot20161117-6ubuntu2
- dnsmasq-base 2.79-1ubuntu0.2
- dnsmasq-utils 2.79-1ubuntu0.2
- haproxy 1.8.8-1ubuntu0.11
- keepalived 1:1.3.9-1ubuntu0.18.04.2
- keystone-common 2:17.0.0-0ubuntu0.20.04.1~cloud0
- libc-bin 2.27-3ubuntu1.4
- libvirt-daemon 6.0.0-0ubuntu8.5~cloud0
- libvirt-daemon-driver-qemu 6.0.0-0ubuntu8.5~cloud0
- libvirt-daemon-driver-storage-rbd 6.0.0-0ubuntu8.5~cloud0
- libvirt-daemon-system 6.0.0-0ubuntu8.5~cloud0
- libvirt-daemon-system-systemd 6.0.0-0ubuntu8.5~cloud0
- neutron-common 2:16.2.0-0ubuntu2~cloud0
- neutron-dhcp-agent 2:16.2.0-0ubuntu2~cloud0
- neutron-fwaas-common 1:16.0.0-0ubuntu0.20.04.1~cloud0
- neutron-l3-agent 2:16.2.0-0ubuntu2~cloud0
- neutron-metadata-agent 2:16.2.0-0ubuntu2~cloud0
- neutron-openvswitch-agent 2:16.2.0-0ubuntu2~cloud0
- nova-api-metadata 2:21.1.0-0ubuntu1~cloud0
- nova-common 2:21.1.0-0ubuntu1~cloud0
- nova-compute 2:21.1.0-0ubuntu1~cloud0
- nova-compute-kvm 2:21.1.0-0ubuntu1~cloud0
- nova-compute-libvirt 2:21.1.0-0ubuntu1~cloud0
- openvswitch-switch 2.13.1-0ubuntu0.20.04.2~cloud0
- python3-oslo.cache 2.3.0-0ubuntu1~cloud0
- python3-oslo.concurrency 4.0.2-0ubuntu1~cloud0
- python3-oslo.config 1:8.0.2-0ubuntu1~cloud0
- python3-oslo.context 1:3.0.2-0ubuntu1~cloud0
- python3-oslo.db 8.1.0-0ubuntu1~cloud0
- python3-oslo.i18n 4.0.1-0ubuntu1~cloud0
- python3-oslo.log 4.1.1-0ubuntu1~cloud0
- python3-oslo.messaging 12.1.0-0ubuntu1~cloud0
- python3-oslo.middleware 4.0.2-0ubuntu1~cloud0
- python3-oslo.policy 3.1.0-0ubuntu1.1~cloud0
- python3-oslo.privsep 2.1.1-0ubuntu1~cloud0
- python3-oslo.reports 2.0.1-0ubuntu1~cloud0
- python3-oslo.rootwrap 6.0.2-0ubuntu1~cloud0
- python3-oslo.serialization 3.1.1-0ubuntu1~cloud0
- python3-oslo.service 2.1.1-0ubuntu1.1~cloud0
- python3-oslo.upgradecheck 1.0.1-0ubuntu1~cloud0
- python3-oslo.utils 4.1.1-0ubuntu1~cloud0
- python3-oslo.versionedobjects 2.0.1-0ubuntu1~cloud0
- python3-oslo.vmware 3.3.1-0ubuntu1~cloud0
- qemu-kvm 1:4.2-3ubuntu6.10~cloud0
network:
namespaces:
qrouter: 1
qdhcp: 1
fip: 1
config:
nova:
my_ip: 10.100.22.1 (bond1)
neutron:
local_ip: 10.120.22.1 (bond2.231@bond2)
port-health:
num-vms-checked: 68
stats:
9fcc851a-821e-4a90-8272-fad52e0c5617:
fa:16:3e:4c:84:c3:
dropped: 23534 (11%)
5ad5d022-a5ab-472d-8307-894a15e64354:
fa:16:3e:97:ce:99:
dropped: 22236 (10%)
features:
neutron:
neutron:
availability_zone: az1
openvswitch-agent:
l2_population: true
l3-agent:
agent_mode: dvr
dhcp-agent:
enable_metadata_network: true
enable_isolated_metadata: true
ovs_use_veth: false
kubernetes:
snaps:
core: 16-2.48.2.1
juju:
machines:
running:
- 40 (version=2.7.6)
charm-versions:
- ceilometer-agent-267
- neutron-openvswitch-278
- nova-compute-323
- ntp-39
units:
local:
- ceilometer-agent-14
- logrotate-41
- neutron-openvswitch-14
- nova-compute-4
- ntp-60
kernel:
boot: BOOT_IMAGE=/vmlinuz-4.15.0-128-generic root=/dev/mapper/vg0-lvroot ro console=tty0 console=ttyS0,115200 console=ttyS1,115200 raid=noautodetect pti=off
memory-checks: no issues found
systemd:
- CPUAffinity not set
- And even more details:
> hotsos /path/to/sos/report --openstack -vvv
INFO: see --help for more display options
```

## Install

Expand Down
56 changes: 36 additions & 20 deletions hotsos.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,45 +43,47 @@ declare -A PLUGINS=(
[system]=true # always do system by default
[all]=false
)
override_all_default=false
# output ordering
declare -a PLUGIN_NAMES=( system openstack kubernetes storage juju kernel )


usage ()
{
cat << EOF
USAGE: hotsos [OPTIONS] SOSPATH
USAGE: hotsos [OPTIONS] [SOSPATH]
OPTIONS
-h|--help
This message.
--juju
Include Juju info.
Use the Juju plugin.
--kernel
Include Kernel info.
Use the Kernel plugin.
--list-plugins
Show available plugins.
--openstack
Include Openstack services info.
Use the Openstack plugin.
--openstack-show-cpu-pinning-results
The Openstack plugin will check for cpu pinning configurations and
perform checks. By default only brief messgaes will be displayed when
issues are found. Use this flag to get more detailed results.
--kubernetes
Include info about Kubernetes
Use the Kubernetes plugin.
--storage
Include storage info including Ceph.
Use the Storage plugin.
--system
Include system info.
Use the System plugin.
-s|--save
Save output to a file.
Save yaml output to a file.
-a|--all
Enable all plugins.
Enable all plugins. This is the default.
-v
Increase amount of information displayed.
SOSPATH
Path to a sosreport. Can be provided multiple times.
Path to a sosreport. Can be provided multiple times. If none provided,
will run against local host.
EOF
}
Expand All @@ -92,32 +94,42 @@ while (($#)); do
usage
exit 0
;;
## PLUGINS ############
--juju)
override_all_default=true
PLUGINS[juju]=true
;;
--kernel)
override_all_default=true
PLUGINS[kernel]=true
;;
--kubernetes)
override_all_default=true
PLUGINS[kubernetes]=true
;;
--openstack)
override_all_default=true
PLUGINS[openstack]=true
;;
--openstack-show-cpu-pinning-results)
OPENSTACK_SHOW_CPU_PINNING_RESULTS=true
--system)
override_all_default=true
PLUGINS[system]=true
;;
--storage)
override_all_default=true
PLUGINS[storage]=true
;;
--kernel)
PLUGINS[kernel]=true
;;
--kubernetes)
PLUGINS[kubernetes]=true
#######################
## PLUGIN OPTS ########
--openstack-show-cpu-pinning-results)
OPENSTACK_SHOW_CPU_PINNING_RESULTS=true
;;
#######################
--list-plugins)
echo "Available plugins:"
echo "${!PLUGINS[@]}"| tr ' ' '\n'| grep -v all| xargs -l -I{} echo " - {}"
exit
;;
--system)
PLUGINS[system]=true
;;
-s|--save)
SAVE_OUTPUT=true
;;
Expand All @@ -143,6 +155,10 @@ done

((${#SOS_PATHS[@]})) || SOS_PATHS=( / )

if ! $override_all_default && ! ${PLUGINS[all]}; then
PLUGINS[all]=true
fi

if ${PLUGINS[all]}; then
PLUGINS[openstack]=true
PLUGINS[storage]=true
Expand Down
5 changes: 2 additions & 3 deletions plugins/kernel/01kernel
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,9 @@ check_nodes_memory () {
echo " node$node-${zones_type,,}:"
zone_issue_found=true
msg_suffix=""
((VERBOSITY_LEVEL>=1)) || msg_suffix=" (use -v to show zones)"
echo "${subsubsub_indent_str}may have limited high order memory - check ${DATA_ROOT}proc/buddyinfo$msg_suffix"
# if verbosity is requested, show the zones
((VERBOSITY_LEVEL>=1)) && check_mallocinfo $node $zones_type 1
check_mallocinfo $node $zones_type 1
fi
done

Expand All @@ -120,7 +119,7 @@ if $zone_issue_found; then
echo -e " compaction:\n${subsubsub_indent_str}failures are at $pcent% of successes (see ${DATA_ROOT}proc/vmstat)"
fi
fi
((VERBOSITY_LEVEL>=1)) && get_slab_major_consumers
get_slab_major_consumers
else
echo " memory-checks: no issues found"
fi
Expand Down
57 changes: 27 additions & 30 deletions plugins/storage/01ceph
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ import re
import subprocess

from common import (
constants,
helpers
)

Expand Down Expand Up @@ -130,37 +129,35 @@ def get_osd_info():
if 'mark' in osd_info[osd_id]:
del osd_info[osd_id]['mark']

if constants.VERBOSITY_LEVEL >= 1:
for line in helpers.get_ps():
ret = re.compile(r".+/ceph-osd\s+.+--id {}\s+.+".format(
osd_id)).match(line)
if ret:
rss = int(int(line.split()[5]) / 1024)
osd_info[osd_id]["rss"] = "{}M".format(rss)
for line in helpers.get_ps():
ret = re.compile(r".+/ceph-osd\s+.+--id {}\s+.+".format(
osd_id)).match(line)
if ret:
rss = int(int(line.split()[5]) / 1024)
osd_info[osd_id]["rss"] = "{}M".format(rss)
break

for line in helpers.get_ps_axo_flags():
ret = re.compile(r".+/ceph-osd\s+.+--id {}\s+.+".format(
osd_id)).match(line)
if ret:
osd_start = ' '.join(line.split()[13:17])
if sos_time_secs and osd_start:
cmd = ["date", "--date={}".format(osd_start),
"+%s"]
osd_start_secs = subprocess.check_output(cmd)
osd_uptime_secs = (int(sos_time_secs) -
int(osd_start_secs))
osd_uptime_str = seconds_to_date(osd_uptime_secs)
osd_info[osd_id]["etime"] = osd_uptime_str
break

for line in helpers.get_ps_axo_flags():
ret = re.compile(r".+/ceph-osd\s+.+--id {}\s+.+".format(
osd_id)).match(line)
if ret:
osd_start = ' '.join(line.split()[13:17])
if sos_time_secs and osd_start:
cmd = ["date", "--date={}".format(osd_start),
"+%s"]
osd_start_secs = subprocess.check_output(cmd)
osd_uptime_secs = (int(sos_time_secs) -
int(osd_start_secs))
osd_uptime_str = seconds_to_date(osd_uptime_secs)
osd_info[osd_id]["etime"] = osd_uptime_str
break

if constants.VERBOSITY_LEVEL >= 3:
ceph_osd_tree = helpers.get_ceph_osd_tree()
if ceph_osd_tree:
for line in helpers.get_ceph_osd_tree():
if line.split()[3] == "osd.{}".format(osd_id):
osd_info[osd_id]["devtype"] = line.split()[1]
break
ceph_osd_tree = helpers.get_ceph_osd_tree()
if ceph_osd_tree:
for line in helpers.get_ceph_osd_tree():
if line.split()[3] == "osd.{}".format(osd_id):
osd_info[osd_id]["devtype"] = line.split()[1]
break

if osd_info:
CEPH_INFO["osds"] = osd_info
Expand Down

0 comments on commit 8ea3fc7

Please sign in to comment.