This check monitors AWS Neuron through the Datadog Agent. It enables monitoring of the Inferentia and Trainium devices and delivers insights into your machine learning model's performance.
Follow the instructions below to install and configure this check for an Agent running on an EC2 instance. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.
The AWS Neuron check is included in the Datadog Agent package.
You also need to install the AWS Neuron Tools package.
No additional installation is needed on your server.
-
Ensure that Neuron Monitor is being used to expose the Prometheus endpoint.
-
Edit the
aws_neuron.d/conf.yaml
file, which is located in theconf.d/
folder at the root of your Agent's configuration directory, to start collecting your AWS Neuron performance data. See the sample aws_neuron.d/conf.yaml for all available configuration options.
The AWS Neuron integration can collect logs from the Neuron containers and forward them to Datadog.
-
Collecting logs is disabled by default in the Datadog Agent. Enable it in your
datadog.yaml
file:logs_enabled: true
-
Uncomment and edit the logs configuration block in your
aws_neuron.d/conf.yaml
file. Here's an example:logs: - type: docker source: aws_neuron service: aws_neuron
Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes Log Collection.
Then, set Log Integrations as pod annotations. This can also be configured with a file, a configmap, or a key-value store. For more information, see the configuration section of Kubernetes Log Collection.
Run the Agent's status subcommand and look for aws_neuron
under the Checks section.
See metadata.csv for a list of metrics provided by this integration.
The AWS Neuron integration does not include any events.
See service_checks.json for a list of service checks provided by this integration.
In containerized environments, ensure that the Agent has network access to the endpoints specified in the aws_neuron.d/conf.yaml
file.
Need help? Contact Datadog support.