agent

Securely Managing Secrets in Datadog Agent: Standalone and Kubernetes Deployments

Nicolas Narbais

19 Mar 2025 — 9 min read

Keeping API keys, passwords, and other credentials out of plaintext configuration is essential for security. The Datadog Agent provides a secrets management mechanism to handle sensitive data both in standalone installations and in Kubernetes environments. In this deep dive, we’ll explore how Datadog Agent’s secret handling works, best practices for using it, real-world implementation examples, additional configuration options, recent enhancements, and known challenges (with solutions).

Why Use Datadog Secrets Management

Storing secrets in plaintext (e.g., directly in datadog.yaml or integration config files) poses a security risk. Datadog Agent’s secrets management package allows you to avoid plaintext secrets by delegating decryption/retrieval to an external tool. In practice, you mark secret values in configs with a special notation and let the Agent invoke a helper program to fetch the real values securely at runtime. This means credentials like API keys, database passwords, etc., never appear in cleartext on disk — they’re loaded only in memory by the Agent.

How it works in a nutshell: You replace sensitive config values with placeholders using the format ENC[<secret_name>]. At startup (or when loading new configs), the Datadog Agent detects these placeholders and calls an external “secret backend” command to retrieve the actual secret values. This enables integration with any secret management backend (HashiCorp Vault, AWS Secrets Manager, Kubernetes secrets, etc.) via a user-provided executable.

Understanding secret_backend_command and the External Secrets API

The core of Datadog’s secret management is the secret_backend_command. This is a configurable path to an executable that the Agent will run whenever it needs to resolve ENC[...]secrets. The Agent communicates with this executable through standard input/output using a simple JSON-based API:

Agent → Command (input): A JSON payload listing the secret handles to fetch. For example: {"version": "1.0", "secrets": ["secret1", "secret2"]}. Each attribute corresponds to one ENC[...] identifier found in the configs.
Command → Agent (output): A JSON object mapping each secret handle to its decrypted value or an error. For example: { "secret1": {"value": "actual_value", "error": null}, "secret2": {"value": null, "error": "could not fetch the secret"} }. The Agent will replace the value for each secret in memory.

This approach is quite flexible since you can fetch from your chosen secret store and authentication method. The Agent itself doesn’t need direct knowledge of the secret backend; it just relies on the script’s output.

Supported secret backends: Because the mechanism is external, virtually any secret management system can be used. Common choices include HashiCorp Vault, AWS Secrets Manager, AWS SSM Parameter Store, Azure Key Vault, and Kubernetes secrets. In fact, Datadog provides an open-source utility that implements a secret_backend_command supporting multiple backends out-of-the-box (Vault, AWS, Azure, aKeyless, local files, etc.) (Secrets Management With Datadog Secret Backend Utility). This utility, called datadog-secret-backend, is a Go binary you can use instead of writing a custom script – it reads a config file to know how to access your secret stores and can even handle multiple providers simultaneously.

The ENC[...] notation: In your Datadog config files (including integration config YAMLs and even datadog.yaml), use ENC[<secret_handle>] as the value for any field that should be kept secret. The <secret_handle> is an identifier you’ll use in your secret backend. It can be a simple key name (e.g., db_password) or even a structured reference (JSON string, etc.) – Datadog doesn’t impose format restrictions inside the brackets. The only rules are that the ENC[...] must be the entire value (you can’t have a partial secret within a larger string), and secrets are always treated as strings (you can’t ENC[] an integer or boolean config). For example:

# postgres.yaml (integration config)
instances:
  - host: db.example.com
    username: ENC[db_readonly_user]
    password: ENC[db_readonly_password]

Here, db_readonly_user and db_readonly_password are handles. The Agent will invoke the secret_backend_command with these two identifiers, and expects the script to return their actual values, which will then be used by the Postgres check.

Configuring the Secret Backend Command (Standalone Instances)

In a non-container (standalone) installation, configuring secret management involves a few steps:

1. Provide a secrets-fetching executable: This is your implementation of how to get secrets from your vault/secret-store. It could be a custom script or a binary. You specify its path in the main config file:

# datadog.yaml
secret_backend_command: <EXECUTABLE_PATH>
# https://github.com/DataDog/datadog-agent/blob/main/pkg/config/config_template.yaml

For example, you might set secret_backend_command: "/opt/datadog-agent/bin/fetch-secrets.sh". This tells the Agent which command to run for resolving secrets. If the command needs command-line arguments, they can be provided via the secret_backend_arguments setting (as a list in YAML) or skip it and handle everything internally. In containerized setups, environment variables can be used as well.

2. Secure the permissions: The Agent enforces strict file permission requirements on the secret command for security. On Linux, the script/binary must be owned by the Datadog Agent user (typically dd-agent on Linux hosts, or root if the Agent runs as root in a container) and not accessible by others (no read/exec permissions for group or world). Essentially, set the file mode to 0700 and owner to the Agent’s user. On Windows, the executable must be a proper Win32 application (so a .exe – a plain PowerShell or Python script won’t work unless compiled) and restricted so that only the Agent’s user (and admins/System) have access. These measures prevent unauthorized users from reading or tampering with the script. If the permissions are too open, the Agent will refuse to execute it and log an error. You can verify the permission check by running datadog-agent secret command, which reports whether the rights are OK.

3. Implement the secret retrieval logic: Your script needs to read the JSON from STDIN and output the JSON response to STDOUT as described.

4. Configure environment as needed: The secret command runs as a subprocess of the Agent and inherits the Agent’s environment. This means any env vars (like cloud credentials, Vault tokens, etc.) available to the Agent will also be available to your script. You might use this for authentication – for example, setting VAULT_TOKEN or AWS credentials in the environment so the script can use them to auth to the secret service. Just ensure those env vars are themselves provided securely (e.g., injected via your orchestration, not hardcoded).

5. Restart the Agent: After configuring secret_backend_command (and possibly secret_backend_arguments) in datadog.yaml, restart the Agent. On startup, it will log that it’s using the secret backend command and attempt to decrypt any ENC[] entries. You can run datadog-agent secret after startup to see a report of secrets loaded (it will list the handles found and if any errors occurred).

Handling Secrets in Kubernetes Environments

Running the Datadog Agent on Kubernetes (typically via the Datadog Helm chart or the Datadog Operator) introduces some convenient options for secrets management. The Agent runs in a container (often as a DaemonSet on each node), and Datadog includes helper scripts in the container image to assist with secrets.

Built-in helper scripts: In containerized deployments, you don’t necessarily have to supply your own secret_backend_command script; Datadog Agent images come with some pre-packaged scripts for common scenarios. As of Agent v7.32.0, a new script readsecret_multiple_providers.sh is available, which supports multiple sources – not just files, but also directly reading Kubernetes Secrets. This newer script is recommended for modern deployments, as it generalizes secret retrieval.

With the Helm chart, enabling the secret backend is straightforward. For example, you can set the value datadog.secretBackend.command: "/readsecret_multiple_providers.sh" to tell the Agent to use the multi-provider helper. Check the values.yaml. Under the hood, this populates the appropriate environment variable (DD_SECRET_BACKEND_COMMAND) in the Agent pods.

Using Kubernetes Secrets as files: One common pattern is to create a Kubernetes Secret containing your sensitive data and mount it into the Datadog Agent container as a file. For example, say you have a K8s Secret named db-creds with a key db_password. You can mount it at a known path (via a volume) in the Agent pod, e.g. mount to /etc/secret-volume/db_password. Then in your Datadog integration config you reference ENC[file@/etc/secret-volume/db_password] as the secret handle. The multi-provider script will recognize the file@ prefix and read the file directly.

A few best practices for Kubernetes secrets:

Use a dedicated volume mount: Datadog recommends mounting secrets into a separate directory (like /etc/secret-volume) rather than using default paths like /var/run/secrets. The reason is the Agent’s helper script, if misused, could potentially access other files including the pod’s service account token. By isolating to a specific path, you minimize risk and confusion.
Namespace considerations: If you mount a Secret via a volume, it must be in the same namespace as the Agent pod (since pods can only directly mount secrets from their own namespace).

Using Kubernetes Secrets via the API: The /readsecret_multiple_providers.sh script allows the Agent to directly query Kubernetes for secrets by name, which is powerful in some situations. Instead of mounting the secret as a file, you can reference it with a handle like ENC[k8s_secret@<namespace>/<secret_name>/<key>]. For example: ENC[k8s_secret@database/db-creds/password] could fetch the password field from the db-creds Secret in the database namespace. For this to work:

The Agent’s service account needs permission to read that Secret (and any secret you want to access this way). You’d create a Role that allows get (or list/watch) on the secret resource and a RoleBinding to attach it to the Agent’s ServiceAccount. The Datadog docs provide an example RBAC policy that grants access to a specific secret in another namespace.
This approach can be useful if, say, your Datadog Agent is running in the default namespace but needs to get a secret from the database namespace – using the API method avoids having to duplicate secrets across namespaces or do complex mounting. Just be cautious with RBAC: only give the Agent access to the secrets it truly needs.

Datadog Operator considerations: If you deploy the Datadog Agent via the Datadog Operator, the operator has built-in support for secrets management. Recent versions (v1.11+) let you specify spec.global.secretBackend.command and args in the DatadogAgent spec, which correspond to DD_SECRET_BACKEND_COMMAND and arguments.

Best Practices for Secure Secret Handling

To ensure clarity, here are some best practices and additional tips for using Datadog Agent’s secret management effectively:

Use the ENC[...] notation consistently: Replace all plaintext secrets in your check configurations with ENC[...] handles.
Keep secrets out of logs and errors: Design your secret backend script so that it never prints sensitive data to stderr. If the script exits with a non-zero status, the Agent will log whatever was written to stderr to help you debug. That’s useful for error messages, but you wouldn’t want an error like “Vault token expired: ABCDE12345” leaking a token. Log generic errors and use exit codes judiciously.
Use the secret CLI for troubleshooting: The Datadog Agent has a built-in command datadog-agent secret (run inside the Agent container or on the host) that helps debug the secret setup. It will check the permissions of your secret_backend_command and list all secret handles it discovered in configs along with any decryption errors. If an integration isn’t appearing in Datadog, run this command to see if a secret failed to load (e.g., bad permissions or script error).

Conclusion

Datadog Agent’s secrets management feature provides a robust way to keep sensitive credentials secure in both traditional and cloud-native environments. By using the ENC[] notation and configuring a secret_backend_command, you can integrate Datadog with enterprise-grade secret managers like Vault, cloud secrets services, and Kubernetes Secrets, ensuring that your API keys and passwords never reside in plain text in your configs. We’ve explored how this mechanism works under the hood, best practices for setup (from file permissions to using provided scripts), and examples of how real organizations are using it to protect everything from database passwords to cloud credentials.

Remember that security is a journey: implement the best practices, stay updated with Datadog’s releases (for any new features or fixes in this area), and always test your secret retrieval process in a safe environment. With the Datadog Agent’s secrets management and the guidelines outlined above, you can significantly reduce the risk of secret leaks and ensure your monitoring infrastructure adheres to your organization’s security standards.