[ARCHIVED] containerdbg: Automate container debugging tasks

Deprecation Notice

This project has been archived and is no longer under active maintenance. It constitutes a historical record of the codebase as it existed on My 13, 2025. While the project remains accessible for reference, further contributions and ongoing support will not be provided.

containerdbg is an all-in-one command-line tool to help debug Kubernetes containers with common issues that arise when moving to containers as part of legacy application modernization.

Currently the tool looks for the following common issues:

Installation

Download the pre-compiled binaries:

VERSION=0.0.9
OS=linux
ARCH=amd64
tar xf containerdbg_${VERSION}_${OS}_${ARCH}.tar.gz
chmod +x containerdbg
sudo mv containerdbg /usr/local/bin

containerdbg components

Building

Check out BUILDING

Usage

In this section we will show 2 main usage scenarios for containerdbg.

For a step-by-step guide with example application please refer to the guide.

Analyzing a deployment yaml

This is use case is for when you have a kubernetes yaml which contains a Deployment resource. If this deployment yaml contains more than one Deployment resource please consider splitting it for simplicity.

  1. the first stage is to run the following command containerdbg debug -f <yaml file> -o record.pb. this will apply all the resources in the yaml file and modify the deployment inside the yaml to by debugged by containerdbg.
  2. The output should look something like:
Installing containerdbg node daemon
NAMESPACE   RESOURCE                                  ACTION        STATUS      RECONCILED  CONDITIONS                                AGE     MESSAGE
            Namespace/containerdbg-system             Unchanged     Current                 <None>                                    4s      Resource is current
containerd  DaemonSet/containerdbg-daemonset          Created       Current                 <None>                                    2s      All replicas scheduled as expected. Repl

Press Ctrl-C to finish the debugging session and download the collected report

At this point, you can work with your deployment for a while until some errors occur. once you are done press Ctrl-C on the terminal in which you ran containerdbg to finish collecting information.

  1. Now you can take record.pb and try to get a summary for the issues discovered by running containerdbg analyze -f record.pb this will print a short summary of what could have went wrong during the execution of your container.

While executing the container the following files were missing:
===============================================================
/var/lib/dpkg/arch is missing
/var/lib/dpkg/triggers/File is missing

While executing the container the library type files were missing:
==================================================================

While executing the container the following files where attempted to be moved but failed to docker limitation:
==============================================================================================================

Analyzing a container image

In case you don't have a kubernetes yaml and you simply want to test an image you could run the following command containerdbg debug <image> -o record.pb. As in the previous section the output should look like:

Installing containerdbg node daemon
NAMESPACE   RESOURCE                                  ACTION        STATUS      RECONCILED  CONDITIONS                                AGE     MESSAGE
            Namespace/containerdbg-system             Unchanged     Current                 <None>                                    4s      Resource is current
containerd  DaemonSet/containerdbg-daemonset          Created       Current                 <None>                                    2s      All replicas scheduled as expected. Repl

Press Ctrl-C to finish the debugging session and download the collected report

At this point, you can work with your deployment for a while until some errors occur. once you are done press Ctrl-C on the terminal in which you ran containerdbg to finish collecting information.

Now you can take record.pb and try to get a summary for the issues discovered by running containerdbg analyze -f record.pb this will print a short summary of what could have went wrong during the execution of your container.

While executing the container the following files were missing:
===============================================================
/usr/local/tomcat/work/Catalina/localhost/petclinic/SESSIONS.ser is missing

While executing the container the library type files were missing:
==================================================================

While executing the container the following files where attempted to be moved but failed to docker limitation:
==============================================================================================================

While executing the container the following connections failed:
==============================================================================================================
10.108.0.105:5432

In this example we can see some missing configration files that have no real importance to the application and a failed connection to a posgres DB which might be the reason the application is failing.

Troubleshooting

In case you see the following errors

program sys_enter_open: apply CO-RE relocations: no BTF found for kernel version <version>: not supported

It means your cluster does not have btf support, in order to resolve this issue you can download the corresponding btf file from https://github.com/aquasecurity/btfhub-archive/ with the matching <version> and then do the following:

  1. extract the downloaded file into btf-install/ using tar xf <filename>
  2. copy the resulting .btf file into btf-install folder.
  3. run export TARGET_REPO=<repo name> where repo name is an image registry accesible to your cluster.
  4. run make install-btf

The program should run succefully now.

Technical background

The tools works by first deploying a workload (either from a YAML file or a container image) - it can also connect to an existing deployment.

Then it utilizes eBPF and a sidecar to instrument the workload while running - so you can try and use the workload as usual.

When you are done you finish the analyze stage and the tool collects the recorded data - and analyzes it displaying a report with issues found.

Contributing

Contributions are welcome, see CONTRIBUTING

Community

Come and ask us questions

Thanks