Jaeger Agent

Jaeger Agent

This article describes how EverQuote uses the Jaeger Agent (among others) to receive traces from instrumented workloads and forward them to the Jaeger backend.

Disclaimer: Just as in the previous article, this particular setup should not be construed as an endorsement or recommendation. Use your own judgment before adopting anything covered by this article.

It is possible to accomplish what I describe here using the Jaeger Operator. However, my goal is to show how to add and configure a Jaeger Agent sidecar container “by hand” so you can infer from that how to configure other such agents (such as the OpenTelemetry Collector).

Also, the Jaeger Operator is designed to deploy the entire Jaeger stack so deploying it to all of our Kubernetes clusters just for Jaeger Agent sidecar injection seems overkill.

Assumptions

I will assume that we are sending traces to a Jaeger Backend such as the one I described in the previous article.

Specifically, I will assume that the Jaeger Collector ingress requires TLS client authentication.

I will also assume that we are using cert-manager to automatically issue/renew certs for Jaeger Agent containers.

Private CA

TLS client authentication really only makes sense when you control the Certificate Authorities whose client certificates you trust. There are ways to selectively trust client certificates from public CAs but that’s beyond the scope of this article.

So the first thing we’re going to do is create a private CA that can issue TLS [client] certificates.

$ cat <<EOF > /tmp/openssl.cnf
[ req ]
string_mask = utf8only
default_md = sha256
policy = policy_match
distinguished_name = req_distinguished_name
x509_extensions = v3_ca

[ policy_match ]
commonName = supplied

[ req_distinguished_name ]
commonName =

[ v3_ca ]
basicConstraints = critical, CA:true, pathlen:1
keyUsage = critical, cRLSign, keyCertSign
subjectKeyIdentifier = hash
EOF

$ openssl req \
    -new \
    -x509 \
    -config /tmp/openssl.cnf \
    -subj "/CN=My Private CA @ my-kubernetes-cluster" \
    -days 3650 \
    -newkey rsa:4096 \
    -keyout my-private-ca.key \
    -nodes \
    -out my-private-ca.crt

⚠ Note: Whole books have been written about Public Key Infrastructures (PKI) so I will not go into details here. The only thing I want to point out is that the CA created by the above commands cannot be used to create subordinate CAs… and that’s a deliberate design choice for our use case.

Using the CA cert and key we can create a ClusterIssuer so cert-manager can issue/renew client certs automatically.

$ kubectl config use-context my-kubernetes-cluster

$ kubectl create secret tls my-private-ca \
    --namespace=kube-system \
    --cert=my-private-ca.crt \
    --key=my-private-ca.key

$ kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: my-private-ca
spec:
  ca:
    secretName: my-private-ca
EOF

⚠ Note: The Secret with the CA cert and key for a ClusterIssuer must be located in the kube-system namespace.

Check your work.

$ kubectl get clusterissuer my-private-ca
NAME            READY   AGE
my-private-ca   True    32m

OK. Now we’re all set for everyone in our Kubernetes cluster to request TLS client certs using the cert-manager custom resource Certificate.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-workload
  namespace: my-namespace
spec:
  commonName: my-workload@my-namespace
  secretName: my-workload-jaeger-agent-tls
  duration: 9600h # 400d
  renewBefore: 4800h # 200d
  issuerRef:
    name: my-private-ca
    kind: ClusterIssuer
  usages:
  - client auth

Example Workload

Let’s assume you have the following workload in Kubernetes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-hotrod
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: example-hotrod
  template:
    metadata:
      labels:
        app.kubernetes.io/name: example-hotrod
    spec:
      containers:
      - name: example-hotrod
        image: jaegertracing/example-hotrod:latest
        args:
        - all
        resources:
          limits:
            cpu: 100m
            memory: 128Mi
          requests:
            cpu: 50m
            memory: 64Mi


⚠ Note: The above is actually a real thing that can be deployed for testing purposes. In order to interact with the application you can kubectl port-forward deploy/example-hotrod 8080:8080 and then type http://[::1]:8080/ into your browser.

⚠ Note: You can add the arg --jaeger-ui=https://jaeger(replace with the actual DNS name of your Jaeger Query Ingress) so that the “find trace” links on the web frontend take you straight to your traces in the Jaeger Backend.

The Jaeger client libraries support configuration via environment variables. So you can add the following container configuration to emit log messages containing trace IDs.

        env:
        - name: JAEGER_REPORTER_LOG_SPANS
          value: "true"

Sidecar Container

Let’s add a Jaeger Agent sidecar container to the above Deployment.

      - name: jaeger-agent
        image: jaegertracing/jaeger-agent:1.21
        args:
        - --reporter.grpc.host-port=dns:///jaeger-collector:443
        resources:
          limits:
            cpu: 100m
            memory: 128Mi
          requests:
            cpu: 50m
            memory: 64Mi


⚠ Note: Just as in the previous article, I am using an abbreviated, unqualified hostname here as a stand-in for an actual hostname: jaeger-collector. Be sure to update this with the real DNS name for your Jaeger Collector Ingress.

IMPORTANT: The Jaeger Agent sidecar container image is pulled from Docker Hub. Since Docker Hub started rate limiting image pulls for the free tier, it is possible that adding this sidecar container to your workload will put that workload at risk of landing in the ImagePullBackOff state.

If you have Docker Hub credentials for a paid subscription, you should add them to your workload (or its associated ServiceAccount).

      imagePullSecrets:
      - name: dockerhub-credentials

Alternatively, you could look into setting up a “pull-through” registry that caches the image.

TLS Authentication

Since our Jaeger Backend does not accept unauthenticated traffic, we have to request a TLS client cert from the private CA we created above.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-hotrod
spec:
  commonName: example-hotrod@my-namespace
  secretName: example-hotrod-tls
  duration: 9600h # 400d
  renewBefore: 4800h # 200d
  issuerRef:
    name: my-private-ca
    kind: ClusterIssuer
  usages:
  - client auth

⚠ Note: The Jaeger Agent container will not notice when a new certificate is available. If you do not think that your workload will be restarted / evicted before 400 days have elapsed (which is very unlikely in a Kubernetes environment), feel free to create a Certificate with a longer duration.

Now we can use the resulting certificate in our Jaeger Agent sidecar container with the following additional config.

        args:
        - --reporter.grpc.tls.enabled=true
        - --reporter.grpc.tls.cert=/var/run/jaeger-agent/tls/tls.crt
        - --reporter.grpc.tls.key=/var/run/jaeger-agent/tls/tls.key
        volumeMounts:
        - mountPath: /var/run/jaeger-agent/tls
          name: tls
          readOnly: true
      volumes:
      - name: tls
        secret:
          defaultMode: 420
          secretName: example-hotrod-tls

Tags

In order to make it easier for us to find our traces in Jaeger, we can attach useful tags to them (both in our code and the Jaeger Agent). Just add the following additional config to the Jaeger Agent sidecar container.

        args:
        - --agent.tags=pod.name=${POD_NAME:},pod.namespace=${POD_NAMESPACE:}
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name

Metrics

In order to see what your Jaeger Agent sidecar is up to, you can expose a metrics/admin endpoint for it and have Prometheus scrape it.

Add the following additional config to the Jaeger Agent sidecar container.

        ports:
        - name: admin-http
          protocol: TCP
          containerPort: 14271

If you have Prometheus Operator deployed in your Kubernetes cluster, you can create a PodMonitor to tell Prometheus to start scraping your Jaeger Agent.

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: example-hotrod
spec:
  podMetricsEndpoints:
  - port: admin-http
    honorLabels: true
  namespaceSelector:
    matchNames:
    - my-namespace
  selector:
    matchLabels:
      app.kubernetes.io/name: example-hotrod

Putting it all together...

With all of the above additions we end up with the following Kubernetes resources.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-hotrod
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: example-hotrod
  template:
    metadata:
      labels:
        app.kubernetes.io/name: example-hotrod
    spec:
      containers:
      - name: example-hotrod
        image: jaegertracing/example-hotrod:latest
        args:
        - all
        env:
        - name: JAEGER_REPORTER_LOG_SPANS
          value: "true"
        resources:
          limits:
            cpu: 100m
            memory: 128Mi
          requests:
            cpu: 50m
            memory: 64Mi
      - name: jaeger-agent
        image: jaegertracing/jaeger-agent:1.21
        args:
        - --reporter.grpc.host-port=dns:///jaeger-collector:443
        - --reporter.grpc.tls.enabled=true
        - --reporter.grpc.tls.cert=/var/run/jaeger-agent/tls/tls.crt
        - --reporter.grpc.tls.key=/var/run/jaeger-agent/tls/tls.key
        - --agent.tags=pod.name=${POD_NAME:},pod.namespace=${POD_NAMESPACE:}
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        ports:
        - name: admin-http
          protocol: TCP
          containerPort: 14271
        resources:
          limits:
            cpu: 100m
            memory: 128Mi
          requests:
            cpu: 50m
            memory: 64Mi
        volumeMounts:
        - mountPath: /var/run/jaeger-agent/tls
          name: tls
          readOnly: true
      imagePullSecrets:
      - name: dockerhub-credentials
      volumes:
      - name: tls
        secret:
          defaultMode: 420
          secretName: example-hotrod-tls

---

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-hotrod
spec:
  commonName: example-hotrod@my-namespace
  secretName: example-hotrod-tls
  duration: 9600h # 400d
  renewBefore: 4800h # 200d
  issuerRef:
    name: my-private-ca
    kind: ClusterIssuer
  usages:
  - client auth

---

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: example-hotrod
spec:
  podMetricsEndpoints:
  - port: admin-http
    honorLabels: true
  namespaceSelector:
    matchNames:
    - my-namespace
  selector:
    matchLabels:
      app.kubernetes.io/name: example-hotrod

And that’s it!

When your workload emits traces, you will be able to find them in the Jaeger Backend.

If you have configured Jaeger as a data source in Grafana, you can also find your traces there.