Skip to content

Despite using memory limiter processor kubernetes pod getting OOM killed #42312

@tanishsummit

Description

@tanishsummit

Component(s)

exporter/awskinesis, cmd/otelcontribcol

Describe the issue you're reporting

I am utilizing opentelemetry collector's memory_limiter processor to prevent collector pods getting OOM and analyzing the performance of collector setup (drop/refusal in traces). Below is my collector configuration -

extensions:
    sigv4auth/aws:
    health_check:
      endpoint: ${MY_POD_IP}:13133

  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318
          cors:
            allowed_origins:
              - "http://*"
              - "https://*"
    
    prometheus:
      config:
        scrape_configs:
        - job_name: otel-collector-metrics
          scrape_interval: 10s
          static_configs:
          - targets:
            - '${MY_POD_IP}:8888'

  processors:
    batch:
      timeout: 1s
      send_batch_size: 2048
      send_batch_max_size: 3000
    memory_limiter:
      check_interval: 1s
      limit_mib: 1700
      spike_limit_mib: 200

  exporters:
    debug: {}
    prometheusremotewrite/aws:
      endpoint: https://<amazon_managed_prometheus_url>/api/v1/remote_write
      auth:
        authenticator: sigv4auth/aws
    awskinesis:
      aws:
        stream_name: "${KINESIS_STREAM_NAME}"
        region: us-east-2
      encoding:
        name: otlp_json
  
  service:
    extensions: [sigv4auth/aws, health_check]
    pipelines:
      traces:
        receivers: [otlp]
        processors: [batch,memory_limiter]
        exporters: [debug,awskinesis]
      metrics:
        receivers: [prometheus]
        processors: []
        exporters: [prometheusremotewrite/aws]
    telemetry:
      metrics:
        address: ${MY_POD_IP}:8888
        level: detailed

Collector version - 0.130.1 (contrib)

Pod request/limits -

resources:
  requests:
    cpu: 2
    memory: 2000Mi
  limits:
    cpu: 2
    memory: 2000Mi

Below is grafana dashboard for collector metrics during which the pod got OOM -

Image

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions