-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Description
Component(s)
exporter/awskinesis, cmd/otelcontribcol
Describe the issue you're reporting
I am utilizing opentelemetry collector's memory_limiter processor to prevent collector pods getting OOM and analyzing the performance of collector setup (drop/refusal in traces). Below is my collector configuration -
extensions:
sigv4auth/aws:
health_check:
endpoint: ${MY_POD_IP}:13133
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
cors:
allowed_origins:
- "http://*"
- "https://*"
prometheus:
config:
scrape_configs:
- job_name: otel-collector-metrics
scrape_interval: 10s
static_configs:
- targets:
- '${MY_POD_IP}:8888'
processors:
batch:
timeout: 1s
send_batch_size: 2048
send_batch_max_size: 3000
memory_limiter:
check_interval: 1s
limit_mib: 1700
spike_limit_mib: 200
exporters:
debug: {}
prometheusremotewrite/aws:
endpoint: https://<amazon_managed_prometheus_url>/api/v1/remote_write
auth:
authenticator: sigv4auth/aws
awskinesis:
aws:
stream_name: "${KINESIS_STREAM_NAME}"
region: us-east-2
encoding:
name: otlp_json
service:
extensions: [sigv4auth/aws, health_check]
pipelines:
traces:
receivers: [otlp]
processors: [batch,memory_limiter]
exporters: [debug,awskinesis]
metrics:
receivers: [prometheus]
processors: []
exporters: [prometheusremotewrite/aws]
telemetry:
metrics:
address: ${MY_POD_IP}:8888
level: detailed
Collector version - 0.130.1 (contrib)
Pod request/limits -
resources:
requests:
cpu: 2
memory: 2000Mi
limits:
cpu: 2
memory: 2000Mi
Below is grafana dashboard for collector metrics during which the pod got OOM -
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1
or me too
, to help us triage it. Learn more here.