Skip to content

caylent/enlyte-observability

Repository files navigation

<<<<<<< HEAD

Enlyte Observability Project

Automated AWS logging compliance and remediation system that monitors CloudTrail events and ensures logging is enabled across various AWS resources.

Overview

This project provides an event-driven architecture that:

  • Monitors AWS CloudTrail for resource creation/modification events
  • Automatically checks if logging is enabled for the resource
  • Uses cross-account IAM roles for secure multi-account access
  • Supports multiple AWS resource types (EC2, EKS, ELB, S3, VPC)

Project Structure

enlyte-observability/
β”œβ”€β”€ cms-v2-lambdas/              # Lambda function code
β”‚   β”œβ”€β”€ code/                    # Python source code
β”‚   β”œβ”€β”€ layers/                  # Lambda layer dependencies
β”‚   └── Makefile                 # Build automation
β”œβ”€β”€ cms-v2-terraform/            # Terraform infrastructure
β”‚   └── stacks/
β”‚       β”œβ”€β”€ base/                # Shared infrastructure (S3, KMS)
β”‚       └── observability/       # Observability stack
β”‚           └── logging/         # Logging Lambda and EventBridge
β”œβ”€β”€ cms-v2-terraform-modules/    # Reusable Terraform modules
β”‚   β”œβ”€β”€ eventbridge-rules/       # EventBridge rule module
β”‚   β”œβ”€β”€ kms/                     # KMS key module
β”‚   β”œβ”€β”€ lambda/                  # Lambda function module
β”‚   └── s3/                      # S3 bucket module
└── docs/                        # Resource-specific documentation
    β”œβ”€β”€ ec2/                     # EC2 logging documentation
    β”œβ”€β”€ eks/                     # EKS logging documentation
    β”œβ”€β”€ elb/                     # ELB logging documentation
    β”œβ”€β”€ s3/                      # S3 logging documentation
    └── vpc/                     # VPC logging documentation

Quick Start

Prerequisites

  • Python 3.12.3 (via pyenv)
  • Terraform >= 1.0
  • AWS CLI configured with appropriate credentials
  • Make utility =======

Enlyte Observability Infrastructure

Automated AWS observability solution that captures CloudTrail events via EventBridge and enables logging for AWS services (ELB, EC2, RDS, VPC, S3) using serverless Lambda functions.

πŸ—οΈ Architecture Overview

CloudTrail Events β†’ EventBridge Rules β†’ Lambda Functions β†’ Enable Service Logging
                                              ↓
                                        Lambda Layers
                                        (Dependencies)
                                              ↓
                                    CloudWatch Logs (30-day retention)

Key Features

  • Event-Driven Automation: Automatically enables logging when AWS resources are created
  • Multi-Service Support: ELB (ALB/NLB), EC2, RDS, VPC, S3 logging enablement
  • Modular Design: Reusable Terraform modules for Lambda, EventBridge, KMS, and S3
  • Security First: KMS encryption for all sensitive data, IAM least privilege
  • Local Development: Run and test Lambda functions locally with sample events
  • CI/CD Ready: S3-based deployment with automatic change detection via ETag

πŸ“ Repository Structure

enlyte-observability/
β”œβ”€β”€ cms-v2-lambdas/              # Lambda function source code
β”‚   β”œβ”€β”€ code/
β”‚   β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”‚   β”œβ”€β”€ common/          # Shared utilities (events, security, utils)
β”‚   β”‚   β”‚   β”œβ”€β”€ ec2_logs_enabler/
β”‚   β”‚   β”‚   β”œβ”€β”€ elb_logs_enabler/
β”‚   β”‚   β”‚   β”œβ”€β”€ process_events_enabler/
β”‚   β”‚   β”‚   β”œβ”€β”€ rds_logs_enabler/
β”‚   β”‚   β”‚   β”œβ”€β”€ s3_logs_enabler/
β”‚   β”‚   β”‚   └── vpc_logs_enabler/
β”‚   β”‚   β”œβ”€β”€ tests/               # Sample CloudTrail events
β”‚   β”‚   β”œβ”€β”€ main_*.py            # Lambda entry points
β”‚   β”‚   └── launch.json          # VS Code debug configuration
β”‚   β”œβ”€β”€ layers/                  # Lambda layer dependencies
β”‚   β”‚   └── requirements.txt
β”‚   β”œβ”€β”€ dist/                    # Built Lambda packages (auto-generated)
β”‚   └── Makefile                 # Build automation
β”‚
β”œβ”€β”€ cms-v2-terraform-modules/    # Reusable Terraform modules
β”‚   β”œβ”€β”€ eventbridge-rules/       # EventBridge rules and targets
β”‚   β”œβ”€β”€ kms/                     # KMS key management
β”‚   β”œβ”€β”€ lambda/                  # Lambda functions and layers
β”‚   β”œβ”€β”€ s3/                      # S3 bucket configuration
β”‚   └── terraform-aws-s3-bucket/ # S3 bucket wrapper
β”‚
β”œβ”€β”€ cms-v2-terraform/            # Terraform infrastructure stacks
β”‚   └── stacks/
β”‚       β”œβ”€β”€ base/                # Base infrastructure
β”‚       └── observability/
β”‚           β”œβ”€β”€ logging/         # Logging stack (EventBridge + Lambda)
β”‚           └── metrics/         # Metrics stack (future)
β”‚
└── docs/                        # Service-specific documentation
    β”œβ”€β”€ ec2/
    β”œβ”€β”€ elb/
    β”œβ”€β”€ rds/
    β”œβ”€β”€ s3/
    └── vpc/

πŸš€ Quick Start

Prerequisites

  • Terraform >= 1.0
  • AWS CLI configured with appropriate credentials
  • Python 3.12+
  • Make (for building Lambda packages)
  • CloudTrail enabled in target AWS region

origin/main

1. Build Lambda Functions

cd cms-v2-lambdas

<<<<<<< HEAD
# Install dependencies locally
make python-packages

# Build Lambda layer
make layers

# Package Lambda function
make process_events

# Upload to S3 (optional)
make copy-s3

=======

Build all components

make layers # Build Python dependencies layer make process_events # Build main event processor make ec2 # Build EC2 logs enabler make elb # Build ELB logs enabler

Or clean and rebuild everything

make clean make layers process_events


Built packages will be in `cms-v2-lambdas/dist/`:
- `dependencies_layer.zip`
- `process_events_lambda.zip`
- `ec2_lambda.zip`
- `elb_lambda.zip`

>>>>>>> origin/main
### 2. Deploy Infrastructure

```bash
cd cms-v2-terraform/stacks/observability/logging

# Initialize Terraform
terraform init

<<<<<<< HEAD
# Plan deployment
terraform plan -var-file=environments/dev_us-east-1.tfvars

# Apply infrastructure
terraform apply -var-file=environments/dev_us-east-1.tfvars

Architecture

Event Flow

  1. CloudTrail captures AWS API calls for resource creation/modification
  2. EventBridge rule filters relevant events and triggers Lambda
  3. Lambda function:
    • Parses the event to identify resource type
    • Assumes cross-account IAM role (logging-remediation)
    • Checks if logging is enabled for the resource
    • Returns result (success if logging disabled, error if enabled)

Supported Resources

Resource Type Logging Type Check Method
EC2 CloudWatch Agent SSM command to check agent status
EKS Control Plane Logs API logging configuration
ELB (ALB/NLB) Access Logs Load balancer attributes
S3 Server Access Logs Bucket logging configuration
VPC Flow Logs VPC flow log status

Cross-Account Access

The Lambda function uses STS AssumeRole to access resources in other AWS accounts:

Lambda (Account A) 
  ↓ AssumeRole
Remediation Role (Account B)
  ↓ Use temporary credentials
AWS Resources (Account B)

Components

Lambda Functions (cms-v2-lambdas/)

Event-driven Python 3.12 Lambda that processes CloudTrail events and checks logging status.

Key Features:

  • Abstract base classes for extensibility
  • Cross-account role assumption
  • Support for Linux and Windows EC2 instances
  • Comprehensive error handling and logging

See cms-v2-lambdas/README.md for detailed documentation.

Terraform Infrastructure (cms-v2-terraform/)

Infrastructure as Code for deploying the observability system.

Components:

  • EventBridge rules for CloudTrail event filtering
  • Lambda function and layer deployment
  • IAM roles for Lambda execution and cross-account access
  • S3 bucket for Lambda artifacts
  • KMS keys for encryption

Terraform Modules (cms-v2-terraform-modules/)

Reusable Terraform modules for common AWS resources:

  • eventbridge-rules/ - EventBridge rule creation
  • kms/ - KMS key management
  • lambda/ - Lambda function and layer deployment
  • s3/ - S3 bucket with security best practices

Documentation (docs/)

Resource-specific documentation for each supported AWS service.

Environment Variables

The Lambda requires these environment variables:

  • REMEDIATION_ROLE_NAME - IAM role name for cross-account access (default: logging-remediation)
  • AWS_REGION - AWS region where Lambda is deployed

IAM Permissions

Lambda Execution Role

Required permissions:

  • Basic Lambda execution (logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents)
  • STS AssumeRole for the remediation role

Remediation Role

Required permissions (assumed by Lambda):

  • S3: s3:GetBucketLogging, s3:PutBucketLogging
  • EKS: eks:DescribeCluster, eks:UpdateClusterConfig
  • EC2: ec2:DescribeInstances, ec2:DescribeFlowLogs, ec2:CreateFlowLogs
  • ELB: elasticloadbalancing:DescribeLoadBalancers, elasticloadbalancing:DescribeLoadBalancerAttributes
  • VPC: ec2:DescribeFlowLogs, ec2:DescribeVpcs
  • SSM: ssm:SendCommand, ssm:GetCommandInvocation (for EC2 CloudWatch Agent checks)

Development

Local Testing

Use VS Code debugger to test Lambda locally:

  1. Open cms-v2-lambdas/code/main_process_events.py
  2. Set breakpoints as needed
  3. Press F5 to start debugging
  4. Test events are in cms-v2-lambdas/code/tests/

Adding New Resource Types

  1. Create handler class in cms-v2-lambdas/code/src/common/ (e.g., rds.py)
  2. Implement BaseResources interface:
    • is_event() - Check if event matches resource
    • parse_event() - Extract resource information
    • get_arn() - Construct resource ARN
    • check_logs_enabled() - Verify logging status
  3. Register handler in cms-v2-lambdas/code/src/common/events.py
  4. Create test event in cms-v2-lambdas/code/tests/
  5. Add documentation in docs/ directory

Deployment

Development Environment

cd cms-v2-terraform/stacks/observability/logging
terraform apply -var-file=environments/dev_us-east-1.tfvars

Production Environment

cd cms-v2-terraform/stacks/observability/logging
terraform apply -var-file=environments/prod_us-east-1.tfvars

Monitoring

The Lambda function uses CloudWatch Logs for observability:

# View Lambda logs
aws logs tail /aws/lambda/process-events --follow

# View specific error messages
aws logs filter-pattern /aws/lambda/process-events --pattern "ERROR"

Troubleshooting

Common Issues

  1. AccessDenied when assuming role

    • Check IAM trust policy on remediation role
    • Verify Lambda execution role has sts:AssumeRole permission
  2. Module import errors

    • Run make python-packages in cms-v2-lambdas/
    • Verify Python 3.12.3 is active
  3. Logging check failures

    • Verify remediation role has appropriate permissions
    • Check resource-specific documentation in docs/

Security Best Practices

  • All S3 buckets use server-side encryption (SSE-KMS)
  • Lambda functions use environment variables for configuration
  • Cross-account access uses IAM roles with least privilege
  • All data in transit uses TLS
  • CloudWatch Logs for audit trail

Contributing

When contributing to this project:

  1. Follow the existing code structure and naming conventions
  2. Add comprehensive error handling
  3. Update relevant documentation
  4. Test locally before deploying
  5. Use descriptive commit messages

License

[Add your license information here]

Support

For questions or issues, please contact the Enlyte DevOps team.

Review planned changes

terraform plan

Deploy

terraform apply


### 3. Upload Lambda Packages

After infrastructure is deployed, upload the Lambda packages:

```bash
# Get bucket name from Terraform
BUCKET_NAME=$(terraform output -raw s3_bucket_name)

# Upload packages
cd ../../../../../cms-v2-lambdas
make BUCKET=observability-logging-artifacts-dev copy-s3

# Or manually:
aws s3 cp dist/process_events_lambda.zip s3://$BUCKET_NAME/
aws s3 cp dist/dependencies_layer.zip s3://$BUCKET_NAME/

4. Trigger Lambda Redeployment

cd cms-v2-terraform/stacks/observability/logging

# Force Lambda to pick up new code
terraform apply -replace="module.process_events_lambda_logging.module.lambda[0].aws_lambda_function.this[0]"

πŸ§ͺ Local Development

Running Lambda Functions Locally

cd cms-v2-lambdas/code

# Run with test event
python main_process_events.py

# Set log level
export LOGLEVEL=DEBUG
python main_process_events.py

VS Code Debugging

Use the included launch.json configuration:

  1. Open cms-v2-lambdas/code/main_process_events.py
  2. Set breakpoints
  3. Press F5 or select "Debug Lambda Test Runner with Path"
  4. Edit tests/nlb_event.json to test different scenarios

Testing with Sample Events

Sample CloudTrail events are in cms-v2-lambdas/code/tests/:

  • alb_event.json - ALB creation event
  • nlb_event.json - NLB deletion event
  • ec2_event.json - EC2 event (placeholder)

Modify main_process_events.py to test different events:

file_path = "tests/alb_event.json"  # Change this line

πŸ“¦ Lambda Function Design

Process Events Lambda (Main Handler)

Purpose: Central event router that identifies AWS service events and enables appropriate logging.

Architecture:

EventsHandler β†’ LoadBalancerEvents β†’ Parse & Identify Service
                                            ↓
                                    Enable Service Logging

Supported Events:

  • ELB: CreateLoadBalancer, DeleteLoadBalancer
  • EC2: (future implementation)
  • RDS: (future implementation)
  • S3: (future implementation)
  • VPC: (future implementation)

Event Processing Flow:

  1. Event Reception: Lambda receives CloudTrail event from EventBridge
  2. Event Identification: Determines service type (ELB, EC2, etc.)
  3. Event Parsing: Extracts relevant information (ARN, region, account)
  4. Action Execution: Enables logging for the identified service
  5. Response: Returns success/failure status

Example Response:

{
  "statusCode": 200,
  "body": {
    "message": "Event processed successfully",
    "data": {
      "eventSource": "elasticloadbalancing.amazonaws.com",
      "eventName": "CreateLoadBalancer",
      "account": "131578276461",
      "region": "us-east-1",
      "arn": "arn:aws:elasticloadbalancing:us-east-1:131578276461:loadbalancer/app/test-lb/abc123"
    }
  }
}

Lambda Layer Structure

The Lambda layer includes shared dependencies:

  • requests - HTTP library
  • urllib3 - HTTP client
  • pytz - Timezone handling

Layer structure in Lambda:

/opt/python/
└── lib/
    └── python3.12/
        └── site-packages/
            β”œβ”€β”€ requests/
            β”œβ”€β”€ urllib3/
            └── pytz/

πŸ”§ Terraform Modules

EventBridge Rules Module

Creates EventBridge rules and targets with automatic Lambda permissions.

Features:

  • Multiple rules and targets per deployment
  • Automatic IAM policy attachment
  • CloudWatch integration
  • Custom event patterns

Documentation: cms-v2-terraform-modules/eventbridge-rules/README.md

Lambda Module

Wrapper around terraform-aws-modules/lambda/aws with S3 change detection.

Features:

  • Automatic S3 package updates via ETag
  • Lambda functions and layers
  • KMS encryption support
  • CloudWatch Logs integration

Documentation: cms-v2-terraform-modules/lambda/README.md

KMS Module

Manages KMS keys for encryption with service principal access.

Documentation: cms-v2-terraform-modules/kms/README.md

S3 Module

S3 bucket configuration with versioning and encryption.

Documentation: cms-v2-terraform-modules/s3/README.md

πŸ” Security

Encryption

  • Lambda Environment Variables: KMS encrypted
  • CloudWatch Logs: KMS encrypted with 30-day retention
  • S3 Artifacts: KMS encrypted with versioning enabled

IAM Permissions

Lambda execution roles follow least privilege:

  • CloudWatch Logs write access
  • KMS decrypt for environment variables
  • Service-specific permissions (e.g., ELB DescribeLoadBalancers)

Network Security

  • Lambdas run in AWS-managed VPC (no internet access required)
  • EventBridge uses IAM for authorization
  • S3 buckets have server-side encryption enforced

πŸ“Š Monitoring

CloudWatch Logs

Lambda execution logs:

  • Log Group: /aws/lambda/process-events-function
  • Retention: 30 days
  • Encryption: KMS encrypted

Log levels (controlled via LOGLEVEL env var):

  • DEBUG - Detailed execution flow
  • INFO - Standard operational messages
  • WARNING - Non-critical issues
  • ERROR - Failures and exceptions

EventBridge Metrics

Monitor in CloudWatch:

  • Invocations - Rule matches
  • FailedInvocations - Lambda failures
  • TriggeredRules - Successful executions

Lambda Metrics

  • Duration - Execution time
  • Errors - Function errors
  • Throttles - Concurrency limits hit
  • ConcurrentExecutions - Parallel invocations

πŸ’° Cost Estimation

Monthly Costs (for 1000 events/day)

Service Usage Cost
Lambda (512MB ARM64) ~30K invocations/month Γ— 500ms ~$2.50
CloudWatch Logs 30-day retention, ~1GB ~$1.50
EventBridge AWS events (free) $0.00
S3 (artifacts) 50MB storage + versioning ~$0.10
KMS 2 keys + API calls ~$2.10
Total ~$6.20/month

Cost Optimization Tips

  • Use ARM64 architecture (20% cheaper than x86)
  • Adjust CloudWatch Log retention based on compliance needs
  • Enable S3 lifecycle policies for old Lambda versions
  • Right-size Lambda memory based on profiling
  • Use reserved concurrency to prevent runaway costs

πŸ› Troubleshooting

Lambda Not Triggered

Symptoms: Events occur but Lambda doesn't execute

Solutions:

  1. Verify CloudTrail is enabled
  2. Check EventBridge rule patterns
  3. Verify Lambda permissions
  4. Review EventBridge metrics for FailedInvocations

S3 Deployment Not Updating

Symptoms: New code uploaded but Lambda runs old version

Solutions:

  1. Verify ignore_source_code_hash = true in Lambda module
  2. Check S3 versioning enabled
  3. Confirm ETag changed after upload
  4. Force update with terraform apply -replace

Event Not Recognized

Symptoms: Lambda returns "Event type not supported"

Solutions:

  1. Add debug logging: export LOGLEVEL=DEBUG
  2. Check event format matches CloudTrail structure
  3. Verify event source and detail-type in EventBridge pattern
  4. Test locally with actual CloudTrail event JSON

πŸ“š Additional Documentation

πŸ—ΊοΈ Roadmap

  • Complete EC2 logging enablement
  • Complete RDS logging enablement
  • Complete S3 logging enablement
  • Complete VPC Flow Logs enablement
  • Add metrics collection stack
  • Implement automated testing (pytest)
  • Add CI/CD pipeline (GitHub Actions)

origin/main

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •