Amazon CloudWatch
Monitoring and observability. Metrics, logs, and alarms in one place.
What is CloudWatch?
Monitor everything in AWS. Collect metrics, store logs, set alarms. See what's happening across your infrastructure. Get alerted when things break.
Think of it like a health monitoring dashboard
CloudWatch is like having vital signs monitors for all your servers and apps. It tracks their health, alerts you when something's wrong, and logs everything for later analysis.
Key Features
Metrics
Track CPU, memory, requests. Built-in for most AWS services.
Logs
Centralized log storage. Search with Logs Insights.
Alarms
Get notified when metrics cross thresholds. Auto-remediate.
Dashboards
Visualize metrics. Share with your team.
Container Insights
Monitor ECS and EKS. See container-level metrics.
Anomaly Detection
ML-powered. Detects unusual patterns automatically.
When to Use
- Monitor AWS resources
- Centralize application logs
- Set up alerting
- Create operational dashboards
- Debug performance issues
- Track custom metrics
When Not to Use
- Full APM solution -> X-Ray + third-party
- Long-term log analysis -> S3 + Athena
- Complex log queries -> OpenSearch
- Multi-cloud monitoring -> Datadog/Grafana
- Code-level tracing -> X-Ray
- Real-time streaming -> Kinesis
Prerequisites
- An AWS account
- Resources to monitor (EC2, Lambda, etc.)
- CloudWatch Agent for detailed OS metrics (optional)
AWS Console Steps
Open CloudWatch Console
Navigate to CloudWatch in the AWS Console
Explore Metrics
Click 'All metrics' to see auto-collected service metrics
Create an Alarm
Select a metric, define threshold, and configure actions
View Logs
Go to 'Log groups' to see logs from Lambda, ECS, etc.
Build Dashboard
Create a custom dashboard with your key metrics
Set Up Logs Insights
Query logs with SQL-like syntax for fast troubleshooting
Pro Tips8
High-resolution metrics cost more
Use 1-second resolution only for critical metrics. Standard 60-second resolution is cheaper.
Logs Insights is powerful but costly
You pay per GB scanned. Use narrow time ranges and specific log groups.
Use metric filters to save money
Extract metrics from logs instead of PutMetricData API. Cheaper for error counts.
Composite alarms reduce noise
Combine alarms with AND/OR logic. Alert only when multiple conditions are true.
Share dashboards publicly
Stakeholders without AWS access can view dashboards. Enable public sharing.
Embedded Metric Format for Lambda
Embed metrics in log output. CloudWatch extracts them automatically. No API calls.
Set log retention early
Default is forever. Set 30-day retention to avoid surprise costs.
Free tier covers basics
10 custom metrics, 10 alarms, 5GB logs free. Good for small projects.
Key Facts8
Metric retention: 1-sec metrics kept 3 hours, 1-min kept 15 days, 1-hour kept 15 months
Metric resolution: Standard 60 seconds, high-resolution 1 second
Alarms: 10,000 per account limit, 3 states (OK, ALARM, INSUFFICIENT_DATA)
Log retention: 1 day to 10 years, default is never expire
Log groups: Query up to 50 log groups per Logs Insights query
Dashboard limits: 500 widgets per dashboard, $3/month for custom dashboards
CloudWatch Agent: Required for memory and disk metrics on EC2
Subscription filters: Max 2 per log group for streaming to Kinesis/Lambda