Building a Custom CloudWatch Dashboard: CPU, Memory, and Error Metrics on One Screen

When an incident hits at 2 AM, you need answers in seconds — not minutes spent hunting across five different console pages. A well-structured CloudWatch Dashboard collapses your critical application signals (CPU, memory, error counts) into a single, actionable pane of glass.

TL;DR

StepActionKey Resource
1Create a CloudWatch DashboardAWS Console / CLI / CloudFormation
2Add a Line Widget for CPU UtilizationAWS/EC2 or AWS/ECS namespace
3Add a Line Widget for MemoryCloudWatch Agent custom namespace
4Add a Number Widget for Error CountCloudWatch Logs Metric Filter or custom metric
5Set time range & auto-refreshDashboard settings
6Share or embed the dashboardCloudWatch Sharing feature

Architecture Overview

Before building widgets, understand where each metric originates. CPU comes from the hypervisor (built-in). Memory requires the CloudWatch Agent because the hypervisor cannot see inside the OS. Application errors are emitted as custom metrics via the CloudWatch Agent or the PutMetricData API.

graph LR EC2["EC2 Instance"] -->|Built-in hypervisor metric| CWM["CloudWatch Metrics AWS/EC2 namespace"] EC2 -->|OS metrics via agent| CWA["CloudWatch Agent"] CWA -->|mem_used_percent| CWMC["CloudWatch Metrics CWAgent namespace"] EC2 -->|App logs| CWL["CloudWatch Logs"] CWL -->|Metric Filter on ERROR| CWME["CloudWatch Metrics MyApp/Errors namespace"] CWM --> DB["CloudWatch Dashboard"] CWMC --> DB CWME --> DB
  1. EC2 / ECS / Lambda — your compute layer emitting raw signals.
  2. CloudWatch Agent — installed on EC2/on-prem to push OS-level metrics (memory, disk) and application log files.
  3. CloudWatch Logs — receives structured log streams; Metric Filters extract numeric signals (e.g., ERROR count) from log data.
  4. CloudWatch Metrics — the time-series store for all numeric data points, organized by namespace, dimension, and metric name.
  5. CloudWatch Dashboard — the visualization layer that queries metrics and renders widgets.
Analogy: Think of CloudWatch Metrics as a flight data recorder and the Dashboard as the cockpit instrument panel. The recorder captures everything; the panel surfaces only what the pilot needs to fly safely right now.

Step 1 — Create the Dashboard

You can create a dashboard via the AWS Console, AWS CLI, or Infrastructure-as-Code. The CLI approach is repeatable and version-controllable.

# Create an empty dashboard named "AppMonitor"
aws cloudwatch put-dashboard \
  --dashboard-name AppMonitor \
  --dashboard-body '{"widgets":[]}'

Verify creation:

aws cloudwatch list-dashboards

Step 2 — CPU Utilization Widget

For EC2, CPUUtilization is a built-in metric in the AWS/EC2 namespace. No agent required.

🔽 Click to expand — Full Dashboard JSON with CPU Widget
{
  "widgets": [
    {
      "type": "metric",
      "x": 0,
      "y": 0,
      "width": 12,
      "height": 6,
      "properties": {
        "title": "CPU Utilization (%)",
        "view": "timeSeries",
        "stacked": false,
        "metrics": [
          [
            "AWS/EC2",
            "CPUUtilization",
            "InstanceId",
            "i-0123456789abcdef0"
          ]
        ],
        "period": 60,
        "stat": "Average",
        "region": "us-east-1",
        "yAxis": {
          "left": { "min": 0, "max": 100 }
        }
      }
    }
  ]
}

Key properties explained:

  • period: Granularity in seconds (60 = 1-minute resolution).
  • stat: Aggregation function — Average, Maximum, p99, etc.
  • view: timeSeries (line chart) or singleValue (number tile).

Step 3 — Memory Utilization Widget

Memory is not available as a built-in EC2 metric. You must install and configure the CloudWatch Agent on your instance. The agent publishes memory metrics to a custom namespace — by default CWAgent.

3a — Install & Configure the CloudWatch Agent

🔽 Click to expand — CloudWatch Agent config (amazon-cloudwatch-agent.json)
{
  "metrics": {
    "namespace": "CWAgent",
    "metrics_collected": {
      "mem": {
        "measurement": [
          "mem_used_percent"
        ],
        "metrics_collection_interval": 60
      }
    },
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}"
    }
  }
}

Start the agent after placing the config at /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config \
  -m ec2 \
  -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json \
  -s

3b — Memory Widget JSON

🔽 Click to expand — Memory Widget definition
{
  "type": "metric",
  "x": 12,
  "y": 0,
  "width": 12,
  "height": 6,
  "properties": {
    "title": "Memory Used (%)",
    "view": "timeSeries",
    "stacked": false,
    "metrics": [
      [
        "CWAgent",
        "mem_used_percent",
        "InstanceId",
        "i-0123456789abcdef0"
      ]
    ],
    "period": 60,
    "stat": "Average",
    "region": "us-east-1",
    "yAxis": {
      "left": { "min": 0, "max": 100 }
    }
  }
}

Step 4 — Application Error Count Widget

Error counts are typically derived from application logs. The pipeline is: Log Stream → Metric Filter → Custom Metric → Widget.

graph LR APP["Application"] -->|Writes logs| LG["Log Group /aws/ec2/my-application"] LG -->|Metric Filter pattern: ERROR| MF["Metric Filter"] MF -->|Increments counter| CM["Custom Metric MyApp/Errors :: ErrorCount"] CM --> WG["Dashboard Widget singleValue / Sum"]
  1. Application writes structured logs to a CloudWatch Log Group.
  2. A Metric Filter pattern (e.g., ERROR) matches log events and increments a counter.
  3. The counter is stored as a custom metric in a namespace you define (e.g., MyApp/Errors).
  4. The Dashboard widget queries that metric.

4a — Create the Metric Filter

aws logs put-metric-filter \
  --log-group-name "/aws/ec2/my-application" \
  --filter-name "ErrorCount" \
  --filter-pattern "ERROR" \
  --metric-transformations \
      metricName=ErrorCount,\
      metricNamespace=MyApp/Errors,\
      metricValue=1,\
      defaultValue=0

4b — Error Count Widget JSON (Single Value / Number tile)

🔽 Click to expand — Error Count Widget definition
{
  "type": "metric",
  "x": 0,
  "y": 6,
  "width": 6,
  "height": 3,
  "properties": {
    "title": "Application Error Count",
    "view": "singleValue",
    "metrics": [
      [
        "MyApp/Errors",
        "ErrorCount"
      ]
    ],
    "period": 300,
    "stat": "Sum",
    "region": "us-east-1"
  }
}

Step 5 — Assemble the Full Dashboard via CLI

Combine all widget definitions into a single put-dashboard call. The --dashboard-body value must be a JSON string.

🔽 Click to expand — Full put-dashboard CLI command
aws cloudwatch put-dashboard \
  --dashboard-name AppMonitor \
  --dashboard-body '{
    "widgets": [
      {
        "type": "metric",
        "x": 0, "y": 0, "width": 12, "height": 6,
        "properties": {
          "title": "CPU Utilization (%)",
          "view": "timeSeries",
          "metrics": [["AWS/EC2","CPUUtilization","InstanceId","i-0123456789abcdef0"]],
          "period": 60, "stat": "Average", "region": "us-east-1"
        }
      },
      {
        "type": "metric",
        "x": 12, "y": 0, "width": 12, "height": 6,
        "properties": {
          "title": "Memory Used (%)",
          "view": "timeSeries",
          "metrics": [["CWAgent","mem_used_percent","InstanceId","i-0123456789abcdef0"]],
          "period": 60, "stat": "Average", "region": "us-east-1"
        }
      },
      {
        "type": "metric",
        "x": 0, "y": 6, "width": 6, "height": 3,
        "properties": {
          "title": "Application Error Count",
          "view": "singleValue",
          "metrics": [["MyApp/Errors","ErrorCount"]],
          "period": 300, "stat": "Sum", "region": "us-east-1"
        }
      }
    ]
  }'

Step 6 — CloudFormation (Infrastructure-as-Code)

For production environments, define your dashboard in CloudFormation so it is version-controlled and reproducible.

🔽 Click to expand — CloudFormation template snippet
Resources:
  AppMonitorDashboard:
    Type: AWS::CloudWatch::Dashboard
    Properties:
      DashboardName: AppMonitor
      DashboardBody: !Sub |
        {
          "widgets": [
            {
              "type": "metric",
              "x": 0, "y": 0, "width": 12, "height": 6,
              "properties": {
                "title": "CPU Utilization (%)",
                "view": "timeSeries",
                "metrics": [["AWS/EC2","CPUUtilization","InstanceId","${InstanceId}"]],
                "period": 60,
                "stat": "Average",
                "region": "${AWS::Region}"
              }
            },
            {
              "type": "metric",
              "x": 12, "y": 0, "width": 12, "height": 6,
              "properties": {
                "title": "Memory Used (%)",
                "view": "timeSeries",
                "metrics": [["CWAgent","mem_used_percent","InstanceId","${InstanceId}"]],
                "period": 60,
                "stat": "Average",
                "region": "${AWS::Region}"
              }
            },
            {
              "type": "metric",
              "x": 0, "y": 6, "width": 6, "height": 3,
              "properties": {
                "title": "Application Error Count",
                "view": "singleValue",
                "metrics": [["MyApp/Errors","ErrorCount"]],
                "period": 300,
                "stat": "Sum",
                "region": "${AWS::Region}"
              }
            }
          ]
        }

Parameters:
  InstanceId:
    Type: AWS::EC2::Instance::Id
    Description: EC2 Instance ID to monitor

Step 7 — IAM Permissions (Least Privilege)

The IAM principal creating and viewing dashboards needs the following minimum permissions:

🔽 Click to expand — Least-privilege IAM policy
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ManageDashboards",
      "Effect": "Allow",
      "Action": [
        "cloudwatch:PutDashboard",
        "cloudwatch:GetDashboard",
        "cloudwatch:ListDashboards",
        "cloudwatch:DeleteDashboards"
      ],
      "Resource": "arn:aws:cloudwatch::123456789012:dashboard/AppMonitor"
    },
    {
      "Sid": "ReadMetrics",
      "Effect": "Allow",
      "Action": [
        "cloudwatch:GetMetricData",
        "cloudwatch:GetMetricStatistics",
        "cloudwatch:ListMetrics"
      ],
      "Resource": "*"
    },
    {
      "Sid": "ManageMetricFilters",
      "Effect": "Allow",
      "Action": [
        "logs:PutMetricFilter",
        "logs:DescribeMetricFilters"
      ],
      "Resource": "arn:aws:logs:us-east-1:123456789012:log-group:/aws/ec2/my-application"
    }
  ]
}

Pro Tips

  • Auto-refresh: In the console, set the dashboard refresh interval (10s, 1m, 2m, 5m, 15m) via the top-right refresh control.
  • Alarms as widgets: Add an alarm type widget to surface CloudWatch Alarm state directly on the dashboard — no separate navigation needed.
  • Math expressions: Use Metric Math in widget definitions to compute derived signals (e.g., error rate = errors / requests * 100) without emitting a separate metric.
  • Cross-account dashboards: CloudWatch supports cross-account observability, allowing a central monitoring account to display metrics from multiple source accounts.
  • Sharing: Use the CloudWatch dashboard sharing feature to generate a read-only URL for stakeholders who do not have AWS Console access.

Glossary

TermDefinition
NamespaceA logical container for CloudWatch metrics (e.g., AWS/EC2, CWAgent). Prevents metric name collisions across services.
DimensionA name/value pair that uniquely identifies a metric within a namespace (e.g., InstanceId=i-xxx).
Metric FilterA CloudWatch Logs rule that scans log events for a pattern and increments a custom metric counter on each match.
PeriodThe length of time (in seconds) over which metric data is aggregated into a single data point.
StatThe aggregation function applied over a period: Average, Sum, Maximum, Minimum, or percentile (e.g., p99).

Next Steps

Related Posts

Comments

Popular posts from this blog

EC2 No Internet Access in Custom VPC: Attaching an Internet Gateway and Fixing Route Tables

IAM User vs. IAM Role: Why Your EC2 Instance Should Never Use a User

EC2 SSH Connection Timeout: The Exact Security Group Rules You Need to Fix It