Monitoring & Management
AWS Health Dashboard
Service Status, Personal Health Events, EventBridge Integration
π₯ Tα»ng Quan
AWS Health Dashboard cung cαΊ₯p thΓ΄ng tin vα» service health vΓ events αΊ£nh hΖ°α»ng ΔαΊΏn tΓ i nguyΓͺn AWS cα»§a bαΊ‘n. ΔΓ’y lΓ "bα»nh viα»n" cho AWS resources - giΓΊp bαΊ‘n biαΊΏt khi nΓ o cΓ³ vαΊ₯n Δα» vΓ cαΊ§n lΓ m gΓ¬.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS HEALTH DASHBOARD β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β π SERVICE HEALTH β β
β β (Public - TαΊ₯t cαΊ£ AWS) β β
β β β β
β β "ToΓ n bα» EC2 α» us-east-1 Δang gαΊ·p sα»± cα»" β β
β β "S3 Δang bα» degraded performance" β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β π€ YOUR ACCOUNT HEALTH β β
β β (Private - Chα» Account cα»§a bαΊ‘n) β β
β β β β
β β "EC2 instance i-1234567890abcdef0 cα»§a bαΊ‘n sαΊ½ bα» retire" β β
β β "RDS instance prod-db cαΊ§n maintenance window" β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββHai LoαΊ‘i Dashboard
1. Service Health Dashboard (Public)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SERVICE HEALTH DASHBOARD β
β https://health.aws.amazon.com/ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β π WHAT IT SHOWS: β
β β’ Current status cα»§a TαΊ€T CαΊ’ AWS services β
β β’ Historical incidents β
β β’ Planned maintenance windows β
β β’ Service disruptions β
β β
β π SCOPE: Global view - KhΓ΄ng cαΊ§n login β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Service β Region β Status β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β Amazon EC2 β us-east-1 β β
Operational β β
β β Amazon S3 β us-west-2 β β οΈ Degraded Performance β β
β β Amazon RDS β eu-west-1 β β
Operational β β
β β AWS Lambda β ap-south-1 β π΄ Service Disruption β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β β LIMITATIONS: β
β β’ KhΓ΄ng cho biαΊΏt TΓI NGUYΓN CỀ THα» cα»§a bαΊ‘n bα» αΊ£nh hΖ°α»ng β
β β’ Chα» hiα»n thα» service-level issues β
β β’ KhΓ΄ng cΓ³ personalized alerts β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ2. Personal Health Dashboard (Account-specific)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PERSONAL HEALTH DASHBOARD β
β (AWS Console β Health) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β π WHAT IT SHOWS: β
β β’ Events αΊ£nh hΖ°α»ng ΔαΊΏn RESOURCES CỀ THα» cα»§a bαΊ‘n β
β β’ Scheduled changes cho tΓ i nguyΓͺn cα»§a bαΊ‘n β
β β’ Account notifications β
β β’ Proactive recommendations β
β β
β π€ SCOPE: Account-specific - CαΊ§n login AWS Console β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β οΈ OPEN ISSUES β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β π΄ EC2 Instance Retirement β β
β β Instance: i-1234567890abcdef0 β β
β β Region: us-east-1 β β
β β Retirement Date: 2024-02-15 β β
β β Action: Migrate to new instance β β
β β β β
β β β οΈ RDS Maintenance Window β β
β β Instance: prod-database β β
β β Window: 2024-01-20 03:00-04:00 UTC β β
β β Action: Plan for brief downtime β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β π
SCHEDULED CHANGES β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β π Certificate Expiration β β
β β ACM Certificate: *.example.com β β
β β Expires: 2024-03-01 β β
β β Action: Renew certificate β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββπ Event Types
PhΓ’n LoαΊ‘i Events
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS HEALTH EVENT TYPES β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. ACCOUNT NOTIFICATIONS ββ
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£ β
β β β’ ThΓ΄ng bΓ‘o chung vα» account ββ
β β β’ Service announcements ββ
β β β’ Policy updates ββ
β β β’ Billing alerts ββ
β β ββ
β β Example: "AWS will deprecate Python 3.8 runtime in Lambda" β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 2. SCHEDULED CHANGES ββ
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£ β
β β β’ Planned maintenance ββ
β β β’ Hardware retirement ββ
β β β’ Software updates ββ
β β β’ Certificate expirations ββ
β β ββ
β β Example: "EC2 instance i-xxx scheduled for retirement on 2024-02-15" β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 3. ISSUES (Ongoing Problems) ββ
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£ β
β β β’ Active service issues ββ
β β β’ Performance degradation ββ
β β β’ Outages ββ
β β β’ Resource-specific problems ββ
β β ββ
β β Example: "Your EBS volume vol-xxx is impaired" β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββEvent Status Timeline
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EVENT LIFECYCLE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β
β β Open βββββΊβUpcoming βββββΊβ Ongoing βββββΊβ Closed β β
β β π β β β° β β π β β β
β β
β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β
β β
β Open: β
β β’ Event ΔΓ£ Δược tαΊ‘o β
β β’ ChΖ°a bαΊ―t ΔαΊ§u β
β β’ CαΊ§n action tα»« user β
β β
β Upcoming: β
β β’ Scheduled nhΖ°ng chΖ°a xαΊ£y ra β
β β’ ThΖ°α»ng lΓ maintenance windows β
β β
β Ongoing: β
β β’ Δang diα»
n ra β
β β’ AWS Δang xα» lΓ½ β
β β
β Closed: β
β β’ ΔΓ£ resolved β
β β’ Historical record β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββAWS Health API
API Overview
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS HEALTH API β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β β οΈ IMPORTANT: Chα» available vα»i Business/Enterprise Support Plan! β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β AWS Health API β β
β β βββββββββββββββββ β β
β β β β
β β Programmatic access to: β β
β β β’ Personal Health Dashboard events β β
β β β’ Affected resources β β
β β β’ Event details and descriptions β β
β β β’ Historical events β β
β β β β
β β Use cases: β β
β β β’ Build custom dashboards β β
β β β’ Integrate vα»i alerting systems β β
β β β’ Automate responses to health events β β
β β β’ Feed into SIEM/monitoring tools β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββAPI Operations
| Operation | MΓ΄ TαΊ£ |
|---|---|
DescribeEvents | List health events matching filter criteria |
DescribeEventDetails | Get detailed info about specific events |
DescribeAffectedEntities | Get resources affected by an event |
DescribeEventTypes | List available event types |
DescribeEventAggregates | Get aggregated event counts |
Code Examples
Python (boto3):
import boto3
from datetime import datetime, timedelta
# Create Health client
health = boto3.client('health', region_name='us-east-1')
# Note: Health API is only available in us-east-1
# Get recent events
events = health.describe_events(
filter={
'eventStatusCodes': ['open', 'upcoming'],
'eventTypeCategories': ['scheduledChange', 'issue'],
'startTimes': [
{
'from': datetime.now() - timedelta(days=7)
}
]
}
)
for event in events['events']:
print(f"Event: {event['eventTypeCode']}")
print(f"Service: {event['service']}")
print(f"Region: {event.get('region', 'global')}")
print(f"Status: {event['statusCode']}")
print("---")
# Get affected resources for an event
affected = health.describe_affected_entities(
filter={
'eventArns': ['arn:aws:health:us-east-1::event/EC2/...']
}
)
for entity in affected['entities']:
print(f"Resource: {entity['entityValue']}")
print(f"Status: {entity['statusCode']}")AWS CLI:
# List open events
aws health describe-events \
--region us-east-1 \
--filter "eventStatusCodes=open,upcoming"
# Get event details
aws health describe-event-details \
--region us-east-1 \
--event-arns "arn:aws:health:us-east-1::event/EC2/..."
# Get affected entities
aws health describe-affected-entities \
--region us-east-1 \
--filter "eventArns=arn:aws:health:..."π EventBridge Integration
Automated Response to Health Events
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HEALTH + EVENTBRIDGE INTEGRATION β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββ βββββββββββββββββ βββββββββββββββββββββββ
β β AWS Health ββββββββββΊβ EventBridge ββββββββββΊβ Targets ββ
β β Event β auto β Rule β β ββ
β βββββββββββββββββ βββββββββββββββββ β β’ Lambda ββ
β β β’ SNS ββ
β β β’ SQS ββ
β β β’ Step Functions ββ
β β β’ SSM Automation ββ
β ββββββββββββββββββββββ
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Example Flow: β
β β
β EC2 Retirement βββΊ EventBridge βββΊ Lambda βββΊ Create New Instance β
β Notification Rule Function + Migrate Data β
β β
β βββΊ SNS βββΊ Email/Slack Notification β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββEventBridge Rule Pattern
{
"source": ["aws.health"],
"detail-type": ["AWS Health Event"],
"detail": {
"service": ["EC2"],
"eventTypeCategory": ["scheduledChange"],
"eventTypeCode": ["AWS_EC2_INSTANCE_RETIREMENT_SCHEDULED"]
}
}Lambda Handler Example
import json
import boto3
def lambda_handler(event, context):
"""
Handle AWS Health Event from EventBridge
"""
print(f"Received Health Event: {json.dumps(event)}")
detail = event['detail']
event_type = detail['eventTypeCode']
service = detail['service']
# Get affected resources
affected_entities = detail.get('affectedEntities', [])
if event_type == 'AWS_EC2_INSTANCE_RETIREMENT_SCHEDULED':
for entity in affected_entities:
instance_id = entity['entityValue']
handle_ec2_retirement(instance_id)
elif event_type == 'AWS_RDS_MAINTENANCE_SCHEDULED':
for entity in affected_entities:
db_instance = entity['entityValue']
notify_team_about_maintenance(db_instance, detail)
return {'statusCode': 200}
def handle_ec2_retirement(instance_id):
"""
Automated response to EC2 retirement
"""
ec2 = boto3.client('ec2')
sns = boto3.client('sns')
# Get instance details
response = ec2.describe_instances(InstanceIds=[instance_id])
instance = response['Reservations'][0]['Instances'][0]
# Send notification
sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789012:ops-alerts',
Subject=f'EC2 Retirement Alert: {instance_id}',
Message=f'''
Instance {instance_id} is scheduled for retirement.
Instance Details:
- Type: {instance['InstanceType']}
- AZ: {instance['Placement']['AvailabilityZone']}
- Private IP: {instance.get('PrivateIpAddress', 'N/A')}
Action Required:
1. Create a new instance
2. Migrate workloads
3. Update DNS/Load Balancer
'''
)
# Optional: Create AMI backup
ec2.create_image(
InstanceId=instance_id,
Name=f'retirement-backup-{instance_id}',
Description='Automated backup before retirement'
)
def notify_team_about_maintenance(db_instance, event_detail):
"""
Notify team about scheduled RDS maintenance
"""
sns = boto3.client('sns')
sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789012:ops-alerts',
Subject=f'RDS Maintenance Scheduled: {db_instance}',
Message=f'''
Database {db_instance} has scheduled maintenance.
Details:
{json.dumps(event_detail, indent=2)}
Please ensure:
1. Application can handle brief downtime
2. Maintenance window is acceptable
3. Modify maintenance window if needed
'''
)π’ AWS Organizations Health
Aggregated View Across Accounts
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS ORGANIZATIONS + HEALTH β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Management Account β β
β β (Org Master) β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Organization Health Dashboard β β β
β β β β β β
β β β π Aggregated Health Events from ALL member accounts β β β
β β β β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β Account: Production (123456789012) β β β β
β β β β βββ 2 EC2 retirements scheduled β β β β
β β β β βββ 1 RDS maintenance β β β β
β β β β β β β β
β β β β Account: Development (234567890123) β β β β
β β β β βββ 0 open issues β β β β
β β β β β β β β
β β β β Account: Staging (345678901234) β β β β
β β β β βββ 1 certificate expiring β β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β API: DescribeEventsForOrganization β
β (Requires enabling Organizational Health in AWS Organizations) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββEnable Organizational Health
import boto3
# Enable organizational health view
organizations = boto3.client('organizations')
health = boto3.client('health', region_name='us-east-1')
# Enable health access for organization
health.enable_health_service_access_for_organization()
# Query health events across all accounts
events = health.describe_events_for_organization(
filter={
'eventStatusCodes': ['open', 'upcoming'],
}
)
for event in events['events']:
print(f"Account: {event.get('awsAccountId', 'N/A')}")
print(f"Service: {event['service']}")
print(f"Event: {event['eventTypeCode']}")
print("---")π± Common Event Types
EC2 Events
| Event Type Code | MΓ΄ TαΊ£ | Severity |
|---|---|---|
AWS_EC2_INSTANCE_RETIREMENT_SCHEDULED | Instance sαΊ½ bα» retire | β οΈ High |
AWS_EC2_INSTANCE_STORE_DRIVE_PERFORMANCE_DEGRADED | Disk performance issue | β οΈ Medium |
AWS_EC2_SYSTEM_MAINTENANCE_EVENT | Planned maintenance | π Low |
AWS_EC2_PERSISTENT_INSTANCE_RETIREMENT | PhαΊ£i migrate ngay | π΄ Critical |
RDS Events
| Event Type Code | MΓ΄ TαΊ£ | Severity |
|---|---|---|
AWS_RDS_MAINTENANCE_SCHEDULED | Scheduled maintenance window | π Low |
AWS_RDS_HARDWARE_MAINTENANCE | Hardware needs replacement | β οΈ Medium |
AWS_RDS_SECURITY_NOTIFICATION | Security-related update | π΄ Critical |
EBS Events
| Event Type Code | MΓ΄ TαΊ£ | Severity |
|---|---|---|
AWS_EBS_VOLUME_ISSUE | Volume impaired | π΄ Critical |
AWS_EBS_VOLUME_IO_PERFORMANCE_ISSUE | I/O degradation | β οΈ High |
Terraform Configuration
Create EventBridge Rule for Health Events
# EventBridge Rule for Health Events
resource "aws_cloudwatch_event_rule" "health_events" {
name = "capture-health-events"
description = "Capture all AWS Health events"
event_pattern = jsonencode({
source = ["aws.health"]
detail-type = ["AWS Health Event"]
})
}
# SNS Topic for notifications
resource "aws_sns_topic" "health_alerts" {
name = "aws-health-alerts"
}
# EventBridge Target - SNS
resource "aws_cloudwatch_event_target" "health_to_sns" {
rule = aws_cloudwatch_event_rule.health_events.name
target_id = "send-to-sns"
arn = aws_sns_topic.health_alerts.arn
input_transformer {
input_paths = {
eventTypeCode = "$.detail.eventTypeCode"
service = "$.detail.service"
region = "$.region"
description = "$.detail.eventDescription[0].latestDescription"
}
input_template = <<EOF
{
"message": "AWS Health Alert: <eventTypeCode>",
"service": "<service>",
"region": "<region>",
"description": "<description>"
}
EOF
}
}
# Allow EventBridge to publish to SNS
resource "aws_sns_topic_policy" "health_alerts_policy" {
arn = aws_sns_topic.health_alerts.arn
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Service = "events.amazonaws.com"
}
Action = "sns:Publish"
Resource = aws_sns_topic.health_alerts.arn
Condition = {
ArnEquals = {
"aws:SourceArn" = aws_cloudwatch_event_rule.health_events.arn
}
}
}
]
})
}
# Lambda for automated response
resource "aws_lambda_function" "health_handler" {
filename = "health_handler.zip"
function_name = "health-event-handler"
role = aws_iam_role.health_lambda_role.arn
handler = "index.handler"
runtime = "python3.11"
timeout = 60
environment {
variables = {
SNS_TOPIC_ARN = aws_sns_topic.health_alerts.arn
}
}
}
# EventBridge Target - Lambda
resource "aws_cloudwatch_event_target" "health_to_lambda" {
rule = aws_cloudwatch_event_rule.health_events.name
target_id = "invoke-lambda"
arn = aws_lambda_function.health_handler.arn
}
# Allow EventBridge to invoke Lambda
resource "aws_lambda_permission" "allow_eventbridge" {
statement_id = "AllowExecutionFromEventBridge"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.health_handler.function_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.health_events.arn
}Best Practices
1. Monitoring & Alerting
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BEST PRACTICES β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β β
DO: β
β βββββ β
β β’ Set up EventBridge rules for critical event types β
β β’ Create SNS topics for different severity levels β
β β’ Automate responses where possible (EC2 retirement β create AMI) β
β β’ Use Organization Health for multi-account visibility β
β β’ Check Health Dashboard during outages before debugging β
β β’ Integrate with incident management (PagerDuty, OpsGenie) β
β β
β β DON'T: β
β βββββββ β
β β’ Ignore scheduled maintenance notifications β
β β’ Wait until retirement date to migrate resources β
β β’ Overlook certificate expiration warnings β
β β’ Skip Health Dashboard check during troubleshooting β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ2. Prioritization Matrix
| Event Category | Response Time | Action |
|---|---|---|
| Critical Issues | Immediate | Page on-call, investigate |
| Hardware Retirement | Within 24h | Plan migration |
| Scheduled Maintenance | Before window | Prepare, notify stakeholders |
| Account Notifications | Weekly | Review and plan |
3. Automation Checklist
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUTOMATION RECOMMENDATIONS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β EC2 Retirement Events: β
β β Auto-create AMI backup β
β β Notify team via Slack/Email β
β β Create Jira ticket for migration β
β β Update CMDB/inventory β
β β
β RDS Maintenance: β
β β Send calendar invite for maintenance window β
β β Notify application owners β
β β Check if maintenance window is acceptable β
β β
β Certificate Expiration: β
β β Alert 30 days before expiry β
β β Auto-renew if using ACM managed certificates β
β β Create ticket for manual renewal if needed β
β β
β EBS Volume Issues: β
β β Page on-call immediately β
β β Create snapshot automatically β
β β Prepare replacement volume β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββSo SΓ‘nh Vα»i Services KhΓ‘c
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HEALTH DASHBOARD vs OTHER SERVICES β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Service β Purpose β Scope β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β Health Dashboard β AWS service issues β Infrastructure health β
β β Affected resources β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β CloudWatch β Metrics & alarms β Application monitoring β
β β Logs analysis β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β CloudTrail β API audit logs β Who did what β
β β Security & compliance β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β X-Ray β Distributed tracing β Request flow debugging β
β β Performance analysis β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β Systems Manager β Operations management β Resource management β
β β Patch management β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β π‘ TIP: Use ALL of them together for complete observability! β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββTα»ng KαΊΏt
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS HEALTH DASHBOARD SUMMARY β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β π₯ TWO DASHBOARDS: β
β β’ Service Health (Public) - Status cα»§a tαΊ₯t cαΊ£ AWS services β
β β’ Personal Health (Private) - Issues affecting YOUR resources β
β β
β π THREE EVENT TYPES: β
β β’ Account Notifications - General announcements β
β β’ Scheduled Changes - Maintenance, retirements β
β β’ Issues - Ongoing problems β
β β
β π KEY INTEGRATIONS: β
β β’ EventBridge β Automated responses β
β β’ SNS β Notifications (Email, Slack, PagerDuty) β
β β’ Lambda β Custom automation β
β β’ Organizations β Multi-account visibility β
β β
β β οΈ IMPORTANT: β
β β’ Health API requires Business/Enterprise Support β
β β’ Always check Health Dashboard first during outages β
β β’ Automate responses to critical events β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ