Datadog是云上应用的监控和分析平台,用于自动采集和分析日志、指标、链路追踪等数据,监控基础设施事件、云服务事件。Datadog为服务器、应用程序以及采集到的各种数据提供了很好的可观测效果。您只需在Datadog集成的Webhook中配置日志服务的开放告警接口URL,即可将Datadog的告警消息发送给日志服务。
前提条件
已创建协议为Datadog的开放告警应用。具体操作,请参见配置开放告警对外接口。
Datadog配置
- 登录Datadog控制台。
- 配置Webhook。
- 在顶部导航栏中,选择
图标 > Integrations。
- 在Integrations页签中,找到webhooks,将鼠标悬浮在webhooks框中,单击Install。
- 安装完成后,将鼠标悬浮在webhooks框中,单击Configure。
- 在Webhooks区域,单击New。
- 在New Webhook区域,配置如下参数,然后单击Save。
参数 说明 Name webhook的名称。 URL 告警消息的接收端,此处配置为您在日志服务中创建开放告警服务和应用后生成的接口信息(完整URL)。如何获取,请参见获取接口信息。 Payload 定义告警消息的内容,Datadog将根据此配置生成告警消息内容。关于Datadog提供的告警消息变量的更多信息,请参见Datadog官方文档。 在配置Payload时,需注意如下事项。
- 在labels字段中,必须添加tags字段。
- 在annotations字段中,必须添加title字段、event_msg字段和text_only_msg字段。
- 其余由Datadog提供的但未被使用的变量,您可以自定义选择添加到labels字段或者annotations字段中。
- 除labels和annotations之外的其他字段,您必须按照如下示例进行配置。
您可以按照如下内容配置Payload。
{ "alert_instance_id": "$ID", "alert_id": "$ALERT_ID", "alert_name": "$ALERT_TITLE", "alert_time": "$LAST_UPDATED", "fire_time": "$DATE", "resolve_time": "$DATE", "status": "$ALERT_TRANSITION", "labels": { "tags": "$TAGS" }, "annotations": { "title": "$EVENT_TITLE", "event_msg": "$EVENT_MSG", "text_only_msg": "$TEXT_ONLY_MSG", "alert_metric": "$ALERT_METRIC", "alert_query": "$ALERT_QUERY", "alert_scope": "$ALERT_SCOPE", "alert_status": "$ALERT_STATUS", "alert_type": "$ALERT_TYPE", "email": "$EMAIL", "event_type": "$EVENT_TYPE", "hostname": "$HOSTNAME", "logs_sample": "$LOGS_SAMPLE", "metric_namespace": "$METRIC_NAMESPACE", "priority": "$PRIORITY", "user": "$USER", "username": "$USERNAME", "__aggreg_key__": "$AGGREG_KEY", "__alert_cycle_key__": "$ALERT_CYCLE_KEY", "__incident_attachments__": "$INCIDENT_ATTACHMENTS", "__incident_commander__": "$INCIDENT_COMMANDER", "__incident_customer_impact__": "$INCIDENT_CUSTOMER_IMPACT", "__incident_fildes__": "$INCIDENT_FIELDS", "__incident_public_id__": "$INCIDENT_PUBLIC_ID", "__incident_title": "$INCIDENT_TITLE", "__incident_url__": "$INCIDENT_URL", "__org_id__": "$ORG_ID", "__org_name__": "$ORG_NAME", "__security_rule_name__": "$SECURITY_RULE_NAME", "__security_signal_id__": "$SECURITY_SIGNAL_ID", "__security_signal_severity__": "$SECURITY_SIGNAL_SEVERITY", "__security_signal_title__": "$SECURITY_SIGNAL_TITLE", "__security_signal_msg__": "$SECURITY_SIGNAL_MSG", "__security_signal_attributes__": "$SECURITY_SIGNAL_ATTRIBUTES", "__security_rule_id__": "$SECURITY_RULE_ID", "__security_rule_query__": "$SECURITY_RULE_QUERY", "__security_rule_group_by_fields__": "$SECURITY_RULE_GROUP_BY_FIELDS", "__security_rule_type__": "$SECURITY_RULE_TYPE", "__link_snapshot_url__": "$SNAPSHOT", "__synthetics_test_name__": "$SYNTHETICS_TEST_NAME", "__synthetics_first_failing_step_name__": "$SYNTHETICS_FIRST_FAILING_STEP_NAME" }, "severity": "$ALERT_PRIORITY", "drill_down_query": "$LINK" }
- 在顶部导航栏中,选择
- 配置通知渠道。
- 在顶部导航栏中,选择
图标 > Manage Monitors。
- 单击目标Monitor对应的
图标。
- 配置Notify your team为您在步骤2中所创建的Webhook。
- 单击Save。
- 在顶部导航栏中,选择
Datadog告警消息
如果您将所有由Datadog提供的但未被使用的变量都添加到了annotations字段中,那么日志服务将收到如下所示的Datadog告警消息。
{
"alert_instance_id": "123456",
"alert_id": "123456",
"alert_name": "STOP on host:abcdefgh",
"alert_time": "1628647425000",
"fire_time": "1628647425000",
"resolve_time": "1627561306000",
"status": "Triggered",
"labels": {
"tags": "ali,host:abcdefgh,monitor"
},
"annotations": {
"title": "[P1] [Triggered on {host:abcdefgh}] STOP",
"event_msg": "%%%
warning
host stop
@webhook-webhook-test-all
The monitor was last triggered at Thu Jul 29 2021 12:21:45 UTC.
- - -
[[Monitor Status](https://app.datadoghq.com/monitors/1234?to_ts=1234&group=host%3Aabcdefgh&from_ts=1627560405000)] · [[Edit Monitor](https://app.datadoghq.com/monitors#1234/edit)] · [[View abcdefgh](https://app.datadoghq.com/infrastructure?filter=abcdefgh)] · [[Show Processes](https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=1234&tags=host%abcdefgh&from_ts=1627560405000&live=false&showSummaryGraphs=true)]
%%%",
"text_only_msg": "
warning
host stop
@webhook-webhook-test-all
Metric Graph: https://app.datadoghq.com/monitors/1234?to_ts=1627561365000&group=host%abcdefgh&from_ts=1627557705000 · Monitor Status: https://app.datadoghq.com/monitors/1234?group=host%abcdefgh · Edit Monitor: https://app.datadoghq.com/monitors#42655965/edit · Event URL: https://app.datadoghq.com/event/event?id=1234 · View abcdefgh: https://app.datadoghq.com/infrastructure?filter=abcdefgh · Show Processes: https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=None&tags=host%abcdefgh&from_ts=None&live=false&showSummaryGraphs=true",
"alert_metric": "null",
"alert_query": "\"datadog.agent.up\".over(\"host:abcdefgh\").by(\"host\").last(2).count_by_status()",
"alert_scope": "host:abcdefgh",
"alert_status": "",
"alert_type": "error",
"email": "",
"event_type": "service_check",
"hostname": "abcdefgh",
"logs_sample": "null",
"metric_namespace": "",
"priority": "normal",
"user": "null",
"username": "",
"__aggreg_key__": "a1b2c3",
"__alert_cycle_key__": "123456789",
"__incident_attachments__": "null",
"__incident_commander__": "null",
"__incident_customer_impact__": "null",
"__incident_fildes__": "null",
"__incident_public_id__": "null",
"__incident_title": "null",
"__incident_url__": "null",
"__org_id__": "123",
"__org_name__": "ali",
"__security_rule_name__": "null",
"__security_signal_id__": "null",
"__security_signal_severity__": "null",
"__security_signal_title__": "null",
"__security_signal_msg__": "null",
"__security_signal_attributes__": "null",
"__security_rule_id__": "null",
"__security_rule_query__": "$SECURITY_RULE_QUERY",
"__security_rule_group_by_fields__": "null",
"__security_rule_type__": "null",
"__link_snapshot_url__": "null",
"__synthetics_test_name__": "null",
"__synthetics_first_failing_step_name__": "null"
},
"severity": "P1",
"drill_down_query": "https://app.datadoghq.com/event/event?id=123456"
}
字段映射
Datadog告警消息被接入到日志服务后,映射为日志服务告警内容。示例如下:
{
"aliuid": "aliuid1",
"alert_instance_id": "123456",
"alert_id": "123456",
"alert_type": "sls_pub",
"alert_name": "STOP on host:abcdefgh",
"region": "",
"project": "",
"project_id": 0,
"next_eval_interval": 0,
"alert_time": 1628647425,
"fire_time": 1628647425,
"fire_results": null,
"fire_results_count": 0,
"resolve_time": 0,
"status": "firing",
"results": null,
"labels":{
"__ali__": "ali",
"__host__": "abcdefgh",
"__monitor__": "monitor"
},
"annotations":{
"__aggreg_key__": "1a2b3c4d",
"__alert_cycle_key__": "123456",
"__config_app__": "sls_pub_alert",
"__link_edit_monitor__": "https://app.datadoghq.com/monitors#1234/edit",
"__link_metric_graph__": "https://app.datadoghq.com/monitors/1234?to_ts=1628647485000&group=host%abcdefgh&from_ts=1628643825000",
"__link_monitor_status__": "https://app.datadoghq.com/monitors/123?group=host%abcdefgh",
"__link_show_processes__": "https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=None&tags=host%abcdefgh&from_ts=None&live=false&showSummaryGraphs=true",
"__link_view_izbp****hqpwt26z__": "https://app.datadoghq.com/infrastructure?filter=abcdefgh",
"__org_id__": "579186",
"__org_name__": "ali",
"__pub_alert_app__": "",
"__pub_alert_protocol__": "datadog",
"__pub_alert_region__": "",
"__pub_alert_service__": "",
"alert_query": "\"datadog.agent.up\".over(\"host:abcdefgh\").by(\"host\").last(2).count_by_status()",
"alert_scope": "host:izbp1cerzh0yyvrhqpwt26z",
"alert_type": "error",
"desc": "warning
host stop
@webhook-test
The monitor was last triggered at Wed Aug 11 2021 02:03:45 UTC.
- - -
",
"event_type": "service_check",
"hostname": "abcdefgh",
"priority": "normal",
"title": "[P1] [Triggered on {host:abcdefgh}] STOP"
},
"severity": 10,
"policy":{
"alert_policy_id": "",
"action_policy_id": "",
"use_default": false,
"repeat_interval": "0s"
},
"template": null,
"drill_down_query": "https://app.datadoghq.com/event/event?id=123456"
}
日志服务 | Datadog | 说明 |
---|---|---|
aliuid | 无 | 用于接入告警的开放告警应用所属的阿里云账号ID |
alert_id | alert_id | 告警监控规则的ID |
alert_instance_id | alert_instance_id | 告警消息的ID |
alert_type | 无 | 告警类型,固定为sls_pub。 |
alert_name | alert_name | 告警监控规则的名称 |
status | status | 告警状态。
|
next_eval_interval | 无 | 告警评估间隔时间,固定为0。 |
alert_time | alert_time | 告警触发时间。 |
fire_time | fire_time | 告警首次触发时间。 |
resolve_time | resolve_time | 告警恢复时间。
|
labels | labels | 告警标签信息。 Datadog告警消息的 labels字段中的tags字段值将被英文逗号(,)拆分为多个字符串。
例如
另外Datadog告警消息的labels字段中,其余未被使用且字段值非空的字段和其字段值都会被添加到日志服务告警消息的labels字段中。 |
annotations | annotations | Datadog告警被接入到日志服务后,日志服务告警的annotations字段中将添加如下额外的字段。
以下字段从Datadog告警消息中的text_only_msg字段中解析得到。
另外Datadog告警消息annotations字段中,其余未被使用且字段值非空的字段和其字段值都会被添加到日志服务告警消息的annotations字段中。 |
severity | severity | 告警严重度,Datadog告警严重度与日志服务告警严重度的映射关系如下:
|
policy | 无 | 开放告警应用中配置的告警策略。更多信息,请参见Policy结构。 |
project | 无 | 告警中心所属的Project。更多信息,请参见项目(Project)。 |
drill_down_query | drill_down_query | 单击字段值中的链接,可跳转到Datadog告警事件的管理页面。 |
内容没看懂? 不太想学习?想快速解决? 有偿解决: 联系专家
阿里云企业补贴进行中: 马上申请
腾讯云限时活动1折起,即将结束: 马上收藏
同尘科技为腾讯云授权服务中心。
购买腾讯云产品享受折上折,更有现金返利:同意关联,立享优惠
转转请注明出处:https://www.yunxiaoer.com/161130.html