Health Checks
Health checks emit 2 events
check.passedcheck.failed
notification.yamlapiVersion: mission-control.flanksource.com/v1
kind: Notification
metadata:
  name: api-http-fail-alert
  namespace: default
spec:
  events:
    - check.failed
  filter: check.type == 'http'
  title: API HTTP Check {{.check.name}} failing
  body: |
    ## Check Failed
    Error: {{.status.error}}
    Failed at {{.status.created_at}}
  to:
    email: alerts@acme.com

Default Templates
check.passed
Title
{{ if ne channel "slack"}}Check {{.check.name}} has passed{{end}}
Template
{{ if eq .channel "slack"}}
{
	"blocks": [
    {{slackSectionTextMD (printf `:large_green_circle: *%s* is _healthy_` .canary.name)}},
    {"type": "divider"},
    {{ if .check_status.message}}{{slackSectionTextMD check_status.message}},{{end}}
		{
			"type": "section",
			"fields": [
				{{slackSectionTextFieldMD (printf `*Canary*: %s` .canary.name) }},
				{{slackSectionTextFieldMD (printf `*Namespace*: %s` .canary.namespace) }}
				{{if ne .agent.name "local"}}
					,{{slackSectionTextFieldMD (printf `*Agent*: %s` .agent.name) }}
				{{end}}
			]
		},
    {{ if .check.labels}}{{slackSectionLabels .check}},{{end}}
		{{if .groupedResources}}{{slackSectionTextMD (printf `*Resources grouped with notification:* %s` (join .groupedResources "\n"))}},{{end}}
    {{ slackURLAction "View Health Check" .permalink "🔕 Silence" .silenceURL}}
  ]
}
{{ else }}
Canary: {{.canary.name}}
{{if .agent}}Agent: {{.agent.name}}{{end}}
{{if .check_status.message}}Message: {{.check_status.message}} {{end}}
{{labelsFormat .check.labels}}
[Reference]({{.permalink}})
{{end}}
check.failed
Title
{{ if ne channel "slack"}}Check {{.check.name}} has failed{{end}}
Template
{{ if eq channel "slack"}}
{
	"blocks": [
		{{slackSectionTextMD (printf `:red_circle: *%s* is _unhealthy_` .check.name)}},
    {"type": "divider"},
		{{ if .check_status.error}}{{slackSectionTextMD check_status.error}},{{end}}
		{
			"type": "section",
			"fields": [
				{{slackSectionTextFieldMD (printf `*Canary*: %s` .canary.name) }},
				{{slackSectionTextFieldMD (printf `*Namespace*: %s` .canary.namespace) }}
				{{if ne .agent.name "local"}}
					,{{slackSectionTextFieldMD (printf `*Agent*: %s` .agent.name) }}
				{{end}}
			]
		},
		{{ if .check.labels}}{{slackSectionLabels .check}},{{end}}
		{{if .groupedResources}}{{slackSectionTextMD (printf `*Resources grouped with notification:* %s` (join .groupedResources "\n"))}},{{end}}
		{{ slackURLAction "View Health Check" .permalink "🔕 Silence" .silenceURL}}
	]
}
{{ else }}
Canary: {{.canary.name}}
{{if .agent}}Agent: {{.agent.name}}{{end}}
Error: {{.check_status.error}}
{{labelsFormat .check.labels}}
[Reference]({{.permalink}})
{{end}}
Template Variables
| Field | Description | Scheme | 
|---|---|---|
agent | Details of the agent that created the config.  | |
canary | canary  | |
check | Check  | |
permalink | Link to the Catalog in mission control  | string  | 
status | check status  | 
Agent
| Field | Description | Scheme | 
|---|---|---|
description | Short description of the agent  | string  | 
id | The id of the agent  | 
  | 
name | The name of the agent  | string  | 
Canary
| Field | Description | Scheme | 
|---|---|---|
created_at | The created at of the canary  | string  | 
deleted_at | The deleted at of the canary  | string  | 
id | The id of the canary  | 
  | 
labels | The labels of the canary  | 
  | 
name | The name of the canary  | string  | 
namespace | The namespace of the canary  | string  | 
source | The source of the canary  | string  | 
updated_at | The updated at of the canary  | string  | 
Check
| Field | Description | Scheme | 
|---|---|---|
created_at | The created at of the check  | 
  | 
deleted_at | The deleted at of the check  | 
  | 
description | The description of the check  | string  | 
id | The id of the check  | 
  | 
labels | The labels of the check  | 
  | 
last_runtime | The last runtime of the check  | 
  | 
last_transition_time | The last transition time of the check  | 
  | 
latency | The past 1 hour latency summary  | |
name | The name of the check  | string  | 
next_runtime | The next runtime of the check  | 
  | 
severity | The severity of the check  | string  | 
status | Check status details  | string  | 
transformed | Whether the check has been transformed  | bool  | 
type | The type of the check  | string  | 
updated_at | The updated at of the check  | 
  | 
uptime | The past 1 hour uptime summary  | 
CheckStatus
| Field | Description | Scheme | 
|---|---|---|
check_id | The id of the check associated with this status  | 
  | 
created_at | The created at of the check  | 
  | 
duration | The duration of the check  | int  | 
error | The error of the check in case of failure  | string  | 
invalid | Whether the check errored out  | bool  | 
message | The success message of the check  | string  | 
status | The status of the check  | bool  | 
time | The time of the check  | string  | 
Uptime
| Field | Description | Scheme | 
|---|---|---|
failed | The number of checks that failed  | int  | 
last_fail | The last time a check failed  | 
  | 
last_pass | The last time a check passed  | 
  | 
p100 | The percentage of checks that passed  | 
  | 
passed | The number of checks that passed  | int  | 
Latency
| Field | Description | Scheme | 
|---|---|---|
p95 | The latency of the check  | 
  | 
p97 | The latency of the check  | 
  | 
p99 | The latency of the check  | 
  | 
rolling1h | The latency of the check  | 
  |