Configuration
Application configuration is handled using pydantic-settings.
The bot reads environment variables from the system environment and from a .env file (dotenv), in that order.
All other settings are read from a config.yaml file. Almost every setting has a sensible default, so you can run the application without creating one — but you will want one as soon as you need to enable integrations or change any defaults.
A fully annotated config.yaml.sample is included in the repository root and covers every available option.
Please review the integrations documentation for additional information on enabling and configuring integrations.
Configurable Parameters
API
The application exposes a lightweight API that serves embeddable incident status widgets and a health check endpoint (GET /health). Widget routes are available under /api/v1/widgets.
To enable it, set the following in config.yaml:
Digest Channel
The digest channel is where updates are sent regarding all incidents managed by the bot. The channel is #incidents by default.
Pinned Images
Pinning images to incident channels is enabled by default.
Note
The API method used for pinning (files.sharedPublicURL) is restricted on free Slack plans.
Links
Custom quick-access buttons can be added to incident messages:
links:
- title: Runbooks
url: https://runbooks.example.com
- title: Monitoring
url: https://grafana.example.com
Options
options:
# Messages posted to every new incident channel on creation.
additional_welcome_messages:
- message: "Welcome! Please review the runbook before taking action."
pin: true # default: false
# Prefix for incident channel names — e.g. "inc" → #inc-2024-01-15-outage
channel_name_prefix: inc # default
# Date format token used in channel names (dayjs / moment style).
channel_name_date_format: YYYY-MM-DD # default
# Prepend the date so channels sort chronologically.
channel_name_use_date_prefix: false # default
# A static meeting URL attached to every incident.
# Leave unset when using the Zoom integration.
meeting_link: https://meet.example.com/oncall
# Pin the meeting link inside the incident channel on creation.
pin_meeting_link_to_channel: false # default
# Number of recent incidents shown on the Slack app home page.
show_most_recent_incidents_app_home_limit: 5 # default
# Items per page in paginated Slack list views.
slack_items_pagination_per_page: 5 # default
# Application timezone. Used for scheduling and all displayed timestamps.
timezone: UTC # default
# Post status updates directly to the digest channel instead of threading them.
updates_in_threads: false # default
# User-agents to suppress from API access logs (e.g. to silence health probes).
skip_logs_for_user_agent:
- kube-probe
Platform
The platform field controls which chat platform the bot connects to. Valid values are slack (default) and matrix.
Slack
When platform: slack, the following environment variables are required:
SLACK_APP_TOKEN— App-level token for websocket communication.SLACK_BOT_TOKEN— Bot-scoped OAuth token.SLACK_USER_TOKEN— User-scoped OAuth token.
Matrix
When platform: matrix, the following environment variables are required:
MATRIX_HOMESERVER— URL of your Matrix homeserver (e.g.https://matrix.example.com).MATRIX_USER_ID— Full Matrix user ID for the bot (e.g.@incidentbot:example.com).MATRIX_ACCESS_TOKEN— Access token for the bot account.MATRIX_DIGEST_ROOM_ID— Room ID to use as the incident digest room.
Optional Matrix variables:
MATRIX_DEVICE_ID— Device ID for the bot session (default:INCIDENTBOT).MATRIX_WIDGET_BASE_URL— Base URL to use for embedded incident widgets.
Roles
Define the roles participants can claim during an incident. Exactly one role must have is_lead: true — this is the Incident Commander role.
Note
Omit this section entirely to use the four built-in default roles.
roles:
incident_commander:
description: "Decision maker. Delegates tasks and drives resolution."
is_lead: true
scribe:
description: "Maintains the incident timeline and tracks follow-up items."
subject_matter_expert:
description: "Domain expert for a specific service or component."
communications_liaison:
description: "Handles customer-facing communication during the incident."
Severities
Note
Omit this section entirely to use the four built-in default severities.
severities:
sev1: "Critical — major impact on all users. All-hands response required."
sev2: "High — significant degradation for a large portion of users."
sev3: "Moderate — minor degradation; worth coordinating but not critical."
sev4: "Investigating — potential issue, no confirmed impact yet. Default."
Statuses
Note
Omit this section entirely to use the four built-in default statuses. If you define custom statuses, exactly one must have initial: true and one must have final: true.
Slash Command
The bot's default slash command is /incidentbot. If you change this, update your Slack app manifest to match.
Jobs
Global scheduled maintenance jobs. All are enabled by default. Use this section to adjust intervals or disable individual jobs.
jobs:
scrape_for_aging_incidents:
enabled: true
interval_days: 2 # how often to check (default: 2 days)
max_age_days: 7 # incidents open longer than this trigger a digest alert
ignore_statuses: # statuses excluded from the aging check
- monitoring
update_slack_cache:
enabled: true
interval_minutes: 15 # how often to refresh Slack channel/user lists
update_pagerduty_oc_data:
enabled: true
interval_minutes: 30 # how often to refresh PagerDuty on-call data
# only active when the PagerDuty integration is enabled
Reminders
Per-incident reminders are posted inside each incident channel on a repeating timer for as long as the incident is open. The system ships with two defaults — you can modify, remove, or extend them.
reminders:
- id: comms_reminder
message: "Time to send a status update?"
interval_minutes: 30
actions:
- type: send_update
label: "Send Update"
- type: snooze
intervals: [30, 60, 90]
- type: dismiss
label: "Stop Reminders"
- id: role_watcher
message: "No roles have been assigned yet — please review and claim as needed."
interval_minutes: 10
once: true
conditions:
no_roles_claimed: true
include_role_buttons: true
actions:
- type: dismiss
label: "Dismiss"
Reminder fields
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Unique identifier; used in Slack action IDs. |
message |
string | required | Text shown in the reminder post. |
interval_minutes |
int | required | How often the reminder fires. |
once |
bool | false |
Fire at most once, then remove itself. |
enabled |
bool | true |
Set to false to disable without removing the entry. |
include_role_buttons |
bool | false |
Append a "Join as <role>" button for each configured role. |
conditions |
object | none | All conditions must be true for the reminder to fire. |
actions |
list | [] |
Buttons shown on the reminder message. |
Conditions
All conditions in a conditions block must be satisfied for the reminder to fire.
| Condition | Type | Description |
|---|---|---|
severity_is |
list of strings | Only fire if the incident severity is in this list. |
status_is |
list of strings | Only fire if the incident status is in this list. |
status_is_final |
bool | Only fire when the incident is in a terminal status. |
no_roles_claimed |
bool | Only fire when no participant has claimed a role yet. |
Action types
| Type | Fields | Description |
|---|---|---|
send_update |
label |
Opens the status update modal. |
snooze |
intervals (list of minutes) |
Reschedules the reminder after a chosen delay. |
dismiss |
label |
Permanently stops this reminder for the incident. |
Automations
Automations fire once in response to incident lifecycle events, optionally filtered by conditions.
automations:
# Invite the on-call team and page PagerDuty for all sev1 incidents
- id: sev1_response
trigger: on_open
conditions:
severity_is: [sev1]
actions:
- type: invite_group
name: sre-oncall
- type: page_pagerduty
escalation_policy: PXXXXXX
priority: high
# Remind responders to update the status page for sev1/sev2
- id: status_page_nudge
trigger: on_open
conditions:
severity_is: [sev1, sev2]
actions:
- type: post_message
message: "Remember to update the status page if customers may be affected."
# Post a postmortem reminder when an incident resolves
- id: postmortem_reminder
trigger: on_final_status
actions:
- type: post_message
message: "Incident resolved. Please open a postmortem within 48 hours."
Automation fields
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Unique identifier. |
trigger |
string | required | Lifecycle event that fires this automation (see below). |
enabled |
bool | true |
Set to false to disable without removing the entry. |
conditions |
object | none | Same structure as reminder conditions. |
actions |
list | [] |
Actions to execute when the automation fires. |
Triggers
| Trigger | Fires when |
|---|---|
on_open |
An incident is first created. |
on_severity_change |
The incident severity is updated. |
on_status_change |
The incident status is updated. |
on_final_status |
The incident reaches a terminal status. |
Automation action types
| Type | Fields | Description |
|---|---|---|
invite_group |
name |
Invite a Slack user group into the incident channel. |
page_pagerduty |
escalation_policy, priority |
Trigger a PagerDuty escalation. Requires the PagerDuty integration. priority: low (default) or high. |
post_message |
message |
Post a plain-text message to the incident channel. |