Incident Command Center Overview
Every incident on FireHydrant has a home which we call the Command Center. This page is where all information and activity about the incident is collected, and can be the place users conduct incidents if for some reason Slack is down or otherwise unavailable.
After you've started an incident, the link to the Command Center is available from multiple locations:
As a bookmark in the incident's Slack channel, if the channel has been created
As a link on a notification message in Slack, if a notification has been posted
On the Incidents page under Active Incidents
On your Dashboard page, either in "My Active" or "All Active" tabs, or at the very top if you are watching an incident
Each incident is also accessible in a variety of other locations including Service Catalog. It should not be difficult to get to an incident from numerous places across the application.
The Command Center is split into two sections: the details panel on the right side, and the main section which takes up the majority of the page.
The Details panel shows some of the high-level details of the incident:
- Description is a general description for the incident. Response teams generally use it to give a brief overview of the incident for themselves, but it can be used for other purposes via Liquid templating.
- Links show any integration links like Slack, Jira tickets, or Status pages, as well as external links that users add manually.
- Related Incidents allow responders to mark multiple incidents as related to each other.
- Responders section shows assigned teams and team members who were involved in the incident and their assigned roles, if relevant.
- Impact denotes which Catalog Items are impacted during the incident.
- Customer Support Issues shows any linked support tickets.
- Tags and Labels allow you to track and organize custom data about your incidents.
The main section of the page is split in multiple tabs for various different purposes as well as the separate top title bar.
The topmost section contains high level details such as the name of the incident, who opened the incident and when, as well as who is currently looking at the incident. In addition, there are more important details about the incident which can also be modified:
- Priority is one method of describing how major/urgent an incident is**.
- Severity is another method of describing how major/urgent an incident is**.
- Milestone is the current status of the incident. You can also click here to modify any timestamps of previous milestones.
- Incident Slack channel, if one exists
- Internal status page, which every incident has on FireHydrant.
Note: **Some FireHydrant users may use only Severity, only Priority, or both. It's up to you to decide what makes the most sense, but Severities have a lot more functionality across the platform overall.
The Incident Timeline is a running timeline of all events that have occurred throughout the incident. Things we track include:
- Runbook steps executing
- Users performing actions like posting notes, updating task completion, etc.
- Any messages or images posted both to the Slack channel or in the user interface
To read more about the timeline, see Incident Timeline.
You can directly manage Tasks and Follow-Ups from this page as well as from Slack.
The Status Pages tab shows all attached (active) status pages for the incident. By default, FireHydrant won't post automatically to a Statuspage, but you can automate this via Runbooks.
FireHydrant also allows you to directly post to your status page(s) from Slack as well.
The Runbooks tab shows all attached Runbooks, their steps, and the statuses of each step.
This allows you to see which Runbooks are running on this particular incident as well as if any steps errored or executed successfully. This is useful for both keeping tabs on each incident's automation as well as debugging.
The Linked Alerts tab shows any alerts linked to this incident from your alerting provider. If you use Alert Routing to create incidents on FireHydrant, then the corresponding alert will automatically be attached to the incident.
The final tab, Change Events, showcases any recent changes to your system that FireHydrant automatically associated with the incident because of the impacted Catalog items.
Alongside out-of-box integrations for GitHub and Kubernetes, FireHydrant has both a robust API as well as a CLI tool that allows you to automate logging changes to your systems from various other sources.
Some examples include in Continuous Integration workflows as well as serverless function webhooks upon detecting infrastructure changes.
Associating change events with incidents can potentially help your team identify contributing factors for the incident faster.
Now that you've gotten an overview of the Command Center, you can learn more about FireHydrant by reading in greater detail about various aspects of incidents: