What is FireHydrant?
FireHydrant is a full-cycle incident management platform that makes your incidents less painful. From automating toil to efficiently assembling the right teams, standardizing communications, facilitating better retrospectives, and gathering metrics, FireHydrant helps organizations improve their reliability and resilience.
With FireHydrant, you can:
- Automate manual tasks such as creating incident Slack channels, a Jira ticket, Zoom bridge or Google Meet, and more
- Integrate with alerting and monitoring integrations to automatically create incidents and pull on-call responders
- Standardize processes and tailor them for different situations, product areas, and teams, among other criteria
- Keep track of your apps, services, environments, and their relationships
- Maintain traceability of incident management data, communications, and action items
- Learn valuable lessons from your incidents, and use that knowledge to improve your infrastructure and processes
Incidents are the bread and butter of FireHydrant. FireHydrant's incidents help your team stay organized and respond to situations efficiently and effectively, while alleviating your team of "scribe" or "note-taking" roles since we track incident events automatically.
Incidents can be created through the FireHydrant UI, from Slack, through our API, or through integrations. Combined with the numerous other platform offerings below, from automation, to tracking your catalog, to managing tasks and conducting informed retrospecties, FireHydrant is here to help you Automate, Respond, Learn, and Improve.
Runbooks are what make FireHydrant unique and powerful. Say goodbye to wikis and endless static playbooks; FireHydrant enables you to automate your processes, including step execution with conditional logic.
Runbooks can be tailored to different situations and severities, by teams, by product/service, and more, and they can even be layered together. With FireHydrant Runbooks, your team can stay focused on fighting fires instead of reading documentation and manually toiling.
See our Introduction to Runbooks for more information.
FireHydrant hosts a Catalog for your infrastructure to help your team stay organized. With the Catalog, you can track which properties are impacted by an incident, any dependencies between items, and who owns these properties + should be involved in an incident, among other things.
You can manage your FireHydrant Catalog from Web, API, or Terraform, or you can import these services from various providers. See our Introduction to Service Catalog for more information.
FireHydrant has complete task and role management built into the platform. Rather than context-switching to external wikis or playbooks, you can pre-define important to-do items within FireHydrant and track task completion as incident timeline items.
In addition, you can also customize incident roles for your organization's needs, allowing every responder to know exactly what is expected of them when they're pulled into an incident.
Once an incident is resolved, FireHydrant helps teams facilitate better incident reviews with built-in Retrospectives. Retrospectives gather and contextualize information and timeline events from all throughout the incident so your team has a clear view of what happened, why it happened, and can have conversations about how to prevent future occurrences.
More importantly, teams can list out Contributing Factors, customize questions/answers in Lessons Learned, and create follow-up action items in linked external ticketing tools to plan work in future sprints.
FireHydrant supports a growing list of integrations. From chat providers like Slack, to alerting providers like PagerDuty and Opsgenie, and others like Zoom, Google Docs, Jira, Zendesk, Kubernetes, Okta, etc., FireHydrant can meet your team where it currently works.
For a complete overview, see An Overview of Integrations.
You can fill in information for your incidents ad-hoc, or you can pre-define Incident Types for your operators to easily declare. This is useful to remove the cognitive load for your teams to declare specific types of situations, pulling in the right responders and resources every time.
FireHydrant teams allow you to quickly assign a group of people to an incident from Slack or the UI. They're also a great way to see which groups own the services in your application stack.
On top of customizing your severities, FireHydrant reduces the stress of figuring out how severe an incident is by enabling you to configure a severity matrix. If certain functionality is down, automatically assign severities. Now, your incident response team can create incidents and be confident that the correct severity is applied.
Many incidents are caused by deploys or configuration changes. With FireHydrant, you can easily view your deploy events associated with different pieces of infrastructure so you can more quickly track the cause of your incidents. FireHydrant supports change event ingestion via the API as well as through our Kubernetes and AWS Cloudtrail integrations.
FireHydrant offers two out-of-the-box status page functionalities: incident-specific status pages, and a global status page. Incident-specific status pages are status pages for a specific incident. These are private temporary status pages that expire after 48 hours of your incident being resolved. Global status pages can be public or private, and they are meant to show the status of your platform or application at any given time. FireHydrant also offers an integration with Atlassian Statuspage.
FireHydrant gives you a quick view of your historical incidents and infrastructure health so you know where to focus your efforts and how you can improve your incident response process moving forward. Analytics include how healthy your infrastructure has been, incident response metrics including remediation time, and much more.