Opsgenie - a modern tool for incident management
In the following blog post we are going to introduce Atlassian’s product called Opsgenie – a modern incident management tool.
Fast diagnosis and DevOps team alerts are just a few IT service management functionalities that act as the first line of defense during service disruptions.
Opsgenie – control over incidents
Opsgenie enables DevOps teams to effectively monitor service disruptions and provide good overview of active incidents. Below we are going to introduce Opsgenie features in more detail.
Opsgenie ensures you will never miss a critical alert. With Opsgenie you can choose between multiple communications channels (including email, SMS, mobile push etc) to ensure recipients are notified in a timely manner about the incident cause and scope.
Opsgenie provides the flexibility to suppress, delay, or expedite alerts based on their content and timing.
With Opsgenie you can build and modify schedules within one interface. Your team will always know who is on-call and accountable during incidents and have the confidence that critical alerts will always be acknowledged.
You can easily create on-call schedules with daily, weekly and custom rotations.
Opsgenie automatically notifies users when their shifts begin and end. Escalations ensure that the alert gets the necessary attention when an alert is not acknowledged within a certain amount of time.
For example, if the person on-call does not respond to a high priority alert within 5 minutes, you can notify another person or team, automatically.
Opsgenie enables you to map alerts to the business services they impact and have a clear understanding of which teams need to respond and who needs to be kept up to date on the progress towards resolution.
You can design your incident response and set up different workflows for incidents of differing priority using Opsgenie’s incident templates. For each type of incident, predefine the needed response teams, the stakeholders, and the best collaboration channels to resolve problems quickly and communicate them effectively.
Status updates show the latest updates for each incident separately. You can also view the service status page for an overview of system health.
Opsgenie on-call schedule view
With integrations you can link Jira Service Management issues to an incident to keep track of the full scope and customer impact of an incident. Additionally, stay on top of follow-on tasks by linking or creating Jira Software issues directly from the Incident details.
Incident Timeline is your source of truth throughout the lifecycle of an incident, listing key details like incident status, associated alerts, Incident Command Center (ICC) activity, and more so your teams can view a full record of events throughout the lifecycle of an incident.
Advanced reporting and analytics
Post incident analysis report helps you to identify how fast people acknowledged the issues, when status changes were communicated, and how teams participated in the resolution.
With the report it is possible to compare different incident responses, to identify opportunities for improvement.
Operational efficiency analytics helps you to understand the volume of alerts your company has handled over a specified period of time, and the corresponding mean-time-to-acknowledge and mean-time-to resolve.
You can easily visualize how these metrics are trending over time and with a mouse click, drill down into areas of concern to understand which alerts required more time and attention.
Operational efficiency analytics report
Monthly overview analytics gives an insight to the monthly alert distribution and response trends. You can easily compare them with the previous month, and drill into any areas of interest.
The Incident investigation dashboard enables you to investigate deployment-related incidents directly from Opsgenie. The dashboard displays a timeline of successful and failed code deployments from Bitbucket or Bamboo, as well as past and ongoing incidents.
All this information in one place means being able to correlate incidents to code deployments as the potential cause of an incident.
Incident investigation dashboard
Opsgenie on Cloud or on-premise platforms
Opsgenie is available for Cloud as a standalone product on Free (14-day trial license for up to 5 users), Essentials, Standard and Enterprise plans.
Atlassian categorizes Opsgenie as a Cloud product but organizations on Data Center can use Opsgenie’s Edge Encryption that encrypts your user data. This way Opsgenie never receives the raw version of the payload directly.
The encryption application is hosted on your own environment and acts as a bridge between Opsgenie and 3rd party tools.
In addition, Opsgenie is now included in all Jira Service Management Cloud plans, but we will introduce this integration in more detail in the next blog post. If you need some more information about Opsgenie or any other Atlassian product, we are happy to help. More info here.