On-call scheduling and Escalation Policies
Last updated
Last updated
On-call schedules ensure that there are always team members available to handle any issues, including during nights, weekends, and holidays, ensuring continuous support. These schedules rotate among team members to distribute responsibilities fairly.
Escalation policies detail the procedure for escalating unresolved issues to higher levels of support. For example, if a server goes down, alerts are sent to the designated person on the escalation list, who is responsible for addressing the issue promptly.
It's crucial to have clear communication channels through Slack, email, and voice calls to ensure prompt notification and effective response to incidents such as major bugs, server capacity problems, or system downtime.
Step 1: Navigate to Temperstack Notifications at the top menu -> Click on On-Call Policies - The on-call policy primarily determines who will be contacted when an incident arises and outlines the escalation process for a particular service or group of services.
Step 2: To add or create a new On-Call policy, navigate and click on the top-right side button - Add On-Call Policy - Here, you can create new rotations to escalate engineers if they fail to respond within the specified time duration.
Step 3: Enter “Policy name" - It is the name of the group or team that can be assigned to the Temperstack services. The Start Date indicates the date when this policy was created.
Step 4: Next to the "Start date," you'll find a "Repeat" option where you can select either "Yes" or "No." - - This option determines whether the escalation schedule should repeat after the last user in the rotation has been reached.
Step 5: Click on "Add Rotation." - This action creates rotation levels to escalate on-call engineers within a given timeframe.
Step 6: After adding a rotation, you'll see "Levels" indicating the number of rotations, including the users added to this policy.
Escalate at: This denotes the time duration within which the alert is escalated to the next person selected for the next round if the first person does not acknowledge it.
Rotation Frequency: You can choose the rotation frequency to repeat on a daily or weekly basis. This determines whether the selected on-call engineers at this level should rotate daily or weekly.
On-Call Engineers: Here, you select the names of users to add to the rotation. The on-call rotation will proceed in the order they are added, based on the rotation frequency.
Step 7: On the right side of the rotation box, you'll find a calendar displaying both monthly and weekly views of the user rotations added to it.
Note: Only on-call engineers with verified numbers are eligible.
Once Level 1 is set up, you have the option to set up Level 2, if necessary, by clicking on "Add Rotation" and following the same steps as described above. If you do not require another level, click on "Submit" beside the "Add Rotation" button to finalize the schedule creation. The newly created on-call schedule will now appear on the list.
Click on the pencil icon located on the right-hand side of the list of all schedules.
Edit the escalation policy as needed.
Simply update it by hitting the "Submit" button.
Locate the dustbin icon on the right-hand side of the list of all schedules.
Simply click on the dustbin icon to delete the policy.
The Final Escalation Policy serves as the last resort for ensuring incident resolution within Temperstack's incident management system. This policy maximizes the utilization of available resources and ensures a timely response to critical incidents. Here's a breakdown of the key components and how to configure them:
Calendar of Rotations:
The rotation calendar provides a comprehensive overview of the availability of on-call engineers.
Rotation-based shift assignment enables the creation of a team member list, with shifts automatically generated by Temperstack in the order you have them listed.
Example Scenario:
Consider a team consisting of Mohan, Hari, and Amal, who rotate weekly for on-call duties.
If the first on-call engineer fails to respond within 10 minutes, the notification escalates to the next person in the team.
Key Functions:
Timely Notifications: Connect services to on-call schedules for prompt incident alerts to designated engineers.
Sequential Escalation: Configure alerts to escalate in a predefined order and time frame, ensuring prompt resolution by the appropriate personnel.
Customization: Tailor escalation policies to specific service requirements, optimizing incident management across diverse portfolios.
Implementing effective escalation policies empowers organizations to proactively address incidents, minimize downtime, and maintain service reliability, ultimately enhancing overall operational resilience.