Add service alerting using PagerDuty

Alerting is managed in the alert-config repository. Use it to:

  • create PagerDuty users, services, schedules and escalation policies
  • manage alert thresholds (for example 5xx exceptions and container kills)
  • enable alerts in specific environments

Get the latest repository locally

Pull the latest alert-config before you start.

git checkout main && git pull

Create PagerDuty configuration for your team

If your team already has a PagerDuty account, skip to create alerts for your microservice.

  1. Create a feature branch in the alert-config repo.
  2. Add your team user to the users list in src/main/scala/uk/gov/hmrc/alertconfig/pagerdutyconfigs/users/Users.scala.

    Replace the placeholders and remove the angle brackets:

    • <MyDigitalService>
    • <my-team-email-address>
    object <MyDigitalService> extends PagerDutyUser(name = "<MyDigitalService>", email = "<my-team-email-address>")

    Ensure the email address can receive emails from external sources.

  3. Create a Scala file (named after your digital service) in:
    src/main/scala/uk/gov/hmrc/alertconfig/pagerdutyconfigs
  4. In that file, create objects for a PagerDuty Schedule, Escalation Policy and Service (integrated with Grafana alerting). Replace placeholders and remove angle brackets:
    • <MyDigitalService>
    • <my-alerts-slack-channel>

    Specify the environments you want to enable. In the example, Staging and Production are enabled.

    import uk.gov.hmrc.alertconfig.pagerdutyconfigs.users.<MyDigitalService>
    
    object <MyDigitalService>Schedule extends PagerDutySchedule(name = "<MyDigitalService> Schedule", admins = <MyDigitalService>)
    
    object <MyDigitalService>EscalationPolicy
      extends PagerDutyEscalationPolicy(
        name = "<MyDigitalService> Escalation Policy",
        rules = Seq(
          PagerDutyEscalationPolicyRule(targets = Seq(PagerDutyTarget(schedule = Some(<MyDigitalService>Schedule))))
        )
      )
    
    object <MyDigitalService>Integration extends PagerDutyIntegration(
      integrationName = "<my-digital-service>",
      enabledEnvironments = Map(
        Staging    -> Set(Warning, Critical),
        Production -> Set(Warning, Critical)
      )
    )
    
    object <MyDigitalService>Service
      extends PagerDutyService(
        serviceName   = "<MyDigitalService>",
        integrations  = Seq(<MyDigitalService>Integration),
        slackChannels = Seq("<my-alerts-slack-channel>")
      )

Merge your changes

Open a PR in alert-config. Telemetry must approve (new accounts require an extra PagerDuty licence). Raise a SUP in Jira:

Summary: PagerDuty account creation for <team name>

In the Description, link to your PR, then create the ticket.

Once triaged and approved by Telemetry and the PR is merged, the account is created. A password setup email will be sent to the team. Your Delivery Lead must set this and share it securely offline.

Check you’ve completed this part

  1. Sign in to the HMRC PagerDuty account.
  2. Confirm the schedule, escalation policy and service exist.
  3. In Service Directory, find your service and create a New Incident.
  4. Verify your Slack channel received the notification.

Set up PagerDuty in Slack

  1. Open the PagerDuty extensions page.
  2. Select Slack integration and open the Slack workspace.
  3. Authorise the integration if prompted.
  4. Choose View for the HMRC Digital workspace.

First-time users: you’ll see a banner to link Slack and PagerDuty. Click Link Account, allow permissions, and confirm in Slack.

In your Slack channel, you should see the test incident and be able to acknowledge or resolve it. First-time users may need to authorise again from Slack:

  1. Click Link Account in Slack.
  2. Sign in to PagerDuty when prompted.

Create alerts for your microservice

  1. Create a branch in alert-config.
  2. Add a Scala file (named after your digital service) in:
    src/main/scala/uk/gov/hmrc/alertconfig/configs
  3. Create an object that extends AlertConfig. Replace placeholders and remove angle brackets:
    • <MyDigitalService>
    • <microservice-name>
    import uk.gov.hmrc.alertconfig.pagerdutyconfigs.<MyDigitalService>Integration
    
    object <MyDigitalService> extends AlertConfig {
      override val alertConfig: Seq[AlertConfigBuilder] = teamAlerts(Seq(
        "<microservice-name>"
      )).withIntegrations(<MyDigitalService>Integration)
    }

You can add many alert types. See the alert-config README for the full list and configuration.

Merge your changes

Open a PR in alert-config and have a teammate review it. After merge, the alert-config pipeline will:

  • generate alerts and an alerting dashboard
  • deploy to Grafana Alerting in each MDTP environment

The change will be live ~30 minutes after the job completes.

Grafana Alerting URLs

Check you’ve completed this task

  1. Go to the MDTP Catalogue and open your microservice.
  2. Under Code, open Commissioning State.
  3. Confirm alerting is configured correctly.

You can also view dashboards via the Grafana Alerting links above and search for your microservice.


Need support?

Ask in #team-telemetry on Slack.

Got feedback?

We’re always improving our documentation. Share your feedback with the team.

Configure PagerDuty for your microservice by setting up users, schedules, escalation policies and alerts in the alert-config repository.

Add service alerting using PagerDuty

Alerting is managed in the alert-config repository. Use it to:

  • create PagerDuty users, services, schedules and escalation policies
  • manage alert thresholds (for example 5xx exceptions and container kills)
  • enable alerts in specific environments

Get the latest repository locally

Pull the latest alert-config before you start.

git checkout main && git pull

Create PagerDuty configuration for your team

If your team already has a PagerDuty account, skip to create alerts for your microservice.

  1. Create a feature branch in the alert-config repo.
  2. Add your team user to the users list in src/main/scala/uk/gov/hmrc/alertconfig/pagerdutyconfigs/users/Users.scala.

    Replace the placeholders and remove the angle brackets:

    • <MyDigitalService>
    • <my-team-email-address>
    object <MyDigitalService> extends PagerDutyUser(name = "<MyDigitalService>", email = "<my-team-email-address>")

    Ensure the email address can receive emails from external sources.

  3. Create a Scala file (named after your digital service) in:
    src/main/scala/uk/gov/hmrc/alertconfig/pagerdutyconfigs
  4. In that file, create objects for a PagerDuty Schedule, Escalation Policy and Service (integrated with Grafana alerting). Replace placeholders and remove angle brackets:
    • <MyDigitalService>
    • <my-alerts-slack-channel>

    Specify the environments you want to enable. In the example, Staging and Production are enabled.

    import uk.gov.hmrc.alertconfig.pagerdutyconfigs.users.<MyDigitalService>
    
    object <MyDigitalService>Schedule extends PagerDutySchedule(name = "<MyDigitalService> Schedule", admins = <MyDigitalService>)
    
    object <MyDigitalService>EscalationPolicy
      extends PagerDutyEscalationPolicy(
        name = "<MyDigitalService> Escalation Policy",
        rules = Seq(
          PagerDutyEscalationPolicyRule(targets = Seq(PagerDutyTarget(schedule = Some(<MyDigitalService>Schedule))))
        )
      )
    
    object <MyDigitalService>Integration extends PagerDutyIntegration(
      integrationName = "<my-digital-service>",
      enabledEnvironments = Map(
        Staging    -> Set(Warning, Critical),
        Production -> Set(Warning, Critical)
      )
    )
    
    object <MyDigitalService>Service
      extends PagerDutyService(
        serviceName   = "<MyDigitalService>",
        integrations  = Seq(<MyDigitalService>Integration),
        slackChannels = Seq("<my-alerts-slack-channel>")
      )

Merge your changes

Open a PR in alert-config. Telemetry must approve (new accounts require an extra PagerDuty licence). Raise a SUP in Jira:

Summary: PagerDuty account creation for <team name>

In the Description, link to your PR, then create the ticket.

Once triaged and approved by Telemetry and the PR is merged, the account is created. A password setup email will be sent to the team. Your Delivery Lead must set this and share it securely offline.

Check you’ve completed this part

  1. Sign in to the HMRC PagerDuty account.
  2. Confirm the schedule, escalation policy and service exist.
  3. In Service Directory, find your service and create a New Incident.
  4. Verify your Slack channel received the notification.

Set up PagerDuty in Slack

  1. Open the PagerDuty extensions page.
  2. Select Slack integration and open the Slack workspace.
  3. Authorise the integration if prompted.
  4. Choose View for the HMRC Digital workspace.

First-time users: you’ll see a banner to link Slack and PagerDuty. Click Link Account, allow permissions, and confirm in Slack.

In your Slack channel, you should see the test incident and be able to acknowledge or resolve it. First-time users may need to authorise again from Slack:

  1. Click Link Account in Slack.
  2. Sign in to PagerDuty when prompted.

Create alerts for your microservice

  1. Create a branch in alert-config.
  2. Add a Scala file (named after your digital service) in:
    src/main/scala/uk/gov/hmrc/alertconfig/configs
  3. Create an object that extends AlertConfig. Replace placeholders and remove angle brackets:
    • <MyDigitalService>
    • <microservice-name>
    import uk.gov.hmrc.alertconfig.pagerdutyconfigs.<MyDigitalService>Integration
    
    object <MyDigitalService> extends AlertConfig {
      override val alertConfig: Seq[AlertConfigBuilder] = teamAlerts(Seq(
        "<microservice-name>"
      )).withIntegrations(<MyDigitalService>Integration)
    }

You can add many alert types. See the alert-config README for the full list and configuration.

Merge your changes

Open a PR in alert-config and have a teammate review it. After merge, the alert-config pipeline will:

  • generate alerts and an alerting dashboard
  • deploy to Grafana Alerting in each MDTP environment

The change will be live ~30 minutes after the job completes.

Grafana Alerting URLs

Check you’ve completed this task

  1. Go to the MDTP Catalogue and open your microservice.
  2. Under Code, open Commissioning State.
  3. Confirm alerting is configured correctly.

You can also view dashboards via the Grafana Alerting links above and search for your microservice.


Need support?

Ask in #team-telemetry on Slack.

Got feedback?

We’re always improving our documentation. Share your feedback with the team.

Configure PagerDuty for your microservice by setting up users, schedules, escalation policies and alerts in the alert-config repository.

Add service alerting using PagerDuty

Alerting is managed in the alert-config repository. Use it to:

  • create PagerDuty users, services, schedules and escalation policies
  • manage alert thresholds (for example 5xx exceptions and container kills)
  • enable alerts in specific environments

Get the latest repository locally

Pull the latest alert-config before you start.

git checkout main && git pull

Create PagerDuty configuration for your team

If your team already has a PagerDuty account, skip to create alerts for your microservice.

  1. Create a feature branch in the alert-config repo.
  2. Add your team user to the users list in src/main/scala/uk/gov/hmrc/alertconfig/pagerdutyconfigs/users/Users.scala.

    Replace the placeholders and remove the angle brackets:

    • <MyDigitalService>
    • <my-team-email-address>
    object <MyDigitalService> extends PagerDutyUser(name = "<MyDigitalService>", email = "<my-team-email-address>")

    Ensure the email address can receive emails from external sources.

  3. Create a Scala file (named after your digital service) in:
    src/main/scala/uk/gov/hmrc/alertconfig/pagerdutyconfigs
  4. In that file, create objects for a PagerDuty Schedule, Escalation Policy and Service (integrated with Grafana alerting). Replace placeholders and remove angle brackets:
    • <MyDigitalService>
    • <my-alerts-slack-channel>

    Specify the environments you want to enable. In the example, Staging and Production are enabled.

    import uk.gov.hmrc.alertconfig.pagerdutyconfigs.users.<MyDigitalService>
    
    object <MyDigitalService>Schedule extends PagerDutySchedule(name = "<MyDigitalService> Schedule", admins = <MyDigitalService>)
    
    object <MyDigitalService>EscalationPolicy
      extends PagerDutyEscalationPolicy(
        name = "<MyDigitalService> Escalation Policy",
        rules = Seq(
          PagerDutyEscalationPolicyRule(targets = Seq(PagerDutyTarget(schedule = Some(<MyDigitalService>Schedule))))
        )
      )
    
    object <MyDigitalService>Integration extends PagerDutyIntegration(
      integrationName = "<my-digital-service>",
      enabledEnvironments = Map(
        Staging    -> Set(Warning, Critical),
        Production -> Set(Warning, Critical)
      )
    )
    
    object <MyDigitalService>Service
      extends PagerDutyService(
        serviceName   = "<MyDigitalService>",
        integrations  = Seq(<MyDigitalService>Integration),
        slackChannels = Seq("<my-alerts-slack-channel>")
      )

Merge your changes

Open a PR in alert-config. Telemetry must approve (new accounts require an extra PagerDuty licence). Raise a SUP in Jira:

Summary: PagerDuty account creation for <team name>

In the Description, link to your PR, then create the ticket.

Once triaged and approved by Telemetry and the PR is merged, the account is created. A password setup email will be sent to the team. Your Delivery Lead must set this and share it securely offline.

Check you’ve completed this part

  1. Sign in to the HMRC PagerDuty account.
  2. Confirm the schedule, escalation policy and service exist.
  3. In Service Directory, find your service and create a New Incident.
  4. Verify your Slack channel received the notification.

Set up PagerDuty in Slack

  1. Open the PagerDuty extensions page.
  2. Select Slack integration and open the Slack workspace.
  3. Authorise the integration if prompted.
  4. Choose View for the HMRC Digital workspace.

First-time users: you’ll see a banner to link Slack and PagerDuty. Click Link Account, allow permissions, and confirm in Slack.

In your Slack channel, you should see the test incident and be able to acknowledge or resolve it. First-time users may need to authorise again from Slack:

  1. Click Link Account in Slack.
  2. Sign in to PagerDuty when prompted.

Create alerts for your microservice

  1. Create a branch in alert-config.
  2. Add a Scala file (named after your digital service) in:
    src/main/scala/uk/gov/hmrc/alertconfig/configs
  3. Create an object that extends AlertConfig. Replace placeholders and remove angle brackets:
    • <MyDigitalService>
    • <microservice-name>
    import uk.gov.hmrc.alertconfig.pagerdutyconfigs.<MyDigitalService>Integration
    
    object <MyDigitalService> extends AlertConfig {
      override val alertConfig: Seq[AlertConfigBuilder] = teamAlerts(Seq(
        "<microservice-name>"
      )).withIntegrations(<MyDigitalService>Integration)
    }

You can add many alert types. See the alert-config README for the full list and configuration.

Merge your changes

Open a PR in alert-config and have a teammate review it. After merge, the alert-config pipeline will:

  • generate alerts and an alerting dashboard
  • deploy to Grafana Alerting in each MDTP environment

The change will be live ~30 minutes after the job completes.

Grafana Alerting URLs

Check you’ve completed this task

  1. Go to the MDTP Catalogue and open your microservice.
  2. Under Code, open Commissioning State.
  3. Confirm alerting is configured correctly.

You can also view dashboards via the Grafana Alerting links above and search for your microservice.


Need support?

Ask in #team-telemetry on Slack.

Got feedback?

We’re always improving our documentation. Share your feedback with the team.

Configure PagerDuty for your microservice by setting up users, schedules, escalation policies and alerts in the alert-config repository.