Intermittent Network Issues
Incident Report for GearHost
Postmortem

Situation Summary:

On Tuesday Oct 13th at approximately 3 AM MST GearHost began experiencing a routing blackhole through our upstream provider (Zayo IP). This was due to an equipment failure on the regional transit providers router which was outside of GearHost’s control. Due to the nature of the failure, routing protocols did not recognize the outage and traffic destined to and from the internet was blackholed in the peer router. Once the issue was identified, GearHost de-peer’d from the affected upstream at all peering points which restored regular traffic for GearHost customers.

Customer Impact:

Partial or total loss of connectivity

Mitigation Strategy:

This outage was due to traffic being dropped by a routing peer while maintaining routing protocols. Normally dynamic routing would detect an outage and internet traffic would route around it. When an issue with an upstream provider drops or “blackholes” traffic without sending routing updates, the only way to mitigate it is to identify where the traffic is being dropped and manually intervene with routing changes at our borders.

Activity that ultimately restored service:

Depeering with Zayo IP at the blendwidth border routers.

Identified Root Cause:

Upstream provider equipment failure dropping traffic while maintaining dynamic routing protocols

Posted Oct 18, 2020 - 18:20 MDT

Resolved
The implemented fix has resolved the overall issue. We will provide a post mortem once we gather all information. Thank you all for your patience and understanding. We are working internally and with our partners to ensure this does not happen again.
Posted Oct 13, 2020 - 10:52 MDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Oct 13, 2020 - 05:08 MDT
Identified
The issue has been identified. It is at the ISP level. We are currently awaiting more information. We will provide an ETA when possible.
Posted Oct 13, 2020 - 04:34 MDT
Investigating
Our engineers are looking into the issues some users are experiencing. We will provide an update as we get them.
Posted Oct 13, 2020 - 01:30 MDT
This incident affected: DEN1, CloudSites, Databases, DNS, and Email.