News
Jun 11 |
[Core Network] Service Alert 22nd May 2018 - Update 1
Posted by David Croft on 11 June 2018 08:33 AM |
This is an incident notification regarding an outage on our core network. Date: Tuesday, 22nd May 2018 Start time: 14:46 BST End time: 16:44 BST (Final clear) Services affected: Intermittent partial and total outage across our IP network Report: A failure of a network device in our core IP network led to a cascade of additional failures across the network. Controlled shutdowns of portions of our network were necessary to bring it back to a stable state in order to fully restore service. Both the failure and the controlled shutdowns caused packet loss and temporary routing failures to services hosted on or delivered through our network. Root Cause Analysis: A memory exhaustion on a edge transit router caused it to restart its BGP process, and the consequent withdrawal and announcement of all routes from BGP caused other devices on the network to suffer similar failures in a cascading fashion. Next Steps: We are bringing forward our upcoming planned network upgrades, which will now take place in July. Emergency maintenance sessions will be announced once lab testing has been completed. In the meantime we will continue to observe the change freeze on our network to prevent a further incident. Regards, David Croft | |
Comments (0)