Latest Updates
Jul
9
[Core Network] EMERGENCY Maintenance window 10th July 2018
Posted by David Croft on 09 July 2018 05:56 PM
Please be advised of the following maintenance window scheduled on
our core network:

Start date: Tuesday, 10th July 2018
Start time: 22:00 BST (UTC+0100)
End time: 04:00 BST (UTC+0100) Wednesday

Purpose:

Our emergency engineering works this past weekend were successful with
one exception: one router could not be replaced due to an
incompatibility with our long range fibre SFPs. The original router
was therefore returned to service. This incompatibility was not
detected during testing as the lab links were short range.

This maintenance window is to attempt the router replacement again
with new optics that are arriving tomorrow. Replacement optics will be
installed and the new replacement router brought into service after
testing.

Services affected:

Cloud Ethernet customers single-homed out of Interxion LON1 (CLBIS02).

Impact:

HIGH - Customers with a single connection to our Interxion LON1
(CLBIS02) PoP will experience an interruption in service while the
device is replaced.

Please note: This maintenance window is being declared as EMERGENCY
without the normal notice period due to the need to perform this work
urgently in order to prevent a recurrence of the outage of 22nd May.

Regards,

David Croft
Read more »



Jun
29
[Core Network] EMERGENCY Maintenance windows 6-8th July 2018
Posted by David Croft on 29 June 2018 01:07 PM
Please be advised of the following maintenance windows scheduled on
our core network:

Start date: Friday, 6th July 2018
Start time: 18:00 BST (UTC+0100)
End time: 04:00 BST (UTC+0100) Saturday

Start date: Saturday, 7th July 2018
Start time: 14:00 BST (UTC+0100)
End time: 04:00 BST (UTC+0100) Sunday

Start date: Sunday, 8th July 2018
Start time: 14:00 BST (UTC+0100)
End time: 04:00 BST (UTC+0100) Monday

Purpose:

We will be replacing core network devices to increase capacity and
prevent a recurrence of the outage suffered on 22nd May.

Services affected:

All

Impact:

LOW - Prior to 20:00 BST on each date
No works will take place that are known to have a possibility of
causing any service degradation, however the network is "at-risk" due
to preparatory work taking place in core network PoPs.

MEDIUM - After 20:00 BST on each date
No works that will take place that are known to have any possibility
of causing a total network outage, however works will be carried out
that will cause a reduction of resilience or the removal of a single
PoP from service.

HIGH - After 23:59 BST on each date
Works that have any possibility of triggering the cascading failure
will be reserved to take place after midnight.

Due to the nature of the instability suffered on 22nd May the
maintenance has been carefully developed to categorise works according
to both their expected (normal) risk and potential (due to the current
situation) risk and allocate them to the maintenance windows as
detailed above. This has been developed conservatively due to the
unexpected cascading nature of the previous failure.

Prior to 20:00 BST on each date, it is anticipated that no customer
suffers loss or degradation of service.

After 20:00 BST it is anticipated that single-homed customers will
experience loss of service, however customers with resilience options
will fail over to alternate paths.

After 23:59 BST it is anticipated that all customers will suffer loss
of service.

In all cases we will endeavour to keep periods of disruption to a minimum.

Please note: These maintenance windows are being declared as EMERGENCY
without the normal notice period due to the need to perform these
works urgently in order to prevent a recurrence of the outage of 22nd
May.

The immediate goal of this work is to stabilise the network to prevent
further a network outage. The second, non-essential part of the
upgrades will take place in August.

Regards,

David Croft
Read more »



Jun
11
[Core Network] Service Alert 22nd May 2018 - Update 1
Posted by David Croft on 11 June 2018 08:33 AM
This is an incident notification regarding an outage on our core network.

Date: Tuesday, 22nd May 2018
Start time: 14:46 BST
End time: 16:44 BST (Final clear)

Services affected:

Intermittent partial and total outage across our IP network

Report:

A failure of a network device in our core IP network led to a cascade
of additional failures across the network.

Controlled shutdowns of portions of our network were necessary to
bring it back to a stable state in order to fully restore service.

Both the failure and the controlled shutdowns caused packet loss and
temporary routing failures to services hosted on or delivered through
our network.

Root Cause Analysis:

A memory exhaustion on a edge transit router caused it to restart its
BGP process, and the consequent withdrawal and announcement of all
routes from BGP caused other devices on the network to suffer similar
failures in a cascading fashion.

Next Steps:

We are bringing forward our upcoming planned network upgrades, which
will now take place in July. Emergency maintenance sessions will be
announced once lab testing has been completed.

In the meantime we will continue to observe the change freeze on our
network to prevent a further incident.

Regards,

David Croft
Read more »



May
22
[Core Network] Service Alert 22nd May 2018
Posted by David Croft on 22 May 2018 05:35 PM
This is an incident notification regarding an outage on our core network.

Date: Tuesday, 22nd May 2018
Start time: 14:46 BST
End time: 16:44 BST (Final clear)

Services affected:

Intermittent partial and total outage across our network

Report:

A failure of a network device in our core network led to a cascade of
additional failures across the network.

Controlled shutdowns of portions of our network were necessary to
bring it back to a stable state in order to fully restore service.

Both the failure and the controlled shutdowns caused packet loss and
temporary routing failures to services hosted on or delivered through
our network.

Our network is built with a high level of resilience, and we and our
vendor are investigating as a matter of urgency how this failure was
able to have such a wide-reaching impact and the steps we will need to
take to prevent it recurring.

Regards,
Read more »



Apr
25
[ADSL/FTTC] Service Alert 10th April 2018 - Update 3
Posted by David Croft on 25 April 2018 10:37 AM
This is an incident notification regarding ADSL/FTTC services.

Date: Tuesday, 10th April 2018
Start time: 11:00 BST
End time: 11:58 BST

Services affected:

ADSL/FTTC circuits provided by TTB (excluding BT and Cloud Ethernet FTTC).

Report:

An incident occurred on the TTB network that caused a large number of
ADSL/FTTC customers to disconnect and be mostly unable to reconnect.
This affected all customers across all ISPs using the TTB broadband
network.

TTB have informed us that the outage was due to a faulty card at
Telecity Manchester which caused a backplane failure. The faulty card
was replaced and traffic was re-established.

The carrier has postponed all planned internal network changes in
order to investigate why the card failed and how this was able to
cause such a widespread outage in order to prevent a reoccurrence.

This is the final update on this service alert.

Regards,

David Croft
Read more »