News
Jul
31
[Core Network] Service Alert 31st July 2014 - Update 2
Posted by David Croft on 31 July 2014 11:41 AM
This is an incident notification regarding Internet connectivity.

Date: Thursday, 31st July 2014
Start time: 09:13 BST (UTC+0100)
End time: 10:41 BST (UTC+0100)

Services affected:

Internet access.

Report:

We were observing intermittent degraded performance towards some
Internet destinations. VoIP services and networks accessed over
peering connections were not affected.

We determined that PT Telekomunikasi Indonesia (AS7713) were
redistributing our routes without our authorisation, interfering with
our global routing. We shut down the affected peering session and
notified the London Internet Exchange (LINX). This resolved the issue.

PT Telekomunikasi Indonesia have informed us that their customer
Myanmar Telecom had a misconfiguration and that they have shut them
down. We will wait until this evening before re-establishing the
peering session.

This is the final update.

Comtec NOC is tracking this issue under ticket [#TWW-162-82542].

Regards,

David Croft
Read more »



Jul
31
[Core Network] Service Alert 31st July 2014
Posted by David Croft on 31 July 2014 10:48 AM
This is an incident notification regarding Internet connectivity.

Date: Thursday, 31st July 2014
Start time: 09:13 BST (UTC+0100)

Services affected:

Internet access.

Report:

We were observing intermittent degraded performance towards some
Internet destinations. VoIP services were not affected. Engineering
are currently investigating.

We have determined that PT Telekomunikasi Indonesia (AS7713) were
redistributing our routes without our authorisation, interfering with
our global routing. We have shut down the affected peering session and
notified the London Internet Exchange (LINX).

We will continue to monitor the situation.

The next update will be issued in 30 minutes.

Comtec NOC is tracking this issue under ticket [#FXV-431-24381].

Regards,

David Croft
Read more »



Jul
31
[Core Network] Service Alert 31st July 2014
Posted by David Croft on 31 July 2014 10:28 AM
This is an incident notification regarding Internet connectivity.

Date: Thursday, 31st July 2014
Start time: 09:13 BST (UTC+0100)

Services affected:

Internet access.

Report:

We are observing intermittent degraded performance towards some
Internet destinations. VoIP services are not affected. Engineering are
currently investigating.

The next update will be issued in 30 minutes.

Comtec NOC is tracking this issue under ticket [#FXV-431-24381].

Regards,

David Croft
Read more »



Jul
30
[IP Voice Services] Service Alert 26th July 2014 - Update 2
Posted by David Croft on 30 July 2014 01:27 PM
This is an incident notification regarding IP Voice Services.

Date: Saturday, 26th July 2014
Start time: 12:36 BST (UTC+0100)
End time: 13:32 BST (UTC+0100)

Services affected:

IP Voice Services.

Report:

An incident occurred affecting registration on IPVS and network
connectivity to all ancillary services.

Root Cause:

All layer 3 connections on a primary core router dropped due to a
sudden failure with this element.

Due to the nature of the failure, whilst BGP sessions on the primary
router did failover to the secondary as expected, it did not release
primary routing responsibility to its peer to complete the failover.
This caused traffic to continue to route to the primary router but no
further and as a result caused the service impact.

At this time the root cause is being associated to the corrective
actions intended to be taken via planned engineering works PEWA0195
that was scheduled for 03/08/2014. PEWA0195 related to a planned
reboot of one of the core network routers to correct a memory issue.

In close coordination with our vendor it was expected that the
impacted router should have remained stable until that time. Due to
this incident however, the work is no longer necessary.

Symptoms:

Up to 50% of active call traffic was affected and up to 90% of users
on the platform experienced a drop in registration on the SBC (Session
Border Controller).

Whilst unregistered, users would experience outbound calls failing and
inbound calls may not have been presented to user devices.

Inbound calls from the PSTN to DDIs that were forwarded to off-net
destinations routed as normal.

This also affected access to the platform for some supplementary
portals and services.

Resolution:

The engineering team rebooted the affected router at 13:32. The
failover to the secondary router took place as expected. At 13:37 the
primary router was brought back into service without issues.

Engineering are continuing to monitor all services to ensure there are
no ongoing problems or a recurrence of this issue.

All of the necessary logs have been taken from the affected router and
will be analysed in conjunction with the vendor to identify and
further confirm the underlying cause of the failure. Whilst this
investigation is ongoing, enhanced monitoring has been configured for
the router, based on the logs taken, to give advanced warning of this
event re-occurring so that a maintenance window can be scheduled if
required. The extended measures now in place should ensure that prompt
and controlled actions are taken if required to prevent further
negative impact.

Timeline:

26/07/2014

12:36 - All layer 3 connections on a particular primary core router
were lost due to a failure with this element. Automated alerts were
generated to key Engineering and Support representatives to notify of
the issue. We also picked this up with our own monitoring.

Due to the nature of the failure most BGP sessions on the primary
router were down but it did not release service to the secondary as
expected, causing traffic to route to the primary but not progress out
beyond this point.

This affected access to the platform for most portals and services,
including registrations and call processing.

Calls in to the platform from the PSTN continued to be accepted by
platform services or redirected as configured back out to the PSTN.
However, calls would not have been presented to end user
devices/systems affected by the routing failure.

13:32 - Once the Engineering Team had fully investigated the issue and
identified the cause of the failure it was decided to reboot the
primary router in order to restore service.

The reboot correctly took the primary router out of service and the
failover took place as expected from the primary to the secondary core
router and services were restored.

13:37 ? The affected router returned to service as expected without
any errors and resumed the primary role for traffic processing.

Engineering are continuing to monitor all services to ensure there are
no ongoing problems or a re-occurrence of this issue.

Apologies for the inconvenience caused.

This is the final update.

Comtec NOC was tracking this issue under ticket [#KOH-381-15457].

Best regards,

David Croft
Read more »



Jul
26
[IP Voice Services] Service Alert 26th July 2014 - Update 1
Posted by David Croft on 26 July 2014 01:35 PM
This is an incident notification regarding IP Voice Services.

Date: Saturday, 26th July 2014
Start time: 12:48 BST (UTC+0100)
End time: 13:32 BST (UTC+0100)

Services affected:

IP Voice Services.

Report:

We are currently investigating an issue affecting all registration on
IPVS and network connectivity to all ancillary services.

Our supplier has now restored service and we will issue further
information when it becomes available.

The next update will be issued when further information becomes available.

Comtec NOC is tracking this issue under ticket [#KOH-381-15457].

Best regards,

David Croft
Read more »



Jul
26
[IP Voice Services] Service Alert 26th July 2014
Posted by David Croft on 26 July 2014 01:11 PM
This is an incident notification regarding IP Voice Services.

Date: Saturday, 26th July 2014
Start time: 12:48 BST (UTC+0100)

Services affected:

IP Voice Services.

Report:

We are currently investigating an issue affecting all registration on
IPVS and network connectivity to all ancillary services.

We have determined that this issue is not on our network and have
escalated it to our supplier. Further details will be supplied as they
become available.

The next update will be issued in 30 minutes.

Comtec NOC is tracking this issue under ticket [#KOH-381-15457].

Best regards,

David Croft
Read more »