Data Center Outage
Incident Report for VPSBlocks Pty Ltd
Postmortem

Major Incident Postmortem

At 4:45AM this morning a major fault occurred at the OMNIConnect Data Center. This is the facility where our infrastructure is located. This fault caused our equipment to lose power for a period of time exceeding the UPS backup power time OMNIConnect currently provides. The impact was severe and all servers, SANs, switches and other equipment was powered off. Once power was restored we worked as fast as possible to get services online by business hours. This was nearly achieved with all 98% of services back online at that time, and 99.8% functioning correctly by 9:30am.

A report from OMNIConnect told us that the cause was a C-phase fuse which burnt out (not unforseeable), which resulted in one of the three phases going down. However, the reason for the outage was the transfer switch didn't recognise as only one phase was down that it needed to switch to generator power, therefore although the generator was running, the transfer switch didn't kick in to transfer the power into the OMNI grid.

OMNI have assured us that they will be installing new redundant transfer switches, a secondary switchboard, and replacing equipment with higher rated equipment (5x) within the next two weeks. In the mean time new monitoring equipment will also be fitted. There are also plans for further upgrades over the next couple of months.

VPSBlocks works very hard to ensure our service and support is top quality, and we are disappointed to have been let down by one of our main suppliers in this instance. We are working with OMNIConnect to ensure that this does not reoccur so we can focus on providing top notch cloud hosting to our valued customers.

Thank you very much for your continued support. Will Kruss Technical Director

Posted 11 months ago. Sep 07, 2017 - 19:45 AEST

Resolved
All services have been restored, if you cannot reach your VM please contact us via phone or email for us to check your server manually.
Posted 11 months ago. Sep 07, 2017 - 09:13 AEST
Identified
Apologies for the delay in updating, we are working hard to bring everything back online after a major data center issue.
Posted 11 months ago. Sep 07, 2017 - 07:47 AEST
Update
We are still investigating an outage at the data center.
Posted 11 months ago. Sep 07, 2017 - 05:11 AEST
Investigating
We are currently investigating an outage
Posted 11 months ago. Sep 07, 2017 - 04:49 AEST