Unexpected downtime due to storage failure [update 3:48pm]

gocept.net is currently experiencing an infrastructure-wide downtime due to an error in the storage layer.

We are currently recovering the affected services and will provide further updates soon.

2011-04-12 2:58pm CEST


We identified an issue in the iSCSI server software which has a vendor-side patch available. After initial tests in our staging environment we are currently preparing a package to apply this patch in our production systems within the next 30 minutes.

2011-04-12 3:26pm CEST

The patch has been applied on the storage server that was currently hung and the VMs are being brought back into a working state now.

The other storage server that had the same issue but is currently working will receive this patch immediately as well, however, we do not expect further interruptions on VMs that are currently working as the patch applied cleanly in the staging systems while under load.

2011-04-12 3:48pm CEST

The patch was applied cleanly on all storage servers now. VMs that had a read-only filesystem have been rebooted and are back online now.

If you are affected, please check that your services are back online. Nagios still needs some time to get back to a complete green state and we'll go after any remaining issues in the next time.


We apologize for any inconvenience.