There is currently an unplanned outage of our IPv6 connectivity in the data center ongoing.
We are working with our uplink providers to restore connectivity.
IPv4 connectivity is not affected, however, services that rely on outside services that provide double-stack networking (IPv4+IPv6) may experience timeouts and delays.
We will post updates here as we work towards a solution.
We are sorry for any inconvenience.
[Update 1 - 11:52 CET]
The cause seems to have been in the data center upstream router infrastructure. We are getting improvements and see traffic on IPv6 picking up again. We do know of a couple of remaining edge cases and continue to work on restoring full connectivity soon.
[Update 2 -14:51 CET, solved]
IPv6 connectivity has been restored. Some nodes are still recovering from the outage but we are seeing continuous improvement in our monitoring.
The root cause was traced to a Cisco IOS bug involving more than 10.000 IPv6 routing entries and our upstream provider has implemented a work-around for the problem. We expect network maintenance some time in the next weeks to provide a fixed IOS version at the network operator's equipment.
Major OS update roll-out starting from 2013-03-04
It has been a long time since we provided the last operating system update - longer than we expected to. However, we are happy to announce that we have finished our QA on a major set of package updates which brings updates for many packages maintained in our platform.
We will roll out the update incrementally between 2012-03-04 and 2012-03-09. All machines will receive a maintenance slot in the next days informing you of the assigned slots per email.
If you own any testing or staging environments then those will be updated at least 24 hours before the production systems.
Downtime expectations
Your VMs will be restarted at least two times:
- once during the week in their assigned slot to activate a new kernel configuration, and
- once on Saturday 2013-03-09 around 10:00-14:00 CET to restart all physical KVM hosts, this will take less than 30 minutes for any specifiv VM.
Important changes
There have been many improvements and bugfixes, most notably on our infrastructure, to make the platform more reliable and faster without you having to worry about it.
However, there are a few changes that we would like you to be aware of to avoid pitfalls:
- Varnish will be upgraded to version 3 - most configurations from Varnish 2 will continue to work correctly, but some subtle changes may break your config. The Varnish community has provided a nice upgrade document that summarises relevant changes.
- Due to a change in the license for Oracle/Sun JDK we decided to switch the Java VM to OpenJDK as we are no longer allowed to automatically distribute the Oracle/Sun JDK.
- Python 2.4 is now in "sunset" period: we will no longer install it to any additional machines and we will announce a roadmap to uninstall Python 2.4 within the next months. This version of Python is very old, unsupported and has known security issues.
- We no longer install "swftools" and will even actively uninstall them as the upstream developers do not maintain them any longer.
If you are interested in more technical details, feel free to take a look at our change log. A detailed list of updated package versions will appear there.
What's next?
Having finished this major release we are already looking forward for the next big thing. In the coming months we expect to drastically improve our storage system by introducing Ceph instead of iSCSI. Preliminary work on the new storage systems has started already and we are really excited to get this done.
If you look closely you will notice that we haven't upgraded the kernel with this update: this is currently intentional and related to the upcoming storage overhaul: the existing Linux iSCSI stack unfortunately doesn't port very well to newer kernel versions and instead of switching our iSCSI implementation while working on Ceph we decided to stay at a kernel that allows a smooth transition to Ceph.
[Update 1, 2013-02-14]
The date for the second reboot had a typo that showed 2013-02-09 for the KVM server reboot which should have read 2013-03-09
Subscribe to:
Posts (Atom)