With a Server Failure Behind Us, We’re Moving Forward

As you may know, mid-last week we suffered a catastrophic failure of our oldest server, Paul. This happened because a customer account was breached, likely due to an unpatched vulnerability in either their content management system, or a plugin for their content management system. While our security measures were able to thwart the attack temporarily, it seems brute force by the attacker was able to get one of their Perl or Python-based scripts to create a new user account (firefart) and then elevate that user account to root-level access, allowing them to do almost anything they please on the server.

In over four years of operating CleverHost, and several more years working at other web hosting companies, this is the first time we’ve ever seen this happen. We wanted to complete a thorough forensic analysis of the server in order to determine the precise nature of the breach, but the attacker left it in such a state that that will not be possible.

Service was restored late last Friday night, just before midnight. Now that Paul is running again – along with a new server called Alia – and all but a handful of minor issues have been dealt with, we have taken, and are continuing to take, numerous steps with our new, and existing, servers to help prevent something like this from happening again, including:

  • Purchasing KernelCare, a service from well-respected vendor CloudLinux, that automatically patches the operating system kernel without the need for a server reboot;
  • Running scripts to add any supported applications (Drupal, WordPress, myBB, etc) into being managed by Installatron; this enhances our users’ abilities to apply security patches to applications, plugins, and themes more easily and automatically;
  • The move to cloud virtual private servers (VPS), instead of dedicated servers, with automatic, daily backups; this will allow us to restore a server in hours, rather than days, and provides redundant backups;
  • The procurement of a new, faster, and more reliable backup system for individual accounts and files; we’re currently evaluating software from R1Soft and JetBackup Manager;
  • We are also investigating other measures like Imunify360, implementing Google BBR for better network performance, installing RKHunter’s cPanel plugin for more frequent rootkit checks, etc.

Earlier today our other legacy server, Chani, also experienced one hour of downtime due to an unknown issue. Data centre technicians couldn’t get the server to respond, so they rebooted the server manually to restore service. As this is not the first time Chani has experienced a random outage, we will also be moving it to the new, cloud infrastructure that Paul is now on. Chani’s migration, and split into two servers, will be scheduled a few weeks from now. Customers will be notified and given ample opportunity to work with us on alternative dates and times to perform DNS and related updates if needed.

We understand your ability to trust your service providers is paramount to conducting business with them, and we hope to restore your faith and trust in CleverHost.