👁7views
CloudScale Crash Recovery

Free & Open Source

CloudScale Crash Recovery

A bad plugin update can take your entire site down in seconds — and if WordPress itself is broken, you cannot log in to fix it. CloudScale Crash Recovery watches your site from outside WordPress, detects crashes within minutes, and automatically rolls back the offending plugin before most visitors even notice. Completely free, no subscription, no premium tier.


Compatibility Checks

Compatibility Checks

The Compatibility Checks tab runs 10 server-side tests to confirm your instance is ready for the system cron watchdog before you deploy it. Click Run Compatibility Checks to execute all checks in one pass.

What is tested:

  • PHP CLI — verifies that a PHP binary is available at the path the watchdog script will use. Required for WP-CLI calls.
  • shell_exec — confirms the shell_exec() PHP function is not disabled. The watchdog uses it to invoke WP-CLI from within PHP recovery fallbacks.
  • curl — confirms curl is available on the server PATH. The watchdog script probes your site URL using curl directly.
  • Probe endpoint — makes a test HTTP request to your configured probe URL and verifies a 200 response with the CLOUDSCALE_OK marker in the body.
  • Plugin directory permissions — checks that the watchdog can read and modify the wp-content/plugins/ directory (needed to identify and delete the crashing plugin).
  • WP-CLI — verifies WP-CLI is installed and executable at the configured path. The watchdog uses wp plugin deactivate and wp plugin delete during recovery.
  • Watchdog script presence — confirms the watchdog shell script has been deployed to /usr/local/bin/cs-crash-watchdog.sh and is executable.
  • Cron entry — checks whether a crontab entry for the watchdog script exists for the web server user.
  • Log file — verifies the log file at /var/log/cloudscale-crash-recovery.log is writable.
  • Legacy WP cron — checks whether DISABLE_WP_CRON is set in wp-config.php. WP-Cron should be disabled on production sites that use a real system cron, to avoid it triggering recovery logic at unexpected times.

Critical failures (marked in red) must be resolved before the watchdog can protect the site. Warnings (amber) are advisory and will not prevent the watchdog from running, but may reduce its reliability.


Setup & Configuration

Setup & Configuration

The watchdog requires a system cron job — not WordPress WP-Cron — to run reliably every minute. WP-Cron is visitor-triggered: it only fires when a page is requested, cannot guarantee minute-level frequency, and does not run at all if your site is down (which is exactly when you need it).

Setup steps:

  1. Copy the exact cron command shown in this panel. It is pre-populated with the correct PHP binary path and WordPress root for your server. It will look similar to:
    * * * * * /usr/bin/php /var/www/html/wp-content/plugins/cloudscale-crash-recovery/watchdog.php >> /var/log/cs-watchdog.log 2>&1
  2. Open the crontab for the web server user: sudo crontab -u apache -e (Apache/RHEL) or sudo crontab -u www-data -e (Nginx/Debian). Using the web server user ensures the watchdog has the same filesystem permissions as WordPress itself.
  3. Paste the command, save, and exit. Verify the cron was registered: sudo crontab -u apache -l
  4. Wait 2 minutes, then return to this panel. The Last probe timestamp should update and Watchdog Status should show Active.
  5. Click Test Connectivity to verify the watchdog can reach your site’s probe URL and that the HTTP response is being recorded correctly.

Configuration options:

  • Probe URL — the URL the watchdog sends a GET request to. Defaults to your WordPress home_url(). Change this if: your homepage has a redirect (use the final destination URL), your homepage requires authentication, or you want to probe a specific health-check endpoint. The probe uses wp_remote_get() with a 10-second timeout and sslverify = true.
  • Recovery window — how recently (in minutes) a plugin must have been modified on disk to be considered the crash culprit. Default: 10 minutes. If you deploy or update plugins frequently, reduce this to 5 minutes to avoid false positives. If you batch-install many plugins at once, increase it to 20–30 minutes.
  • Notification email — the watchdog sends a plain-text email via wp_mail() to this address on every recovery event. Leave blank to disable email notifications. Check your server’s mail delivery logs if emails are not arriving: tail -50 /var/log/maillog

Status & Log

Status & Log

Why CloudScale Crash Recovery?

Every WordPress site owner has experienced it: you click “Update” on a plugin, the page goes white, and suddenly your site is serving a 500 error to every visitor. Worse, you cannot log into wp-admin to fix it — because WordPress itself is broken.

CloudScale Crash Recovery watches your site from outside WordPress using your server’s system cron. It probes your site every minute. The moment it detects a crash, it automatically deactivates and removes the most recently modified plugin — the most likely cause — and re-probes to confirm recovery. The whole process takes under two minutes.

It is completely free. No premium version, no upgrade nag, no monthly fee. Install it, configure a system cron entry, and your site has automatic crash recovery running silently in the background.

The Watchdog Dashboard shows the real-time status of your automated crash recovery system. The watchdog is a PHP script invoked by a system cron job that makes an HTTP GET request to your site’s frontend URL every minute. If the response code is not 200 OK, it triggers the recovery sequence.

  • Watchdog statusActive: the system cron is configured and the watchdog has probed the site within the last 2 minutes. Inactive: no recent probe recorded — check your crontab or see Setup & Configuration.
  • Last probe — exact timestamp of the most recent health check. If this timestamp is more than 3 minutes old and the watchdog shows Active, the cron job may be stalled or the PHP process is timing out. Investigate with grep CRON /var/log/syslog | tail -20 (Linux) or grep cron /var/log/system.log | tail -20 (macOS/cPanel).
  • Last response code — the HTTP status code from the most recent probe request. 200 = healthy. 500 = PHP fatal error (most common cause of plugin-induced crashes). 503 = maintenance mode active. 301/302 = your homepage redirects — update the Probe URL to the redirect target to avoid false positives.
  • Recovery count — cumulative number of automatic recoveries since the plugin was activated. A count above 0 means the watchdog has saved your site at least once. Each event is logged with full details in the Crash Log tab.

How the recovery sequence works: On detecting a non-200 response, the watchdog calls WP-CLI (wp plugin deactivate and wp plugin delete) on the plugin whose files in wp-content/plugins/ have the most recent mtime within the configured recovery window (default: 10 minutes). It then re-probes the site. If the re-probe returns 200 OK, recovery is confirmed and the event is logged. If the re-probe still fails, the watchdog logs the failure and waits for the next cron cycle — it does not cascade-delete additional plugins.


Logs & Debug

Logs & Debug

The Crash Log records every recovery event with enough detail to conduct a post-incident review and take preventive action.

  • Timestamp — exact date and time (server timezone) when the crash was first detected by the watchdog probe.
  • HTTP status — the error code that triggered the recovery. 500 Internal Server Error is the most common — indicates a PHP fatal error, typically caused by a plugin update that introduced a parse error or incompatible function call. 502 Bad Gateway or 504 Gateway Timeout indicate a PHP-FPM or web server process crash rather than a PHP code error.
  • Plugin deactivated — the plugin folder name that was identified as the culprit and removed. Identification is based on the plugin directory with the most recent filesystem mtime within the recovery window. This is accurate when a plugin was updated or installed shortly before the crash, which covers the vast majority of plugin-induced outages.
  • Recovery confirmedYes if the re-probe returned 200 OK after the plugin was removed. No means the site was still returning an error after the plugin was deactivated — the crash may have a different root cause (database corruption, server resource exhaustion, theme error).
  • Response time — total elapsed time from crash detection to confirmed recovery, in seconds. Typical recovery time: 60–90 seconds (one cron cycle to detect + time to run WP-CLI + one re-probe).

Post-incident actions: After reviewing the log, reinstall the deactivated plugin only after confirming the crash root cause. Check the plugin’s changelog for known PHP compatibility issues, or test the plugin on a staging environment before reactivating on production.


Settings

Settings

The Settings tab contains miscellaneous options that operate independently of the watchdog.

Custom 404 Page

When enabled, replaces the default WordPress 404 (theme-rendered) response with a clean, self-contained branded page — with no dependency on the active theme or any page builder. This means the 404 page renders correctly even if the active theme is broken or missing, which is particularly useful during a crash recovery event.

To preview: enable the toggle, then visit any URL on your site that does not exist (e.g. /this-page-does-not-exist). Disable the toggle to revert to the default theme 404 behaviour.