linux - How to automatically restart MySQL and MongoDB when being unresponsive?

Saturday, 30 December 2017

linux - How to automatically restart MySQL and MongoDB when being unresponsive?

I’m running a simple development server (Ubuntu) on which MySQL and MongoDB sometimes crash. I always restart them with sudo service mysql restart.

Although I know I need to investigate why they crash—and I will—I’m currently looking for a way to automatically restart them after they've crashed. I guess I need to have some kind of daemon which pings them and restarts them if they are not responsive anymore, but I'm unsure of how to do this.

I read about tools like Nagios, but I guess that’s a bit overkill for my situation.

Does anybody know how I can get started?

Answer

I read about tools like Nagios, but I guess that’s a bit overkill for my situation.

Does anybody know how I can get started?

Easy. Look into setting up monitoring configurations with Monit. It’s a lightweight and easy to setup system monitoring tool that is very useful to setup in scenarios exactly as you describe; service goes down, restart it and alert me about it.

I’ve mainly use it for Apache web servers, but there are lots of examples of what can be done for other programs/software such as MySQL and such.

Setting up Monit.

The way I set it up is line this. First, install the Monit program itself like this:

sudo apt-get install monit

Once installed, then edit the config here; I prefer to use nano but feel free to use whatever text editor you prefer:

sudo nano /etc/monit/monitrc

Adjust the default daemon values to check services every 60 seconds with a start delay of 120:

set daemon 60
with start delay 60

Then find the mailserver area of monitrc and add the following line. Postfix or an SMTP needs to be active for this to work. I typically have Postfix installed on my servers so I use the following setup:

set mailserver localhost

Then I make sure a Monit config directory is setup like this:

sudo mkdir -p /etc/monit/conf.d

Setting up a Monit Apache2 monitoring ruleset.

Now—like I said—I mainly use Monit for Apache monitoring so this is a simple config I like to use but the basic concept is similar for MySQL, MongoDB or other things. I would save it in this file:

sudo nano /etc/monit/conf.d/apache2.conf

And this would be the contents of that file:

check process apache with pidfile /var/run/apache2.pid
  start "/usr/sbin/service apache2 start"
  stop  "/usr/sbin/service apache2 stop"
  if failed host 127.0.0.1 port 80
    with timeout 15 seconds
  then restart
  alert email_address@example.com only on { timeout, nonexist }

The syntax is fairly self-explanatory, but basically:

The process hinges on the apache2.pid; be sure to change that to match the actual location of your apache2.pid or httpd.pid in your environment.

Then has commands connected to the processes of start and stop.

And has logic that monitors the web server on port 80 on localhost (127.0.0.1)

And only acts of the server is unreachable for 15 seconds.

An if it has to act, it attempts a restart.

And then sends an alert to the specified email address on incidents of the server timing out or not existing.

Setting up a Monit MySQL monitoring ruleset.

Based on the examples I linked to above, I would assume a config like this would work for MySQL. First, create a file like this:

sudo nano /etc/monit/conf.d/mysql.conf

And I have adapted the example so it—I would assume—behaves similarly to what I have setup for Apache:

check process mysqld with pidfile /var/run/mysqld/mysqld.pid
  start program = "/usr/sbin/service mysql start"
  stop program = "/usr/sbin/service mysql stop"
  if failed host 127.0.0.1 port 3306 protocol mysql
    with timeout 15 seconds
  then restart
  alert email_address@example.com only on { timeout, nonexist }

Of course that should be tweaked to match your actual working environment—such as adjusting the location of the mysqld.pid, the email address and such—but past that it’s fairly generic in ideas/implementation.

Once that is set, restart monit and all should be good:

sudo service monit restart

Setting up a Monit MongoDB monitoring ruleset.

To create a MongoDB monitoring ruleset, create a file like this:

sudo nano /etc/monit/conf.d/mongod.conf

And here is the MongoDB monitoring rule; note this matches the active MongoDB daemon and not a PID (aka: mongod.lock) since it didn’t seem to work with that:

check process mongod matching "/usr/bin/mongod"
  start program = "/usr/sbin/service mongod start"
  stop program = "/usr/sbin/service mongod stop"
  if failed host 127.0.0.1 port 27017 protocol http
    with timeout 15 seconds
  then restart
  alert email_address@example.com only on { timeout, nonexist }

Of course that should be tweaked to match your actual working environment—such as adjusting the actual path of the /usr/bin/mongod binary, the email address and such—but past that it’s fairly generic in ideas/implementation.

Once that is set, restart monit and all should be good:

sudo service monit restart

Monitoring Monit.

You can follow the Monit log to see it in action:

sudo tail -f -n 200 /var/log/monit.log

And as a test, you can simply stop the MySQL or MongoDB server and then see what shows up in that log. If all goes well you should see the whole monitoring process and restart happen including an email being sent to the address you have set in the config.

Notes

Saturday, 30 December 2017