I’m running a simple development server (Ubuntu) on which MySQL and MongoDB sometimes crash. I always restart them with sudo service mysql restart
.
Although I know I need to investigate why they crash—and I will—I’m currently looking for a way to automatically restart them after they've crashed. I guess I need to have some kind of daemon which pings them and restarts them if they are not responsive anymore, but I'm unsure of how to do this.
I read about tools like Nagios, but I guess that’s a bit overkill for my situation.
Does anybody know how I can get started?
I read about tools like Nagios, but I guess that’s a bit overkill for my situation.
Does anybody know how I can get started?
Easy. Look into setting up monitoring configurations with Monit. It’s a lightweight and easy to setup system monitoring tool that is very useful to setup in scenarios exactly as you describe; service goes down, restart it and alert me about it.
I’ve mainly use it for Apache web servers, but there are lots of examples of what can be done for other programs/software such as MySQL and such.
Setting up Monit.
The way I set it up is line this. First, install the Monit program itself like this:
sudo apt-get install monit
Once installed, then edit the config here; I prefer to use nano
but feel free to use whatever text editor you prefer:
sudo nano /etc/monit/monitrc
Adjust the default daemon values to check services every 60 seconds with a start delay of 120:
set daemon 60
with start delay 60
Then find the mailserver
area of monitrc
and add the following line. Postfix or an SMTP needs to be active for this to work. I typically have Postfix installed on my servers so I use the following setup:
set mailserver localhost
Then I make sure a Monit config directory is setup like this:
sudo mkdir -p /etc/monit/conf.d
Setting up a Monit Apache2 monitoring ruleset.
Now—like I said—I mainly use Monit for Apache monitoring so this is a simple config I like to use but the basic concept is similar for MySQL, MongoDB or other things. I would save it in this file:
sudo nano /etc/monit/conf.d/apache2.conf
And this would be the contents of that file:
check process apache with pidfile /var/run/apache2.pid
start "/usr/sbin/service apache2 start"
stop "/usr/sbin/service apache2 stop"
if failed host 127.0.0.1 port 80
with timeout 15 seconds
then restart
alert email_address@example.com only on { timeout, nonexist }
The syntax is fairly self-explanatory, but basically:
- The process hinges on the
apache2.pid
; be sure to change that to match the actual location of your apache2.pid
or httpd.pid
in your environment.
- Then has commands connected to the processes of
start
and stop
.
- And has logic that monitors the web server on port
80
on localhost
(127.0.0.1
)
- And only acts of the server is unreachable for 15 seconds.
- An if it has to act, it attempts a restart.
- And then sends an alert to the specified email address on incidents of the server timing out or not existing.
Setting up a Monit MySQL monitoring ruleset.
Based on the examples I linked to above, I would assume a config like this would work for MySQL. First, create a file like this:
sudo nano /etc/monit/conf.d/mysql.conf
And I have adapted the example so it—I would assume—behaves similarly to what I have setup for Apache:
check process mysqld with pidfile /var/run/mysqld/mysqld.pid
start program = "/usr/sbin/service mysql start"
stop program = "/usr/sbin/service mysql stop"
if failed host 127.0.0.1 port 3306 protocol mysql
with timeout 15 seconds
then restart
alert email_address@example.com only on { timeout, nonexist }
Of course that should be tweaked to match your actual working environment—such as adjusting the location of the mysqld.pid
, the email address and such—but past that it’s fairly generic in ideas/implementation.
Once that is set, restart monit
and all should be good:
sudo service monit restart
Setting up a Monit MongoDB monitoring ruleset.
To create a MongoDB monitoring ruleset, create a file like this:
sudo nano /etc/monit/conf.d/mongod.conf
And here is the MongoDB monitoring rule; note this matches the active MongoDB daemon and not a PID (aka: mongod.lock
) since it didn’t seem to work with that:
check process mongod matching "/usr/bin/mongod"
start program = "/usr/sbin/service mongod start"
stop program = "/usr/sbin/service mongod stop"
if failed host 127.0.0.1 port 27017 protocol http
with timeout 15 seconds
then restart
alert email_address@example.com only on { timeout, nonexist }
Of course that should be tweaked to match your actual working environment—such as adjusting the actual path of the /usr/bin/mongod
binary, the email address and such—but past that it’s fairly generic in ideas/implementation.
Once that is set, restart monit
and all should be good:
sudo service monit restart
Monitoring Monit.
You can follow the Monit log to see it in action:
sudo tail -f -n 200 /var/log/monit.log
And as a test, you can simply stop the MySQL or MongoDB server and then see what shows up in that log. If all goes well you should see the whole monitoring process and restart happen including an email being sent to the address you have set in the config.