jim.shamlin.com

Automation via crontab

A crontab (think "Cron tab") is a file that provides instructions to the server to execute commands at certain times - which is ideal for performing maintenance and doing batch processing of collected data. Of importance: the Cron daemon itself does nothing but kick off other programs.


Accessing Crontab

Each user account has its own crontab file (provided that access to crontab is not disabled) that can be edited by logging into the shell and executing crontab -e - or better still, simply edit it locally, upload it to the server as an ASCII file, and execute crontab filename.txt.


Basics

A crontab file usually has multiple lines that look like this:

0 * * * * /usr/local/etc/maint/hourly.script
0 0 * * * /usr/local/etc/maint/daily.script
0 0 1 * * /usr/local/etc/maint/monthly.script

Each line has six fields - five indicating what time to take action, and a sixth indicating the action to take. When the appointed time arrives, the Cron daemon executes the action indicated as if it were being done by the user account to which the crontab file belongs.


Specifying Time

Time is specified as five numbers, separated by spaces:

An asterisk (*) is used for a wildcard for any of the above.

Values can also be specified as a comma-separated list. For a program to run every 15 minutes, specify the minute value as 0,15,30,45 (being careful not to insert spaces after the commas).

Caveat #1: Beware of Wildcards

Keep in mind that the Cron daemon will check the tab every minute and, if the current time matches, the command will be executed. For that reason, you have to be careful with wildcards. Consider the following:

* * * * 1 /usr/local/etc/maint/weekly.script

This is a botched crontab line for a script that is intended run once per week, each Monday. However, it's going to execute 3,600 times in a row - each minute of each hour - on each Monday. To set it to run once a week, you have to specify a time:

0 0 * * 1 /usr/local/etc/maint/weekly.script

With that setting, the script will run once per week, at midnight on each Monday.

Caveat #2: Beware of the Last Day of the Month

Maybe this is obvious, but it's easily overlooked: If you set a crontab for day-of-month 31, it will not execute in February, April, June, September, or November (which have fewer than 31 days). Either use the first day of the month (which is close enough) or pick a date no later than the 28th.

Caveat #3: Beware of Midnight

People seem overly fond of doing things at specific times - and when it comes to maintenance tasks, that specific time is often set to exactly midnight each night (0 0 * * * in the crontab file).

Technically, that's not a problem - a crontab can do that - but procedurally, what do you suppose happens when a five or six hundred users on a single server all kick off a few data-intensive processes at exactly the same time? Incidentally, what time of day do you think most Web servers crash?

Even if you're not motivated to take on the burden of being a good neighbor (or even if you have a dedicated server), your processes will run more smoothly if you schedule them for a low-traffic time (which is usually between 3 and 5 am eastern, in the USA - but check your activity reports to find your own server's particular trend) and spread them out, so you don't have a lot of demand on the server's capacity all at once.


Commands

The command passed to cron is usually in the form of a path to a script, but it may also be a literal command, just as it would be entered from the command prompt. If you need to execute multiple commands, they can be separated with % rather than having multiple lines for the same minute.

Any output of the script will be sent to the mail account of the crontab owner on the server, unless specified otherwise (it can be written to file or sent into a void).


Optional Stuff

There are a number of optional lines that can be included ...

MAILTO = me@myserver.com
# this is a comment
@reboot /path/reboot.script

The first line will mail the output of any of the crontab lines following it to the address specified. You can have multiple MAILTO commands to send the output to different addresses, if you so desire.

The second line is a comment - the # tells Cron to ignore it. Comments in crontab are generally a waste of space unless you have a lot of tasks in a crontab and/or a bad memory.

The third line shows a method of running a command when the server is rebooted. The @reboot substitutes for the time code.


In closing ...

One last thought in closing: use crontab judiciously. It can be (ab)used to do a lot of frivolous things (displaying a holiday message on your home page, sending birthday greetings via e-mail, sending a text to your cell phone to remind you to pick up a dozen eggs on the way home). I recommend using other methods to take care of these tasks, and only using Cron to do boring maintenance tasks that you are certain you'd want and need to do every day/month/week for as long as the server is operational.