This module provides an easy to use interface for cron-like task scheduling. The latest version and its unit tests can be found at github.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 | #!/usr/bin/env python
"""
This module provides a class for cron-like scheduling systems, and
exposes the function used to convert static cron expressions to Python
sets.
CronExpression objects are instantiated with a cron formatted string
that represents the times when the trigger is active. When using
expressions that contain periodic terms, an extension of cron created
for this module, a starting epoch should be explicitly defined. When the
epoch is not explicitly defined, it defaults to the Unix epoch. Periodic
terms provide a method of recurring triggers based on arbitrary time
periods.
Standard Cron Triggers:
>>> job = CronExpression("0 0 * * 1-5/2 find /var/log -delete")
>>> job.check_trigger((2010, 11, 17, 0, 0))
True
>>> job.check_trigger((2012, 12, 21, 0 , 0))
False
Periodic Trigger:
>>> job = CronExpression("0 %9 * * * Feed 'it'", (2010, 5, 1, 7, 0, -6))
>>> job.comment
"Feed 'it'"
>>> job.check_trigger((2010, 5, 1, 7, 0), utc_offset=-6)
True
>>> job.check_trigger((2010, 5, 1, 16, 0), utc_offset=-6)
True
>>> job.check_trigger((2010, 5, 2, 1, 0), utc_offset=-6)
True
"""
import datetime
import calendar
__all__ = ["CronExpression", "parse_atom", "DEFAULT_EPOCH", "SUBSTITUTIONS"]
__license__ = "Public Domain"
DAY_NAMES = zip(('sun', 'mon', 'tue', 'wed', 'thu', 'fri', 'sat'), xrange(7))
MINUTES = (0, 59)
HOURS = (0, 23)
DAYS_OF_MONTH = (1, 31)
MONTHS = (1, 12)
DAYS_OF_WEEK = (0, 6)
L_FIELDS = (DAYS_OF_WEEK, DAYS_OF_MONTH)
FIELD_RANGES = (MINUTES, HOURS, DAYS_OF_MONTH, MONTHS, DAYS_OF_WEEK)
MONTH_NAMES = zip(('jan', 'feb', 'mar', 'apr', 'may', 'jun',
'jul', 'aug', 'sep', 'oct', 'nov', 'dec'), xrange(1, 13))
DEFAULT_EPOCH = (1970, 1, 1, 0, 0, 0)
SUBSTITUTIONS = {
"@yearly": "0 0 1 1 *",
"@anually": "0 0 1 1 *",
"@monthly": "0 0 1 * *",
"@weekly": "0 0 * * 0",
"@daily": "0 0 * * *",
"@midnight": "0 0 * * *",
"@hourly": "0 * * * *"
}
class CronExpression(object):
def __init__(self, line, epoch=DEFAULT_EPOCH, epoch_utc_offset=0):
"""
Instantiates a CronExpression object with an optionally defined epoch.
If the epoch is defined, the UTC offset can be specified one of two
ways: as the sixth element in 'epoch' or supplied in epoch_utc_offset.
The epoch should be defined down to the minute sorted by
descending significance.
"""
for key, value in SUBSTITUTIONS.items():
if line.startswith(key):
line = line.replace(key, value)
break
fields = line.split(None, 5)
if len(fields) == 5:
fields.append('')
minutes, hours, dom, months, dow, self.comment = fields
dow = dow.replace('7', '0').replace('?', '*')
dom = dom.replace('?', '*')
for monthstr, monthnum in MONTH_NAMES:
months = months.lower().replace(monthstr, str(monthnum))
for dowstr, downum in DAY_NAMES:
dow = dow.lower().replace(dowstr, str(downum))
self.string_tab = [minutes, hours, dom.upper(), months, dow.upper()]
self.compute_numtab()
if len(epoch) == 5:
y, mo, d, h, m = epoch
self.epoch = (y, mo, d, h, m, epoch_utc_offset)
else:
self.epoch = epoch
def __str__(self):
base = self.__class__.__name__ + "(%s)"
cron_line = self.string_tab + [str(self.comment])
if not self.comment:
cron_line.pop()
arguments = '"' + ' '.join(cron_line) + '"'
if self.epoch != DEFAULT_EPOCH:
return base % (arguments + ", epoch=" + repr(self.epoch))
else:
return base % arguments
def __repr__(self):
return str(self)
def compute_numtab(self):
"""
Recomputes the sets for the static ranges of the trigger time.
This method should only be called by the user if the string_tab
member is modified.
"""
self.numerical_tab = []
for field_str, span in zip(self.string_tab, FIELD_RANGES):
split_field_str = field_str.split(',')
if len(split_field_str) > 1 and "*" in split_field_str:
raise ValueError("\"*\" must be alone in a field.")
unified = set()
for cron_atom in split_field_str:
# parse_atom only handles static cases
for special_char in ('%', '#', 'L', 'W'):
if special_char in cron_atom:
break
else:
unified.update(parse_atom(cron_atom, span))
self.numerical_tab.append(unified)
if self.string_tab[2] == "*" and self.string_tab[4] != "*":
self.numerical_tab[2] = set()
def check_trigger(self, date_tuple, utc_offset=0):
"""
Returns boolean indicating if the trigger is active at the given time.
The date tuple should be in the local time. Unless periodicities are
used, utc_offset does not need to be specified. If periodicities are
used, specifically in the hour and minutes fields, it is crucial that
the utc_offset is specified.
"""
year, month, day, hour, mins = date_tuple
given_date = datetime.date(year, month, day)
zeroday = datetime.date(*self.epoch[:3])
last_dom = calendar.monthrange(year, month)[-1]
dom_matched = True
# In calendar and datetime.date.weekday, Monday = 0
given_dow = (datetime.date.weekday(given_date) + 1) % 7
first_dow = (given_dow + 1 - day) % 7
# Figure out how much time has passed from the epoch to the given date
utc_diff = utc_offset - self.epoch[5]
mod_delta_yrs = year - self.epoch[0]
mod_delta_mon = month - self.epoch[1] + mod_delta_yrs * 12
mod_delta_day = (given_date - zeroday).days
mod_delta_hrs = hour - self.epoch[3] + mod_delta_day * 24 + utc_diff
mod_delta_min = mins - self.epoch[4] + mod_delta_hrs * 60
# Makes iterating through like components easier.
quintuple = zip(
(mins, hour, day, month, given_dow),
self.numerical_tab,
self.string_tab,
(mod_delta_min, mod_delta_hrs, mod_delta_day, mod_delta_mon,
mod_delta_day),
FIELD_RANGES)
for value, valid_values, field_str, delta_t, field_type in quintuple:
# All valid, static values for the fields are stored in sets
if value in valid_values:
continue
# The following for loop implements the logic for context
# sensitive and epoch sensitive constraints. break statements,
# which are executed when a match is found, lead to a continue
# in the outer loop. If there are no matches found, the given date
# does not match expression constraints, so the function returns
# False as seen at the end of this for...else... construct.
for cron_atom in field_str.split(','):
if cron_atom[0] == '%':
if not(delta_t % int(cron_atom[1:])):
break
elif field_type == DAYS_OF_WEEK and '#' in cron_atom:
D, N = int(cron_atom[0]), int(cron_atom[2])
# Computes Nth occurence of D day of the week
if (((D - first_dow) % 7) + 1 + 7 * (N - 1)) == day:
break
elif field_type == DAYS_OF_MONTH and cron_atom[-1] == 'W':
target = min(int(cron_atom[:-1]), last_dom)
lands_on = (first_dow + target - 1) % 7
if lands_on == 0:
# Shift from Sun. to Mon. unless Mon. is next month
target += 1 if target < last_dom else -2
elif lands_on == 6:
# Shift from Sat. to Fri. unless Fri. in prior month
target += -1 if target > 1 else 2
# Break if the day is correct, and target is a weekday
if target == day and (first_dow + target - 7) % 7 > 1:
break
elif field_type in L_FIELDS and cron_atom.endswith('L'):
# In dom field, L means the last day of the month
target = last_dom
if field_type == DAYS_OF_WEEK:
# Calculates the last occurence of given day of week
desired_dow = int(cron_atom[:-1])
target = (((desired_dow - first_dow) % 7) + 29)
target -= 7 if target > last_dom else 0
if target == day:
break
else:
# See 2010.11.15 of CHANGELOG
if field_type == DAYS_OF_MONTH and self.string_tab[4] != '*':
dom_matched = False
continue
elif field_type == DAYS_OF_WEEK and self.string_tab[2] != '*':
# If we got here, then days of months validated so it does
# not matter that days of the week failed.
return dom_matched
# None of the expressions matched which means this field fails
return False
# Arriving at this point means the date landed within the constraints
# of all fields; the associated trigger should be fired.
return True
def parse_atom(parse, minmax):
"""
Returns a set containing valid values for a given cron-style range of
numbers. The 'minmax' arguments is a two element iterable containing the
inclusive upper and lower limits of the expression.
Examples:
>>> parse_atom("1-5",(0,6))
set([1, 2, 3, 4, 5])
>>> parse_atom("*/6",(0,23))
set([0, 6, 12, 18])
>>> parse_atom("18-6/4",(0,23))
set([18, 22, 0, 4])
>>> parse_atom("*/9",(0,23))
set([0, 9, 18])
"""
parse = parse.strip()
increment = 1
if parse == '*':
return set(xrange(minmax[0], minmax[1] + 1))
elif parse.isdigit():
# A single number still needs to be returned as a set
value = int(parse)
if value >= minmax[0] and value <= minmax[1]:
return set((value,))
else:
raise ValueError("Invalid bounds: \"%s\"" % parse)
elif '-' in parse or '/' in parse:
divide = parse.split('/')
subrange = divide[0]
if len(divide) == 2:
# Example: 1-3/5 or */7 increment should be 5 and 7 respectively
increment = int(divide[1])
if '-' in subrange:
# Example: a-b
prefix, suffix = [int(n) for n in subrange.split('-')]
if prefix < minmax[0] or suffix > minmax[1]:
raise ValueError("Invalid bounds: \"%s\"" % parse)
elif subrange == '*':
# Include all values with the given range
prefix, suffix = minmax
else:
raise ValueError("Unrecognized symbol: \"%s\"" % subrange)
if prefix < suffix:
# Example: 7-10
return set(xrange(prefix, suffix + 1, increment))
else:
# Example: 12-4/2; (12, 12 + n, ..., 12 + m*n) U (n_0, ..., 4)
noskips = list(xrange(prefix, minmax[1] + 1))
noskips+= list(xrange(minmax[0], suffix + 1))
return set(noskips[::increment])
|
Standard Cron Fields:
>>> job = CronExpression("0 0 * * 1-5/2 find /var/log -delete")
>>> job.check_trigger((2010, 11, 17, 0, 0))
True
>>> job.check_trigger((2012, 12, 21, 0 , 0))
False
Periodic Trigger:
>>> job = CronExpression("0 %9 * * * Feed 'it'", (2010, 5, 1, 7, 0, -6))
>>> job.comment
"Feed 'it'"
>>> job.check_trigger((2010, 5, 1, 7, 0), utc_offset=-6)
True
>>> job.check_trigger((2010, 5, 1, 16, 0), utc_offset=-6)
True
>>> job.check_trigger((2010, 5, 2, 1, 0), utc_offset=-6)
True
Simple user-space cron in less than ten lines:
import time
import os
import cronex
while True:
for line in open("crontab"):
job = cronex.CronExpression(line.strip())
if job.check_trigger(time.gmtime(time.time())[:5]):
os.system(job.comment)
time.sleep(60)
SYNTAX
If you are familiar with cron, skip down to the section titled "PERIODIC," and consider reading the last half of "RANGES." Everything else is standard "man 5 crontab" information that probably isn't worded as well as in the man page.
FIELDS
Cron fields must be defined in the following order: minute, hour, day of the month, month, and day of the week. All fields permit arbitrary listing of values, ranges, wild-cards, steps and periodicity. In the field for days of the week, 0 represents Sunday and 6 Saturday. The months and days of the week fields permit using three letter abbreviations such as "dec" for December and "tue" for Tuesday in place of numbers. All abbreviations are case insensitive.
Lone integers can be specified for each field, or multiple values can be used by separating them with commas. In the field representing days of the month, "5,10,25" would represent the 5th, 10th and 25th day of each month.
RANGES
Ranges are defined with a hyphen separating the initial and terminal values. In the hour field, "6-14" would mean all from 6:00am to 2:00pm. All ranges are inclusive of both values. If the first value is greater than the second value, this will be interpreted to mean the range should wrap-around. If "10-2" were specified in the months field, it would be interpreted to mean October, November, December, January and February.
WILD-CARDS
Wild-cards, indicated with a "*", in a field represents all valid values. It is the same as 0-59 for minutes, 0-23 for hours, 1-31 for days, 1-12 for months and 0-6 for weekdays.
STEPS
Steps are specified with a "/" and number following a range or wild-card. When iterating through a range with a step, the specified number of values will be skipped each time. "1-10/2" is the functional equivalent to "1,3,5,7,9".
PERIODIC
In standard cron format, an approximation to trigger an event every 10 days might look like this: "0 0 */10 * *". This works fine for January: The trigger is active on January 1st, 11th, and 31st; but in February, the trigger will be active on February 1st, far ahead of schedule. One "solution" would be to instead use "0 0 10,20,30 * *", but this still does not produce an event consistently every 10 days. This is the problem that periodicity addresses. Periodicity is represented as a "%" followed by the length of each period. The length of the period can be outside the bounds normal ranges of each field. "0 0 %45 * *" would be active every 45 days. All periodicities are calculated starting from the epoch and are independent of each other.
SPECIAL SYMBOLS
There are three additional special symbols: "L", "W" and "#".
When used in the day of the month field, a number followed by "L" represents the occurrence of a day of the week represented by the value preceding "L". In the day of the month field, "L" without a prefixed integer represents the last day of the month. "0 0 * * 5L" represent a midnight trigger for the last Friday of each month whereas "0 0 L 2 *" represents a midnight trigger for the last day of every February.
"W" is only valid for the field representing days of the month, and must be prefixed with an integer. It specifies the weekday (Monday-Friday) nearest the given day. In the construct "0 0 7W * *", when the 7th falls on a Saturday, the trigger will be active on the 6th. If the 7th falls on a Sunday, the trigger will be active on the 8th.
"#" is only valid for the field representing days of the week. The "#" has a prefix and suffix that represent the day of the week and the Nth occurrence of that day of the week. "0 0 * * 0#5" would trigger every 5th Sunday.
All of the constructs above can be combined in individual fields using commas: "0,30 */7,5 1,%90,L 9-4/6,5-8 4#2" is a completely valid, albeit it hideous, expression.
SPECIAL STRINGS
There are several special strings that can substitute common cron expressions. These strings _replace_, not augment the cron fields.
String Equivalent
------ ----------
@yearly 0 0 1 1 *
@anually 0 0 1 1 *
@monthly 0 0 1 * *
@weekly 0 0 * * 0
@daily 0 0 * * *
@midnight 0 0 * * *
@hourly 0 * * * *
Is it correct behavior for days of month validation to override days of week validation? I was under the impression that both needed to pass in crons, for example 1 0 22,23,24,25,26,27,28 11 4 for Thanksgiving.
Yep. I did not know about this in my first implementation, but I fixed it later on.
Also, an easier way to write that would be "1 0 * 11 4#4". I posted the full syntax documentation as well.
Awesome, thanks for the info.