Welcome, guest | Sign In | My Account | Store | Cart
>>> import datetime
>>> str(datetime.datetime.now())
'2010-03-21 21:33:32.750246'
>>> str(datetime.date.today())
'2010-03-21'

This function goes the other way for date and datetime strings of this format.

Python, 59 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
def _datetime_from_str(time_str):
    """Return (<scope>, <datetime.datetime() instance>) for the given
    datetime string.
    
    >>> _datetime_from_str("2009")
    ('year', datetime.datetime(2009, 1, 1, 0, 0))
    >>> _datetime_from_str("2009-12")
    ('month', datetime.datetime(2009, 12, 1, 0, 0))
    >>> _datetime_from_str("2009-12-25")
    ('day', datetime.datetime(2009, 12, 25, 0, 0))
    >>> _datetime_from_str("2009-12-25 13")
    ('hour', datetime.datetime(2009, 12, 25, 13, 0))
    >>> _datetime_from_str("2009-12-25 13:05")
    ('minute', datetime.datetime(2009, 12, 25, 13, 5))
    >>> _datetime_from_str("2009-12-25 13:05:14")
    ('second', datetime.datetime(2009, 12, 25, 13, 5, 14))
    >>> _datetime_from_str("2009-12-25 13:05:14.453728")
    ('microsecond', datetime.datetime(2009, 12, 25, 13, 5, 14, 453728))
    """
    import time
    import datetime
    formats = [
        # <scope>, <pattern>, <format>
        ("year", "YYYY", "%Y"),
        ("month", "YYYY-MM", "%Y-%m"),
        ("day", "YYYY-MM-DD", "%Y-%m-%d"),
        ("hour", "YYYY-MM-DD HH", "%Y-%m-%d %H"),
        ("minute", "YYYY-MM-DD HH:MM", "%Y-%m-%d %H:%M"),
        ("second", "YYYY-MM-DD HH:MM:SS", "%Y-%m-%d %H:%M:%S"),
        # ".<microsecond>" at end is manually handled below
        ("microsecond", "YYYY-MM-DD HH:MM:SS", "%Y-%m-%d %H:%M:%S"),
    ]
    for scope, pattern, format in formats:
        if scope == "microsecond":
            # Special handling for microsecond part. AFAIK there isn't a
            # strftime code for this.
            if time_str.count('.') != 1:
                continue
            time_str, microseconds_str = time_str.split('.')
            try:
                microsecond = int((microseconds_str + '000000')[:6])
            except ValueError:
                continue
        try:
            # This comment here is the modern way. The subsequent two
            # lines are for Python 2.4 support.
            #t = datetime.datetime.strptime(time_str, format)
            t_tuple = time.strptime(time_str, format)
            t = datetime.datetime(*t_tuple[:6])
        except ValueError:
            pass
        else:
            if scope == "microsecond":
                t = t.replace(microsecond=microsecond)
            return scope, t
    else:
        raise ValueError("could not determine date from %r: does not "
            "match any of the accepted patterns ('%s')"
            % (time_str, "', '".join(s for s,p,f in formats)))

5 comments

Bard Aase 11 years, 8 months ago  # | flag
Bard Aase 11 years, 8 months ago  # | flag

Aha. I see..

Gabriel Genellina 11 years, 8 months ago  # | flag

I'd use something like this:

microsecond = int((microseconds_str + '000000')[:6])

so _datetime_from_str("2009-12-25 13:05:14.453") is parsed correctly.

(Also, why the _ in the name? _datetime_from_str looks like an internal function, not to be used...)

Trent Mick 11 years, 8 months ago  # | flag

@Gabriel: thanks for microsecond correction: added.

Regarding the leading "_": I tend to re-use a lot of my recipes in various scripts and modules of mine. My typical usage of those recipe functions is for internal-only support use.

Scott S-Allen 9 years, 9 months ago  # | flag

For a slightly different approach to date-time parsing, I've posted the "Cheap Date" recipe 578064. Two years late and many dollars short, I know, but to some it may be of use. It handles the u-sec without exception, but is not meant to diminish the value here.

From a different tangent than the other recipe, the main iteration here caught my eye. This script progressively matches longer, higher resolution, dates. As such the portions of the pattern and format could iterate nicely.

Using an empty join works well with replacement patterns, eliminating the need to escaped chars at times. As it builds left-right, the delimit chars are added to the LH side of the next value.

PAT_PARTS = ["YYYY","-MM","-DD"," HH",":MM",":SS"]
FMT_PARTS = ["%Y","-%m","-%d"," %H",":%M",":%S"]

rep_pat = ''.join(FMT_PARTS[:4])

One, of several, ways could be similar to enumerate the above to generate an index or to iterate pairs like this:

PARTS_PAIRS = [["YYYY","%Y"],["-MM","-%m"],["-DD","-%d"],
            [" HH"," %H"],[":MM",":%M"],[":SS",":%S"],]

cur_pattern = ''
cur_format = ''
for n_pat, n_fmt in PARTS_PAIRS:
    cur_pattern = ''.join(cur_pattern, n_pat)
    cur_format = ''.join(cur_pattern, n_fmt)

Of course, its possible to start with the full string and reduce the pattern-length till a successful match.

for i in range(len(PAT_PARTS),0,-1)):
    cur_pattern = ''.join(PAT_PARTS[:i]) could also

Cheers,