Converting a date from a read string to a more standard one

I have dates in a form Fri 27th Augthat is a nightmare programmatically, as I am sure you can imagine.

I am wondering what is the best way to convert them to a date form in the USA 08/27/13. I need to specify the year from the month, i.e. Aug-Dec implies 13, and Jan-Jul implies 14.

I was thinking about how to do this in regex, or even just do a series of line replacements.

But the complication is that I have a list of strings, not all of which are dates of this form. If others have numbers inside, how can I check the date of this form and then replace if there is one?

eg.

list = ['not a date', 'als0 not a dat3', 'Wed 5th Jan', ... , 'no date here']

The test requirement makes the regular expression suitable, but I read a lot about SO versus using rein Python, although I don't know why. Should I (learn to use enough, and) use it?

With @Allan's answer, I was able to solve my problem with

def is_date(string):
    tmp = string.replace('th','')
    string = tmp.replace('rd','')
    tmp = string.replace('nd','')
    string = tmp.replace('st','')
    try:
        d = strptime(string, "%a %d %b")
        date = str(d[1]) + "/" + str(d[2]) + "/"
        if d[1] >= 8:
            date += "13"
        else:
            date += "14"
        return date
    except ValueError:
        return 0

Thanks for your answers, @Allan, @adsmith and @codnodder.

+3
source share
3 answers

Take a look at time.strptime . It raises ValueError, so you might want to catch this exception and ignore strings that are not dates.

, . , ... , , :)

@OllieFord: , :

import datetime

def is_date(string):
    for suffix in ("th", "rd", "nd", "st"):
        string = string.replace(suffix, "")

    try:
        d = datetime.datetime.strptime(string, "%a %d %b")
        y = 2014
        if d.month >= 8:
            y = 2013            
        d = d.replace(year = y)
        return d.strftime("%x")
    except ValueError:
        return None

datetime, . %x : . , , , , ...

Dateutil.parser, @Marian. , , , , , ( ).

+2

Regular expression doesn't seem like the worst idea for this particular task. Below is a long example. I am sure that there are many more effective approaches.

import re

# Convert dates like "Fri 27th Aug" with year fudge
mons = {
    'Aug' : ( 8, 13),
    'Sep' : ( 9, 13),
    'Oct' : (10, 13),
    'Nov' : (11, 13),
    'Dec' : (12, 13),
    'Jan' : ( 1, 14),
    'Feb' : ( 2, 14),
    'Mar' : ( 3, 14),
    'Apr' : ( 4, 14),
    'May' : ( 5, 14),
    'Jun' : ( 6, 14),
    'Jul' : ( 7, 14),
}
days = ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun')

# pattern is purposefully strict to avoid false matches against
# other arbitrary strings
pat = re.compile(r'^(%s) (\d+)(st|nd|rd|th) (%s)$' %
                 ('|'.join(days), '|'.join(mons.keys())))
strlist = ['not a date', 'als0 not a dat3', 'Wed 5th Jan', 'no date here']
newlist = []
for tok in strlist:
    m = re.match(pat, tok)
    if m:
        day = int(m.group(2))
        mon = m.group(4)
        newlist.append('%02d/%02d/%02d' % (mons[mon][0], day,mons[mon][1]))
    else:
        newlist.append(tok)

for tok in newlist:
    print tok

EDIT: Changed date format to match OP correction.

+1
source

All Articles