在处理日志时遇到英文格式的日期会很不习惯,平常更习惯使用纯数字格式的日期。以下内容是从日志中截取的一部分:

我们可以用 python 将日志中的英文日期替换成数字格式。只需要用到两个函数即可实现这种日期格式的替换。
>>> from datetime import datetime>>> help(datetime.strptime)strptime(...) method of builtins.type instancestring, format -> new datetime parsed from a string (like time.strptime())>>> help(datetime.strptime)strftime(...)format -> strftime() style string
即 strptime 和 strftime,这两个函数一个是将字符串转为日期格式,另一个是将日期格式转为字符串。
其中转换过程需要用到格式字符串,要用格式字符串来引导替换过程,常用的格式字符串为:
Commonly used format codes:%Y Year with century as a decimal number.%m Month as a decimal number [01,12].%d Day of the month as a decimal number [01,31].%H Hour (24-hour clock) as a decimal number [00,23].%M Minute as a decimal number [00,59].%S Second as a decimal number [00,61].%z Time zone offset from UTC.%a Locale's abbreviated weekday name.%A Locale's full weekday name.%b Locale's abbreviated month name.%B Locale's full month name.%c Locale's appropriate date and time representation.%I Hour (12-hour clock) as a decimal number [01,12].%p Locale's equivalent of either AM or PM.
这里 demo 中用到的就是下面这种:
# 14/Jul/2022:10:07:55format_from = '%d/%b/%Y:%H:%M:%S'
而我们想替换成的数字类型的格式为:
# 2022-07-14 10:07:55format_to = '%Y-%m-%d %H:%M:%S
转换过程为:
>>> from datetime import datetime>>> s = '14/Jul/2022:10:07:55'>>> format_from = '%d/%b/%Y:%H:%M:%S'>>> format_to = '%Y-%m-%d %H:%M:%S'>>> d = datetime.strptime(s, format_from)>>> ddatetime.datetime(2022, 7, 14, 10, 7, 55)>>> s1 = datetime.strftime(d, format_to)>>> s1'2022-07-14 10:07:55'
我们用正则表达式来识别字符串中的日期,将日期转换完后再替换掉原字符串,处理后的效果如下:

因为是从标准输入中读取内容,所以可以使用管道,这样也更符合文件的流式处理,甚至可以接 tail -f 实时转换日志中的日期。
测试代码如下:
#!python3import sysimport refrom datetime import datetime'''Commonly used format codes:%Y Year with century as a decimal number.%m Month as a decimal number [01,12].%d Day of the month as a decimal number [01,31].%H Hour (24-hour clock) as a decimal number [00,23].%M Minute as a decimal number [00,59].%S Second as a decimal number [00,61].%z Time zone offset from UTC.%a Locale's abbreviated weekday name.%A Locale's full weekday name.%b Locale's abbreviated month name.%B Locale's full month name.%c Locale's appropriate date and time representation.%I Hour (12-hour clock) as a decimal number [01,12].%p Locale's equivalent of either AM or PM.'''def date_convert():format_from = '%d/%b/%Y:%H:%M:%S'format_to = '%Y-%m-%d %H:%M:%S'pattern = '../.*/....:..:..:..'f = sys.stdinline = f.readline()while line:match = re.search(pattern,line)if match:s = match.start()e = match.end()date_ori = line[s:e]d = datetime.strptime(date_ori, format_from)date_conv = datetime.strftime(d, format_to)line_conv = re.sub(pattern,date_conv, line)print(line_conv, end='')else:print(line, end='')line = f.readline()if __name__ == '__main__':date_convert()
因为不同日志中的日期格式不同,所以在处理具体的日志时要根据对应的日期格式来调整格式字符串,同时还要调整匹配日期的正则表达式,这里只给出了处理思路,此处的正则表达式写的并不严谨。
注意:
微信公众号编辑器会对贴进去的代码自动进行转换,有可能会造成单引号错位,代码贴进去后看起来多出来一个空格,删掉或重新输入后又会被替换掉,还没测试直接粘贴代码运行是否会报错。

全文完。
如果转发本文,文末务必注明:“转自微信公众号:生有可恋”。
文章转载自生有可恋,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




