暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Python 替换日志中的英文日期

生有可恋 2023-02-08
1407

在处理日志时遇到英文格式的日期会很不习惯,平常更习惯使用纯数字格式的日期。以下内容是从日志中截取的一部分:

我们可以用 python 将日志中的英文日期替换成数字格式。只需要用到两个函数即可实现这种日期格式的替换。

    >>> from datetime import datetime
    >>> help(datetime.strptime)
    strptime(...) method of builtins.type instance
    string, format -> new datetime parsed from a string (like time.strptime())
    >>> help(datetime.strptime)
    strftime(...)
    format -> strftime() style string

    即 strptime 和 strftime,这两个函数一个是将字符串转为日期格式,另一个是将日期格式转为字符串。

    其中转换过程需要用到格式字符串,要用格式字符串来引导替换过程,常用的格式字符串为:

          Commonly used format codes:


      %Y Year with century as a decimal number.
      %m Month as a decimal number [01,12].
      %d Day of the month as a decimal number [01,31].
      %H Hour (24-hour clock) as a decimal number [00,23].
      %M Minute as a decimal number [00,59].
      %S Second as a decimal number [00,61].
      %z Time zone offset from UTC.
      %a Locale's abbreviated weekday name.
      %A Locale's full weekday name.
      %b Locale's abbreviated month name.
      %B Locale's full month name.
      %c Locale's appropriate date and time representation.
      %I Hour (12-hour clock) as a decimal number [01,12].
      %p Locale's equivalent of either AM or PM.

      这里 demo 中用到的就是下面这种:

        # 14/Jul/2022:10:07:55
        format_from = '%d/%b/%Y:%H:%M:%S'

        而我们想替换成的数字类型的格式为:

          # 2022-07-14 10:07:55
          format_to = '%Y-%m-%d %H:%M:%S

          转换过程为:

            >>> from datetime import datetime
            >>> s = '14/Jul/2022:10:07:55'
            >>> format_from = '%d/%b/%Y:%H:%M:%S'
            >>> format_to = '%Y-%m-%d %H:%M:%S'
            >>> d = datetime.strptime(s, format_from)
            >>> d
            datetime.datetime(2022, 7, 14, 10, 7, 55)
            >>> s1 = datetime.strftime(d, format_to)
            >>> s1
            '2022-07-14 10:07:55'

            我们用正则表达式来识别字符串中的日期,将日期转换完后再替换掉原字符串,处理后的效果如下:

            因为是从标准输入中读取内容,所以可以使用管道,这样也更符合文件的流式处理,甚至可以接 tail -f 实时转换日志中的日期。

            测试代码如下:

              #!python3


              import sys
              import re
              from datetime import datetime


              '''
              Commonly used format codes:


              %Y Year with century as a decimal number.
              %m Month as a decimal number [01,12].
              %d Day of the month as a decimal number [01,31].
              %H Hour (24-hour clock) as a decimal number [00,23].
              %M Minute as a decimal number [00,59].
              %S Second as a decimal number [00,61].
              %z Time zone offset from UTC.
              %a Locale's abbreviated weekday name.
              %A Locale's full weekday name.
              %b Locale's abbreviated month name.
              %B Locale's full month name.
              %c Locale's appropriate date and time representation.
              %I Hour (12-hour clock) as a decimal number [01,12].
              %p Locale's equivalent of either AM or PM.
              '''


              def date_convert():
              format_from = '%d/%b/%Y:%H:%M:%S'
              format_to = '%Y-%m-%d %H:%M:%S'
              pattern = '../.*/....:..:..:..'


              f = sys.stdin


              line = f.readline()
              while line:
              match = re.search(pattern,line)
              if match:
              s = match.start()
              e = match.end()
              date_ori = line[s:e]
              d = datetime.strptime(date_ori, format_from)
              date_conv = datetime.strftime(d, format_to)
              line_conv = re.sub(pattern,date_conv, line)
              print(line_conv, end='')
              else:
              print(line, end='')
              line = f.readline()


              if __name__ == '__main__':
              date_convert()

              因为不同日志中的日期格式不同,所以在处理具体的日志时要根据对应的日期格式来调整格式字符串,同时还要调整匹配日期的正则表达式,这里只给出了处理思路,此处的正则表达式写的并不严谨。

              注意:

              微信公众号编辑器会对贴进去的代码自动进行转换,有可能会造成单引号错位,代码贴进去后看起来多出来一个空格,删掉或重新输入后又会被替换掉,还没测试直接粘贴代码运行是否会报错。

              全文完。

              如果转发本文,文末务必注明:“转自微信公众号:生有可恋”。

              文章转载自生有可恋,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

              评论