python - Read only lines that contain certain specific string and apply regex on them -

here's code: have script reads file in file not lines similar , i'd extract informations lines have i doc o:.

i've tried if condition still doesn't work when there lines regex aren't matching:

#!/usr/bin/env python   # -*- coding: utf-8 -*-  import re   def extraire(data):     ms = re.match(r'(\s+).*?(o:\s+).*(r:\s+).*mid:(\d+)', data) # heure & mid      return {'heure':ms.group(1), 'mid':ms.group(2),"origine":ms.group(3),"destination":ms.group(4)}  tableau = []    fichier = open("/home/test/file.log") f = fichier.readlines()  line in f:      if (re.findall(".*i doc o:.*",line)):          tableau = [extraire(line) line in f ]  print tableau fichier.close()

and here's example of lines of file here want first , fourth lines..:

01:09:25.258 mta         messages       doc o:nvs:smtp/alarm@yyy.xx r:nvs:sms/+654811 mid:6261 01:09:41.965 mta         messages       rep o:nvs:smtp/alarmes.techniques@xxx.de r:nvs:sms/+455451 mid:6261 01:09:41.965 mta         messages       rep 6261 ok, accepted (id: 26) 08:14:14.469 mta         messages       doc o:nvs:smtp/alarm@xxxx.en r:nvs:sms/+654646 mid:6262 08:14:30.630 mta         messages       rep o:nvs:smtp/alarm@azea.er r:nvs:sms/+33688704859 mid:6262 08:14:30.630 mta         messages       rep 6262 ok, accepted (id: 28)

from: http://docs.python.org/2/library/re.html

?, +?, ?? '', '+', , '?' qualifiers greedy; match text possible. behaviour isn’t desired; if re <.*> matched against ...

also, findall best used w/ entire buffer, , returns list, hence looping on matches saves having conditional against each line of file.

buff = fichier.read() matches = re.findall(".*?i doc ):.*", buff) match in matches:     tableau = ...

-here test code, tell me it's doing, didn't want?

>>> import re >>> = """ ... 01:09:25.258 mta         messages       doc o:nvs:smtp/alarm@yyy.xx r:nvs:sms/+654811 mid:6261 ... 01:09:41.965 mta         messages       rep o:nvs:smtp/alarmes.techniques@xxx.de r:nvs:sms/+455451 mid:6261 ... 01:09:41.965 mta         messages       rep 6261 ok, accepted (id: 26) ... 08:14:14.469 mta         messages       doc o:nvs:smtp/alarm@xxxx.en r:nvs:sms/+654646 mid:6262 ... 08:14:30.630 mta         messages       rep o:nvs:smtp/alarm@azea.er r:nvs:sms/+33688704859 mid:6262 ... 08:14:30.630 mta         messages       rep 6262 ok, accepted (id: 28)""" >>> m = re.findall(".*?i doc o:.*",a) ['01:09:25.258 mta         messages       doc o:nvs:smtp/alarm@yyy.xx r:nvs:sms/+654811 mid:6261', '08:14:14.469 mta         messages       doc o:nvs:smtp/alarm@xxxx.en r:nvs:sms/+654646 mid:6262']  >>> tableau = [] >>> line in m: ...     tableau.append( extraire(line) ) ...  >>> tableau [{'origine': 'r:nvs:sms/+654811', 'destination': '6261', 'heure': '01:09:25.258', 'mid': 'o:nvs:smtp/alarm@yyy.xx'}, {'origine': 'r:nvs:sms/+654646', 'destination': '6262', 'heure': '08:14:14.469', 'mid': 'o:nvs:smtp/alarm@xxxx.en'}]

you in single line as

>>> tableau = [ extraire(line) line in re.findall( ".*?i doc ):.*", fichier.read() ) ]

Search This Blog

Babette

python - Read only lines that contain certain specific string and apply regex on them -

Comments

Post a Comment

Popular posts from this blog

node.js - Bad Request - node js ajax post -

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -