Processing multiple files in Python, and matching on fields -
i have number of csv files need compare 1 'master list', , determine, based on unique id, if these other files contain entries key.
what easiest way in python be? i.e kind of structures suggest read data into, , how suggest iterate through it?
here example of data , output looking for.
**master list** unique id : file name : file version : responsible party j578221 : expander : 1.23 : joe bloggs kk89821 : top : 0.9 : mike smith **location x** region : file name : unique id usa : acme expander : j578221 usa : acme tail : mk33431 **location z** reqion : file name : unique id : date added china : expander : j578221 : 03-04-2012 hk : acme top : kk89821 : 06-07-2012 **output:** unique id : file name : file version : responsible party : in location x : in location z j578221 : expander : 1.23 : joe bloggs : yes : yes kk89821 : top : 0.9 : mike smith : no : yes
the easiest way might using regular expressions (see documentation here) retrieve key of each line in master file. (you might need evaluate structure of files first , modify expression if position of unique-id changes.)
store id list in dictionary key , use list value, indicating files in each of master keys included.
you afterwards filter dictionary ids (keys) 1 or multiple files or files containing 1 particular id.
Comments
Post a Comment