awk - Spliting a file into multiple files based on common line prefix -
suppose have file following format.
prefix1: line 1 prefix1: line 2 prefix1: line 3 prefix2: line 4 prefix2: line 5 prefix3: line 6 prefix3: line 7 prefix3: line 8 prefix3: line 9 prefix3: line 10
i split 3 files names prefix1
, prefix2
, prefix3
, newlines intact part of whichever file either entirely contains them.
in real file, there might n
prefixes , not 3.
i write python script implement functionality directly, wonder if there's shorter way in awk
.
this one-liner works job:
awk -f: '{f=$1?$1:f; print > f}' file
with example:
kent$ cat file prefix1: line 1 prefix1: line 2 prefix1: line 3 prefix2: line 4 prefix2: line 5 prefix3: line 6 prefix3: line 7 prefix3: line 8 prefix3: line 9 prefix3: line 10 kent$ awk -f: '{f=$1?$1:f; print > f}' file kent$ head prefix* ==> prefix1 <== prefix1: line 1 prefix1: line 2 prefix1: line 3 ==> prefix2 <== prefix2: line 4 prefix2: line 5 ==> prefix3 <== prefix3: line 6 prefix3: line 7 prefix3: line 8 prefix3: line 9 prefix3: line 10
you mentioned n prefixes. if n large enough, above awk line fail "opened many files". solve it, close file after write/append text it.
that , line :
awk -f: '{f=$1?$1:f; print >> f; close(f)}' file
this line works no matter input file sorted prefix or not.
Comments
Post a Comment