Bash script - Construct a single line out of many lines having duplicates in a single column -


i have instrumented log file have 6 lines of duplicated first column below.

//sc001@1/1/1@1/1,get,clientstart,1363178707755 //sc001@1/1/1@1/1,get,talktosocketstart,1363178707760 //sc001@1/1/1@1/1,get,decoderequest,1363178707765 //sc001@1/1/1@1/1,get-reply,encodereponse,1363178707767 //sc001@1/1/1@1/2,get,decoderequest,1363178708765 //sc001@1/1/1@1/2,get-reply,encodereponse,1363178708767 //sc001@1/1/1@1/2,get,talktosocketend,1363178708770 //sc001@1/1/1@1/2,get,clientend,1363178708775 //sc001@1/1/1@1/1,get,talktosocketend,1363178707770 //sc001@1/1/1@1/1,get,clientend,1363178707775 //sc001@1/1/1@1/2,get,clientstart,1363178708755 //sc001@1/1/1@1/2,get,talktosocketstart,1363178708760 

note: , (comma) delimiter here

like wise there many duplicate first column values (ids) in log file (above example having 2 values (ids); //sc001@1/1/1@1/1 , //sc001@1/1/1@1/2) need consolidate log records below format.

id,clientstart,talktosocketstart,decoderequest,encodereponse,talktosocketend,clientend  //sc001@1/1/1@1/1,1363178707755,1363178707760,1363178707765,1363178707767,1363178707770,1363178707775 //sc001@1/1/1@1/2,1363178708755,1363178708760,1363178708765,1363178708767,1363178708770,1363178708775 

i suppose have bash script exercise , appreciate expert support this. hope there may sed or awk solution more efficient.

thanks much

one way:

sort -t, -k4n,4 file | awk -f, '{a[$1]=a[$1]?a[$1] fs $nf:$nf;}end{for(i in a){print i","a[i];}}' 

sort command sorts file on basis of last(4th) column. awk takes sorted input , forms array 1st field key, , value combination of values of last column.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -