Shell script-duplicate records -


i facing problem in removing duplicate entries.(i not in shell!).here situation- application creates flat text file. each line 1 record , each field seperated delimiter "~|"(quotes excluded). record looks like-

field1~|field2~|field3~|field4~|field5~|field6~|field7~| 

there records duplicate.duplicate record decided value of field- field2. how write shell script/awk/sed remove duplicate records based on criteria? script has write output other file. have done in application due performance problem can not done. help.

input file

field1~|aba~|field3~|field4~|field5~|field6~|field7~| field1~|pqr~|field3~|field4~|field5~|field6~|field7~| field1~|xyz~|field3~|field4~|field5~|field6~|field7~| field1~|aba~|field3~|field4~|field5~|field6~|field7~| field1~|rst~|field3~|field4~|field5~|field6~|field7~| field1~|pqr~|field3~|field4~|field5~|field6~|field7~| 

output should be-

field1~|aba~|field3~|field4~|field5~|field6~|field7~| field1~|pqr~|field3~|field4~|field5~|field6~|field7~| field1~|xyz~|field3~|field4~|field5~|field6~|field7~| field1~|rst~|field3~|field4~|field5~|field6~|field7~| 

(order of records doesn't matter.)

not sure if understood question correctly, you're looking for?:

test.txt:

field1~|field2~|field3~|field4~|field5~|field6~|field7~| foo~|field2~|bar~|field4~|field5~|field6~|field7~| field1~|foobar~|field3~|field4~|field5~|field6~|field7~| 

calling sort:

sort --field-separator="~" --key 2,2 --unique test.txt 

results in:

field1~|field2~|field3~|field4~|field5~|field6~|field7~| field1~|foobar~|field3~|field4~|field5~|field6~|field7~| 

Comments