i facing problem in removing duplicate entries.(i not in shell!).here situation- application creates flat text file. each line 1 record , each field seperated delimiter "~|"(quotes excluded). record looks like-
field1~|field2~|field3~|field4~|field5~|field6~|field7~|
there records duplicate.duplicate record decided value of field- field2. how write shell script/awk/sed remove duplicate records based on criteria? script has write output other file. have done in application due performance problem can not done. help.
input file
field1~|aba~|field3~|field4~|field5~|field6~|field7~| field1~|pqr~|field3~|field4~|field5~|field6~|field7~| field1~|xyz~|field3~|field4~|field5~|field6~|field7~| field1~|aba~|field3~|field4~|field5~|field6~|field7~| field1~|rst~|field3~|field4~|field5~|field6~|field7~| field1~|pqr~|field3~|field4~|field5~|field6~|field7~|
output should be-
field1~|aba~|field3~|field4~|field5~|field6~|field7~| field1~|pqr~|field3~|field4~|field5~|field6~|field7~| field1~|xyz~|field3~|field4~|field5~|field6~|field7~| field1~|rst~|field3~|field4~|field5~|field6~|field7~|
(order of records doesn't matter.)
not sure if understood question correctly, you're looking for?:
test.txt:
field1~|field2~|field3~|field4~|field5~|field6~|field7~| foo~|field2~|bar~|field4~|field5~|field6~|field7~| field1~|foobar~|field3~|field4~|field5~|field6~|field7~|
calling sort:
sort --field-separator="~" --key 2,2 --unique test.txt
results in:
field1~|field2~|field3~|field4~|field5~|field6~|field7~| field1~|foobar~|field3~|field4~|field5~|field6~|field7~|
Comments
Post a Comment