ruby - Comparing strings of equal lengths and noting where the differences occur -


given 2 strings of equal length such

s1 = "acct" s2 = "atct" 

i find out positions there strings differ. have done this. (please suggest better way of doing it. bet there should be)

z= seq1.chars.zip(seq2.chars).each_with_index.map{|(s1,s2),index| index+1 if s1!=s2}.compact 

z array of positions 2 strings different. in case z returns 2

imagine add new string

s3 = "agct" 

and wish compare the others , see 3 strings differ. same approach above time

s1.chars.zip(s2.chars,s3.chars) 

returns array of arrays. given 2 strings relaying on comparing 2 chars equality, add more strings starts become overwhelming , strings become longer.

#=> [["a", "a", "a"], ["c", "t", "g"], ["c", "c", "c"], ["t", "t", "t"]] 

running

s1.chars.zip(s2.chars,s3.chars).each_with_index.map{|item| item.uniq}   #=> [["a"], ["c", "t", "g"], ["c"], ["t"]]  

can reduce redundancy , return positions same(non empty subarray of size 1). print out indices , contents of subarrays of size > 1.

s1.chars.zip(s2.chars,s3.chars,s4.chars).each_with_index.map{|item| item.uniq}.each_with_index.map{|a,index| [index+1,a] unless a.size== 1}.compact.map{|h| hash[*h]} #=> [{2=>["c", "t", "g"]}] 

i feel glide halt or slow increase number of strings , string lengths longer. alternative ways of optimally doing this? thank you.

here's i'd start. i'm purposely using different strings make easier see differences:

str1 = 'jackdaws love giant sphinx of quartz' str2 = 'jackdaws l0ve gi4nt sphinx 0f qu4rtz' 

to first string's characters:

str1.chars.with_index.to_a - str2.chars.with_index.to_a => [["o", 10], ["a", 19], ["o", 30], ["a", 35]] 

to second string's characters:

str2.chars.with_index.to_a - str1.chars.with_index.to_a => [["0", 10], ["4", 19], ["0", 30], ["4", 35]] 

there little slow down strings bigger, won't bad.


edit: added more info.

if have arbitrary number of strings, , need compare them all, use array#combination:

str1 = 'acct' str2 = 'atct' str3 = 'agct'  require 'pp'  pp [str1, str2, str3].combination(2).to_a >> [["acct", "atct"], ["acct", "agct"], ["atct", "agct"]] 

in above output can see combination cycles through array, returning various n sized combinations of array elements.

pp [str1, str2, str3].combination(2).map{ |a,b| a.chars.with_index.to_a - b.chars.with_index.to_a } >> [[["c", 1]], [["c", 1]], [["t", 1]]] 

using combination's output cycle through array, comparing elements against each other. so, in above returned array, in "acct" , "atct" pair, 'c' difference between two, located @ position 1 in string. similarly, in "acct" , "agct" difference "c" again, in position 1. 'atct' , 'agct' it's 't' @ position 1.

because saw in longer string samples code return multiple changed characters, should pretty close.


Comments