这里 DNA
类的每个实例都对应一个字符串,例如 'GCCCAC'
。可以从这些字符串构造包含 k-mer 的子字符串数组。对于这个字符串,有 1-mers、2-mers、3-mers、4-mers、5-mers 和一个 6-mers:
["G", "C", "C", "C", "A", "C"]
["GC", "CC", "CC", "CA", "AC"]
["GCC", "CCC", "CCA", "CAC"]
["GCCC", "CCCA", "CCAC"]
["GCCCA", "CCCAC"]
["GCCCAC"]
模式应该很明显。有关详细信息,请参阅 Wiki。
问题是编写 DNA 类的 shared_kmers(k, dna2) 方法,该方法返回所有对 [i, j] 的数组,其中此 DNA 对象(接收消息)与 dna2 共享一个位于 i 位置的公共(public) k-mer在这个 dna 和 dna2 中的位置 j。
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
dna2.shared_kmers(2, dna1)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
dna1.shared_kmers(3, dna2)
#=> [[2, 0], [3, 1]]
dna1.shared_kmers(4, dna2)
#=> [[2, 0]]
dna1.shared_kmers(5, dna2)
#=> []
最佳答案
class DNA
attr_accessor :sequencing
def initialize(sequencing)
@sequencing = sequencing
end
def kmers(k)
@sequencing.each_char.each_cons(k).map(&:join)
end
def shared_kmers(k, dna)
kmers(k).each_with_object([]).with_index do |(kmer, result), index|
dna.kmers(k).each_with_index do |other_kmer, other_kmer_index|
result << [index, other_kmer_index] if kmer.eql?(other_kmer)
end
end
end
end
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.kmers(2)
#=> ["GC", "CC", "CC", "CA", "AC"]
dna2.kmers(2)
#=> ["CC", "CA", "AC", "CG", "GC"]
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
dna2.shared_kmers(2, dna1)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
dna1.shared_kmers(3, dna2)
#=> [[2, 0], [3, 1]]
dna1.shared_kmers(4, dna2)
#=> [[2, 0]]
dna1.shared_kmers(5, dna2)
#=> []
关于ruby - 如何在Ruby中的两个字符串中找到相同子序列的索引?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/58684821/