java codility training基因组范围查询

本文介绍了java codility training基因组范围查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

任务是：

给出非空的零索引字符串S.字符串S由大写英文字母A，C，G，T组成的N个字符组成。

A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper-case English letters A, C, G, T.

此字符串实际上代表DNA序列，而上层字符串 - 大小写字母代表单个核苷酸。

This string actually represents a DNA sequence, and the upper-case letters represent single nucleotides.

您还会获得由M个整数组成的非空零索引数组P和Q.这些阵列代表关于最小核苷酸的查询。我们将字符串S的字母表示为数组P和Q中的整数1,2,3,4，其中A = 1，C = 2，G = 3，T = 4，并且我们假设A

You are also given non-empty zero-indexed arrays P and Q consisting of M integers. These arrays represent queries about minimal nucleotides. We represent the letters of string S as integers 1, 2, 3, 4 in arrays P and Q, where A = 1, C = 2, G = 3, T = 4, and we assume that A < C < G < T.

查询K要求您找到范围内的最小核苷酸（P [K]，Q [K]），0≤P[i]≤Q[我]< N。

Query K requires you to find the minimal nucleotide from the range (P[K], Q[K]), 0 ≤ P[i] ≤ Q[i] < N.

例如，考虑字符串S = GACACCATA和数组P，Q，使得：

For example, consider string S = GACACCATA and arrays P, Q such that:

P[0] = 0    Q[0] = 8
P[1] = 0    Q[1] = 2
P[2] = 4    Q[2] = 5
P[3] = 7    Q[3] = 7

来自的最小核苷酸这些范围如下：

The minimal nucleotides from these ranges are as follows:

    (0, 8) is A identified by 1,
    (0, 2) is A identified by 1,
    (4, 5) is C identified by 2,
    (7, 7) is T identified by 4.

写一个函数：

class Solution { public int[] solution(String S, int[] P, int[] Q); }

给定非空零索引字符串S由N个字符和两个非字符组成空零索引数组P和Q由M个整数组成，返回一个由M个字符组成的数组，指定所有查询的连续答案。

that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M characters specifying the consecutive answers to all queries.

序列应返回为：

    a Results structure (in C), or
    a vector of integers (in C++), or
    a Results record (in Pascal), or
    an array of integers (in any other programming language).

例如，给定字符串S = GACACCATA和数组P，Q，使得：

For example, given the string S = GACACCATA and arrays P, Q such that:

P[0] = 0    Q[0] = 8
P[1] = 0    Q[1] = 2
P[2] = 4    Q[2] = 5
P[3] = 7    Q[3] = 7

该函数应返回值[1,1,2,4]，如上所述。

the function should return the values [1, 1, 2, 4], as explained above.

假设：

    N is an integer within the range [1..100,000];
    M is an integer within the range [1..50,000];
    each element of array P, Q is an integer within the range [0..N − 1];
    P[i] ≤ Q[i];
    string S consists only of upper-case English letters A, C, G, T.

复杂性：

    expected worst-case time complexity is O(N+M);
    expected worst-case space complexity is O(N),
         beyond input storage
         (not counting the storage required for input arguments).

可以修改输入数组的元素。

Elements of input arrays can be modified.

我的解决方案是：

class Solution {
    public int[] solution(String S, int[] P, int[] Q) {
        final  char c[] = S.toCharArray();
        final int answer[] = new int[P.length];
        int tempAnswer;
        char tempC;

        for (int iii = 0; iii < P.length; iii++) {
            tempAnswer = 4;
            for (int zzz = P[iii]; zzz <= Q[iii]; zzz++) {
                tempC = c[zzz];
                if (tempC == 'A') {
                    tempAnswer = 1;
                    break;
                } else if (tempC == 'C') {
                    if (tempAnswer > 2) {
                        tempAnswer = 2;
                    }
                } else if (tempC == 'G') {
                    if (tempAnswer > 3) {
                        tempAnswer = 3;
                    }

                }
            }
            answer[iii] = tempAnswer;
        }

        return answer;
    }
}

这不是最佳，我相信它应该是在一个循环中完成，任何提示我该如何实现？

It is not optimal, I believe it's supposed to be done within one loop, any hint how can I achieve it?

您可以在此检查解决方案的质量测试名称是Genomic-range-query。

You can check quality of your solution here https://codility.com/train/ test name is Genomic-range-query.

推荐答案

这是在codility.com中获得100分100的解决方案。请阅读前缀总和以了解解决方案：

Here is the solution that got 100 out of 100 in codility.com. Please read about prefix sums to understand the solution:

public static int[] solveGenomicRange(String S, int[] P, int[] Q) {
        //used jagged array to hold the prefix sums of each A, C and G genoms
        //we don't need to get prefix sums of T, you will see why.
        int[][] genoms = new int[3][S.length()+1];
        //if the char is found in the index i, then we set it to be 1 else they are 0
        //3 short values are needed for this reason
        short a, c, g;
        for (int i=0; i<S.length(); i++) {
            a = 0; c = 0; g = 0;
            if ('A' == (S.charAt(i))) {
                a=1;
            }
            if ('C' == (S.charAt(i))) {
                c=1;
            }
            if ('G' == (S.charAt(i))) {
                g=1;
            }
            //here we calculate prefix sums. To learn what's prefix sums look at here https://codility.com/media/train/3-PrefixSums.pdf
            genoms[0][i+1] = genoms[0][i] + a;
            genoms[1][i+1] = genoms[1][i] + c;
            genoms[2][i+1] = genoms[2][i] + g;
        }

        int[] result = new int[P.length];
        //here we go through the provided P[] and Q[] arrays as intervals
        for (int i=0; i<P.length; i++) {
            int fromIndex = P[i];
            //we need to add 1 to Q[i],
            //because our genoms[0][0], genoms[1][0] and genoms[2][0]
            //have 0 values by default, look above genoms[0][i+1] = genoms[0][i] + a;
            int toIndex = Q[i]+1;
            if (genoms[0][toIndex] - genoms[0][fromIndex] > 0) {
                result[i] = 1;
            } else if (genoms[1][toIndex] - genoms[1][fromIndex] > 0) {
                result[i] = 2;
            } else if (genoms[2][toIndex] - genoms[2][fromIndex] > 0) {
                result[i] = 3;
            } else {
                result[i] = 4;
            }
        }

        return result;
    }

这篇关于java codility training基因组范围查询的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！