我有一个示例HTML,它是从PracticalCryptography.com获得的,可以帮助我计算随机文本的重合度索引

<html>

<body>
  <script type="text/javascript">
    function GetIC() {
      plaintext = document.getElementById("p").value.toLowerCase().replace(/[^a-z]/g, "");
      var counts = new Array(26);
      var totcount = 0;
      for (i = 0; i < 26; i++) counts[i] = 0;
      for (i = 0; i < plaintext.length; i++) {
        counts[plaintext.charCodeAt(i) - 97] ++;
        totcount++;
      }
      var sum = 0;
      for (i = 0; i < 26; i++) sum = sum + counts[i] * (counts[i] - 1);
      ic = sum / (totcount * (totcount - 1));
      document.getElementById("ic").value = ic;
      document.getElementById("count").value = totcount;
    }
  </script>
  <form>
    <p>
      <textarea name="p" id="p" rows="2" cols="50" wrap="soft">Defend the east wall of the castle</textarea>
    </p>
    <p>
      <input name="b" id="b" value="Get I.C." onclick="GetIC()" type="button">
    </p>
    <p>Index of Coincidence =
      <input id="ic" name="ic" size="15" maxchars="15" value="" type="text">Character Count =
      <input id="count" name="count" size="8" maxchars="8" value="" type="text">
    </p>
  </form>
</body>

</html>


我想将Javascript代码移植到Java。以下是我的尝试:
private static boolean TestIOC(String text) {
    // Replace any character *not* in the range a-z
    // /g     -- global tag means find all, not just find one
    String plaintext = text.toLowerCase().replaceAll("/[^a-z]/g", "");
    int counts[] = new int[26];
    int totcount = 0;

    for (int i = 0; i < 26; i++) counts[i] = 0;
    for (int i = 0; i < plaintext.length(); i++) {
        int codePointAtI = Character.codePointAt(plaintext, i);
        //System.out.println(codePointAtI);

        counts[codePointAtI - 97]++; // Problematic line
        totcount++;
    }

    int sum = 0;
    float ic = 0;

    for (int i = 0; i < 26; i++) {
        sum = sum + counts[i]*(counts[i]-1);
        ic = sum / (totcount*(totcount-1));
    }

    if (ic >= 0.062 || ic <= 0.072) {
        DecimalFormat df = new DecimalFormat("#.###");
        System.out.println(df.format(ic));
        return true;
    }

    else {
        return false;
    }
}

但是它有一个



counts[codePointAtI - 97]++;
纯文本输出



注释行的输出:System.out.println(codePointAtI);

最佳答案

private static boolean TestIOC(String text) {
    // Replace any character *not* in the range a-z, and change all characters to lowercase.
    String plaintext = text.toLowerCase().replaceAll("[^a-z]", "");

    //System.out.println(plaintext);

    int counts[] = new int[26];
    int totcount = 0;
    double sum = 0;
    double ic = 0;

    for (int i = 0; i < 26; i++) counts[i] = 0;
    for (int i = 0; i < plaintext.length(); i++) {
        int codePointAtI = Character.codePointAt(plaintext, i);
        //System.out.println(codePointAtI);

        counts[codePointAtI - 97]++;
        totcount++;
    }

    //System.out.println("Totcount: " + totcount);

    for (int i = 0; i < 26; i++) {
        sum = sum+counts[i]*(counts[i]-1);
        ic = sum/(totcount*(totcount-1));
    }

    //System.out.println("Sum:" + sum);
    //System.out.println("ic: " + ic);

    if (ic >= 0.059 && ic <= 0.068) {
        System.out.println("This file contains an English text.");
        return true;
    }

    else {
        System.out.println("This file does not contain an English text.");
        return false;
    }
}

09-04 02:30