我创建了两个HashMap,其中包含来自两个单独的txt文件的字符串。

现在,我试图比较两个HashMap,并计算每个文件包含的重复值的数量。例如,如果file1和file2都两次包含字符串“ hello”,则我的控制台应打印:hello发生2次。

这是我的第一个HashMap:

 List<String> word_list = new ArrayList<>();
        //Load your words to the word_list here


         while (INPUT_TEXT1.hasNext()) {
            String input_word = INPUT_TEXT1.next();

            word_list.add(input_word);

        }

        INPUT_TEXT1.close();

        String regexPattern = "[^a-zA-Z]";

        int index = 0;

        for (String s : word_list) {

            word_list.set(index++, s.replaceAll(regexPattern, "").toLowerCase());
        }

        //Find the unique words now from list
        String[] uniqueWords = word_list.stream().distinct().
                                       toArray(size -> new String[size]);
        Map<String, Integer> wordsMap = new HashMap<>();
        int frequency = 0;

        //Load the words to Map with each uniqueword as Key and frequency as Value
        for (String uniqueWord : uniqueWords) {
            frequency = Collections.frequency(word_list, uniqueWord);
            System.out.println(uniqueWord+" occured "+frequency+" times");
            wordsMap.put(uniqueWord, frequency);
        }

       //Now, Sort the words with the reverse order of frequency(value of HashMap)
       Stream<Entry<String, Integer>> topWords = wordsMap.entrySet().stream().
         sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6);

        //Now print the Top 5 words to console
        System.out.println("Top 5 Words:::");
        topWords.forEach(System.out::println);


        System.out.println("\n\n");


这是我的第二个HashMap:

List<String> wordList = new ArrayList<>();
        //Load your words to the word_list here


         while (INPUT_TEXT2.hasNext()) {
            String input_word1 = INPUT_TEXT2.next();

            wordList.add(input_word1);

        }

        INPUT_TEXT2.close();

        String regex = "[^a-zA-Z]";

        int index1 = 0;

        for (String s : wordList) {

            wordList.set(index1++, s.replaceAll(regex, "").toLowerCase());
        }

        String[] uniqueWords1 = wordList.stream().distinct().
                                       toArray(size -> new String[size]);
        Map<String, Integer> wordsMap1 = new HashMap<>();

         //Load the words to Map with each uniqueword as Key and frequency as Value
        for (String uniqueWord : uniqueWords1) {
            frequency = Collections.frequency(wordList, uniqueWord);
            System.out.println(uniqueWord+" occured "+frequency+" times");
            wordsMap.put(uniqueWord, frequency);
        }

       //Now, Sort the words with the reverse order of frequency(value of HashMap)
       Stream<Entry<String, Integer>> topWords1 = wordsMap1.entrySet().stream().
         sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6)


这是我寻找重复值的原始方法:

 boolean val = wordsMap.keySet().containsAll(wordsMap1.keySet());

    for (Entry<String, Integer> str : wordsMap.entrySet()) {
        System.out.println("================= " + str.getKey());


        if(wordsMap1.containsKey(str.getKey())){
            System.out.println("Map2 Contains Map 1 Key");
        }
    }

    System.out.println("================= " + val);


有人对此有其他建议吗?谢谢

编辑
如何计算每个单个值的出现次数?

最佳答案

我认为您的代码也能正常工作。如果您的目标是找到一种更好的方法来执行上一次检查,则可以尝试以下操作:

Set<String> keySetMap1 = new HashSet<String>(wordsMap.keySet());
Set<String> keySet2 = wordsMap1.keySet();
keySetMap1.retainAll(keySet2);
keySetMap1.stream().forEach(x -> System.out.println("Map2 Contains Map 1 Key: "+x));

08-26 02:08