排序频繁出现的词中的流的替代方案

本文介绍了排序频繁出现的词中的流的替代方案的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

所以，我有一个将字符串列表作为参数并读取它的方法.然后按频率对它们进行排序，如果单词具有相同的频率，则按字母顺序打印.(事实上也有俄语单词，它们总是在英语单词之下).

So, I have a method which takes List of Strings as an arguments and, reads it. Then sorts them by frequency, and if the words have the same frequency they are printed alphabetically. (take in fact that there are also Russian words, and they always go beneath English words).

这是一个很好的输出示例:

Here is an example of a good output:

лицами-18
Apex-15
azet-15
xder-15
анатолю-15
андреевич-15
батальона-15
hello-13
zello-13
полноте-13

这是我的代码:

public class Words {

public String countWords(List<String> lines) {

    StringBuilder input = new StringBuilder();
    StringBuilder answer = new StringBuilder();

    for (String line : lines){
        if(line.length() > 3){
            if(line.substring(line.length() - 1).matches("[.?!,]+")){
                input.append(line.substring(0,line.length()-1)).append(" ");
            }else{
                input.append(line).append(" ");
            }
        }
    }

    String[] strings = input.toString().split("\\s");

    List<String> list = new ArrayList<>(Arrays.asList(strings));

    Map<String, Integer> unsortMap = new HashMap<>();
    while (list.size() != 0){
        String word = list.get(0);
        int freq = Collections.frequency(list, word);
        if (word.length() >= 4 && freq >= 10){
            unsortMap.put(word.toLowerCase(), freq);
        }

        list.removeAll(Collections.singleton(word));
    }
    //The Stream logic is here
    List<String> sortedEntries = unsortMap.entrySet().stream()
            .sorted(Comparator.comparingLong(Map.Entry<String, Integer>::getValue)
                    .reversed()
                    .thenComparing(Map.Entry::getKey)
            )
            .map(it -> it.getKey() + " - " + it.getValue())
            .collect(Collectors.toList());
    
    //Logic ends here

    for (int i = 0; i < sortedEntries.size(); i++) {
        if(i<sortedEntries.size()-1) {
            answer.append(sortedEntries.get(i)).append("\n");
        }
        else{
            answer.append(sortedEntries.get(i));
        }
    }

    return answer.toString();

 }
}

我的问题:目前代码运行良好，并且给出了成功的结果，但是正如您所看到的，我正在使用流对字符串进行排序.但是，我只是对是否有其他解决方案可以在不使用流的情况下编写我的代码感兴趣.更准确地说，还有没有其他方法可以在不使用流的情况下按频率然后按字母顺序(如果它们具有相同的频率)对字符串进行排序.

My Issue: Currently the code is working fine, and it gives successful results, however as you can see I am using streams to sort the strings. However, I am just interested if there is other solution to write my code without using streams. To be more precise is there any other way to sort Strings by frequency and then by alphabetic order (if they have same frequency), without using streams.

推荐答案

您可以在流中执行的任何操作，您都可以在传统 Java 中执行.但是使用流通常可以使代码更短、更简单、更易于阅读！

Anything you can do in streams you can do in conventional Java. But using streams usually makes for much shorter, simpler, and easier-to-read code!

顺便说一下，您的前半部分代码可以简单地替换为:

By the way, the first half of your code could be replaced with simply this:

Map < String, AtomicInteger > map = new HashMap <>();
for ( String word : words ) {
    map.putIfAbsent( word , new AtomicInteger( 0 ) );
    map.get( word ).incrementAndGet();
}

您的代码的后半部分通过先按值排序，然后按键排序来报告地图.

The second half of your code is reporting on a map by sorting first on value, then on key.

该挑战在问题中讨论，根据值然后键对 HashMap 进行排序?和对地图进行排序按值.这些答案中有一些巧妙的解决方案，例如肖恩.

That challenge is discussed in Questions, Sorting a HashMap based on Value then Key? and Sort a Map<Key, Value> by values. There are some clever solutions among those Answers, such as this one by Sean.

但我宁愿保持简单.我会将我们的单词和字数映射转换为我们自己的自定义类的对象，每个对象将单词和字数作为字段.

But I would rather keep things simple. I would translate the map of our word and word-count to objects of our own custom class, each object holding the word and word-count as fields.

Java 16+ 带来了 records 功能，使此类自定义类定义变得更加容易.记录是编写类的一种更简洁的方法，其主要目的是透明和不可变地通信数据.编译器隐式地创建构造函数、getter、equals &hashCode 和 toString.

Java 16+ brings the records feature, making such a custom class definition much easier. A record is a briefer way to write a class whose main purpose is to communicate data transparently and immutably. The compiler implicitly creates the constructor, getters, equals & hashCode, and toString.

record WordAndCount (String word , int count ) {}

在 Java 16 之前，使用常规类代替 record.这是与单行记录等效的 33 行源代码.

Before Java 16, use a conventional class in place of that record. Here is the 33-line source-code equivalent of that record one-liner.

final class WordAndCount {
    private final String word;
    private final int count;

    WordAndCount ( String word , int count ) {
        this.word = word;
        this.count = count;
    }

    public String word () { return word; }

    public int count () { return count; }

    @Override
    public boolean equals ( Object obj ) {
        if ( obj == this ) return true;
        if ( obj == null || obj.getClass() != this.getClass() ) return false;
        var that = ( WordAndCount ) obj;
        return Objects.equals( this.word , that.word ) && this.count == that.count;
    }

    @Override
    public int hashCode () {
        return Objects.hash( word , count );
    }

    @Override
    public String toString () {
        return "WordAndCount[" + "word=" + word + ", " + "count=" + count + ']';
    }
}

我们创建一个该记录类型的对象数组，并进行填充.

We make an array of objects of that record type, and populate.

List<WordAndCount> wordAndCounts = new ArrayList <>(map.size()) ;
for ( String word : map.keySet() ) {
    wordAndCounts.add( new WordAndCount( word, map.get( word ).get() ) );
}

现在排序.Comparator 接口有一些方便的工厂方法，我们可以在其中传递方法引用.

Now sort. The Comparator interface has some handy factory methods where we can pass a method reference.

wordAndCounts.sort(
        Comparator
                .comparingInt( WordAndCount ::count )
                .reversed()
                .thenComparing( WordAndCount ::word )
);

让我们将所有代码放在一起.

Let’s pull all that code together.

package work.basil.text;

import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;

public class EngRus {
    public static void main ( String[] args ) {
        // Populate input data.
        List < String > words = EngRus.generateText(); // Recreate the original data seen in the Question.
        System.out.println( "words = " + words );

        // Count words in the input list.
        Map < String, AtomicInteger > map = new HashMap <>();
        for ( String word : words ) {
            map.putIfAbsent( word , new AtomicInteger( 0 ) );
            map.get( word ).incrementAndGet();
        }
        System.out.println( "map = " + map );

        // Report on word count, sorting first by word-count numerically and then by word alphabetically.
        record WordAndCount( String word , int count ) { }
        List < WordAndCount > wordAndCounts = new ArrayList <>( map.size() );
        for ( String word : map.keySet() ) {
            wordAndCounts.add( new WordAndCount( word , map.get( word ).get() ) );
        }
        wordAndCounts.sort( Comparator.comparingInt( WordAndCount :: count ).reversed().thenComparing( WordAndCount :: word ) );
        System.out.println( "wordAndCounts = " + wordAndCounts );
    }

    public static List < String > generateText () {
        String input = """
                лицами-18
                Apex-15
                azet-15
                xder-15
                анатолю-15
                андреевич-15
                батальона-15
                hello-13
                zello-13
                полноте-13
                """;

        List < String > words = new ArrayList <>();
        input.lines().forEach( line -> {
            String[] parts = line.split( "-" );
            for ( int i = 0 ; i < Integer.parseInt( parts[ 1 ] ) ; i++ ) {
                words.add( parts[ 0 ] );
            }
        } );
        Collections.shuffle( words );
        return words;
    }
}

运行时:

词语= [андреевич，你好，xder，батальона，лицами，полноте，анатолю，лицами，полноте，полноте，анатолю，анатолю，zello，你好，лицами，xder，батальона，顶点，xder，андреевич，анатолю，你好，xder，耳尖，xder，андреевич，лицами，zello，полноте，лицами，耳尖，батальона，zello，полноте，xder，你好，azet，батальона，zello，你好，полноте，耳尖，полноте，полноте，azet，андреевич，полноте，耳尖，анатолю，你好，azet，лицами，анатолю，zello，анатолю，耳尖，zello，андреевич，лицами，xder，你好，полноте，zello，耳尖，батальона，лицами，你好，azet，耳尖，анатолю，анатолю，zello，полноте，анатолю，耳尖，батальона，андреевич，лицами，андреевич，azet，azet，лицами，лицами，zello，azet，анатолю，xder，батальона，полноте，лицами，你好，лицами，xder，xder，лицами，zello，андреевич，батальона，лицами，андреевич，azet，полноте，你好，андреевич，лицами，你好，耳尖，батальона，你好，azet，лицами，zello，батальона，анатолю，耳尖，azet，xder，андреевич，андреевич，батальона，анатолю，батальона，一种pex, xder, azet, azet, xder, azet, анатолю, Apex, батальона, Apex, Apex, лицами, батальона, xder, azet, azet, анатолю, Apex, батальона, Apex, Apex, лицами, батальона, xder, azet, azet, анатолю, apex, батальона, Apex, Apex

map = {андреевич=15, xder=15, zello=13, батальона=15, azet=15, лицами=18, анатолю=15, hello=13, Apex=15, по1лна=15, по1лна

map = {андреевич=15, xder=15, zello=13, батальона=15, azet=15, лицами=18, анатолю=15, hello=13, Apex=15, полноте=13}

wordAndCounts = [WordAndCount[word=лицами, count=18], WordAndCount[word=Apex, count=15], WordAndCount[word=azet, count=15], WordAndCount[word=xder, count=15],WordAndCount[word=анатолю, count=15], WordAndCount[word=андреевич, count=15], WordAndCount[word=батальона, count=15], WordAndCount[word=hello, count=13], WordAndCount[word=zello,count=13], WordAndCount[word=полноте, count=13]]

wordAndCounts = [WordAndCount[word=лицами, count=18], WordAndCount[word=Apex, count=15], WordAndCount[word=azet, count=15], WordAndCount[word=xder, count=15], WordAndCount[word=анатолю, count=15], WordAndCount[word=андреевич, count=15], WordAndCount[word=батальона, count=15], WordAndCount[word=hello, count=13], WordAndCount[word=zello, count=13], WordAndCount[word=полноте, count=13]]

这篇关于排序频繁出现的词中的流的替代方案的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！