我有一个ArrayListDico,我尝试将其拆分为多个ArrayList,但这会导致某些重复。

这是Dico类别:

public class Dico implements Comparable {
    private final String m_term;
    private double m_weight;
    private final int m_Id_doc;

    public Dico(int Id_Doc, String Term, double tf_ief) {
        this.m_Id_doc = Id_Doc;
        this.m_term = Term;
        this.m_weight = tf_ief;
    }

    public String getTerm() {
        return this.m_term;
    }

    public double getWeight() {
        return this.m_weight;
    }

    public void setWeight(double weight) {
        this.m_weight = weight;
    }

    public int getDocId() {
        return this.m_Id_doc;
    }

    @Override
    public int compareTo(Object another) throws ClassCastException {
        if (!(another instanceof Dico))
            throw new ClassCastException("A Dico object expected.");
        int anotherDocid = ((Dico) another).getDocId();
        return this.getDocId() - anotherDocid;
    }

    @Override
    public String toString() {
        return "id" + getDocId() + "term" + getTerm() + "weight" + getWeight() + "";
    }
}


和用于执行此操作的split_dico函数:

public static void split_dico(List<Dico> list) {
    int[] changes = new int[list.size() + 1]; // allow for max changes--> contain index of subList
    Arrays.fill(changes, -1); // if an index is not used, will remain -1
    changes[0] = 0;
    int change = 1;
    int id = list.get(0).getDocId();
    for (int i = 1; i < list.size(); i++) {
        Dico dic_entry = list.get(i);
        if (id != dic_entry.getDocId()) {
            changes[change++] = i;
            id = dic_entry.getDocId();
        }
    }
    changes[change] = list.size(); // end of last change segment
    List<List<Dico>> sublists = new ArrayList<>(change);
    for (int i = 0; i < change; i++) {
        sublists.add(list.subList(changes[i], changes[i + 1]));
        System.out.println(sublists);
    }
}


测试:

List<Dico> list = Arrays.asList(new Dico(1, "foo", 1),
    new Dico(7, "zoo", 5),
    new Dico(2, "foo", 1),
    new Dico(3, "foo", 1),
    new Dico(1, "bar", 2),
    new Dico(4, "zoo", 0.5),
    new Dico(2, "bar", 2),
    new Dico(3, "baz", 3));
Collections.sort(list_new);
split_dico(list_new);


输出:

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6]]

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6]]

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2]]

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2], [doc id : 4 term : zoo weight : 0.15]]

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2], [doc id : 4 term : zoo weight : 0.15], [doc id : 7 term : zoo weight : 1.5]]


我不了解此功能的问题。

最佳答案

在打印循环中,您将在添加新的子列表之后打印整个子列表列表。

取而代之的是,根据您的要求,仅应在填写完子列表后才打印

09-17 21:41