我正在尝试为第[1]行(即实体名称)查找10个以上重复项,但是输出仅显示实体名称=“ Rick Ross”的重复项,它不会读取并遍历实体名称后的下一行“里克·罗斯”。

这是代码:

public class EventDetectioncopy {
    public static void main(String[] args) throws FileNotFoundException, IOException{
        System.out.print("Enter a name for new Tweet Cluster (csv file): ");
        BufferedReader scanFile = new BufferedReader(new InputStreamReader(System.in));
        String newFile = scanFile.readLine();

        try {
            eventDetection(newFile);
        }
        catch (FileNotFoundException e) {
            System.out.println(e);
        }
        catch (IOException e){

        }
    }

    public static void eventDetection(String filename) throws FileNotFoundException, IOException{
        String csv = "1day/clusters.sortedby.clusterid.csv";
        FileWriter newCsv = new FileWriter(filename + "." + "csv");
        BufferedWriter newCsvBW = new BufferedWriter(newCsv);
        BufferedReader reader = new BufferedReader(new FileReader(csv));
        String data;

        try{
            String temp = null;
            List<String> tempList = new ArrayList<String>();
            List<String> list = new ArrayList<String>();

            while((data = reader.readLine()) != null)
            {
                String[] splitText = data.split(",");
                //list = Arrays.asList(splitText);
                String nameEntity = splitText[1];
                if(temp != null)
                {
                    if(!(nameEntity.equals(temp)))
                    {
                        if(tempList.size() >= 10)
                        {
                            for(int i = 0; i < tempList.size(); i ++)
                            {
                                newCsvBW.append(tempList.get(i));
                                newCsvBW.append("\n");
                                System.out.println(tempList.get(i));
                            }
                            break;
                        }
                    }
                    else
                    {
                        tempList.add(data);
                    }
                }
                else
                {
                    temp = nameEntity;
                    tempList.add(data);
                }

            }
        }
        finally
        {
            reader.close();
            newCsvBW.close();
        }

    }
}


这是csv文件的一些内容:

[clusterid],[名称实体],[tweetid],[timestamp],[userid],[tweet令牌],[tweet文本]

1   rick ross   2.5582E+17  1.34983E+12 389746870   rick ross dice pineappl Rick Ross x diced pineapples
1   rick ross   2.5582E+17  1.34983E+12 56082039    dice pineappl uhhh rick ross voic   Diced Pineapples. UHHH *Rick Ross voice*
1   rick ross   2.55821E+17 1.34983E+12 870278689   rick ross trend Why is Rick Ross trending?
1   rick ross   2.55822E+17 1.34983E+12 379948188   lmfao rick ross grunt   Lmfao he did that rick ross grunt .
1   rick ross   2.55822E+17 1.34983E+12 276594374   play rick ross  they played w| rick ross !
1   rick ross   2.55822E+17 1.34983E+12 386219877   rick ross ugli  Rick Ross So Ugly ..
1   rick ross   2.55822E+17 1.34983E+12 53327754    wanna play rick ross belli  I Wanna Play in Rick Ross Belly..!
1   rick ross   2.55824E+17 1.34983E+12 19690034    rick ross dice pineappl ft wale amp drake video via laleak  Rick Ross - Diced Pineapples ft. Wale &amp; Drake (Video) via @laleakers
1   rick ross   2.55825E+17 1.34983E+12 357250991   husband rick ross   where my husband rick ross î„…î‰
1   rick ross   2.55825E+17 1.34983E+12 53734179    throw rick ross kirko bangz *Throws Rick ross At Kirko Bangz*
1   rick ross   2.55825E+17 1.34983E+12 462179553   rick ross stay fresh    Rick Ross Stay Fresh!!!!
2   tyler oakley    2.55821E+17 1.34983E+12 867420925   know someth trend new asktyl tyleroakley live   HOW DO YOU KNOW WHEN SOMETHING IS TRENDING? IM NEW TO THIS... #aSKTYLER
2   tyler oakley    2.55822E+17 1.34983E+12 504044044   asktyl get perfect quiff tyleroakley live   #AskTyler How do you get a perfect quiff :)?
2   tyler oakley    2.55822E+17 1.34983E+12 709347721   asktyl realli homework right now tyleroakley live   #asktyler i really should be doing homework right now
2   tyler oakley    2.55822E+17 1.34983E+12 171667747   obsess right now asktyl tyleroakley live    what is your obsession right now? #asktyler
3   wiz khalifa 2.5582E+17  1.34983E+12 588829718   dont like wiz khalifa look sexi I don't like Wiz Khalifa but he looks sexy.
3   wiz khalifa 2.55856E+17 1.34984E+12 502086440   feel like wiz khalifa right now I feel like wiz Khalifa right now..
3   wiz khalifa 2.55866E+17 1.34984E+12 446056049   like wiz khalifa hes ador realli look like hot cheeto man thingi    I like Wiz Khalifa he's adorable, but he really do look like the hot cheeto man thingy
3   wiz khalifa 2.55883E+17 1.34984E+12 67747115    np ne yo ft wiz khalifa dont make em like   #Np Ne-Yo ft. Wiz Khalifa - They don't make em like you


第一个实体名称结束后,如何读取csv文件中row [1]的下一行。

最佳答案

最后,如果if块

if(!(nameEntity.equals(temp)))


在for循环后添加:

 temp = nameEntity;
 tempList.add(data);


编辑:

根据注释的输入,用以下do-while替换while循环:

     do {
            data = reader.readLine();
            String nameEntity = null;
            if (data != null) {
                String[] splitText = data.split(",");
                nameEntity = splitText[1];
            }
            if (temp != null) {
                if (data == null || !(nameEntity.equals(temp))) {
                    if (tempList.size() >= 10) {
                        for (int i = 0; i < tempList.size(); i++) {
                            newCsvBW.append(tempList.get(i));
                            newCsvBW.append("\n");
                            System.out.println(tempList.get(i));
                        }
                    }
                    tempList.clear();
                    temp = nameEntity;
                }
            } else {
                temp = nameEntity;
            }
            tempList.add(data);
        } while (data != null);

08-16 19:24