我无法通过ftp从此html文件获取html文本。我使用漂亮的汤通过http / https读取html文件,但由于某些原因,我无法从ftp下载/读取。请帮忙!

这是网址。
a link

到目前为止,这是我的代码。

BufferedReader reader = null;
String total = "";
String line;
ur = "ftp://ftp.legis.state.tx.us/bills/832/billtext/html/house_resolutions/HR00001_HR00099/HR00014I.htm"
try {
    URL url = new URL(ur);
    URLConnection urlc = url.openConnection();
    InputStream is = urlc.getInputStream(); // To download
    reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
        while ((line = reader.readLine()) != null)
            total += reader.readLine();

} finally {
    if (reader != null)
        try { reader.close();
        } catch (IOException logOrIgnore) {}
}

最佳答案

此代码对我有用,Java 1.7.0_25。注意,您存储的是每两行之一,同时在while循环的条件和主体中调用reader.readLine()

public static void main(String[] args) throws MalformedURLException, IOException {
    BufferedReader reader = null;
    String total = "";
    String line;
    String ur = "ftp://ftp.legis.state.tx.us/bills/832/billtext/html/house_resolutions/HR00001_HR00099/HR00014I.htm";
    try {
        URL url = new URL(ur);
        URLConnection urlc = url.openConnection();
        InputStream is = urlc.getInputStream(); // To download
        reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
        while ((line = reader.readLine()) != null) {
            total += line;
        }
    } finally {
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException logOrIgnore) {
            }
        }
    }
}

10-02 04:24