本文介绍了如何使用StreamReader阅读Word文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有n个asp.net 2.0应用程序.我正在尝试上传文件并读取行并将其显示在文本框中.对于.txt文件,这可以正常工作.但是,如果我做一个单词doc,就会使文本杂乱无章(看起来像基于xml的格式).这是我的代码...

I have n asp.net 2.0 app. I am trying to upload a file and read lines and display them in a textbox. This works fine for a .txt file. But if I do a word doc, I get all kinds of jibberish (looks like xml-based formatting) surroudning the text. Here is my code...

    Dim s As New StringBuilder
    Dim rdr As StreamReader

    If FileUpload1.HasFile Then

        rdr = New StreamReader(FileUpload1.FileContent)

        Do Until rdr.EndOfStream
            s.Append(rdr.ReadLine() & ControlChars.NewLine)
        Loop

        TextBox1.Text = s.toString()

    End If

推荐答案

这是因为Word文档文件包含基于xml的格式.如果您使用哑文本阅读器(例如Notepad.exe或命令行中的type)来查看文件中的内容,则会看到相同的内容.

That's because the Word document file contains that xml-based formatting. You will see the same thing, if you use a dumb text reader (e.g. Notepad.exe, or e.g. type from the command-line) to see what's in the file.

要从周围的格式中提取文本,您需要使用软件(例如Word本身,winword.exe)以纯文本格式保存或获取文档.

To extract the text from the surrounding formatting, you'll need to use software (e.g. Word itself, winword.exe) to save or get the document in plain-text format.

这篇关于如何使用StreamReader阅读Word文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-31 07:26