问题描述
我有n个asp.net 2.0应用程序.我正在尝试上传文件并读取行并将其显示在文本框中.对于.txt文件,这可以正常工作.但是,如果我做一个单词doc,就会使文本杂乱无章(看起来像基于xml的格式).这是我的代码...
I have n asp.net 2.0 app. I am trying to upload a file and read lines and display them in a textbox. This works fine for a .txt file. But if I do a word doc, I get all kinds of jibberish (looks like xml-based formatting) surroudning the text. Here is my code...
Dim s As New StringBuilder
Dim rdr As StreamReader
If FileUpload1.HasFile Then
rdr = New StreamReader(FileUpload1.FileContent)
Do Until rdr.EndOfStream
s.Append(rdr.ReadLine() & ControlChars.NewLine)
Loop
TextBox1.Text = s.toString()
End If
推荐答案
这是因为Word文档文件包含基于xml的格式.如果您使用哑文本阅读器(例如Notepad.exe
或命令行中的type
)来查看文件中的内容,则会看到相同的内容.
That's because the Word document file contains that xml-based formatting. You will see the same thing, if you use a dumb text reader (e.g. Notepad.exe
, or e.g. type
from the command-line) to see what's in the file.
要从周围的格式中提取文本,您需要使用软件(例如Word本身,winword.exe
)以纯文本格式保存或获取文档.
To extract the text from the surrounding formatting, you'll need to use software (e.g. Word itself, winword.exe
) to save or get the document in plain-text format.
这篇关于如何使用StreamReader阅读Word文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!