本文介绍了用VB DOTNET解析HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图解析网站中的一些数据以从其表格中获取特定项目。我知道,bgcolor属性设置为#ffffff或#f4f4ff的任何标记都是我想要开始的地方,而我的实际数据位于第二个地方。



目前,我有:

 使用中的 InnerHtml  code> object( curElement ),就像这样: 

  For Each curElement As HtmlElement in theElementCollection 
Dim controlValue As String = curElement.GetAttribute(bgcolor)。ToString
MsgBox(controlValue)
如果controlValue.Equals(#f4f4ff)或controlValue.Equals(#ffffff)然后
Dim elementValue As String = curElement.InnerHtml
End If
Next

阅读更多信息:

<要获取< tr> HTML元素的第二个子元素,请使用combina FirstChild 然后 NextSibling ,如下所示:



<$
Dim controlValue As String = curElement.GetAttribute(bgcolor)。ToString
MsgBox(controlValue)
如果controlValue .Equals(#f4f4ff)或controlValue.Equals(#ffffff)然后
Dim firstChildElement = curElement.FirstChild
Dim secondChildElement = firstChildElement.NextSibling

'secondChildElement should成为第二个< td>,现在获得内部HTML
的值Dim elementValue As String = secondChildElement.InnerHtml
End If
Next


I am trying to parse some data from a website to get specific items from their tables. I know that any tag with the bgcolor attribute set to #ffffff or #f4f4ff is where I want to start and my actual data sits in the 2nd within that .

Currently I have:

Private Sub runForm()


    Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("TR")
    For Each curElement As HtmlElement In theElementCollection
        Dim controlValue As String = curElement.GetAttribute("bgcolor").ToString
        MsgBox(controlValue)
        If controlValue.Equals("#f4f4ff") Or controlValue.Equals("#ffffff") Then

        End If
    Next
End Sub

This code gets the TR element that I need, but I have no idea how (if it is possible) to then investigate the inner elements. If not, what do you think would be the best route to take? The site does not really label any of their tables. The 's i am looking for basically look like:

<td><b><font size="2"><a href="/movie/?id=movieTitle.htm">The Movie</a></font></b></td>

I want to pull out "The Movie" text and add it to a text file.

Use the InnerHtml property of the HtmlElement object (curElement) you have, like this:

For Each curElement As HtmlElement In theElementCollection
    Dim controlValue As String = curElement.GetAttribute("bgcolor").ToString
    MsgBox(controlValue)
    If controlValue.Equals("#f4f4ff") Or controlValue.Equals("#ffffff") Then
        Dim elementValue As String = curElement.InnerHtml
    End If
Next

Read the documentation of HtmlElement.InnerHtml Property for more information.

UPDATE:

To get the second child of the <tr> HTML element, use a combination of FirstChild and then NextSibling, like this:

For Each curElement As HtmlElement In theElementCollection
    Dim controlValue As String = curElement.GetAttribute("bgcolor").ToString
    MsgBox(controlValue)
    If controlValue.Equals("#f4f4ff") Or controlValue.Equals("#ffffff") Then
        Dim firstChildElement = curElement.FirstChild
        Dim secondChildElement = firstChildElement.NextSibling

        ' secondChildElement should be the second <td>, now get the value of the inner HTML
        Dim elementValue As String = secondChildElement.InnerHtml
    End If
Next

这篇关于用VB DOTNET解析HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 13:15
查看更多