问题描述
我正在尝试在 c# 应用程序中进行一些抓取.
I'm trying to do a bit of scraping in a c# application.
我正在尝试访问以下页面上的 4 条信息:https://smstestbed.nist.gov/vds/current
I am trying to access 4 pieces of information on the following page:https://smstestbed.nist.gov/vds/current
- 创作时间
- 可用性
- 线性 X 和 Y 坐标
以下函数是我从远程加工工具轮询实时数据馈送的地方.我遇到的问题是,虽然我已经能够将CreationTime"打印到终端,但我的 XPath 使用非常笨拙,就 此链接 似乎表明我应该能够在评论后的两行中做我正在做的事情
The following function is where I am polling a live data feed from a remote machining tool.The problem I have is that whilst I have been able to print 'CreationTime' to a terminal, my XPath use is horrifically clunky and as far as This Link seems to suggest I should be able to do what I am doing in the 2 lines after my comment
"//这应该是一种更好的访问数据的方式,但由于某种原因第二行失败了"
"//This should be a far better way of accessing the data but for some reason the second line fails"
不幸的是,我发现 AvailabilityNode 为 Null.
Unfortunately I am getting AvailabilityNode was Null.
public static void PollNIST()
{
string NISTSourceURL = "https://smstestbed.nist.gov/vds/current"; // Gives us a human friendly reference to the HTM
//-------------------------------- Current (mostly) Working Version---------------------------------------------------------------------------------
// Retrieve raw HTML
var NISTTargetURL = NISTSourceURL;
var NISTHttpClient = new HttpClient();
var NISTXMLRaw = NISTHttpClient.GetStringAsync(NISTTargetURL); // We now have all of the HTML / XML Data as a raw string
//Console.WriteLine(MazXMLRaw.Result); // Prints the resulting HTML to a terminal as a debug tool (Works)
XmlDocument CurNISTXML = new XmlDocument(); // Generate Blank XML Doc
CurNISTXML.LoadXml(NISTXMLRaw.Result); // This (".result") passes the actual string?, should then be loaded into new XML file
var elementHeader = CurNISTXML.GetElementsByTagName("Header");
var curNISTHeader = elementHeader.Item(0);
var creationTime = curNISTHeader.Attributes[0]; // We actually have the creationTime
string CurNISTTime = creationTime.InnerText; ; // //*[@id="mtconnect content"]/ul/li[1]
//This should be a far better way of accessing the data but for some reason the second line fails
XmlNode AvailabilityNode = CurNISTXML.SelectSingleNode("/table[1]/tbody/tr[1]"); //*[@id="mtconnect content"]/table[1]/tbody/tr[1]/td[7] // Xpath Availability
var CurNISTStatus = AvailabilityNode.InnerText; // //*[@id="mtconnect content"]/ul/li[1]
string CurNistX = ""; // //*[@id="mtconnect content"]/table[5]/tbody/tr/td[7]
string CurNistY = ""; // //*[@id="mtconnect content"]/table[6]/tbody/tr/td[7]
Console.WriteLine("-------BEGIN NIST DATA PACKET-------");
Console.WriteLine("NIST Time : " + creationTime.InnerText);
Console.WriteLine("NIST Status: " + CurNISTStatus);
Console.WriteLine("NIST X Pos.: " + CurNistX);
Console.WriteLine("NIST Y Pos.: " + CurNistY);
Console.WriteLine("--------END NIST DATA PACKET--------");
//var currentNIST = new NISTDataSet()// Create new instance ofNISTdata object
}
有什么想法吗?
推荐答案
所以结果证明我提取 XML 的方式没有任何问题,只有我的路径.
So it turns out there was nothing wrong with how I was extracting the XML, only with my Paths.
public static void PollNIST()
{
string NISTSourceURL = "https://smstestbed.nist.gov/vds/current"; // Gives us a human friendly reference to the HTMl
// string NistXmlUrl = // Someone on stackexchange is claiming that there is another url for the XML but viewsource says otherwise
//-------------------------------- Current (mostly) Working Version---------------------------------------------------------------------------------
var NISTHttpClient = new HttpClient();
var NISTXMLRaw = NISTHttpClient.GetStringAsync(NISTSourceURL); // We now have all of the HTML / XML Data as a raw string
//Console.WriteLine(MazXMLRaw.Result); // Prints the resulting HTML to a terminal as a debug tool (Works)
XmlDocument CurNISTXML = new XmlDocument(); // Generate Blank XML Doc
CurNISTXML.LoadXml(NISTXMLRaw.Result); // This (".result") passes the actual string?, should then be loaded into new XML file
// Get CreationTime (WORKING!)
XmlNodeList elementHeader = CurNISTXML.GetElementsByTagName("Header");
XmlNode curNISTHeader = elementHeader.Item(0);
XmlAttribute creationTime = curNISTHeader.Attributes[0]; // We now have the creationTime element
string CurNISTTime = creationTime.InnerText; // //*[@id="mtconnect content"]/ul/li[1]
// Get availability (WORKING!)
XmlNodeList nodeAvailability = CurNISTXML.GetElementsByTagName("Availability");
XmlNode availability = nodeAvailability.Item(0); // I think this is maybe a bit of a hackish / improper way to do this?
string curNISTStatus = availability.InnerText;
//Get linear tool X Coord.
XmlNodeList deviceStream = CurNISTXML.GetElementsByTagName("ComponentStream");
XmlNode linearCompXStream = deviceStream.Item(4);
string curNISTX = linearCompXStream.InnerText; // We do not need to break down the nodes any further as the value is the only text within
//Get Linear tool y Coord.
XmlNode linearCompYStream = deviceStream.Item(5);
string curNISTY = linearCompYStream.InnerText; // We do not need to break down the nodes any further as the value is the only text within
Console.WriteLine("-------BEGIN NIST DATA PACKET-------");
Console.WriteLine("NIST Time : " + creationTime.InnerText);
Console.WriteLine("NIST Status: " + curNISTStatus);
Console.WriteLine("NIST X Pos.: " + curNISTX);
Console.WriteLine("NIST Y Pos.: " + curNISTY);
Console.WriteLine("--------END NIST DATA PACKET--------");
//var currentNIST = new NISTDataSet()// Create new instance ofNISTdata object
}
效果很好.
这篇关于我在 C# 中使用 XPath 有什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!