本文介绍了如何从元标记中获取内容的价值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在努力从元标记中获取值。到目前为止,我已经取得了成功,但我已经获得了如下所示的元标记:
I'm working on getting values from meta tags. So far I've gotten success but stuck at a point where i'm getting meta tag like below:
<meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image">
通过此我无法提取其中的url字符串元标记的内容属性。
我尝试过:
through this i'm not able to extract url string which is in the content property of meta tag.
What I have tried:
Regex meta = new Regex(@"<meta\s*(?:(?:\b(\w|-)+\b\s*(?:=\s*(?:""[^""]*""|'" +
@"[^']*'|[^""'<> ]+)\s*)?)*)/?\s*>");
WebClient web = new WebClient();
web.UseDefaultCredentials = true;
string page = web.DownloadString(url);
WebClient client = new WebClient();
// Add a user agent header in case the
// requested URI contains a query.
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
Stream data = client.OpenRead(url);
StreamReader reader = new StreamReader(data);
string s = reader.ReadToEnd();
//Console.WriteLine(s);
data.Close();
reader.Close();
MatchCollection mc = meta.Matches(s);
int mIdx = 0;
foreach (Match m in mc)
{
for (int gIdx = 0; gIdx < m.Groups.Count; gIdx++)
{
metadata.Add(m.Groups[gIdx].Value);
}
mIdx++;
}
任何解决方案?
推荐答案
这篇关于如何从元标记中获取内容的价值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!