问题描述
说我有一个网站,显示您的标志,当你输入你卷号。您还可以看到其他人的标记以同样的方式,通过增加自己的卷号。
Say I have a website which displays your marks when you input your roll number. You can also see others' marks the same way by incrementing your own roll number.
我想创建一个Excel工作表,找到标志的标准偏差(大专以上项目)。
I want to create an Excel sheet to find the standard deviation of the marks (college project).
这在物理上是不可能的,我手动输入所有的数据,所以我寻找一些自动化的方法,该方法能为我做这项工作,并保存在一个文本文件中的所有字段,我可以很容易地转换成表格。
It is physically impossible for me to manually enter all the data, so I am searching for some automation method which can do this work for me and save all fields in a text file, which I can easily convert to a table.
背景细节:
链接到这里网站。然后你会发出带有参数的散列POST请求(填表)。你的情况的参数的名称是REGNO(看网页的HTML源代码来弄明白自己)和值是好您要提取数据的数量。
First of all, you will need a ruby library that can issue a POST request. Such as Faraday . Then you will issue a POST request with hash of parameters(filling the form). In your case the name of parameter is "regno"(look at the html source of the page to figure it out yourself) and the value is well the number for which you want to extract data.
你将有在这个舞台上是结果HTML页面的源代码。
What you will have on this stage is the source of html page with results.
结果都在大致相同的形式:
Results are all in roughly the same form:
<tr bgColor="#ffffff">
<td align="middle"><font face="Arial" size=2> 301</font></td>
<td align="left" ><font face="Arial" size=2>ENGLISH CORE</font></td>
<td align="left" ><font face="Arial" size=2>084 </font></td>
<td align="middle"><font face="Arial" size=2>A2</font></td>
</tr>
TR的只有BGCOLOR变化,当然这些数据。您需要使用提取所有这些块, 例如。你可以做的更好,并使用,另一个Ruby库的XPath功能。您需要自己寻找这两个了。
Only the bgColor of tr varies and the data of course. You need to extract all these blocks using a regular expression, for example. You can do one better and use XPath feature of Nokogiri, another ruby library. You need to look these two up by yourself.
当你把所有的数据,你并不需要创建Excel工作表 - Ruby是能够通过自身做这样简单的数学的
When you have all the data, you don't need to create Excel sheet - Ruby is capable of doing such simple math by itself.
我建议要通过2位的库所有的实例和应用的所有相关的人到你的特定的任务。红宝石实际上是这样的任务,一个伟大的选择,因为库大多是好的,开始是无痛的。由于没有编程经验,虽然将沿途的事情复杂化。
I recommend going through all examples of two mentioned libraries and applying all relevant ones to your specific task. Ruby is actually a great choice for such task, as libraries are mostly good and starting is painless. Having no programming experience though will complicate things along the way.
这篇关于从使用Ruby的web僵尸网站自动数据检索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!