我希望在使用Scrapy的同时从一个页面上抓取多个产品名称

<!-- body_text //-->

    <td width="601" valign="top">

      <table border="0" width="100%" cellspacing="0" cellpadding="0">

        <tr>

          <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

        </tr>

       <tr>

         <td class="pageHeading">Pool (Pocket Billiards) Table</td>

        </tr>

        <tr>

          <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

        </tr>

        <tr>

          <td class="main">A Victoria table is more than mere wood and slate. By paying attention to the details - the hidden differences - Victoria tables have become known name as masterpieces of original design and craftmanship, and most prestigious name in billiards.<br><br>



          These tables, available in two sizes  9’ X 4.5’ and 8’ X 4’, are made of frames with selected good quality solid wood and finely crafted rose wood legs with Mahagony polish.<br><br>

Slate Beds used are either Indian Bangalore Black Slate or Imported Slate. Slates are covered with worsted wool cloth optionally from Jupiter (China) or Strachan (West of England cloth, U.K.) to have proper speed, accuracy and responsiveness of the table to spin. Chrome nuts and adjusters  are used for leveling. It is surrounded with standard imported vulcanized 'L' shaped or 'V' shaped rubber cushions or Northern Cushions (Made in England) to cause billiard balls to rebound while minimizing the lose of kinetic energy.</td>

        </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs20b"></a>VS-20B</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9&lsquo; X 4.5&lsquo;</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.B. Frame</li><li><strong>Bangalore Slate</strong></li><li>Standard Accessories</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-20bbig.jpg')"><img src="images/products/vs-20b.jpg" alt="VS-20B" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs20b"></a>VS-20C</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 8&lsquo; X 4&lsquo;</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.B. Frame</li><li><strong>Bangalore Slate</strong></li><li>Standard Accessories</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-20cbig.jpg')"><img src="images/products/vs-20c.jpg" alt="VS-20C" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs23b"></a>VS-23B</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9&lsquo; X 4.5&lsquo;</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.A.L. Frame</li><li><strong>Imported Slate</strong></li><li>Standard Accessories</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-23bbig.jpg')"><img src="images/products/vs-23b.jpg" alt="VS-23B" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs23b"></a>VS-23C</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 8&lsquo; X 4&lsquo;</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.A.L. Frame</li><li><strong>Imported Slate</strong></li><li>Standard Accessories</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-23cbig.jpg')"><img src="images/products/vs-23c.jpg" alt="VS-23C" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs9"></a>VS-9</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9&lsquo; X 4.5&lsquo;</strong></li><li>Auto Ball Return System</li><li>Pro Speed Cloth</li><li>American Pocket Size</li><li>Standard Accessories</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-9big.jpg')"><img src="images/products/vs-9.jpg" alt="VS-9" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs7"></a>VS-7</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98"L X 54" W X 31" H</strong></li><li>Solid oak for top/brand rails, Dark cherry finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket.  Easy assembly</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-7big.jpg')"><img src="images/products/vs-7.jpg" alt="VS-7" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs8"></a>VS-8/Light Oak</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8&lsquo; X 4&lsquo;</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98" X 54"W X 31"H</strong></li><li>Solid oak for top/brand rails, Light oak finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket, Easy assembly</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-8big.jpg')"><img src="images/products/vs-8.jpg" alt="VS-8/Light Oak" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs12"></a>VS-12</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8&lsquo; X 4&lsquo;</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 99-3/4"L X 55 - 3/4" W X 31" H</strong></li><li>Black laminate, pedestal legs, with drop pocket, Steel frame Easy assembly. Accessories included.</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-12big.jpg')"><img src="images/products/vs-12.jpg" alt="VS-12" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs10"></a>VS-10</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8&lsquo; X 4&lsquo;</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98" L X 54"W X 31"H</strong></li><li>Solid oak for top/brand rails, oak finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket, Easy assembly</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-10big.jpg')"><img src="images/products/vs-10.jpg" alt="VS-10" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs11"></a>VS-11</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8&lsquo; X 4&lsquo;</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 100" X 56"</strong></li><li>Solid wood for top/brand rails</li><li>Mahogany finish</li><li>Rams head solid rubber with # 6 leather drop pocket</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-11big.jpg')"><img src="images/products/vs-11.jpg" alt="VS-11" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>



            <tr>

              <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

            </tr>

            <tr>

              <td>

                <table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">

                  <tr>

                    <td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs13"></a>VS-13</strong></td>

                  </tr>

                </table>

                <table cellpadding="4" cellspacing="4" width="100%" border="0" >

                  <tr>

                    <td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8&lsquo; X 4&lsquo;</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 100" X 56"</strong></li><li>Solid wood for top/brand rails,</li><li>Dark cherry finish</li><li>Rams head solid rubber wood<br />
<br />
with # 6 leather drop pocket</li></ul></td>

                    <td width="40%" align="center"><a href="javascript:popupWindow('images/products/vs-13big.jpg')"><img src="images/products/vs-13.jpg" alt="VS-13" border="0" width="250px"></a></td>

                  </tr>

                </table>

              </td>

            </tr>


            <tr>

          <td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>

        </tr>

        <tr>

          <td>

            <table cellpadding="4" cellspacing="0" width="100%" border="0">

              <tr>

                <td width="50%" valign="top" class="product_name1" colspan="2"><strong>Standard Accessories for Pool</strong></td>

              </tr>

            </table>

            <table cellpadding="4" cellspacing="4" width="100%" border="0" class="product_box1">

              <tr>

                <td width="50%" valign="top" class="product_text">

                <ul>

                  <li>Aramith Pool Ball 2.1/4" or 2.1/16"</li>

                  <li>Table Brush</li>

                  <li>60" Rest Stick C/W Brass Cross Head Rest</li>

                  <li>Wall Cue Rack</li>

                </ul></td>

                <td width="50%" valign="top" class="product_text">

                <ul>

                  <li>Plastic Triangle</li>

                  <li>Triangle Chalk X 12 Pcs.</li>

                  <li>Pool House Cue X 4 Pcs.</li>

                  <li>Table Cover</li>

                  <li>Round Type Lamp Shade X 2 Pcs.</li>

                </ul></td>

              </tr>

            </table>

          </td>

        </tr>

    </table></td>

<!-- body_text_eof //-->

     <td width="45" valign="top">

      <table border="0" width="45" cellspacing="0" cellpadding="0">

<!-- right_navigation //-->


从代码中可以看到,我想在xpath上查找的是以下字段:td[@class='product_name']/strong/a/@name

我还需要从此xpath中提取图像:rd[@align='center']/a/img/@src

我正在以CSV格式导出数据,目前我的刮板将所有产品名称存储在一个单元格中。我正在尝试使其在CSV的单个单元格中分别存储每个产品名称和图片网址。

我尝试为此使用循环,但无法使其正常工作
我的代码:

  def parse(self, response):
   hxs = HtmlXPathSelector(response)
   titles = hxs.select("//head")
   items = []
   item = item()

   for i in range(0,5):

     item ["productname"] = titles.select("//td[@class='product_name'][i]/strong").extract()
     item ["imgurl"] = titles.select("//td[@align='center'][i]/a/img/@src").extract()


     items.append(item)
     return(items)

最佳答案

names = hxs.xpath('//td[@class="product_name"]/strong/text()')
imageurls = hxs.xpath('//tr/td[@align="center"]/a/img/@src')
for name, url in zip(names, imageurls):
    item["productname"] = name
    item["imgurl"] = url
    yield item


这是最简单的方法,因为提取名称和图像url的顺序将彼此对应。

关于xpath - 循环以爬取同一页面上的多个元素,同时分别存储它们,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/24183258/

10-11 22:24
查看更多