Web crawler crawl error

Question

Web crawler crawl error

I am doing this simple scalping tutorial that is given on the official scrapy website, but getting some errors. I am doing this for the first time, so completely unknown about all this. I need to implement a web crawler in my application, and I found that scrapy to fulfill my needs started with a tutorial and ended up with the error that I inserted below. Can someone explain to me what happened to the code ..?

THIS IS MY ROUND CODE

from scrapy.spider import Spider

class DmozSpider(Spider):

    name="dmoz"

    allowed_domains = ["dmoz.org"]

    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
    ]

    def parse(self, response):

   filename = response.url.split("/")[-2]

   open(filename, 'wb').write(response.body)

THIS IS AN ERROR I ACCEPT.

2014-02-04 10: 45: 51 + 0530 [scrapy] DEBUG: listening to web services at 0.0.0.0:6080 2014-02-04 10: 45: 51 + 0530 [dmoz] DEBUG: Crawled (200) http: //www.dmoz.org/Computers/Programming/Languages/Python/Resources/ "> (referer: None)

: Spider http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/ " > Traceback ( ): " /usr/lib/python 2.7/dist-packages/twisted/internet/base.py ", 1178, mainLoop self.runUntilCurrent() " /usr/lib/python 2.7/dist-packages/twisted/internet/base.py ", 800, runUntilCurrent call.func(* call.args, ** call.kw) " /usr/lib/python 2.7/dist-packages/twisted/internet/defer.py ", 362, self._startRunCallbacks () " /usr/lib/python 2.7/dist-packages/twisted/internet/defer.py ", 458, _startRunCallbacks self._runCallbacks() --- --- " /usr/lib/python 2.7/dist-packages/twisted/internet/defer.py ", 545, _runCallbacks current.result = callback (current.result, * args, ** kw) " /usr/local/lib/python 2.7/dist-packages/scrapy/spider.py", 56, NotImplementedError exceptions.NotImplementedError:

+3

python-2.7 web-crawler scrapy

y. dixit 04 . '14 5:51

1

Guy Gavriely · Accepted Answer · 2014-02-04T06:27:25+0000

, parse , , , , , , , , t20 > DmozSpider

Web crawler crawl error

More articles: