python - How do i fix this; "TypeError: 'WikipediaItem' object does not support item assignment" -


i new python , scrapy. want scrape data wikipedia things didn't work out. everytime scrapy crawl wiki, get; "typeerror: 'wikipediaitem' object not support item assignment". how fix , me scrape details wikipedia.

anyway, here's code:

from scrapy.spider import basespider scrapy.selector import htmlxpathselector wikipedia.items import wikipediaitem  class wikipediaitem(basespider):     name = "wiki"     allowed_domains = ["wikipedia.org"]     start_urls = ["http://en.wikipedia.org/wiki/main_page"]      def parse(self, response):         hxs = htmlxpathselector(response)         sites = hxs.select('//table[@id="mp-upper"]/tr')         items = []         site in sites:             item = wikipediaitem()             item['title'] = site.select('.//a[@class="mainpagebg"]/text()').extract()             item['link'] = site.select('.//a[@class="mainpagebg"]').extract()             item['details'] = site.select('.//p/text()').extract()             items.append(item)         return items 

and here's result get:

2013-04-18 23:56:54+0800 [scrapy] info: scrapy 0.14.4 started (bot: wikipedia) 2013-04-18 23:56:54+0800 [scrapy] debug: enabled extensions: logstats, telnetconsole, closespider, webservice, corestats, memoryusage, spiderstate 2013-04-18 23:56:54+0800 [scrapy] debug: enabled downloader middlewares: httpauthmiddleware, downloadtimeoutmiddleware, useragentmiddleware, retrymiddleware, defaultheadersmiddleware, redirectmiddleware, cookiesmiddleware, httpcompressionmiddleware, chunkedtransfermiddleware, downloaderstats 2013-04-18 23:56:54+0800 [scrapy] debug: enabled spider middlewares: httperrormiddleware, offsitemiddleware, referermiddleware, urllengthmiddleware, depthmiddleware     2013-04-18 23:56:54+0800 [scrapy] debug: enabled item pipelines:  2013-04-18 23:56:54+0800 [wiki] info: spider opened 2013-04-18 23:56:54+0800 [wiki] info: crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2013-04-18 23:56:54+0800 [scrapy] debug: telnet console listening on 0.0.0.0:6023 2013-04-18 23:56:54+0800 [scrapy] debug: web service listening on 0.0.0.0:6080 2013-04-18 23:56:56+0800 [wiki] debug: crawled (200) <get http://en.wikipedia.org/wiki/main_page> (referer: none) 2013-04-18 23:56:56+0800 [wiki] error: spider error processing <get http://en.wikipedia.org/wiki/main_page>     traceback (most recent call last):       file "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1178, in mainloop         self.rununtilcurrent()       file "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 800, in rununtilcurrent         call.func(*call.args, **call.kw)       file "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 368, in callback         self._startruncallbacks(result)       file "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 464, in _startruncallbacks         self._runcallbacks()     --- <exception caught here> ---       file "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 551, in _runcallbacks         current.result = callback(current.result, *args, **kw)       file "/home/jean/wiki/wikipedia/spiders/wikipedia_spider.py", line 17, in parse         item['title'] = row.select('.//a[@class="mainpagebg"]/text()').extract()     exceptions.typeerror: 'wikipediaitem' object not support item assignment 2013-04-18 23:56:56+0800 [wiki] info: closing spider (finished) 2013-04-18 23:56:56+0800 [wiki] info: dumping spider stats:     {'downloader/request_bytes': 215,      'downloader/request_count': 1,          'downloader/request_method_count/get': 1,      'downloader/response_bytes': 17762,      'downloader/response_count': 1,      'downloader/response_status_count/200': 1,      'finish_reason': 'finished',      'finish_time': datetime.datetime(2013, 4, 18, 15, 56, 56, 244255),          'scheduler/memory_enqueued': 1,      'spider_exceptions/typeerror': 1,      'start_time': datetime.datetime(2013, 4, 18, 15, 56, 54, 592948)} 2013-04-18 23:56:56+0800 [wiki] info: spider closed (finished) 2013-04-18 23:56:56+0800 [scrapy] info: dumping global stats:     {'memusage/max': 28065792, 'memusage/startup': 28065792} 

here's items.py

from scrapy.item import item, field

class wikipediaitem(item):

title = field()  link = field()  details = field() 

you named scraper same wikipediaitem imported:

from wikipedia.items import wikipediaitem  class wikipediaitem(basespider):     # ... 

the parse using basespider subclass, not whatever defined in wikipedia.items. perhaps want rename class:

class wikipediaspider(basespider):     # ... 

Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -