Unable to fetch some links using list comprehension within scrapyGenerator Expressions vs. List ComprehensionList comprehension vs mapCreate a dictionary with list comprehension in Pythonlist comprehension vs. lambda + filterif/else in Python's list comprehension?if else in a list comprehensionScrapy - parse a page to extract items - then follow and store item url contentsScrapy: Get data on page and following linkHow to keep track of a request in scrapyPython Scrapy and yielding
Could solar power be utilized and substitute coal in the 19th century?
Reply ‘no position’ while the job posting is still there (‘HiWi’ position in Germany)
Visiting the UK as unmarried couple
Is it improper etiquette to ask your opponent what his/her rating is before the game?
Why did the EU agree to delay the Brexit deadline?
How to get the similar sounding words together
Why has "pence" been used in this sentence, not "pences"?
Using a siddur to Daven from in a seforim store
Is camera lens focus an exact point or a range?
Find last 3 digits of this monster number
Engineer refusing to file/disclose patents
Can somebody explain Brexit in a few child-proof sentences?
Does the Mind Blank spell prevent the target from being frightened?
Do Legal Documents Require Signing In Standard Pen Colors?
Could the E-bike drivetrain wear down till needing replacement after 400 km?
Melting point of aspirin, contradicting sources
How do I extrude a face to a single vertex
Can I Retrieve Email Addresses from BCC?
Would it be legal for a US State to ban exports of a natural resource?
Open a doc from terminal, but not by its name
What is the grammatical term for “‑ed” words like these?
Did US corporations pay demonstrators in the German demonstrations against article 13?
Can I rely on this github repository files?
In Star Trek IV, why did the Bounty go back to a time when whales were already rare?
Unable to fetch some links using list comprehension within scrapy
Generator Expressions vs. List ComprehensionList comprehension vs mapCreate a dictionary with list comprehension in Pythonlist comprehension vs. lambda + filterif/else in Python's list comprehension?if else in a list comprehensionScrapy - parse a page to extract items - then follow and store item url contentsScrapy: Get data on page and following linkHow to keep track of a request in scrapyPython Scrapy and yielding
I've written a script in python using scrapy to get the links from response after making a post request to a certain url. The links are perfectly coming through when I try with the following script.
Working one:
import scrapy
from scrapy.crawler import CrawlerProcess
class AftnetSpider(scrapy.Spider):
name = "aftnet"
base_url = "http://www.aftnet.be/MyAFT/Clubs/SearchClubs"
def start_requests(self):
yield scrapy.FormRequest(self.base_url,callback=self.parse,formdata='regions':'1,3,4,6')
def parse(self,response):
for items in response.css("dl.club-item"):
for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall():
yield "result_url":response.urljoin(item)
if __name__ == "__main__":
c = CrawlerProcess(
'USER_AGENT': 'Mozilla/5.0',
)
c.crawl(AftnetSpider)
c.start()
However, my intention is to achieve the same using list comprehension but I'm getting some error.
Using list comprehension:
def parse(self,response):
return [response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
I get the following error:
2019-03-08 12:45:44 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'str' in <POST http://www.aftnet.be/MyAFT/Clubs/SearchClubs>
How can I get some links using list comprehension within scrapy?
python python-3.x web-scraping scrapy
add a comment |
I've written a script in python using scrapy to get the links from response after making a post request to a certain url. The links are perfectly coming through when I try with the following script.
Working one:
import scrapy
from scrapy.crawler import CrawlerProcess
class AftnetSpider(scrapy.Spider):
name = "aftnet"
base_url = "http://www.aftnet.be/MyAFT/Clubs/SearchClubs"
def start_requests(self):
yield scrapy.FormRequest(self.base_url,callback=self.parse,formdata='regions':'1,3,4,6')
def parse(self,response):
for items in response.css("dl.club-item"):
for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall():
yield "result_url":response.urljoin(item)
if __name__ == "__main__":
c = CrawlerProcess(
'USER_AGENT': 'Mozilla/5.0',
)
c.crawl(AftnetSpider)
c.start()
However, my intention is to achieve the same using list comprehension but I'm getting some error.
Using list comprehension:
def parse(self,response):
return [response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
I get the following error:
2019-03-08 12:45:44 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'str' in <POST http://www.aftnet.be/MyAFT/Clubs/SearchClubs>
How can I get some links using list comprehension within scrapy?
python python-3.x web-scraping scrapy
add a comment |
I've written a script in python using scrapy to get the links from response after making a post request to a certain url. The links are perfectly coming through when I try with the following script.
Working one:
import scrapy
from scrapy.crawler import CrawlerProcess
class AftnetSpider(scrapy.Spider):
name = "aftnet"
base_url = "http://www.aftnet.be/MyAFT/Clubs/SearchClubs"
def start_requests(self):
yield scrapy.FormRequest(self.base_url,callback=self.parse,formdata='regions':'1,3,4,6')
def parse(self,response):
for items in response.css("dl.club-item"):
for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall():
yield "result_url":response.urljoin(item)
if __name__ == "__main__":
c = CrawlerProcess(
'USER_AGENT': 'Mozilla/5.0',
)
c.crawl(AftnetSpider)
c.start()
However, my intention is to achieve the same using list comprehension but I'm getting some error.
Using list comprehension:
def parse(self,response):
return [response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
I get the following error:
2019-03-08 12:45:44 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'str' in <POST http://www.aftnet.be/MyAFT/Clubs/SearchClubs>
How can I get some links using list comprehension within scrapy?
python python-3.x web-scraping scrapy
I've written a script in python using scrapy to get the links from response after making a post request to a certain url. The links are perfectly coming through when I try with the following script.
Working one:
import scrapy
from scrapy.crawler import CrawlerProcess
class AftnetSpider(scrapy.Spider):
name = "aftnet"
base_url = "http://www.aftnet.be/MyAFT/Clubs/SearchClubs"
def start_requests(self):
yield scrapy.FormRequest(self.base_url,callback=self.parse,formdata='regions':'1,3,4,6')
def parse(self,response):
for items in response.css("dl.club-item"):
for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall():
yield "result_url":response.urljoin(item)
if __name__ == "__main__":
c = CrawlerProcess(
'USER_AGENT': 'Mozilla/5.0',
)
c.crawl(AftnetSpider)
c.start()
However, my intention is to achieve the same using list comprehension but I'm getting some error.
Using list comprehension:
def parse(self,response):
return [response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
I get the following error:
2019-03-08 12:45:44 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'str' in <POST http://www.aftnet.be/MyAFT/Clubs/SearchClubs>
How can I get some links using list comprehension within scrapy?
python python-3.x web-scraping scrapy
python python-3.x web-scraping scrapy
asked Mar 8 at 7:03
MITHUMITHU
239217
239217
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Your generator with a loop is returning a single dict
on every call:
yield "result_url":response.urljoin(item)
But your list comprehension is returning a list of strings. I don't know why you want a list comprehension here: your generator is much easier to understand (as shown by the fact that you have got it to work and are having trouble with the list comprehension) but if you insist on doing it, what you need is a list of dicts
not strings, something like
return ["result_url":response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
But please don't do that. Remember that readability counts. Your generator is readable, your one-liner isn't.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55058289%2funable-to-fetch-some-links-using-list-comprehension-within-scrapy%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your generator with a loop is returning a single dict
on every call:
yield "result_url":response.urljoin(item)
But your list comprehension is returning a list of strings. I don't know why you want a list comprehension here: your generator is much easier to understand (as shown by the fact that you have got it to work and are having trouble with the list comprehension) but if you insist on doing it, what you need is a list of dicts
not strings, something like
return ["result_url":response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
But please don't do that. Remember that readability counts. Your generator is readable, your one-liner isn't.
add a comment |
Your generator with a loop is returning a single dict
on every call:
yield "result_url":response.urljoin(item)
But your list comprehension is returning a list of strings. I don't know why you want a list comprehension here: your generator is much easier to understand (as shown by the fact that you have got it to work and are having trouble with the list comprehension) but if you insist on doing it, what you need is a list of dicts
not strings, something like
return ["result_url":response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
But please don't do that. Remember that readability counts. Your generator is readable, your one-liner isn't.
add a comment |
Your generator with a loop is returning a single dict
on every call:
yield "result_url":response.urljoin(item)
But your list comprehension is returning a list of strings. I don't know why you want a list comprehension here: your generator is much easier to understand (as shown by the fact that you have got it to work and are having trouble with the list comprehension) but if you insist on doing it, what you need is a list of dicts
not strings, something like
return ["result_url":response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
But please don't do that. Remember that readability counts. Your generator is readable, your one-liner isn't.
Your generator with a loop is returning a single dict
on every call:
yield "result_url":response.urljoin(item)
But your list comprehension is returning a list of strings. I don't know why you want a list comprehension here: your generator is much easier to understand (as shown by the fact that you have got it to work and are having trouble with the list comprehension) but if you insist on doing it, what you need is a list of dicts
not strings, something like
return ["result_url":response.urljoin(item) for items in response.css("dl.club-item") for item in items.css("dd a[data-toggle='popover']::attr('data-url')").getall()]
But please don't do that. Remember that readability counts. Your generator is readable, your one-liner isn't.
answered Mar 8 at 8:31
BoarGulesBoarGules
8,43721228
8,43721228
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55058289%2funable-to-fetch-some-links-using-list-comprehension-within-scrapy%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown