Cheerio - Get correct text when selector returns more than one result The Next CEO of Stack OverflowGet an element by index in jQueryGet all non-unique values (i.e.: duplicate/more than one occurrence) in an arrayCan I add more jquery selectors to cheerio? (node.js)Scraping w/ Node.js + Cheerio - Duplicates, selector within selector creates duplicate returncheerio / jquery selectors: how to get text in tag a?jquery / cheerio selectors : how to get text in tag b?Getting text from Table CheerioCheerio returns undefined when using the “contains” selectorcheerio-req returns correct tags but without spacescheerio: Get normal + text nodesCheerio : Getting a text from a list
Could you use a laser beam as a modulated carrier wave for radio signal?
Upgrading From a 9 Speed Sora Derailleur?
Why does freezing point matter when picking cooler ice packs?
Early programmable calculators with RS-232
My ex-girlfriend uses my Apple ID to login to her iPad, do I have to give her my Apple ID password to reset it?
Ising model simulation
How to show a landlord what we have in savings?
Shortening a title without changing its meaning
MT "will strike" & LXX "will watch carefully" (Gen 3:15)?
Can Sri Krishna be called 'a person'?
How can a day be of 24 hours?
Can I cast Thunderwave and be at the center of its bottom face, but not be affected by it?
Does the Idaho Potato Commission associate potato skins with healthy eating?
How dangerous is XSS
Is it reasonable to ask other researchers to send me their previous grant applications?
How to unfasten electrical subpanel attached with ramset
Find a path from s to t using as few red nodes as possible
Direct Implications Between USA and UK in Event of No-Deal Brexit
How does a dynamic QR code work?
What steps are necessary to read a Modern SSD in Medieval Europe?
Are British MPs missing the point, with these 'Indicative Votes'?
Car headlights in a world without electricity
Why did the Drakh emissary look so blurred in S04:E11 "Lines of Communication"?
Man transported from Alternate World into ours by a Neutrino Detector
Cheerio - Get correct text when selector returns more than one result
The Next CEO of Stack OverflowGet an element by index in jQueryGet all non-unique values (i.e.: duplicate/more than one occurrence) in an arrayCan I add more jquery selectors to cheerio? (node.js)Scraping w/ Node.js + Cheerio - Duplicates, selector within selector creates duplicate returncheerio / jquery selectors: how to get text in tag a?jquery / cheerio selectors : how to get text in tag b?Getting text from Table CheerioCheerio returns undefined when using the “contains” selectorcheerio-req returns correct tags but without spacescheerio: Get normal + text nodesCheerio : Getting a text from a list
https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/1?ref=botao
I would like to get the market text from the page above. ("Sábado, 14 de Abril de 2018" and "16:00").
I did this with kotlin and the jsoup library:
val date = select("div.col-sm-8 > span.text-2")[1] //Sábado, 14 de Abril de 2018
val time = select("div.col-sm-8 > span.text-2")[2] //16:00
This query div.col-sm-8 > span.text-2
returns an array and I simple get the right information using the index.
But due to other issues, I have to use javascript.
I tried to do the same thing using JavaScript and the Cherio library but it seems that it doesn't work the same way, even if both search mode are based in JQuery:
const scherio = require('cheerio');
const rp = require('request-promise');
/**
* @type string
*/
const baseurl = "https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/";
const turn = 190;
let totalGames = 1;
const gamesPerRound = 10;
module.exports =
class FetchRoundsFromCbf
fetchRounds()
for (let i = 1; i <= totalGames; i++)
let url = baseurl.concat(i.toString());
rp(url).then(function (html)
const $ = scherio.load(html);
let date = $("div.col-sm-8 > span.text-2")[1];
let time = $("div.col-sm-8 > span.text-2")[2];
console.log(date.text());
console.log(time.text());
);
Gives me:
Unhandled rejection TypeError: date.text is not a function
at /home/alexandre/dev/flutter/brasileiro-parser-js/network/fetchdata/FetchRoundsFromCbf.js:32:39
at tryCatcher (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:512:31)
at Promise._settlePromise (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:569:18)
at Promise._settlePromise0 (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:614:10)
at Promise._settlePromises (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:694:18)
at _drainQueueStep (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:138:12)
at _drainQueue (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:131:9)
at Async._drainQueues (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:147:5)
at Immediate.Async.drainQueues [as _onImmediate] (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:17:14)
at processImmediate (timers.js:637:19)
Then I print only the query result:
console.log(date);
console.log(time);
I receive:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' Sábado, 14 de Abril de 2018',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Circular],
[Object],
[Object],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Circular],
next: [Object]
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' 16:00',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Object],
[Object],
[Circular],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next: null
I'm not very good at javascript, how would I do to retrieve the information I need?
javascript jquery web-scraping web-crawler cheerio
add a comment |
https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/1?ref=botao
I would like to get the market text from the page above. ("Sábado, 14 de Abril de 2018" and "16:00").
I did this with kotlin and the jsoup library:
val date = select("div.col-sm-8 > span.text-2")[1] //Sábado, 14 de Abril de 2018
val time = select("div.col-sm-8 > span.text-2")[2] //16:00
This query div.col-sm-8 > span.text-2
returns an array and I simple get the right information using the index.
But due to other issues, I have to use javascript.
I tried to do the same thing using JavaScript and the Cherio library but it seems that it doesn't work the same way, even if both search mode are based in JQuery:
const scherio = require('cheerio');
const rp = require('request-promise');
/**
* @type string
*/
const baseurl = "https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/";
const turn = 190;
let totalGames = 1;
const gamesPerRound = 10;
module.exports =
class FetchRoundsFromCbf
fetchRounds()
for (let i = 1; i <= totalGames; i++)
let url = baseurl.concat(i.toString());
rp(url).then(function (html)
const $ = scherio.load(html);
let date = $("div.col-sm-8 > span.text-2")[1];
let time = $("div.col-sm-8 > span.text-2")[2];
console.log(date.text());
console.log(time.text());
);
Gives me:
Unhandled rejection TypeError: date.text is not a function
at /home/alexandre/dev/flutter/brasileiro-parser-js/network/fetchdata/FetchRoundsFromCbf.js:32:39
at tryCatcher (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:512:31)
at Promise._settlePromise (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:569:18)
at Promise._settlePromise0 (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:614:10)
at Promise._settlePromises (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:694:18)
at _drainQueueStep (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:138:12)
at _drainQueue (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:131:9)
at Async._drainQueues (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:147:5)
at Immediate.Async.drainQueues [as _onImmediate] (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:17:14)
at processImmediate (timers.js:637:19)
Then I print only the query result:
console.log(date);
console.log(time);
I receive:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' Sábado, 14 de Abril de 2018',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Circular],
[Object],
[Object],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Circular],
next: [Object]
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' 16:00',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Object],
[Object],
[Circular],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next: null
I'm not very good at javascript, how would I do to retrieve the information I need?
javascript jquery web-scraping web-crawler cheerio
add a comment |
https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/1?ref=botao
I would like to get the market text from the page above. ("Sábado, 14 de Abril de 2018" and "16:00").
I did this with kotlin and the jsoup library:
val date = select("div.col-sm-8 > span.text-2")[1] //Sábado, 14 de Abril de 2018
val time = select("div.col-sm-8 > span.text-2")[2] //16:00
This query div.col-sm-8 > span.text-2
returns an array and I simple get the right information using the index.
But due to other issues, I have to use javascript.
I tried to do the same thing using JavaScript and the Cherio library but it seems that it doesn't work the same way, even if both search mode are based in JQuery:
const scherio = require('cheerio');
const rp = require('request-promise');
/**
* @type string
*/
const baseurl = "https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/";
const turn = 190;
let totalGames = 1;
const gamesPerRound = 10;
module.exports =
class FetchRoundsFromCbf
fetchRounds()
for (let i = 1; i <= totalGames; i++)
let url = baseurl.concat(i.toString());
rp(url).then(function (html)
const $ = scherio.load(html);
let date = $("div.col-sm-8 > span.text-2")[1];
let time = $("div.col-sm-8 > span.text-2")[2];
console.log(date.text());
console.log(time.text());
);
Gives me:
Unhandled rejection TypeError: date.text is not a function
at /home/alexandre/dev/flutter/brasileiro-parser-js/network/fetchdata/FetchRoundsFromCbf.js:32:39
at tryCatcher (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:512:31)
at Promise._settlePromise (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:569:18)
at Promise._settlePromise0 (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:614:10)
at Promise._settlePromises (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:694:18)
at _drainQueueStep (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:138:12)
at _drainQueue (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:131:9)
at Async._drainQueues (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:147:5)
at Immediate.Async.drainQueues [as _onImmediate] (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:17:14)
at processImmediate (timers.js:637:19)
Then I print only the query result:
console.log(date);
console.log(time);
I receive:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' Sábado, 14 de Abril de 2018',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Circular],
[Object],
[Object],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Circular],
next: [Object]
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' 16:00',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Object],
[Object],
[Circular],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next: null
I'm not very good at javascript, how would I do to retrieve the information I need?
javascript jquery web-scraping web-crawler cheerio
https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/1?ref=botao
I would like to get the market text from the page above. ("Sábado, 14 de Abril de 2018" and "16:00").
I did this with kotlin and the jsoup library:
val date = select("div.col-sm-8 > span.text-2")[1] //Sábado, 14 de Abril de 2018
val time = select("div.col-sm-8 > span.text-2")[2] //16:00
This query div.col-sm-8 > span.text-2
returns an array and I simple get the right information using the index.
But due to other issues, I have to use javascript.
I tried to do the same thing using JavaScript and the Cherio library but it seems that it doesn't work the same way, even if both search mode are based in JQuery:
const scherio = require('cheerio');
const rp = require('request-promise');
/**
* @type string
*/
const baseurl = "https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/";
const turn = 190;
let totalGames = 1;
const gamesPerRound = 10;
module.exports =
class FetchRoundsFromCbf
fetchRounds()
for (let i = 1; i <= totalGames; i++)
let url = baseurl.concat(i.toString());
rp(url).then(function (html)
const $ = scherio.load(html);
let date = $("div.col-sm-8 > span.text-2")[1];
let time = $("div.col-sm-8 > span.text-2")[2];
console.log(date.text());
console.log(time.text());
);
Gives me:
Unhandled rejection TypeError: date.text is not a function
at /home/alexandre/dev/flutter/brasileiro-parser-js/network/fetchdata/FetchRoundsFromCbf.js:32:39
at tryCatcher (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:512:31)
at Promise._settlePromise (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:569:18)
at Promise._settlePromise0 (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:614:10)
at Promise._settlePromises (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:694:18)
at _drainQueueStep (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:138:12)
at _drainQueue (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:131:9)
at Async._drainQueues (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:147:5)
at Immediate.Async.drainQueues [as _onImmediate] (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:17:14)
at processImmediate (timers.js:637:19)
Then I print only the query result:
console.log(date);
console.log(time);
I receive:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' Sábado, 14 de Abril de 2018',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Circular],
[Object],
[Object],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Circular],
next: [Object]
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'text-2 p-r-20' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ type: 'tag',
name: 'i',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [],
parent: [Circular],
prev: null,
next: [Object] ,
type: 'text',
data: ' 16:00',
parent: [Circular],
prev: [Object],
next: null ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] class: 'col-sm-8' ,
'x-attribsNamespace': [Object: null prototype] class: undefined ,
'x-attribsPrefix': [Object: null prototype] class: undefined ,
children:
[ [Object],
[Object],
[Object],
[Object],
[Object],
[Circular],
[Object] ],
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent: [Object],
prev: null,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent: [Object],
prev: [Circular],
next: [Object] ,
prev:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev:
type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Circular] ,
next: [Circular] ,
next:
type: 'text',
data: 'n ',
parent:
type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] ,
prev: [Circular],
next: null
I'm not very good at javascript, how would I do to retrieve the information I need?
javascript jquery web-scraping web-crawler cheerio
javascript jquery web-scraping web-crawler cheerio
edited Mar 9 at 0:53
Guillermo Gutiérrez
10.4k136997
10.4k136997
asked Mar 8 at 19:17
alexpfxalexpfx
1,87942045
1,87942045
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You can use eq()
to get a Cheerio element by index, the same way as in jQuery.
let date = $("div.col-sm-8 > span.text-2").eq(1);
let time = $("div.col-sm-8 > span.text-2").eq(2);
eq()
reduces the set of matched elements to the one at the specified index.
add a comment |
I managed what I want by using slice:
let date = $("div.col-sm-8").find("span").slice(1);
let time = $("div.col-sm-8").find("span").slice(2);
console.log(date.text());
console.log(time.text());
add a comment |
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55069621%2fcheerio-get-correct-text-when-selector-returns-more-than-one-result%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use eq()
to get a Cheerio element by index, the same way as in jQuery.
let date = $("div.col-sm-8 > span.text-2").eq(1);
let time = $("div.col-sm-8 > span.text-2").eq(2);
eq()
reduces the set of matched elements to the one at the specified index.
add a comment |
You can use eq()
to get a Cheerio element by index, the same way as in jQuery.
let date = $("div.col-sm-8 > span.text-2").eq(1);
let time = $("div.col-sm-8 > span.text-2").eq(2);
eq()
reduces the set of matched elements to the one at the specified index.
add a comment |
You can use eq()
to get a Cheerio element by index, the same way as in jQuery.
let date = $("div.col-sm-8 > span.text-2").eq(1);
let time = $("div.col-sm-8 > span.text-2").eq(2);
eq()
reduces the set of matched elements to the one at the specified index.
You can use eq()
to get a Cheerio element by index, the same way as in jQuery.
let date = $("div.col-sm-8 > span.text-2").eq(1);
let time = $("div.col-sm-8 > span.text-2").eq(2);
eq()
reduces the set of matched elements to the one at the specified index.
answered Mar 9 at 0:49
Guillermo GutiérrezGuillermo Gutiérrez
10.4k136997
10.4k136997
add a comment |
add a comment |
I managed what I want by using slice:
let date = $("div.col-sm-8").find("span").slice(1);
let time = $("div.col-sm-8").find("span").slice(2);
console.log(date.text());
console.log(time.text());
add a comment |
I managed what I want by using slice:
let date = $("div.col-sm-8").find("span").slice(1);
let time = $("div.col-sm-8").find("span").slice(2);
console.log(date.text());
console.log(time.text());
add a comment |
I managed what I want by using slice:
let date = $("div.col-sm-8").find("span").slice(1);
let time = $("div.col-sm-8").find("span").slice(2);
console.log(date.text());
console.log(time.text());
I managed what I want by using slice:
let date = $("div.col-sm-8").find("span").slice(1);
let time = $("div.col-sm-8").find("span").slice(2);
console.log(date.text());
console.log(time.text());
answered Mar 8 at 20:04
alexpfxalexpfx
1,87942045
1,87942045
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55069621%2fcheerio-get-correct-text-when-selector-returns-more-than-one-result%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown