An unofficial nodeJS wrapper for the ScrapingBee API.
SPB unofficial wrapper, is a nodeJS module, whose sole purpose is to provide a nice and simple way to access ScrapingBee's API while working with NodeJS.
DISLAIMER: This is not an official ScrapingBee product.
npm install spb-unofficial-wrapper
To use this module you have to first create an account on ScrapingBee to get an API key.
The most simple way to use the module is the following:
const Scraper = require('spb-unofficial-wrapper')
const scraper = new Scraper('YOUR_API_KEY')
scraper.request('https://website.com')
.get()
.then(response => {
console.log(repsonse.data)
console.log(response.cost)
})
.catch(err => {
console.log(err.message)
console.log(err.status)
})
You can configure your scraper by setting a default config that will be used in all consequent requests. This can be performed with the following code.
const Scraper = require('spb-unofficial-wrapper')
const configuration = {....} // Please check below for detailed documentation about available settings
const scraper = new Scraper('YOUR_API_KEY', configuration)
The configuration contains the following values by default:
{
"request": {
"cookies": [],
"headers": [],
},
"block": {
"ads": true,
"resources": true
},
"settings": {
"premiumProxy": false,
"countryCode": ""
},
"javascript": {
"render": true,
"snippet": "",
"waitForLoad": 0,
"responseWithoutRunningJs": false
},
"css": {
"waitForSelector": ""
}
}
If you only provide on value for a nested object, eg. provide only render flag for javascript settings, the other values will keep their default values. So:
{
"javascript": {
"render": false
}
}
Becomes:
{
"javascript": {
"render": false,
"snippet": "",
"waitForLoad": 0,
"responseWithoutRunningJs": false
}
}
You can also use the request builder to set each property of the request. See the following example showing how it works:
const Scraper = require('spb-unofficial-wrapper')
const scraper = new Scraper('YOUR_API_KEY')
scraper.request('https://website.com')
.setAdsBlocking(true)
.setResourcesBlocks(false)
.setPremiumProxy(true)
.get()
.then(response => {
console.log(repsonse.data)
console.log(response.cost)
})
.catch(err => {
console.log(err.message)
console.log(err.statusCode)
})
If you have already passed a default configuration to the scaper, the values will be overrided for the specific request when using the functions of the builder.
You can find detailed documentation about the builder functions and how to use them here, under the Methods section of the Builder class.
You can easily perform post request using the following code.
const Scraper = require('spb-unofficial-wrapper')
const scraper = new Scraper('YOUR_API_KEY')
scraper.request('https://website.com')
.post({
'email': '[email protected]',
'password': 'mysecretpassword'
})
Data will be send using content-type: application/x-www-form-urlencoded and it will be url encoded. All other methods of the request builder are also available in POST requests.
After calling the get method of a request, if the request is successfull it will resolve and the response object will contain the following fields:
data: The data returned from ScrapingBee
headers: The headers returned from the scraper
cost: How much the request costed in credits
statusCode: The status code that was returned to the scraper (it can be different than the one ScrapingBee returned)
resolvedURL: The URL that was scraped (useful in case of redirects)
In case of a failed request the following fields are retunred in the response object:
error: A message explaining what went wrong
cost: How much the request costed in credits (If the website returns a 404 status the request is charged)
statusCode: The status returned from the scraper
headers: The headers returned from the scraper
This module allows to also calculate the cost of a request before even performing it. Keep in mind these rules can change so don't rely that match on this feature. To calculate the cost of a request you can use the following code:
scraper.request('https://website.com')
.setJavascriptRendering(true)
.setPremiumProxy(true)
.calculateCost() // Returns 100 because we use a premium proxy with javascript rendering
Another example would be:
scraper.request('https://website.com')
.setJavascriptRendering(false)
.setPremiumProxy(true)
.calculateCost() // Returns 10 because we use a premium proxy without javascript rendering
npm run build
npm run test
npm run docs
👤 George Koniaris
- My Blog: https://gkoniaris.gr
- Twitter: @gkondev
- Github: @gkoniaris
- LinkedIn: @gkon
Give a ⭐️ if this project helped you!
This README was generated with ❤️ by readme-md-generator