Skip to content

AlexKopen/AUR-Package-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AUR Package Scraper

Aggregates package metadata from the Arch User Repository.

About

This script leverages Puppeteer, Puppeteer Cluster, and Cheerio to scrape and parse package details from the Arch User Repository. This data is written to a JSON file and can be used in any database or application.

Installation

npm install

Usage

npm start

All scraped values will be saved to aur-package-data.json. The number of cores on the host system is used to determine the level of concurrency. As of 10/15/19, using a stable internet connection and 12-core processor, the total runtime is approximately 50 seconds. The generated JSON file is around 475,683 lines long and 11.9 Mib in size.

Sample JSON Output

	{
		"name": "yay",
		"version": "9.3.1-1",
		"votes": 793,
		"popularity": 61.36,
		"description": "Yet another yogurt. Pacman wrapper and AUR helper written in go.",
		"maintainer": "jguer"
	}

About

Aggregates package metadata from the Arch User Repository

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published