Skip to content

Commit

Permalink
view previous download history
Browse files Browse the repository at this point in the history
  • Loading branch information
drawrowfly committed Mar 30, 2020
1 parent 3d74fa7 commit 48594f2
Show file tree
Hide file tree
Showing 9 changed files with 220 additions and 22 deletions.
51 changes: 48 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ This is not an official API support and etc. This is just a scraper that is usin
- Sign URL to make custom request to the TIkTok API
- Extract metadata from the User, Hashtag and Sginel Video pages
- **Save previous progress and download only new videos that weren't downloaded before**. This feature only works from the CLI and only if **download** flag is on.
- **View and manage previously downloaded posts history in the CLI**

## To Do

Expand All @@ -31,8 +32,8 @@ This is not an official API support and etc. This is just a scraper that is usin
- [x] Add tests
- [x] Download video without the watermark
- [x] Indicate in the output file(csv/json) if the video was downloaded or not
- [ ] Scrape metadata and download video from the multiple users/hashtags specified in a source(file or etc)
- [ ] Scrape users/hashtag
- [ ] Scrape metadata and download posts from different users/hashtags in batch
- [ ] Scrape users/hashtags
- [ ] Web interface

## Contribution
Expand All @@ -48,7 +49,7 @@ yarn test
yarn build
```

## JSON/CSV output:
## Post metadata example:

```javascript
{
Expand Down Expand Up @@ -86,8 +87,37 @@ yarn build
}[]
```

## CSV file example

![Demo](https://i.imgur.com/6gIbBzo.png)

## View and manage previously downloaded posts history in the CLI

You can only view this history from the CLI and only if you have used -s flag in your previous scraper executions
**-s** save download history to avoid downloading duplicate posts in the future

To view history record:

```sh
tiktok-scraper history
```

To delete single history record:

```sh
tiktok-scraper history -r TYPE:INPUT
tiktok-scraper history -r user:tiktok
tiktok-scraper history -r hashtag:summer
tiktok-scraper history -r trend
```

To delete all records:

```sh
tiktok-scraper history -r all
```

![History](https://i.imgur.com/VnDKh72.png)
**Possible errors**

- Unknown. Report them if you will receive any
Expand Down Expand Up @@ -122,6 +152,7 @@ Commands:
tiktok-scraper hashtag [id] Scrape videos from hashtag. Enter hashtag without #
tiktok-scraper trend Scrape posts from current trends
tiktok-scraper music [id] Scrape posts from a music id number
tiktok-scraper history View previous download history

Options:
--help, -h help [boolean]
Expand All @@ -142,12 +173,19 @@ Options:
avoiding duplicates [boolean] [default: false]
--noWaterMark, -w Download video without the watermark. This option will
affect the execution speed [boolean] [default: false]
--remove, -r Delete the history record by entering "TYPE:INPUT" or
"all" to clean all the history. For example: user:bob
[default: ""]

Examples:
tiktok-scraper user USERNAME -d -n 100
tiktok-scraper hashtag HASHTAG_NAME -d -n 100
tiktok-scraper trend -d -n 100
tiktok-scraper music MUSICID -n 100
tiktok-scraper music MUSIC_ID -d -n 50
tiktok-scraper history
tiktok-scraper history -r user:bob
tiktok-scraper history -r all
```

**Example 1:**
Expand Down Expand Up @@ -222,6 +260,13 @@ ZIP path: /{CURRENT_PATH}/trend_1552945659138.zip
CSV path: /{CURRENT_PATH}/tend_1552945659138.csv
```

**Example 7:**
View previous download history

```sh
tiktok-scraper history
```

**To make it look better, when downloading posts the progress will be shown in terminal**

```sh
Expand Down
43 changes: 40 additions & 3 deletions bin/cli.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
/* eslint-disable prefer-destructuring */
/* eslint-disable no-param-reassign */

const yargs = require('yargs');
const TikTokScraper = require('../build');
const CONST = require('../build/constant');

Expand All @@ -25,16 +26,26 @@ const startScraper = async argv => {
if (scraper.csv) {
console.log(`CSV path: ${scraper.csv}`);
}
if (scraper.message) {
console.log(scraper.message);
}
if (scraper.table) {
console.table(scraper.table);
}
} catch (error) {
console.log(error);
}
};

require('yargs')
yargs
.usage('Usage: $0 <command> [options]')
.example(`$0 user USERNAME -d -n 100`)
.example(`$0 trend -d -n 100`)
.example(`$0 hashtag HASHTAG_NAME -d -n 100`)
.example(`$0 music MUSIC_ID -d -n 50`)
.example(`$0 history`)
.example(`$0 history -r user:bob`)
.example(`$0 history -r all`)
.command('user [id]', 'Scrape videos from username. Enter only username', {}, argv => {
startScraper(argv);
})
Expand All @@ -47,6 +58,9 @@ require('yargs')
.command('music [id]', 'Scrape videos from music id. Enter only music id', {}, argv => {
startScraper(argv);
})
.command('history', 'View previous post download history', {}, argv => {
startScraper(argv);
})
.options({
help: {
alias: 'h',
Expand Down Expand Up @@ -90,6 +104,11 @@ require('yargs')
default: false,
describe: 'Download video without the watermark. This option will affect the execution speed',
},
remove: {
alias: ['r'],
default: '',
describe: 'Delete the history record by entering "TYPE:INPUT" or "all" to clean all the history. For example: user:bob',
},
})
.check(argv => {
if (CONST.scrape.indexOf(argv._[0]) === -1) {
Expand All @@ -98,9 +117,27 @@ require('yargs')

if (argv.store) {
if (!argv.download) {
throw new Error('--store, -s flag only works in combination with the download flag. Add -d to your command');
throw new Error('--store, -s flag is only working in combination with the download flag. Add -d to your command');
}
}

if (argv.remove) {
if (argv.remove.indexOf(':') === -1) {
argv.remove = `${argv.remove}:`;
}
const split = argv.remove.split(':');
const type = split[0];
const input = split[1];

if (type !== 'all' && CONST.history.indexOf(type) === -1) {
throw new Error(`--remove, -r list of allowed types: ${CONST.history}`);
}
if (!input && type !== 'trend' && type !== 'all') {
throw new Error('--remove, -r to remove the specific history record you need to enter "TYPE:INPUT". For example: user:bob');
}
}

return true;
})
.demandCommand().argv;
.demandCommand()
.help().argv;
3 changes: 2 additions & 1 deletion src/constant/index.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
export = {
scrape: ['user', 'hashtag', 'trend', 'music', 'discover_user', 'discover_hashtag', 'discover_music'],
scrape: ['user', 'hashtag', 'trend', 'music', 'discover_user', 'discover_hashtag', 'discover_music', 'history'],
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:74.0) Gecko/20100101 Firefox/74.0',
history: ['user', 'hashtag', 'trend', 'music'],
};
22 changes: 18 additions & 4 deletions src/core/TikTok.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,11 @@ describe('TikTok Scraper MODULE(promise): user(valid input data)', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 3,
filetype: '',
filepath: '',
input: 'tiktok',
noWaterMark: false,
type: 'user',
userAgent: 'Custom User-Agent',
proxy: '',
Expand Down Expand Up @@ -54,6 +56,7 @@ describe('TikTok Scraper MODULE(event): user(valid input data)', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: 'tiktok',
Expand Down Expand Up @@ -115,6 +118,7 @@ describe('TikTok Scraper MODULE(promise): user(invalid input data)', () => {
const instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: '',
Expand All @@ -130,6 +134,7 @@ describe('TikTok Scraper MODULE(promise): user(invalid input data)', () => {
const instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: '',
Expand All @@ -147,6 +152,7 @@ describe('TikTok Scraper MODULE(event): user(invalid input data)', () => {
const instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: '',
Expand All @@ -167,6 +173,7 @@ describe('TikTok Scraper MODULE(event): user(invalid input data)', () => {
const instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: '',
Expand All @@ -193,6 +200,7 @@ describe('TikTok Scraper MODULE(promise): user(save to a file)', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: 'all',
filepath: '',
input: 'tiktok',
Expand Down Expand Up @@ -225,6 +233,7 @@ describe('TikTok Scraper MODULE(promise): hashtag(valid input data)', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: 'summer',
Expand Down Expand Up @@ -257,6 +266,7 @@ describe('TikTok Scraper MODULE(promise): signUrl', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: 'https://m.tiktok.com/share/item/list?secUid=&id=355503&type=3&count=30&minCursor=0&maxCursor=0&shareUid=&lang=',
Expand Down Expand Up @@ -288,6 +298,7 @@ describe('TikTok Scraper MODULE(promise): getHashtagInfo', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: hasthagName,
Expand Down Expand Up @@ -338,6 +349,7 @@ describe('TikTok Scraper MODULE(promise): getUserProfileInfo', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: userName,
Expand Down Expand Up @@ -399,6 +411,7 @@ describe('TikTok Scraper CLI: user(save progress)', () => {
store_history: true,
test: true,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: 'tiktok',
Expand All @@ -414,12 +427,12 @@ describe('TikTok Scraper CLI: user(save progress)', () => {
jest.restoreAllMocks();
});

it('fs.readFile should be called once', async () => {
expect(fs.readFile).toHaveBeenCalledTimes(1);
it('fs.readFile should be called 2 times', async () => {
expect(fs.readFile).toHaveBeenCalledTimes(2);
});

it('fs.writeFile should be called once', async () => {
expect(fs.writeFile).toHaveBeenCalledTimes(1);
it('fs.writeFile should be called 2 times', async () => {
expect(fs.writeFile).toHaveBeenCalledTimes(2);
});

it('result should contain a valid file name for the Zip file', async () => {
Expand All @@ -433,6 +446,7 @@ describe('TikTok Scraper MODULE(promise): getVideoMeta', () => {
instance = new TikTokScraper({
download: false,
asyncDownload: 5,
asyncScraping: 5,
filetype: '',
filepath: '',
input: 'https://www.tiktok.com/@tiktok/video/6807491984882765062',
Expand Down
Loading

0 comments on commit 48594f2

Please sign in to comment.