Skip to content

isteven-xu/spark-compactor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Description

A python script for PySpark which can merge small files with the DHFS block size.

Usage

spark-submit --name MergeFiles file_merge.py tableconfig
# Table config like : table_a:0:-1,table_b:0:-2 means all partitions except recent 2. 

About

对指定目录进行小文件合并

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages