cbr2pdf is a bash script will convert all .cbr and .cbz files recursively from a folder to PDF files in a seperate folder with pretty colors and stats. This script mainly uses ImageMagick to convert the images to pdf files and 7zip/p7z to extract the archives.
- Find alternatives to
p7zip
as they seem to operate differently across different distro. E.g macOS version ofp7zip
doesn't have any issues with one particular archive, butp7zip
in Fedora cannot extract the exact same archive. On Xubuntu, the archive can be extracted but the files are corrupted. Interestingly, the command7za
and7z
are different, the later of which is able to extract the archive. - Find a way to use img2pdf instead of ImageMagick
- Rewrite to python? (See above)
- Ensure that this script runs on different distros, mainly BSD and other linux distros like Fedora, CentOS, etc
Recently, I've added a dodgy way of running the script in parallel. To see if parallelisation helps, see the performance page.
TL:DR - Script runs very well when runnning with 2 parallels, slower with 4 parallels and having the spinner enabled slows down the script tremendously.
$ git clone https://github.com/Julian-Heng/cbr2pdf.git
$ cd cbr2pdf
$ ./cbr2pdf.sh
$ curl https://raw.githubusercontent.com/Julian-Heng/cbr2pdf/master/cbr2pdf.sh > cbr2pdf.sh
$ chmod +x cbr2pdf.sh
$ ./cbr2pdf.sh
The main commands used in this script are 7z
and ImageMagick
, but also include commands from the GNU Core Utils like sort
, basename
and printf
. So do keep that in mind. But if you're just running Ubuntu, or Arch Linux or any kind if linux, you should be fine.
The script also relies on bash-4.4 (September 2016) or above.
For MacOS, you'll need homebrew to install ImageMagick and 7zip. It will also install the Xcode Commandline tools, which includes git
. Curl
is also not installed by default.
$ sudo apt install p7zip-full imagemagick
$ sudo pacman -S p7zip imagemagick
$ sudo dnf install p7zip ImageMagick
$ sudo zypper install p7zip ImageMagick
$ sudo pkg install p7zip imagemagick
$ brew install p7zip imagemagick
$ ./cbr2pdf.sh --option --option VALUE
Usage: ./cbr2pdf.sh --option --option VALUE
Options:
[-v|--verbose] Enable verbose output
[-x|--extract] Only extract files
[-h|--help] Displays this message
[-k|--keep] Keep extracted files
[-q|--quiet] Suppress all output
[-p|--parallel "VALUE"] Run in parallel
[-l|--loglevel "VALUE"] Determine level of output details
[-w|--overwrite] Overwrite existing files
[-i|--input "DIRECTORY"] The input path for the files
[-o|--output "DIRECTORY"] The output path for the converted files
[--version] Print version number
[--no-spinner] Disable the spinner
[--no-summary] Disable printing summary (still print failed)
[--no-color] Disable color output
[--no-list] Disable printing file listing
This bash script convert all comic book archives with the
file extension .cbr or .cbz recursively from a folder
to PDF files in a seperate folder. It can also do single
files.
This script mainly uses ImageMagick to convert the images
to pdf files and 7zip/p7z to extract the archives.
Made by Julian Heng
[!] Both folders must already exist before starting this script
┌[julian@Julians-MacBook-Pro]-(~)
└> ./cbr2pdf.sh -i ~/Input -o ~/Output
================================================
[!] File list
================================================
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 01 (of 04) (2010).cbz
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 02 (of 04) (2010).cbz
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 03 (of 04) (2010).cbz
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 04 (of 04) (2010).cbz
================================================
[!] File information
================================================
Job Number: 1/4
Output Directory: /Users/julian/Output/Input/(2010) The Transformers - Drift [#1-4]
Source File: /Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 01 (of 04) (2010).cbz
[!] Extracting archive...
[!] No subfolders detected...
[!] Converting to PDF...
[!] Deleting extracted files...
[!] Finish converting "The Transformers - Drift 01 (of 04) (2010).cbz"
================================================
[!] File information
================================================
Job Number: 2/4
Output Directory: /Users/julian/Output/Input/(2010) The Transformers - Drift [#1-4]
Source File: /Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 02 (of 04) (2010).cbz
[!] Extracting archive...
[!] No subfolders detected...
[!] Converting to PDF...
[!] Deleting extracted files...
[!] Finish converting "The Transformers - Drift 02 (of 04) (2010).cbz"
================================================
[!] File information
================================================
Job Number: 3/4
Output Directory: /Users/julian/Output/Input/(2010) The Transformers - Drift [#1-4]
Source File: /Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 03 (of 04) (2010).cbz
[!] Extracting archive...
[!] No subfolders detected...
[!] Converting to PDF...
[!] Deleting extracted files...
[!] Finish converting "The Transformers - Drift 03 (of 04) (2010).cbz"
================================================
[!] File information
================================================
Job Number: 4/4
Output Directory: /Users/julian/Output/Input/(2010) The Transformers - Drift [#1-4]
Source File: /Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 04 (of 04) (2010).cbz
[!] Extracting archive...
[!] No subfolders detected...
[!] Converting to PDF...
[!] Deleting extracted files...
[!] Finish converting "The Transformers - Drift 04 (of 04) (2010).cbz"
================================================
[!] Completed files
================================================
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 01 (of 04) (2010).cbz
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 02 (of 04) (2010).cbz
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 03 (of 04) (2010).cbz
/Users/julian/Input/(2010) The Transformers - Drift [#1-4]/The Transformers - Drift 04 (of 04) (2010).cbz
================================================
[!] Finish converting all files
================================================
┌[julian@Julians-MacBook-Pro]-(~)
└>
- 0 - Finished successfully
- 1 - Script was interrupted
- 2 - Unknown flags or no flags parsed
- 3 - Input/Output directory not valid
- 4 - 7z/unzip or ImageMagick not installed
- 5 - Wrong bash version
Basically there are 6 steps that the script performs
- List all files within the
input
directory - Create the same folder structure as the
input
directory into theoutput
directory - Extract using
7z
orunzip
to theoutput
directory - Convert using ImageMagick from all
.jpg
or.png
to.pdf
- Delete extracted files
- Loop until all files are done
Firstly, the script will go and run all the prechecks before running the main script. This involves getting all arguments, printing verbose and debug information, checking directories and checking applications before continuing.
Using the find
command, we create an array containing all the files to be converted, which is then sorted. That array is then fed into a while loop as a the variable inputFile
. This variable is then seperated into parent
, source_dir
, source_file
, source_filename
, source_ext
, output
.
parent
: Parent directorysource_dir
: Source directorysource_file
: Source filesource_filename
: Source filenamesource_ext
: Source file extensionoutput
: Destination directory
By doing so, it makes it easier to form the folder structure on the output directory, as well as detecting file type for filtering out files that isnt .cbr
or .cbz
.
After setting the variables, we can now extract the files from the input directory into the output directory. 7z
is used for extracting because unzip
is unable to extract rar archives. unzip
however, is used as a fallback if 7z
is not present. The extract function uses the variables inputFile
and output/source_file
and to extract the files into the output directory within a folder with the name as the inputFile
.
Sometimes, the images are within a folder after extraction. This detects if there is a subfolder after extraction and moves the files up one level by using the find
command for any directories within a max depth of 2. If there is a subfolder within the extracted directory, then the find
command will print out 2 lines, first is the output directory and the second is the subfolder itself. Using sed
, we can assign the variable check
to the second line of the command. A simple check is then perform to see if the check
variable is empty. If it is, then there's no subfolder and the script continues. If it isn't empty, then the files inside of the subfolder is brought up one level using mv
.
Due to case-sensitive conditionals in bash, the extracted directory is checked for file extensions which are either all uppercase (JPG
) or camelCase (Jpg
). This ensures that when converting, the script will not choke when encountering file extensions that isn't all lowercase.
Just like in the extract function, we used convert
inside of a function where we pass the arguments output/source_file/*.{jpg,png}
and output/source_filename.pdf
. The first variable is all images inside of the extracted folder and the second variable is the final converted file.
After converting the files, the extracted folder is then deleted and then moved on the the next file.
This project is licensed under the GPL-3.0 License - see the LICENSE file for details