Visit the releases page to download builds for parquet-java/parquet-cli. All releases are built using the local
profile, producing standalone Uber JAR files that can be executed with java -jar
without requiring a Hadoop environment (on non-Windows systems).
Download parquet-cli
curl -L -O https://github.com/CBIIT/parquet-cli/releases/download/1.15.0/parquet-cli-1.15.0.jar
Convert input.csv
to output.parquet
java -jar parquet-cli-1.15.0.jar convert-csv input.csv -o output.parquet
You may encounter errors and warnings regarding Security Manager being disabled. Note that Security Manager will be completely removed in Java 24.
- Create a policy file called
parquet.policy
grant {
permission java.util.PropertyPermission "*", "read,write";
permission java.security.AllPermission;
};
- Run the jar with the Security Manager activated, using the specified policy
java -Djava.security.manager -Djava.security.policy=parquet.policy -jar parquet-cli-1.15.0.jar convert-csv input.csv -o output.parquet
To run this on Windows, you will need hadoop/winutils
git clone https://github.com/cdarlint/winutils C:\Programs\winutils
- Set the HADOOP_HOME environment variable to the location of the winutils/hadoop-3.3.6 folder
$env:HADOOP_HOME = "C:\Programs\winutils\hadoop-3.3.6"
# or permanently: [Environment]::SetEnvironmentVariable("HADOOP_HOME", "C:\Programs\winutils\hadoop-3.3.6", "User")
- Run the jar
java -jar parquet-cli-1.15.0.jar convert-csv input.csv -o output.parquet