Presto is a distributed SQL query engine for big data.
See the User Manual for end user documentation.
- Mac OS X or Linux
- Java 7, 64-bit
- Maven 3 (for building)
- Python 2.4+ (for running with the launcher script)
Presto is a standard Maven project. Simply run the following command from the project root directory:
mvn clean install
On the first build, Maven will download all the dependencies from the internet and cache them in the local repository (~/.m2/repository
), which can take a considerable amount of time. Subsequent builds will be faster.
Presto has a comprehensive set of unit tests that can take several minutes to run. You can disable the tests when building:
mvn clean install -DskipTests
After building Presto for the first time, you can load the project into your IDE and run the server. We recommend using IntelliJ IDEA. Because Presto is a standard Maven project, you can import it into your IDE using the root pom.xml
file.
Presto comes with sample configuration that should work out-of-the-box for development. Use the following options to create a run configuration:
- Main Class:
com.facebook.presto.server.PrestoServer
- VM Options:
-ea -Xmx2G -Dconfig=etc/config.properties -Dlog.levels-file=etc/log.properties
- Working directory:
$MODULE_DIR$
- Use classpath of module:
presto-server
The working directory should be the presto-server
subdirectory. In IntelliJ, using $MODULE_DIR$
accomplishes this automatically.
Additionally, the Hive plugin needs to be configured with location of your Hive metastore Thrift service. Add the following to the list of VM options, replacing localhost:9083
with the correct host and port:
-Dhive.metastore.uri=thrift://localhost:9083
If your Hive metastore or HDFS cluster is not directly accessible to your local machine, you can use SSH port forwarding to access it. Setup a dynamic SOCKS proxy with SSH listening on local port 1080:
ssh -v -N -D 1080 server
Then add the following to the list of VM options:
-Dhive.metastore.thrift.client.socks-proxy=localhost:1080
If your Hive metastore references files stored on a federated HDFS, you should provide your HDFS config files as a VM option:
-Dhive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
Start the CLI to connect to the server and run SQL queries:
presto-cli/target/presto-cli-*-executable.jar
Run a query to see the nodes in the cluster:
SELECT * FROM sys.node;
In the sample configuration, the Hive connector is mounted in the hive
catalog, so you can run the following queries to show the tables in the Hive database default
:
SHOW TABLES FROM hive.default;