Skip to content

Commit

Permalink
Gremlin server krl (aws#7)
Browse files Browse the repository at this point in the history
* Initial version of instructions for Gremlin Server config.
  • Loading branch information
krlawrence authored Nov 6, 2020
1 parent 6db1860 commit e372fcd
Show file tree
Hide file tree
Showing 4 changed files with 65 additions and 4 deletions.
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ Python package integrating jupyter notebooks with various graph-stores including
- Python3.6
- Jupyter Notebooks

## Introduction
The graph-notebook provides a way to interact using a Jupyter notebook with any graph database that follows the Gremlin Server or RDF HTTP protocols. These databases could be running locally on your laptop, in a private data center or in the cloud. This project was initially created as a way to work with Amazon Neptune but is not limited to that database engine. For example you can connect to a Gremlin Server running on your laptop using this solution. The instructions below describe the process for connecting to Amazon Neptune. We encourage others to contribute configurations they find useful. There is an `additional-databases` folder where such information can be found. We have already provided instructions for establishing the Gremlin Server connection.

## Installation

```
Expand Down Expand Up @@ -81,16 +84,19 @@ echo "{
}" >> ~/graph_notebook_config.json
```

### Connecting to a local graph store
As mentioned in the introduction, it is possible to connect `graph-notebook` to a graph database running on your local machine. An example being Gremlin Server. There are additional instructions regarding the use of local servers in the `additional-databases` folder.

## Authentication

If you are running a SigV4 authenticated endpoint, ensure that the config field `iam_credentials_provider_type` is set
If you are running a SigV4 authenticated endpoint, ensure that the config field `iam_credentials_provider_type` is set
to `ENV` and that you have set the following environment variables:

- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- AWS_REGION
- AWS_SESSION_TOKEN (OPTIONAL. Use if you are using temporary credentials)
- AWS_SESSION_TOKEN (OPTIONAL. Use if you are using temporary credentials)


## Security

Expand All @@ -99,4 +105,3 @@ See [CONTRIBUTING](https://github.com/aws/graph-notebook/blob/main/CONTRIBUTING.
## License

This project is licensed under the Apache-2.0 License.

2 changes: 2 additions & 0 deletions additional-databases/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Additional Databases
Subfolders within this part of the tree contain detailed instructions that help when configuring graph-notebook with different graph database engines.
54 changes: 54 additions & 0 deletions additional-databases/gremlin-server/gremlin-server-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
## Connecting graph-notebook to a Gremlin Server

![Gremlin](https://github.com/aws/graph-notebook/blob/gremlin-server-krl/images/gremlin-notebook.png?raw=true, "Picture of Gremlin holding a notebook")

These notes explain how to connect the graph-notebook to a Gremlin server running locally on the same machine. The same steps should also work if you have a remote Gremlin Server. In such cases `localhost` should be replaced with the DNS or IP address of the remote server. It is assumed the [graph-notebook installation](https://github.com/aws/graph-notebook/blob/main/README.md) has been completed and the Jupyter environment is running before following these steps.

### Gremlin Server Configuration
Several of the steps below are optional but please read each step carefully and decide if you want to apply it.
1. Download the Gremlin Server from https://tinkerpop.apache.org/ and unzip it. The remaining steps in this section assume you have made your working directory the place where you performed the unzip.
2. In conf/tinkergraph-empty.properties, change the ID manager from `LONG` to `ANY` to
enable IDs that include text strings.
```
gremlin.tinkergraph.vertexIdManager=ANY
```
3. Optionally add another line doing the same for edge IDs.
```
gremlin.tinkergraph.edgeIdManager=ANY
```
4. To enable HTTP as well as Web Socket connections to the Gremlin Server, edit the file /conf/gremlin-server.yaml and change
```
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
```
to
```
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer`.
```
This will allow you to access the Gremlin Server from Jupyter using commands like `curl` as well as using the `%%gremlin` cell magic. This step is optional if you do not need HTTP connectivity to the server.
5. Start the Gremlin server `bin/gremlin-server.sh start`


### Connecting to a local Gremlin Server from Jupyter
1. In the Jupyter Notebook disable SSL using `%%graph_notebook_config` and change the host to `localhost`
```
%%graph_notebook_config
{
"host": "localhost",
"port": 8182,
"ssl": false
}
```
If the Gremlin Server you wish to connect to is remote, replacing `localhost` with the IP address or DNS of the remote server should work. This assumes you have access to that server from your local machine.

### Using `%seed` with Gremlin Server
The graph-notebook has a `%seed` command that can be used to load sample data. For some data sets to load successfully, the stack size used by the Gremlin Server needs to be increased. If you do not plan to use the `%seed` command to load the `air-routes` data set this step can be ignored.

1. In order to load the `air-routes` data set into TinkerGraph via Gremlin Server using the graph-notebook `%seed` command, the size of the JVM thread stack needs to be increased. Editing the `gremlin-server.sh` file and adding `-Xss2m` to the JAVA_OPTIONS variable is one way to do that. Locate this section of the file and add the `-Xss2m` flag.

```
# Set Java options
if [[ "$JAVA_OPTIONS" = "" ]] ; then
JAVA_OPTIONS="-Xms512m -Xmx4096m -Xss2m"
fi
```
Binary file added images/gremlin-notebook.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e372fcd

Please sign in to comment.