-
Notifications
You must be signed in to change notification settings - Fork 6
housejester/druid-test-harness
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
## This is a test harness for Druid. ### Instructions 1. Provision some servers. This can all be just one machine if you want. You will just need to know up front a hostname for your mysql server, zookeeper server, and the druid broker (and they can all be the same hostname, you just need to know what it is). I'd recommend giving it a shot on a single server first. 2. Install mysql on your mysql host. Create a druid database on the server, and grant all privileges to a druid user with password 'diurd'. 3. Update env-cluster.sh with your cluster details. Again, you'll need to specify the hostname for mysql, zookeeper, and the druid broker, but the other hosts you can leave what they are. Also, be sure to specify your AWS credentials and S3 bucket name. 4. Tar up this whole directory (after you've updated env-cluster.sh) and scp it to your servers. If you're just running everything on one server, just run them all locally. 5. Run the scripts in numerical order on the appropriate hosts. So, for example, go to your zookeeper server and run the 01-start-zookeeper.sh script. #### Don't leave the firehose running for too long...watch your disk. To stop everyting, just run the stop-* scripts from the appropriate hosts to stop the services. Stop zookeeper last (the Druid realtime node refuses to shutdown if zookeeper is already down). for example, running: `./stop-firehose.sh` on the firehose server will stop the firehose. ### The Firehose Sample Data The example data generated by the firehose is using d8a-conjure. You can see what it looks like if you cat just cat firehose/appevents.txt. See http://conjure.d8a.io for info on how the template works. You can see the data generated by cd'ing to firehose and running: `java -jar d8a-conjure-1.0-SNAPSHOT.jar -template appevents.txt` The sample data will be generated to the console. ### The Druid Realtime Spec for the Sample Data The spec used for the druid realtime node for ingesting that data is in druid/appevents_realime.spec ### The Query The query for the data is in queryies/event_counts_query.body ### Making changes You can make changes to the sample data itself just by editing the appevents.txt (again just see the d8a-conjure site for what you can do with the template). Then, you can tweak the spec to account for whatever aggregations you want to do. Finally, you can change the event_counts_query.body to do whatever query you want against the data (just be sure you always lower case ALL names and fieldNames). When you make any changes you'll need to bounce the firehose and the realtime node. I haven't tested what would happen if you change the dimensions and then have old data with different dimensions. ### Nuking and Starting Over If you need to just nuke everything and start over: - Stop everything using the stop-* scripts (stop zookeeper last) - Remove all /tmp/druid* directories - Remove all /tmp/kafka-* directories - Remove the /tmp/realtime directory - Remove the /tmp/zookeeper directory - Bring everything back up.
About
Harness for testing Druid
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published