forked from twitter/scala_school
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
129 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
--- | ||
permalink: searchbird.html | ||
title: Searchbird | ||
layout: post | ||
desc: Building a distributed search engine using Finagle | ||
--- | ||
|
||
Start the project: | ||
|
||
Exploring a scala service. Creates a simple k/v store. | ||
|
||
<pre> | ||
$ mkdir searchbird; cd searchbird | ||
$ scala-bootstrapper searchbird | ||
$ find . -type f | ||
./Capfile | ||
./config/development.scala | ||
./config/production.scala | ||
./config/staging.scala | ||
./config/test.scala | ||
./Gemfile | ||
./project/build/SearchbirdProject.scala | ||
./project/build.properties | ||
./project/plugins/Plugins.scala | ||
./run | ||
./src/main/scala/com/twitter/searchbird/config/SearchbirdServiceConfig.scala | ||
./src/main/scala/com/twitter/searchbird/Main.scala | ||
./src/main/scala/com/twitter/searchbird/SearchbirdServiceImpl.scala | ||
./src/main/thrift/searchbird.thrift | ||
./src/scripts/console | ||
./src/scripts/searchbird.sh | ||
./src/test/scala/com/twitter/searchbird/AbstractSpec.scala | ||
./src/test/scala/com/twitter/searchbird/SearchbirdServiceSpec.scala | ||
</pre> | ||
|
||
This creates | ||
|
||
* Change finagle & util to latest versions (1.8.3, 1.11.2 at the time of writing) | ||
|
||
h2. Exploring the default bootstrapper project | ||
|
||
Let's first explore the default project <strong>scala-bootstrapper</strong> creates for us. This is meant as a template. You'll end up substituting most of it, but it serves as a convenient scaffold. It defines a simple (but complete) key-value store. Configuration, a thrift interface, stats export and logging are all included. | ||
|
||
Since searchbird is a "thrift":http://thrift.apache.org/ service (like most of our services), its external interface is defined in the thrift IDL. <strong>src/main/thrift/searchbird.thrift</strong>: | ||
|
||
<pre> | ||
service SearchbirdService { | ||
string get(1: string key) throws(1: SearchbirdException ex) | ||
|
||
void put(1: string key, 2: string value) | ||
} | ||
</pre> | ||
|
||
This is pretty straightforward: our service <strong>SearchbirdService</strong> exports 2 RPC methods, <strong>get</strong> and <strong>put</strong>. They comprise a simple interface to a key-value store. | ||
|
||
Now run the default service, and explore it through this interface. | ||
|
||
First build the project, and run the service (which is also the default "main" method that sbt will run). | ||
|
||
<pre> | ||
$ sbt | ||
… | ||
> update | ||
… | ||
> compile | ||
> run -f config/development.scala | ||
</pre> | ||
|
||
In another window: | ||
|
||
<pre> | ||
$ gem install thrift_client --version '=0.6' | ||
$ src/scripts/console | ||
Hint: the client is in the variable `$client` | ||
No servers specified, using 127.0.0.1:9999 | ||
> $client.put("foo", "bar") | ||
nil | ||
> $client.get("foo") | ||
"bar" | ||
> | ||
</pre> | ||
|
||
The server also exports runtime statistics. These are convenient both for inspecting individual servers as well as aggregating into global service statistics (a machine-readable json interface is also provided). | ||
|
||
<pre> | ||
$ curl localhost:9900/stats.txt | ||
counters: | ||
Searchbird/connects: 2 | ||
Searchbird/requests: 5 | ||
Searchbird/success: 5 | ||
jvm_gc_ConcurrentMarkSweep_cycles: 2 | ||
jvm_gc_ConcurrentMarkSweep_msec: 102 | ||
jvm_gc_ParNew_cycles: 9 | ||
jvm_gc_ParNew_msec: 210 | ||
jvm_gc_cycles: 11 | ||
jvm_gc_msec: 312 | ||
gauges: | ||
Searchbird/connections: 0 | ||
Searchbird/pending: 0 | ||
jvm_fd_count: 147 | ||
jvm_fd_limit: 10240 | ||
jvm_heap_committed: 588251136 | ||
jvm_heap_max: 3220570112 | ||
jvm_heap_used: 39530208 | ||
jvm_nonheap_committed: 81481728 | ||
jvm_nonheap_max: 1124073472 | ||
jvm_nonheap_used: 69312424 | ||
jvm_num_cpus: 4 | ||
jvm_post_gc_CMS_Old_Gen_used: 5970824 | ||
jvm_post_gc_CMS_Perm_Gen_used: 46407832 | ||
jvm_post_gc_Par_Eden_Space_used: 0 | ||
jvm_post_gc_Par_Survivor_Space_used: 0 | ||
jvm_post_gc_used: 52378656 | ||
jvm_start_time: 1314124442749 | ||
jvm_thread_count: 14 | ||
jvm_thread_daemon_count: 8 | ||
jvm_thread_peak_count: 14 | ||
jvm_uptime: 404221 | ||
labels: | ||
metrics: | ||
Searchbird/connection_duration: (average=25115, count=2, maximum=52068, minimum=142, p25=142, p50=142, p75=52068, p90=52068, p95=52068, p99=52068, p999=52068, p9999=52068, sum=50230) | ||
Searchbird/connection_received_bytes: (average=84, count=2, maximum=142, minimum=29, p25=29, p50=29, p75=142, p90=142, p95=142, p99=142, p999=142, p9999=142, sum=169) | ||
Searchbird/connection_requests: (average=2, count=2, maximum=4, minimum=1, p25=1, p50=1, p75=4, p90=4, p95=4, p99=4, p999=4, p9999=4, sum=5) | ||
Searchbird/connection_sent_bytes: (average=61, count=2, maximum=95, minimum=23, p25=23, p50=23, p75=95, p90=95, p95=95, p99=95, p999=95, p9999=95, sum=123) | ||
Searchbird/request_latency_ms: (average=20, count=5, maximum=95, minimum=1, p25=1, p50=2, p75=8, p90=95, p95=95, p99=95, p999=95, p9999=95, sum=103) | ||
</pre> | ||
|
||
In addition to our own service statistics, we are also given some generic JVM stats that are often useful. | ||
|