From 6054fc53b685cfeace4fe3d3c681a3982a070478 Mon Sep 17 00:00:00 2001 From: "marius a. eriksen" Date: Tue, 23 Aug 2011 12:05:51 -0700 Subject: [PATCH] searchbird: beginnings of lesson 12 --- _posts/2011-05-12-lesson.textile | 129 +++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 _posts/2011-05-12-lesson.textile diff --git a/_posts/2011-05-12-lesson.textile b/_posts/2011-05-12-lesson.textile new file mode 100644 index 00000000..be5a83af --- /dev/null +++ b/_posts/2011-05-12-lesson.textile @@ -0,0 +1,129 @@ +--- +permalink: searchbird.html +title: Searchbird +layout: post +desc: Building a distributed search engine using Finagle +--- + +Start the project: + +Exploring a scala service. Creates a simple k/v store. + +
+$ mkdir searchbird; cd searchbird
+$ scala-bootstrapper searchbird
+$ find . -type f
+./Capfile
+./config/development.scala
+./config/production.scala
+./config/staging.scala
+./config/test.scala
+./Gemfile
+./project/build/SearchbirdProject.scala
+./project/build.properties
+./project/plugins/Plugins.scala
+./run
+./src/main/scala/com/twitter/searchbird/config/SearchbirdServiceConfig.scala
+./src/main/scala/com/twitter/searchbird/Main.scala
+./src/main/scala/com/twitter/searchbird/SearchbirdServiceImpl.scala
+./src/main/thrift/searchbird.thrift
+./src/scripts/console
+./src/scripts/searchbird.sh
+./src/test/scala/com/twitter/searchbird/AbstractSpec.scala
+./src/test/scala/com/twitter/searchbird/SearchbirdServiceSpec.scala
+
+ +This creates + +* Change finagle & util to latest versions (1.8.3, 1.11.2 at the time of writing) + +h2. Exploring the default bootstrapper project + +Let's first explore the default project scala-bootstrapper creates for us. This is meant as a template. You'll end up substituting most of it, but it serves as a convenient scaffold. It defines a simple (but complete) key-value store. Configuration, a thrift interface, stats export and logging are all included. + +Since searchbird is a "thrift":http://thrift.apache.org/ service (like most of our services), its external interface is defined in the thrift IDL. src/main/thrift/searchbird.thrift: + +
+service SearchbirdService {
+  string get(1: string key) throws(1: SearchbirdException ex)
+
+  void put(1: string key, 2: string value)
+}
+
+ +This is pretty straightforward: our service SearchbirdService exports 2 RPC methods, get and put. They comprise a simple interface to a key-value store. + +Now run the default service, and explore it through this interface. + +First build the project, and run the service (which is also the default "main" method that sbt will run). + +
+$ sbt
+…
+> update
+…
+> compile
+> run -f config/development.scala
+
+ +In another window: + +
+$ gem install thrift_client --version '=0.6'
+$ src/scripts/console
+Hint: the client is in the variable `$client`
+No servers specified, using 127.0.0.1:9999
+> $client.put("foo", "bar")
+nil
+> $client.get("foo")
+"bar"
+>
+
+ +The server also exports runtime statistics. These are convenient both for inspecting individual servers as well as aggregating into global service statistics (a machine-readable json interface is also provided). + +
+$ curl localhost:9900/stats.txt
+counters:
+  Searchbird/connects: 2
+  Searchbird/requests: 5
+  Searchbird/success: 5
+  jvm_gc_ConcurrentMarkSweep_cycles: 2
+  jvm_gc_ConcurrentMarkSweep_msec: 102
+  jvm_gc_ParNew_cycles: 9
+  jvm_gc_ParNew_msec: 210
+  jvm_gc_cycles: 11
+  jvm_gc_msec: 312
+gauges:
+  Searchbird/connections: 0
+  Searchbird/pending: 0
+  jvm_fd_count: 147
+  jvm_fd_limit: 10240
+  jvm_heap_committed: 588251136
+  jvm_heap_max: 3220570112
+  jvm_heap_used: 39530208
+  jvm_nonheap_committed: 81481728
+  jvm_nonheap_max: 1124073472
+  jvm_nonheap_used: 69312424
+  jvm_num_cpus: 4
+  jvm_post_gc_CMS_Old_Gen_used: 5970824
+  jvm_post_gc_CMS_Perm_Gen_used: 46407832
+  jvm_post_gc_Par_Eden_Space_used: 0
+  jvm_post_gc_Par_Survivor_Space_used: 0
+  jvm_post_gc_used: 52378656
+  jvm_start_time: 1314124442749
+  jvm_thread_count: 14
+  jvm_thread_daemon_count: 8
+  jvm_thread_peak_count: 14
+  jvm_uptime: 404221
+labels:
+metrics:
+  Searchbird/connection_duration: (average=25115, count=2, maximum=52068, minimum=142, p25=142, p50=142, p75=52068, p90=52068, p95=52068, p99=52068, p999=52068, p9999=52068, sum=50230)
+  Searchbird/connection_received_bytes: (average=84, count=2, maximum=142, minimum=29, p25=29, p50=29, p75=142, p90=142, p95=142, p99=142, p999=142, p9999=142, sum=169)
+  Searchbird/connection_requests: (average=2, count=2, maximum=4, minimum=1, p25=1, p50=1, p75=4, p90=4, p95=4, p99=4, p999=4, p9999=4, sum=5)
+  Searchbird/connection_sent_bytes: (average=61, count=2, maximum=95, minimum=23, p25=23, p50=23, p75=95, p90=95, p95=95, p99=95, p999=95, p9999=95, sum=123)
+  Searchbird/request_latency_ms: (average=20, count=5, maximum=95, minimum=1, p25=1, p50=2, p75=8, p90=95, p95=95, p99=95, p999=95, p9999=95, sum=103)
+
+ +In addition to our own service statistics, we are also given some generic JVM stats that are often useful. +