Skip to content

Commit

Permalink
Merge branch 'v1.1.0-pending-release' of github.com:logstash/logstash…
Browse files Browse the repository at this point in the history
… into v1.1.0-pending-release
  • Loading branch information
jordansissel committed Jan 31, 2012
2 parents e88e5e6 + c742ba8 commit 0ce9692
Show file tree
Hide file tree
Showing 7 changed files with 211 additions and 113 deletions.
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,8 @@ build/docs/%: docs/% lib/logstash/version.rb Makefile

build/docs/index.html: $(addprefix build/docs/,$(subst lib/logstash/,,$(subst .rb,.html,$(PLUGIN_FILES))))
build/docs/index.html: docs/generate_index.rb lib/logstash/version.rb docs/index.html.erb Makefile
ruby $< build/docs > $@
@echo "Building documentation index.html"
$(QUIET)ruby $< build/docs > $@
$(QUIET)sed -i -re 's/%VERSION%/$(VERSION)/g' $@
$(QUIET)sed -i -re 's/%ELASTICSEARCH_VERSION%/$(ELASTICSEARCH_VERSION)/g' $@

Expand Down
229 changes: 137 additions & 92 deletions docs/tutorials/just-enough-amqp-for-logstash.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,36 +3,46 @@ title: Just Enough AMQP - logstash
layout: content_right
---

While configuring your AMQP broker is out of scope for logstash, it's important to understand how
logstash uses AMQP. To do that, we need to understand a little about AMQP.
While configuring your AMQP broker is out of scope for logstash, it's important
to understand how logstash uses AMQP. To do that, we need to understand a
little about AMQP.

You should also consider reading [this](http://www.rabbitmq.com/tutorials/amqp-concepts.html) at the RabbitMQ website.
You should also consider reading
[this](http://www.rabbitmq.com/tutorials/amqp-concepts.html) at the RabbitMQ
website.

# Exchanges, queues and bindings; OH MY!

You can get a long way by understanding a few key terms.

## Exchanges
Exchanges are for message **producers**. In Logstash, we map these to **outputs**.
Logstash puts messages on exchanges.
There are many types of exchanges and they are discussed below.

Exchanges are for message **producers**. In Logstash, we map these to
**outputs**. Logstash puts messages on exchanges. There are many types of
exchanges and they are discussed below.

## Queues

Queues are for message **consumers**. In Logstash, we map these to inputs.
Logstash reads messages from queues.
Optionally, queues can consume only a subset of messages. This is done with "routing keys".
Logstash reads messages from queues. Optionally, queues can consume only a
subset of messages. This is done with "routing keys".

## Bindings
Just having a producer and a consumer is not enough. We must `bind` a queue to an exchange.
When we bind a queue to an exchange, we can optionally provide a routing key.
Routing keys are discussed below.

Just having a producer and a consumer is not enough. We must `bind` a queue to
an exchange. When we bind a queue to an exchange, we can optionally provide a
routing key. Routing keys are discussed below.

## Broker
A broker is simply the AMQP server software. There are several brokers but the most common (and arguably popular) is [RabbitMQ](http://www.rabbitmq.com).

A broker is simply the AMQP server software. There are several brokers but the
most common (and arguably popular) is [RabbitMQ](http://www.rabbitmq.com).
Some others are Apache Qpid (and the commercial version - RedHat MRG)

# Routing Keys
Simply put, routing keys are somewhat like tags for messages. In practice, they are hierarchical in nature
with the each level separated by a dot:

Simply put, routing keys are somewhat like tags for messages. In practice, they
are hierarchical in nature with the each level separated by a dot:

- `messages.servers.production`
- `sports.atlanta.baseball`
Expand All @@ -47,111 +57,146 @@ can programatically define the routing key for a given event using the metadata

From a consumer/queue perspective, routing keys also support two types wildcards - `#` and `*`.

- `*` matches any single word.
- `#` matches any number of words and behaves like a traditional wildcard.
- `*` (asterisk) matches any single word.
- `#` (hash) matches any number of words and behaves like a traditional wildcard.

Using the above examples, if you wanted to bind to an exchange and see messages for just production,
you would use the routing key `logs.servers.production.*`. If you wanted to see messages for host1, regardless of environment
you could use `logs.servers.%.host1.#`.
Using the above examples, if you wanted to bind to an exchange and see messages
for just production, you would use the routing key `logs.servers.production.*`.
If you wanted to see messages for host1, regardless of environment you could
use `logs.servers.%.host1.#`.

Wildcards can be a bit confusing but a good general rule to follow is to use `*` in places where you need wildcards for a known element.
Use `#` when you need to match any remaining placeholders. Note that wildcards in routing keys only make sense on the consumer/queue binding,
not in the publishing/exchange side.
Wildcards can be a bit confusing but a good general rule to follow is to use
`*` in places where you need wildcards for a known element. Use `#` when you
need to match any remaining placeholders. Note that wildcards in routing keys
only make sense on the consumer/queue binding, not in the publishing/exchange
side.

We'll get into some of that neat stuff below. For now, it's enough to understand the general idea behind routing keys.
We'll get into some of that neat stuff below. For now, it's enough to
understand the general idea behind routing keys.

# Exchange types

There are three primary types of exchanges that you'll see.

## Direct
A direct exchange is one that is probably most familiar to people. Message comes in and, assuming there is a queue bound, the message is picked up.
You can have multiple queues bound to the same direct exchange. The best way to understand this pattern is pool of workers (queues) that read from a
direct exchange to get units of work. Only one consumer will see a given message in a direct exchange.

You can set routing keys on messages published to a direct exchange. This allows you do have workers that do different tasks read from the same global
A direct exchange is one that is probably most familiar to people. Message
comes in and, assuming there is a queue bound, the message is picked up. You
can have multiple queues bound to the same direct exchange. The best way to
understand this pattern is pool of workers (queues) that read from a direct
exchange to get units of work. Only one consumer will see a given message in a
direct exchange.

You can set routing keys on messages published to a direct exchange. This
allows you do have workers that do different tasks read from the same global
pool of messages yet consume only the ones they know how to handle.

The RabbitMQ concepts guide (linked below) does a good job of describing this visually [here](http://www.rabbitmq.com/img/tutorials/intro/exchange-direct.png)
The RabbitMQ concepts guide (linked below) does a good job of describing this
visually
[here](http://www.rabbitmq.com/img/tutorials/intro/exchange-direct.png)

## Fanout
Fanouts are another type of exchange. Unlike direct exchanges, every queue bound to a fanout exchange will see the same messages.
This is best described as a PUB/SUB pattern. This is helpful when you need broadcast messages to multiple interested parties.

Fanout exchanges do NOT support routing keys. All bound queues see all messages.
Fanouts are another type of exchange. Unlike direct exchanges, every queue
bound to a fanout exchange will see the same messages. This is best described
as a PUB/SUB pattern. This is helpful when you need broadcast messages to
multiple interested parties.

Fanout exchanges do NOT support routing keys. All bound queues see all
messages.

## Topic
Topic exchanges are special type of fanout exchange. Fanout exchanges don't support routing keys. Topic exchanges do support them.
Just like a fanout exchange, all bound queues see all messages with the additional filter of the routing key.

Topic exchanges are special type of fanout exchange. Fanout exchanges don't
support routing keys. Topic exchanges do support them. Just like a fanout
exchange, all bound queues see all messages with the additional filter of the
routing key.

# AMQP in logstash
As stated earlier, in Logstash, Outputs publish to Exchanges. Inputs read from Queues that are bound to Exchanges.
Logstash uses the `bunny` AMQP library for interaction with a broker. Logstash endeavors to expose as much of the configuration for both exchanges and queues.
There are many different tunables that you might be concerned with setting - including things like message durability or persistence of declared queues/exchanges.
See the relevant input and output documentation for AMQP for a full list of tunables.

As stated earlier, in Logstash, Outputs publish to Exchanges. Inputs read from
Queues that are bound to Exchanges. Logstash uses the `bunny` AMQP library for
interaction with a broker. Logstash endeavors to expose as much of the
configuration for both exchanges and queues. There are many different tunables
that you might be concerned with setting - including things like message
durability or persistence of declared queues/exchanges. See the relevant input
and output documentation for AMQP for a full list of tunables.

# Sample configurations, tips, tricks and gotchas
There are several examples in the logstash source directory of AMQP usage, however a few general rules might help eliminate any issues.

There are several examples in the logstash source directory of AMQP usage,
however a few general rules might help eliminate any issues.

## Check your bindings
If logstash is publishing the messages and logstash is consuming the messages, the `exchange` value for the input should match the `name` in the output.

If logstash is publishing the messages and logstash is consuming the messages,
the `exchange` value for the input should match the `name` in the output.

sender agent

```
input { stdin { type = "test" } }
output {
amqp {
name => "test_exchange"
host => "my_amqp_server"
exchange_type => "fanout"
}
}
```
input { stdin { type = "test" } }
output {
amqp {
name => "test_exchange"
host => "my_amqp_server"
exchange_type => "fanout"
}
}

receiver agent

```
input {
amqp {
name => "test_queue"
host => "my_amqp_server"
exchange => "test_exchange" # This matches the exchange declared above
}
}
output { stdout { debug => true }}
```
input {
amqp {
name => "test_queue"
host => "my_amqp_server"
exchange => "test_exchange" # This matches the exchange declared above
}
}
output { stdout { debug => true }}

## Message persistence
By default, logstash will attempt to ensure that you don't lose any messages. This is reflected in the AMQP default settings as well.
However there are cases where you might not want this. A good example is where AMQP is not your primary method of shipping.

In the following example, we use AMQP as a sniffing interface. Our primary destination is the embedded ElasticSearch instance. We have
a secondary AMQP output that we use for duplicating messages. However we disable persistence and durability on this interface so that messages
don't pile up waiting for delivery. We only use AMQP when we want to watch messages in realtime. Additionally, we're going to leverage
routing keys so that we can optionally filter incoming messages to subsets of hosts. The exercise of getting messages to this logstash
agent are left up to the user.

```
input { # some input definition here}
output {
elasticsearch { embedded => true }
amqp {
name => "logtail"
host => "my_amqp_server"
exchange_type => "topic" # We use topic here to enable pub/sub with routing keys
key => "logs.%{host}"
durable => false # If rabbitmq restarts, the exchange disappears.
auto_delete => true # If logstash disconnects, the exchange goes away
persistent => false # Messages are not persisted to disk
}
}
```

Now if you want to stream logs in realtime, you can use the programming language of your choice to bind a queue to the `logtail` exchange.
If you do not specify a routing key, you will see every message that comes in to logstash. However, you can specify a routing key like
`logs.apache1` and see only messages from host `apache1`.

Note that any logstash variable is valid in the key definition. This allows you to create really complex routing key hierarchies for advanced filtering.

Note that RabbitMQ has specific rules about durability and persistence matching on both the queue and exchange. You should read the RabbitMQ documentation
to make sure you don't crash your RabbitMQ server with messages awaiting someone to pick them up.

By default, logstash will attempt to ensure that you don't lose any messages.
This is reflected in the AMQP default settings as well. However there are
cases where you might not want this. A good example is where AMQP is not your
primary method of shipping.

In the following example, we use AMQP as a sniffing interface. Our primary
destination is the embedded ElasticSearch instance. We have a secondary AMQP
output that we use for duplicating messages. However we disable persistence and
durability on this interface so that messages don't pile up waiting for
delivery. We only use AMQP when we want to watch messages in realtime.
Additionally, we're going to leverage routing keys so that we can optionally
filter incoming messages to subsets of hosts. The exercise of getting messages
to this logstash agent are left up to the user.

input {
# some input definition here
}

output {
elasticsearch { embedded => true }
amqp {
name => "logtail"
host => "my_amqp_server"
exchange_type => "topic" # We use topic here to enable pub/sub with routing keys
key => "logs.%{host}"
durable => false # If rabbitmq restarts, the exchange disappears.
auto_delete => true # If logstash disconnects, the exchange goes away
persistent => false # Messages are not persisted to disk
}
}

Now if you want to stream logs in realtime, you can use the programming
language of your choice to bind a queue to the `logtail` exchange. If you do
not specify a routing key, you will see every message that comes in to
logstash. However, you can specify a routing key like `logs.apache1` and see
only messages from host `apache1`.

Note that any logstash variable is valid in the key definition. This allows you
to create really complex routing key hierarchies for advanced filtering.

Note that RabbitMQ has specific rules about durability and persistence matching
on both the queue and exchange. You should read the RabbitMQ documentation to
make sure you don't crash your RabbitMQ server with messages awaiting someone
to pick them up.
21 changes: 13 additions & 8 deletions lib/logstash/filters/grep.rb
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,6 @@ def filter(event)
@logger.debug("Running grep filter", :event => event, :config => config)
matches = 0
@patterns.each do |field, regexes|
if !event[field]
@logger.debug("Skipping match object, field not present",
:field => field, :event => event)
next
end

# For each match object, we have to match everything in order to
# apply any fields/tags.
match_count = 0
Expand All @@ -70,8 +64,19 @@ def filter(event)
match_want += 1

# Events without this field, with negate enabled, count as a match.
if event[field].nil? and @negate == true
match_count += 1
# With negate disabled, we can't possibly match, so skip ahead.
if event[field].nil?
if @negate
msg = "Field not present, but negate is true; marking as a match"
@logger.debug(msg, :field => field, :event => event)
match_count += 1
else
@logger.debug("Skipping match object, field not present",
:field => field, :event => event)
end
# Either way, don't try to process -- may end up with extra unwanted
# +1's to match_count
next
end

(event[field].is_a?(Array) ? event[field] : [event[field]]).each do |value|
Expand Down
12 changes: 6 additions & 6 deletions lib/logstash/inputs/amqp.rb
Original file line number Diff line number Diff line change
Expand Up @@ -27,17 +27,17 @@ class LogStash::Inputs::Amqp < LogStash::Inputs::Base
config :password, :validate => :password, :default => "guest"

# The name of the queue.
config :name, :validate => :string, :default => ''
config :name, :validate => :string, :default => ""

# The name of the exchange to bind the queue. This is analogous to the 'amqp
# output' [config 'name'](../outputs/amqp)
config :exchange, :validate => :string, :required => true

# The routing key to use. This is only valid for direct or fanout exchanges
# This setting is ignored on topic exchanges.
#
# If you don't know what this setting is for, leave it default.
config :key, :validate => :string, :default => '#'
#
# * Routing keys are ignored on topic exchanges.
# * Wildcards are not valid on direct exchanges.
config :key, :validate => :string, :default => "logstash"

# The vhost to use. If you don't know what this is, leave the default.
config :vhost, :validate => :string, :default => "/"
Expand Down Expand Up @@ -98,7 +98,7 @@ def register
@amqpsettings[:ssl] = @ssl if @ssl
@amqpsettings[:verify_ssl] = @verify_ssl if @verify_ssl
@amqpurl = "amqp://"
amqp_credentials = ''
amqp_credentials = ""
amqp_credentials << @user if @user
amqp_credentials << ":#{@password}" if @password
@amqpurl += amqp_credentials unless amqp_credentials.nil?
Expand Down
3 changes: 1 addition & 2 deletions lib/logstash/outputs/amqp.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,8 @@ class LogStash::Outputs::Amqp < LogStash::Outputs::Base
config :name, :validate => :string, :required => true

# Key to route to by default. Defaults to 'logstash'
# This key setting is ignored for topic exchanges.
#
# If you don't know what this setting is for, leave it default.
# * Routing keys are ignored on topic exchanges.
config :key, :validate => :string, :default => "logstash"

# The vhost to use
Expand Down
Loading

0 comments on commit 0ce9692

Please sign in to comment.