Entity-style text parsing. Quesadilla was extracted from Cheddar.
See the Cheddar text guide for more information about how to type entities.
Quesadilla's API is fully documented. Read the online documentation.
Add this line to your application's Gemfile:
gem 'quesadilla'
And then execute:
$ bundle
Or install it yourself as:
$ gem install quesadilla
To extract entites from text, simply call extract:
Quesadilla.extract('Some #awesome text')
# => {
# display_text: "Some #awesome text",
# display_html: "Some <a href=\"#hashtag-awesome\" class=\"tag\">#awesome</a> text",
# entities: [
# {
# type: "hashtag",
# text: "#awesome",
# display_text: "#awesome",
# indices: [5, 13],
# hashtag: "awesome",
# display_indices: [5, 13]
# }
# ]
# }
Quesadilla supports extracting various span-level Markdown features as well as automatically detecting links and GitHub-style named emoji. Here are the list of options you can pass when extracting:
Option | Description | Default |
---|---|---|
:markdown |
All Markdown parsing | true |
:markdown_code |
Markdown code tags | true |
:markdown_links |
Markdown links (including <http://soff.es> style links) |
true |
:markdown_triple_emphasis |
Markdown bold italic | true |
:markdown_double_emphasis |
Markdown bold | true |
:markdown_emphasis |
Markdown italic | true |
:markdown_strikethrough |
Markdown Extra strikethrough | true |
:hashtags |
Hashtags | true |
:hashtags_validator |
Callable object to validate hashtags | nil |
:autolinks |
Automatically detect links | true |
:emoji |
GitHub-style named emoji | true |
:users |
User mentions | false |
:user_validator |
Callable object to validate usernames | nil |
:html |
Generate HTML representations for entities and the entire string | true |
Everything is enabled by deafult except user mentions. If you don't want to extract Markdown, you should call the extractor this like:
Quesadilla.extract('Some text', markdown: false)
You can also just disable strikethrough and still extract the rest of the Markdown entities if you want:
Quesadilla.extract('Some text', markdown_strikethrough: false)
If you want to change the generated HTML, you can create a custom renderer:
class CustomRenderer < Quesadilla::HTMLRenderer
def hashtag(display_text, hashtag)
%Q{<a href="http://example.com/tags/#{hashtag}" class="tag">#{display_text}</a>}
end
end
extraction = Quesadilla.extract('Some #awesome text', html_renderer: CustomRenderer)
extraction[:display_html] #=> 'Some <a href="http://example.com/tags/awesome" class="tag">#awesome</a> text'
Take a look at Quesadilla::HTMLRenderer for more details on creating a custom renderer.
To enable user mention extraction, pass users: true
as an option. You can optionally pass a callable object to validate a username. Here's a quick example:
validator = lambda do |username|
User.where('LOWER(username) = ?', username.downcase).first.try(:id)
end
extraction = extract('Real @soffes and fake @nobody', users: true, user_validator: validator)
Assuming there is a user named soffes
in your database, it would extract @soffes
. Assuming there isn't a user named nobody
, that would remain plain text. Obviously feel free to do whatever you want here. Quesadilla makes no assumptions about your user system.
User and hashtag detection use the twitter-text gem. This has some limits that you may not expect such as usernames can't be more than 20 characters and hashtags can't contain certain characters.
Quesadilla is tested under 1.9.2, 1.9.3, 2.0.0, JRuby 1.7.2 (1.9 mode), and Rubinius 2.0.0 (1.9 mode).
See the contributing guide.