Skip to content

Text (source code) search engine with indexer and a front end web interface to search. Uses Python 3.

License

Notifications You must be signed in to change notification settings

cbess/text-sherlock

Repository files navigation

Text Sherlock (or Sherlock)

Provides a fast, easy to install and simple to use search engine for text, but optimized for source code. An alternative, OpenGrok, requires too much time to install (though it may be worth it for some), but is more feature rich. Text Sherlock will give you a much easier setup, a text indexer, and a web app interface for searching with very little effort.

Soli Deo Gloria

Basic Setup

Instructions:

  1. Download Sherlock source from GitHub.
  2. Extract/place the Sherlock source code in the desired (install) directory. This will be where Sherlock lives.
  3. Run sh setup/virtualenv-setup.sh to setup an isolated environment and download core packages.
  4. Configure settings. The defaults in settings.py provide documentation for each setting.
    • Copy example.local_settings.yml to local_settings.yml.
    • Override/copy any setting from settings.py to local_settings.yml (change the values as needed). All YAML keys/options must be lowercase.
  5. Run source sherlock_env/bin/activate to enter the virtual environment.
  6. Run python main.py --index update or --index rebuild to index the path specified in the settings. Watch indexing output.
  7. Run python main.py --runserver to start the web server.
  8. Go to http://localhost:7777 to access the web interface. Uses the Bootstrap toolkit for it's UI.

You may need to install some packages before a Ubuntu installation will run without error.

  • Install curl: sudo apt-get install curl
  • Install uuid libs: sudo apt-get install uuid-dev
  • Install python dev: sudo apt-get install python-dev

Includes:

  • Settings/Configuration
  • Setup script (read contents of script for more information)
  • Main controller script
    • Run main.py -h for more information.
  • End-to-end interface
    • Indexing and searching text (source code). Built-in support for whoosh (fast searching) or xapian (much faster searching).
      • Easily extend indexing or searching via custom backends.
    • Front end web app served using werkzeug or cheroot.
      • werkzeug is for development to small traffic.
      • cheroot is the high-performance, pure-Python HTTP server used by CherryPy.
    • Settings and configuration using Python.

Web Interface

Features:

Append to document URL.

  • To highlight lines, append to URL: &hl=3,7,12-14,21
  • To jump to a line, append to end of URL: #line-3

screenshot

screenshot

Using other backends

In settings.py:

  • Change the default_indexer and default_searcher values to match the name given to the backend.
    • Possible values:
      • whoosh the default, no extra work needed.
      • xapian must be installed separately using the included setup/install-xapian.sh setup script.

Using other web servers

Text Sherlock has built-in support for werkzeug and cheroot WSGI compliant servers.

In settings.py:

  • Change the server_type value to one of the available server types.
    • Possible values:
      • default, werkzeug web server (default).
      • cheroot, production ready web server.

Core packages

Requires Python 3.5+

Other References

Project Goals

  1. Provide an easy to setup, fast, and adequate text search engine solution.
  2. Be a respectable alternative to OpenGrok.
  3. Influence the OpenGrok contributors to provide a simpler setup process.
    • I successfully setup two installations of OpenGrok on CentOS and Ubuntu 11.x. Each time it took more than two hours. Text Sherlock setup takes less than 5 minutes (excluding package download time).

Contributors