Skip to content
This repository has been archived by the owner on Dec 1, 2017. It is now read-only.

Latest commit

 

History

History
25 lines (18 loc) · 715 Bytes

README.md

File metadata and controls

25 lines (18 loc) · 715 Bytes

AutoTagger

What is it?

This is an experimental automatic content tagger for GOV.UK pages based on the Ankusa gem, using the naive Bayes algorithm.

It attempts to determine correct tags for a page by learning from other, manually tagged pages.

How to use it?

To run the script locally, run ./bin/tag.rb file_name in your command line.

The file you pass to the script should be in CSV format with three columns - URL, tag and content. For an example, see the sample_content.csv file.

How to run the tests?

Just run rspec in the command line (which will work once the tests are written).

License

See the LICENSE file.