Part-of-Speech Tagging using Hidden Markov Models
This project demonstrates the Pomegranate library's proficiency in building a Hidden Markov Model for part-of-speech tagging with a tagset.
Understanding Hidden Markov Models and the Viterbi Algorithm
This project showcases the use of the Viterbi algorithm, one of the plenty applications of statistics in computing. For Hidden Markov Models (HMMs), the Viterbi algorithm is used to determine the most likely sequence of parts of speech (hidden states) given the preceeding observed words. It is one of the most basic forms of part-of-speech tagging in Natural Language Processing.
HMM_warmup.ipynb
: Test notebook for practicing both the forward and Viterbi methods, among other things.HMM_Tagger.ipynb
: Notebook usingbase-hmm-tagger
, emission and transition probabilities to make next part-of-speech predictions using the Viterbi algorithm.helpers.py
: Script file for additional functions and classes.
- Clone this repository
- Ensure you have all the required libraries and modules installed
- Follow the steps in the
HMM_warmup.ipynb
notebook in your preferred environment for a base understanding - Implement the
n-gram
models andbase-hmm-tagger
in theHMM_Tagger.ipynb
notebook by rerunning the cells
For a more comprehensive explanation, refer to this post from Medium.
Copyright (c) 2024, Ayo
This project is for personal and educational purposes only. All rights reserved.