go-nlp

Description

Go-nlp is a versatile natural language processing utility.

When I started Golang, I realised that the field of natural language processing was not very developed in this programming language.

So I thought it might be interesting to create a utility NLP package for the Gophers.

Badges

Installation

To import the go-nlp package into your project, simply issue the following command: go get github.com/mx79/go-nlp

To update the package from time to time, use one of the two commands: go get -u github.com/mx79/go-nlp or go install github.com/mx79/go-nlp@latest

Usage

package main

import (
	"fmt"
	"github.com/mx79/go-nlp/clean"
)

func main() {
	// Test sentence in english
	sentence := "My name is Max, I will use this sentence as a test. ?!"

	// Stopwords object
	stopwords := clean.NewStopwords("en")
	fmt.Println(stopwords.Stop(sentence))
	// My name Max, I will use sentence test. ?!

	// Stemmer object
	stemmer := clean.NewStemmer("en")
	fmt.Println(stemmer.Stem(sentence))
	// My name is Max, I will use this sentenc as a test. ?!

	// Cleaning func
	fmt.Println(clean.Lower(sentence))
	fmt.Println(clean.Tokenize(sentence, true))
	fmt.Println(clean.RemoveAccent(sentence))
	fmt.Println(clean.RemovePunctuation(sentence))
	// my name is max, i will use this sentence as a test. ?!
	// [My name is Max , I will use this sentence as a test . ? !]
	// My name is Max, I will use this sentence as a test. ?!
	// My name is Max I will use this sentence as a test

	// Purger object that is calling every cleaning package object and func
	p := clean.NewTextPurger("en", true, true, true, true, true)
	fmt.Println(p.Purge(sentence))
	// my name max i will use sentenc test
}

Support

This repository is maintained, and you can create a ticket directly on it for any bug or suggestion for improvement.

Roadmap

In the future, I would like to integrate into this package an intention detection object based on the RandomForest algorithm.

Ideally, I would like to integrate as many languages as possible with stopwords, lemmatization and stemming data.

Contributing

This project is open to contributions.

Acknowledgment

I would like to thank ranks.nl for providing me with stopwords for over 30 countries. I also thank the contributors to the data in the famous Python NLTK package. For this package, I got the stemming data of about ten languages.

Name	Name	Last commit message	Last commit date
Latest commit mx79 Add some test May 4, 2023 3f2e8de · May 4, 2023 History 30 Commits
.idea	.idea	changing path of stopwords	Aug 26, 2022
clean	clean	Add some test	May 4, 2023
distance	distance	Add some test	May 4, 2023
extractor	extractor	Add some test	May 4, 2023
utils	utils	Add some test	May 4, 2023
README.md	README.md	last updates and WordErrorRate add	Sep 29, 2022
go.mod	go.mod	last updates and WordErrorRate add	Sep 29, 2022
go.sum	go.sum	last updates and WordErrorRate add	Sep 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

go-nlp

Description

Badges

Installation

Usage

Support

Roadmap

Contributing

Acknowledgment

About

Releases

Packages

Languages

mx79/go-nlp

Folders and files

Latest commit

History

Repository files navigation

go-nlp

Description

Badges

Installation

Usage

Support

Roadmap

Contributing

Acknowledgment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages