Skip to content

Commit

Permalink
merge malaya-graph
Browse files Browse the repository at this point in the history
  • Loading branch information
huseinzol05 committed Apr 3, 2023
1 parent 7af8f40 commit 40e4ea4
Show file tree
Hide file tree
Showing 28 changed files with 5,010 additions and 105 deletions.
6 changes: 6 additions & 0 deletions README-pypi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,14 @@ Features
- **Spelling Correction**, using local Malaysia NLP researches hybrid with Transformer-Bahasa to auto-correct any malay words and NeuSpell using T5-Bahasa.
- **Abstractive Summarization**, provide abstractive summarization using T5-Bahasa.
- **Extractive Summarization**, Extractive interface using Transformer-Bahasa and Doc2Vec.
- **Text to Knowledge Graph**, Generate knowledge graph from human sentences.
- **Topic Modeling**, provide Transformer-Bahasa, LDA2Vec, LDA, NMF, LSA interface and easy BERTopic integration.
- **EN-MS Translation**, provide English to standard Malay using T5-Bahasa.
- **IND-MS Translation**, provide Indonesian to standard Malay using T5-Bahasa.
- **JAV-MS Translation**, provide Javanese to standard Malay using T5-Bahasa.
- **MS-EN Translation**, provide standard Malay to English using T5-Bahasa.
- **MS-IND Translation**, provide standard Malay to Indonesian using T5-Bahasa.
- **MS-JAV Translation**, provide standard Malay to Javanese using T5-Bahasa.
- **Zero-shot classification**, provide Zero-shot classification interface using Transformer-Bahasa to recognize texts without any labeled training data.
- **Zero-shot Entity Recognition**, provide Zero-shot entity tagging interface using Transformer-Bahasa to extract entities.
- **Constituency Parsing**, breaking a text into sub-phrases using finetuned Transformer-Bahasa.
Expand All @@ -62,6 +67,7 @@ Features
- **Emotion Analysis**, detect and recognize 6 different emotions of texts using finetuned Transformer-Bahasa.
- **Entity Recognition**, seeks to locate and classify named entities mentioned in text using finetuned Transformer-Bahasa.
- **Jawi-to-Rumi**, convert from Jawi to Rumi using Transformer.
- **Knowledge Graph to Text**, Generate human sentences from a knowledge graph.
- **Language Detection**, using Fast-text and Sparse Deep learning Model to classify Malay (formal and social media), Indonesia (formal and social media), Rojak language and Manglish.
- **Language Model**, using KenLM, Masked language model using BERT, ALBERT and RoBERTa, and GPT2 to do text scoring.
- **NSFW Detection**, detect NSFW text using rules based and subwords Naive Bayes.
Expand Down
2 changes: 2 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ Features
- **Spelling Correction**, using local Malaysia NLP researches hybrid with Transformer-Bahasa to auto-correct any malay words and NeuSpell using T5-Bahasa.
- **Abstractive Summarization**, provide abstractive summarization using T5-Bahasa.
- **Extractive Summarization**, Extractive interface using Transformer-Bahasa and Doc2Vec.
- **Text to Knowledge Graph**, Generate knowledge graph from human sentences.
- **Topic Modeling**, provide Transformer-Bahasa, LDA2Vec, LDA, NMF, LSA interface and easy BERTopic integration.
- **EN-MS Translation**, provide English to standard Malay using T5-Bahasa.
- **IND-MS Translation**, provide Indonesian to standard Malay using T5-Bahasa.
Expand All @@ -85,6 +86,7 @@ Features
- **Emotion Analysis**, detect and recognize 6 different emotions of texts using finetuned Transformer-Bahasa.
- **Entity Recognition**, seeks to locate and classify named entities mentioned in text using finetuned Transformer-Bahasa.
- **Jawi-to-Rumi**, convert from Jawi to Rumi using Transformer.
- **Knowledge Graph to Text**, Generate human sentences from a knowledge graph.
- **Language Detection**, using Fast-text and Sparse Deep learning Model to classify Malay (formal and social media), Indonesia (formal and social media), Rojak language and Manglish.
- **Language Model**, using KenLM, Masked language model using BERT, ALBERT and RoBERTa, and GPT2 to do text scoring.
- **NSFW Detection**, detect NSFW text using rules based and subwords Naive Bayes.
Expand Down
19 changes: 19 additions & 0 deletions docs/Api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,18 @@ malaya.summarization.extractive
.. automodule:: malaya.summarization.extractive
:members:

malaya.text_to_kg.e2e
---------------------------------

.. automodule:: malaya.text_to_kg.e2e
:members:

malaya.text_to_kg.parser
---------------------------------

.. automodule:: malaya.text_to_kg.parser
:members:

malaya.topic_model.decomposition
---------------------------------

Expand Down Expand Up @@ -245,6 +257,13 @@ malaya.jawi_rumi
.. automodule:: malaya.jawi_rumi
:members:

malaya.kg_to_text
-------------------------

.. automodule:: malaya.kg_to_text
:members:


malaya.language_detection
-------------------------

Expand Down
2 changes: 2 additions & 0 deletions docs/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ Features
- **Spelling Correction**, using local Malaysia NLP researches hybrid with Transformer-Bahasa to auto-correct any malay words and NeuSpell using T5-Bahasa.
- **Abstractive Summarization**, provide abstractive summarization using T5-Bahasa.
- **Extractive Summarization**, Extractive interface using Transformer-Bahasa and Doc2Vec.
- **Text to Knowledge Graph**, Generate knowledge graph from human sentences.
- **Topic Modeling**, provide Transformer-Bahasa, LDA2Vec, LDA, NMF, LSA interface and easy BERTopic integration.
- **EN-MS Translation**, provide English to standard Malay using T5-Bahasa.
- **IND-MS Translation**, provide Indonesian to standard Malay using T5-Bahasa.
Expand All @@ -85,6 +86,7 @@ Features
- **Emotion Analysis**, detect and recognize 6 different emotions of texts using finetuned Transformer-Bahasa.
- **Entity Recognition**, seeks to locate and classify named entities mentioned in text using finetuned Transformer-Bahasa.
- **Jawi-to-Rumi**, convert from Jawi to Rumi using Transformer.
- **Knowledge Graph to Text**, Generate human sentences from a knowledge graph.
- **Language Detection**, using Fast-text and Sparse Deep learning Model to classify Malay (formal and social media), Indonesia (formal and social media), Rojak language and Manglish.
- **Language Model**, using KenLM, Masked language model using BERT, ALBERT and RoBERTa, and GPT2 to do text scoring.
- **NSFW Detection**, detect NSFW text using rules based and subwords Naive Bayes.
Expand Down
Loading

0 comments on commit 40e4ea4

Please sign in to comment.