Skip to content

Commit

Permalink
added todo items
Browse files Browse the repository at this point in the history
  • Loading branch information
keiwenhuang committed Sep 12, 2024
1 parent 8ce06bc commit 8112a56
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 357 deletions.
36 changes: 35 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,38 @@ This project uses the Comprehensive Medical Q&A Dataset from Kaggle:
This project is licensed under the MIT License - see the LICENSE file for details.

## Acknowledgements
We gratefully acknowledge the use of the Comprehensive Medical Q&A Dataset from Kaggle in this project.
We gratefully acknowledge the use of the Comprehensive Medical Q&A Dataset from Kaggle in this project.

## TODO
- [ ] Perform more extensive data exploration
- [ ] Analyze distribution of question types
- [ ] Examine question and answer lengths
- [ ] Identify common medical terms and topics
- [ ] Visualize key insights from the dataset

- [ ] Enhance data preprocessing
- [ ] Implement text cleaning (remove special characters, standardize formatting)
- [ ] Perform named entity recognition for medical terms
- [ ] Apply advanced tokenization techniques
- [ ] Explore options for data augmentation

- [ ] Evaluate and improve data quality
- [ ] Identify and handle potential data inconsistencies
- [ ] Implement a method to detect and remove duplicate entries
- [ ] Consider adding additional metadata (e.g., difficulty level, topic categories)

- [ ] Optimize data storage and retrieval
- [ ] Experiment with different vector embedding techniques
- [ ] Fine-tune Chroma database parameters for better performance
- [ ] Implement caching mechanisms for frequently accessed data

- [ ] Enhance model performance and capabilities
- [ ] Experiment with fine-tuning the language model on medical data
- [ ] Implement a hybrid retrieval-generation approach
- [ ] Explore few-shot learning techniques for improved accuracy

- [ ] Improve user experience
- [ ] Develop a user-friendly web interface or chat application
- [ ] Implement multi-turn conversations and context tracking
- [ ] Add support for voice input and text-to-speech output

Loading

0 comments on commit 8112a56

Please sign in to comment.