AI Analytics - Comparing Performance of GenAI Models

This repository was created to compare the performance of foundational models at different tasks and levels of complexity, using visualisation and statistics.

Data:

Data was facilitated by DataAnnotationTech

Categories:

1. Adversarial Dishones	8. Extraction
2. Adversarial Harmfulness	9. Mathemathical Reasoning
3. Brain Storming	10. Open QA
4. Classification	11. Poetry
5. Closed QA	12. Rewriting
6. Creative Writing	13. Summarization
7. Coding

Likertype rating scale:

Bard much better
Bard better
Bard slightly better
About the same
ChatGPT slightly better
ChatGPT better
Chat GPT much better

Tools used: pandas, plotly, statsmodels and scipy and scikit-posthocs

Bard vs ChatGPT

Statistical comparison**

Note: Imbalance dataset, a prime number of prompts 1003, Bard was not rated "Bard much better in the Poetry Category" and "Bard better" in the category Creative Writing for simple prompts.

Chi-square with Monte Carlo iterations p-value: 0.0001
Kruskal-Wallis p-value: 6.96E-7
Multinomila Logistic regression p-value: 0.00015

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
logos		logos
md-visualisations/bard_vs_gpt		md-visualisations/bard_vs_gpt
README.md		README.md
bard_and_chatgpt_statistical_comparison.ipynb		bard_and_chatgpt_statistical_comparison.ipynb
data_analysis_gpt_vs_bard_part_01.ipynb		data_analysis_gpt_vs_bard_part_01.ipynb
helper_functions.py		helper_functions.py
plotly_plots.py		plotly_plots.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Analytics - Comparing Performance of GenAI Models

Bard vs ChatGPT

Statistical comparison**

About

Releases

Packages

Languages

Wb-az/AI-Performance-Analytics-LLMs

Folders and files

Latest commit

History

Repository files navigation

AI Analytics - Comparing Performance of GenAI Models

Bard vs ChatGPT

Statistical comparison**

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages