-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Gaussian Mixture Models as a toy distribution #28
Conversation
Thank you for the addition! Can you briefly explain in the docstring what this distribution/dataset looks like (e.g. in 2D), and how it differs from the Hypersphere dataset? |
I don't think its too comparable to the hyperspheres from what I understand. In this case, the user controls the placement of all gaussian blobs as well as their weights and standard deviations. |
Yes, controlling the datasets via high-level hyperparameters in a similar fashion to how we construct models is the core philosophy of this library. Would you like to add this? |
I think random generation will move it much closer towards the hyperspheres dataset, dependent on how the generation of the means and stddevs is implemented. It might make the addition obsolete. |
In that case, let's stick to the hyperspheres dataset. You are welcome to add more generation modes to the hyperspheres dataset, though. For instance, we could replace it with something like class MixtureDataset:
def __init__(self, mode="spheres"):
match mode:
case "cubes": ...
case "spheres": ...
case "random": ... where each |
My issue with random creation is lacking reproducability. Say I wanted to learn a distribution with two gaussians and compare the loss values for different network types. In this case the loss will be different depending on the overlap and position of the means usually. |
For this, you can either copy the dataset directly or use a seed (see |
Hmm, I was thinking of a method that uses its own rng to be able to seed the dataset generation without affecting the rest of the process. But maybe thats not necessary |
Yes, I can implement a hypercube version. I think I would place the blobs on the corners (and hence have an upper bound for the amount of centers). |
I would welcome a name change consistent with the implementation of changes. However, let's stick to ML jargon, using |
Ok, the trivial way would be to name it |
Ok, see new pr |
I thought it might be useful to have access to gaussian mixture model toy distributions