Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Question about batch size #46

Open
huzijun1996 opened this issue Feb 13, 2025 · 2 comments
Open

The Question about batch size #46

huzijun1996 opened this issue Feb 13, 2025 · 2 comments

Comments

@huzijun1996
Copy link

Hello. You set the batch size to 256 in both PIP and Transpose, but different from image training, the length of each segment in AMASS dataset is different (e.g., length 1204, length 1064, length 3181, length 1111, length 344, etc.), and there are correlations before and after the complete batch of data, so how should we slice the data?
Do we preprocess the data uniformly to length 256, and discard the part of a segment that exceeds a multiple of 256? If we don't, we can only default the batch size to 1, and then do the “squeeze()” processing.

If we simply use the custom_collate_fn function and pad_sequence, then data from different segments will be mixed together.
Image
I took a piece of data for training, and you can see that the batch size changed from 1 to 16, and the number of data segments changed from 150 to 9.
Image
Image

@Xinyu-Yi
Copy link
Owner

The sequence length is set to 200 for training. E.g., for a sequence of 650 frames, we cut it into 4 subsequences of 200, 200, 200, and 50 frames. You can pack a batch of subsequences by pack_padded_sequence in torch.

@huzijun1996
Copy link
Author

huzijun1996 commented Feb 17, 2025

Thank you for your answer. Your meaning is to slice each complete data so that it becomes a small fragment of data every 200 frames, and the part of data less than 200 frames is processed with pack_padded_sequence. At the same time, the batch size is set to 256, so that 256 data of length 200 are extracted at once for training. Do I understand it correctly?
Or do you plan to split and padding all the data inside the dataset whether it's 650 frames, 1300 frames, or 843 frames, each uniformly split and padding into [n,200], and then put all the data into pack_padded_sequence and select 256 batch sizes out of the n for one train?
In this way, the format of the data I put into training will be changed to “glb_acc torch.Size(batch_size,seq_length,6,3), glb_rot torch.Size(batch_size,seq_length,6,3,3)”. It doesn't seem like it can be put directly into the PIP for training when it's in this format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants