Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training the model #20

Open
dy1ngs0ul opened this issue Jan 9, 2020 · 10 comments
Open

Training the model #20

dy1ngs0ul opened this issue Jan 9, 2020 · 10 comments

Comments

@dy1ngs0ul
Copy link

Hello Nvidia AI-IOT team,

First of all thank you very much for your effort in creating this code. I am Zeyan and currently working on real time pose estimation implementation on Jetson AGX Xavier.
My goal is to use Depths image (from Intel real sense camera) and check whether the depths information could help improve the performance of pose estimation or not.

Before I conduct my experiments. First I wish to train the model to act as an base line for our experiments. From your training script it seems config.json file is required to trained the network. As i wish to follow your parameters for this baseline training. It would be great if you could provide me your conifg file so that I could follow your step and parameters to train your model.

Thanks in advance for your help and support. I will be looking forward for your reply. Please let me know if you have anything to say,

Thanks
Dr. Zeyan Oo

@jaybdub
Copy link
Collaborator

jaybdub commented Feb 4, 2020

Hi dy1ngs0ul,

Thanks for reaching out!

You may find the training configuration files in this directory

https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json

Please let me know if you have any questions.

Best,
John

@dy1ngs0ul
Copy link
Author

Thanks for your help

@kinglintianxia
Copy link

kinglintianxia commented Apr 26, 2020

@jaybdub , Thanks for your excellent work!
So far I think cmap_channels means keypoint numbers, paf_channels equals to 2*connections, Can you explain upsample_channels means?

"model": {
        "name": "densenet121_baseline_att",
        "kwargs": {
            "cmap_channels": 18,
            "paf_channels": 42,
            "upsample_channels": 256,
            "num_upsample": 3
        }
    },

@NicolaGugole
Copy link

Hi guys! Have any of you succesfully completed any training using the script provided within the repo?
I'm trying to prune the models but I can't seem to be able to proceed with retraining using train.py because of inconcistency between paf tensors' size:
Traceback (most recent call last): File "provaTrain.py", line 150, in <module> paf_mse = torch.mean(mask * (paf_out - paf)**2) File "/usr/local/lib/python3.6/dist-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/wrap.py", line 58, in wrapper return orig_fn(*new_args, **kwargs) RuntimeError: The size of tensor a (42) must match the size of tensor b (38) at non-singleton dimension 1

@NicolaGugole
Copy link

I solved it myself, thank you anyway!

@OliverGuy
Copy link

OliverGuy commented Jul 1, 2020

Hey all, I'm having a similar error as @NicolaGugole using the training dataset downloaded through the provided shell script.
Any tips on how to fix this would be greatly appreciated !
Edit:Nevermind, one just has to edit the model attribute of the json file referenced earlier to match tensor sizes.

@NicolaGugole
Copy link

Hey all, I'm having a similar error as @NicolaGugole using the training dataset downloaded through the provided shell script.
Any tips on how to fix this would be greatly appreciated !
Edit:Nevermind, one just has to edit the model attribute of the json file referenced earlier to match tensor sizes.

In my case I had to change the annotation file because I noticed a difference between the annotation keypoints number (17 keypoints) and the human_pose.json number (18 keypoints). This difference in tensor sizes is weird in my opinion.
Forcing this sizes to match does not create a fruitful training in my case, I assume because of the fact that the annotation files contain values created to match 17 keypoints while we modified them to match 18 keypoints.

I noticed that in this config file (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json) the devs used a "modified" version of the json file. I hope in the near future we'll have the opportunity to take a look at the modified version of these json files (maybe the devs could upload the files to this repo).

So I have a question @OliverGuy : did you just change the kwargs cmap_channels and paf_channels in the json file referenced earlier? Did that do the job? I tried to do the same but ended up with other conflicts.

Sorry for bothering you all,
Have a nice day!

@OliverGuy
Copy link

OliverGuy commented Jul 6, 2020

@NicolaGugole I only modified those in the json, but I'm having issues with CudNN not finding the convolution algorithm (see #54).

@silent-code
Copy link

@NicolaGugole

You have to pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders) so that you will have 18 keypoints. Use the command:

python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json

@sinuku
Copy link

sinuku commented Aug 26, 2021

@jaybdub , Thanks for your excellent work!
So far I think cmap_channels means keypoint numbers, paf_channels equals to 2*connections, Can you explain upsample_channels means?

"model": {
        "name": "densenet121_baseline_att",
        "kwargs": {
            "cmap_channels": 18,
            "paf_channels": 42,
            "upsample_channels": 256,
            "num_upsample": 3
        }
    },

Did you figure out what upsample_channels means?
I am struggling with the same issue as you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants