Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix None grad problem during training TOOD by adding SigmoidGeometricMean #7090

Merged
merged 2 commits into from
Jan 29, 2022

Conversation

Johnson-Wang
Copy link
Collaborator

@Johnson-Wang Johnson-Wang commented Jan 28, 2022

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

The training of TOOD often encounters None gradient during backpropagation, which would further cause None tensors in the next training step. Some issues in the original repo (fcjian/TOOD#11) might be also due to this error. The problem is caused by the naive implementation of sigmoid geometric mean function cls_score = (cls_logits.sigmoid() * cls_prob.sigmoid()).sqrt(). This output might be 0 if cls_logits or cls_prob is a low negative value, which causes either inf grad of none grad during backpropagation.

Modification

A reimplementation of SigmoidGeometricMean class as an inheritance of torch.autograd.Function is proposed. The backward function is derived analytically and would avoid and inf or none grad during bp.

  • This modification has little influence on the final results (42.3 mAP after modification vs. 42.4 mAP as reported).
  • This modification enables users to train TOOD without ATSS warmup, yet with some performance drop (~41.8 mAP)

Checklist

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • This PR does not involve any function interface change.
  • Docstring has been added.

@Johnson-Wang Johnson-Wang changed the title Add SigmoidGeometricMean Fix None grad problem during training TOOD by adding SigmoidGeometricMean Jan 28, 2022
from torch.nn import functional as F


class SigmoidGeometricMean(Function):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we implement an interface named sigmoid_geometric_mean = SigmoidGeometricMean.apply here so that in tood_head we can simply use sigmoid_geometric_mean(xxx)?

Copy link
Collaborator

@jshilong jshilong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented Jan 28, 2022

Codecov Report

Merging #7090 (72e89e3) into dev (4bdb312) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #7090      +/-   ##
==========================================
+ Coverage   62.41%   62.46%   +0.04%     
==========================================
  Files         330      330              
  Lines       26199    26216      +17     
  Branches     4436     4437       +1     
==========================================
+ Hits        16353    16375      +22     
+ Misses       8976     8966      -10     
- Partials      870      875       +5     
Flag Coverage Δ
unittests 62.43% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmdet/models/dense_heads/tood_head.py 83.79% <100.00%> (+0.06%) ⬆️
mmdet/models/utils/__init__.py 100.00% <100.00%> (ø)
mmdet/models/utils/misc.py 96.66% <100.00%> (+3.80%) ⬆️
mmdet/utils/misc.py 95.23% <0.00%> (-4.77%) ⬇️
mmdet/core/bbox/assigners/max_iou_assigner.py 72.36% <0.00%> (-1.32%) ⬇️
mmdet/models/dense_heads/corner_head.py 69.46% <0.00%> (+1.40%) ⬆️
mmdet/models/detectors/cornernet.py 100.00% <0.00%> (+5.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4bdb312...72e89e3. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants