Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gTest] Add more test coverage for Batchnorm Forward Training #3444

Merged
merged 7 commits into from
Jan 14, 2025

Conversation

xinlipn
Copy link
Contributor

@xinlipn xinlipn commented Dec 18, 2024

More test cases added for better coverage. Here's the summary

  1. N C H W: 768 1 14 14
    Covers:
    forward_spatial_single.cpp: variant == 0, variant == 1
    768 1 14 14.txt

  2. N C H W: 768 1 23 23
    Covers:
    forward_spatial_single.cpp: variant == 1 and IsApplicable()
    forward_spatial_multiple.cpp: variant == 2 (1st)
    768 1 23 23.txt

  3. N C H W: 832 1 14 14
    Covers:
    forward_spatial_single.cpp: variant == 1 and IsApplicable()
    forward_spatial_multiple.cpp: variant == 0 then variant == 2 (2nd)
    832 1 14 14.txt

  4. N C H W: 832 1 28 28
    Covers:
    forward_spatial_single.cpp: variant == 1 and IsApplicable()
    forward_spatial_multiple.cpp: variant == 2 (1st) then variant == 2 (2nd)
    832 1 28 28.txt

A few questions/observations:

  1. Should NetworkLarge be used for GPU_BN_FWD_Train_Large_FP64?

    testing::Combine(testing::ValuesIn(NetworkSmall<BNTestCase>()),

  2. This blurb in IsApplicable

    if(!(WORKAROUND_SWDEV_253606 == 0 && n < 3) &&
    !((in_nhw < 33554432 && in_cstride > 1024) ||
    ((n >= 256) && (in_cstride > 60) && bfpmixparm) ||
    ((in_cstride > 512) && bfpmixparm) ||
    in_cstride <= 512))
    return false;

    is the same as in GetSolution
    if( (in_nhw < 33554432 && in_cstride > 1024) ||
    ((n >= 256) && (in_cstride > 60) && (bfpmixparm || bbfpmixparam)) ||
    ((in_cstride > 512) && (bfpmixparm || bbfpmixparam)))

    Which means this snippet will never be run
    else
    {
    variant = 2;
    xlocalsize = 1;
    ylocalsize = 1024;
    auto segment = int(std::ceil(double(in_cstride) / double(ylocalsize)));
    xgridsize = c;
    ygridsize = segment * ylocalsize;
    ldsgcn = ylocalsize / 64;
    ldsnogcn = ylocalsize;
    }

  3. Likewise, these two lines are identical

    if((n > 768) && (in_cstride > 150) && bfp32parm)

    if((n > 768) && (in_cstride > 150) && bfp32parm)

Due to 2 and 3 above, variant will never be set to 2 in forward_spatial_single.cpp

  1. When this condition is met in forward_spatial_single.cpp, IsApplicable returns false
    if(!(WORKAROUND_SWDEV_253606 == 0 && n < 3) &&
    !((in_nhw < 33554432 && in_cstride > 1024) ||
    ((n >= 256) && (in_cstride > 60) && bfpmixparm) ||
    ((in_cstride > 512) && bfpmixparm) ||
    in_cstride <= 512))
    return false;

    And forward_spatial_multiple.cpp will be called. In its GetSolution, this snippet will always be skipped due to the fact it's false as above
    if((in_nhw < 33554432 && in_cstride > 1024) ||
    ((n >= 256) && (in_cstride > 60) && bfpmixparm) ||
    ((in_cstride > 512) && bfpmixparm))
    {
    variant = 1;
    }
    else if(in_cstride <= 512)
    {
    variant = 0;
    }

Therefore, variant will never be 0 or 1 for forward_spatial_multiple.cpp

Please note the cases that do not pass IsApplicable() in forward_spatial_single.cpp will be handled in IsApplicable() in forward_spatial_multiple.cpp

return !BnFwdTrainingSpatialSingle{}.IsApplicable(context, problem);

Comment on lines 145 to +147
INSTANTIATE_TEST_SUITE_P(Smoke,
GPU_BN_FWD_Train_Large_FP64,
testing::Combine(testing::ValuesIn(NetworkSmall<BNTestCase>()),
testing::Combine(testing::ValuesIn(NetworkLarge<BNTestCase>()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall it be a new Full instance? Let's keep Smoke tests relatively small, that's the point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @CAHEK7 could you clarify what you mean by "Full" instance? For each data type, there's 4 sets of test data for NetworkSmall and 32 for NetworkLarge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a new Full instance. Its just there was a typo before. It should be calling NetworkLarge for testing AP2. For AP1 we only test small bn network (4 test).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean Full tests and Smoke tests - Smoke is a small and fast subset which checks only a basic functionality and must not have any long-running tests or large configs.
The main point of Smoke tests and a quick sanity check before the long runs.

That's why I'm kind of recommend to move those tests into the Full tests, since it uses large configs.

Comment on lines 145 to +147
INSTANTIATE_TEST_SUITE_P(Smoke,
GPU_BN_FWD_Train_Large_FP64,
testing::Combine(testing::ValuesIn(NetworkSmall<BNTestCase>()),
testing::Combine(testing::ValuesIn(NetworkLarge<BNTestCase>()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a new Full instance. Its just there was a typo before. It should be calling NetworkLarge for testing AP2. For AP1 we only test small bn network (4 test).

@xinlipn xinlipn marked this pull request as ready for review January 13, 2025 16:41
@xinlipn xinlipn merged commit d7daf94 into develop Jan 14, 2025
34 of 143 checks passed
@xinlipn xinlipn deleted the sl/bn_fwd_training branch January 14, 2025 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants