Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a bug on example code at 5.6. Getting Accurate Results #321

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

enochjung
Copy link

@lshqqytiger benchmarked the example codes on ThunderX2 CN9980 processor, but the performance did not meet our expectations.
Interestingly, Case 2 performed faster than Case 1.

  • Case 1
    int checksum = 0;
    
    const clock_t start = clock();
    
    for (int a = 0; a < 1000; a++)
        for (int b = 0; b < 1000; b++)
            checksum ^= gcd(a, b);
    
    const clock_t end = clock();
  • Case 2
    int a[1000], b[1000];
    
    for (int i = 0; i < 1000; i++)
        a[i] = rand() % 1000, b[i] = rand() % 1000;
    
    int checksum = 0;
    
    const clock_t start = clock();
    
    for (int t = 0; t < 1000; t++)
        for (int i = 0; i < 1000; i++)
            checksum += gcd(a[i], b[i]);
    
    const clock_t end = clock();

Case 1 took 0.104947 seconds, while Case 2 took 0.075065 seconds.

The faster speed of Case 2 was due to a high branch prediction rate. In Case 2, there were only 1,000 combinations of gcd arguments, enabling the branch predictor to memorize the entire sequence of comparison and perform branch prediction successfully.

To ensure the example code executes 1 million combinations of gcd arguments as intended, I modified the code as follows:

  • Case 3
    int a[1000], b[1000];
    
    for (int i = 0; i < 1000; i++)
        a[i] = rand() % 1000, b[i] = rand() % 1000;
    
    int checksum = 0;
    
    const clock_t start = clock();
    
    for (int t = 0; t < 1000; t++)
        for (int i = 0; i < 1000; i++)
            checksum += gcd(a[t], b[i]);  // modified from a[i] to a[t]
    
    const clock_t end = clock();

Case 3 took 0.122731 seconds, performing the expected result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant