-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removes ZeroFrog's "optimized" memcpy and memcmp functions. #370
Removes ZeroFrog's "optimized" memcpy and memcmp functions. #370
Conversation
LGTM, libc memcpy is probably much faster nowadays. |
As x86_32 is deprecated, the performance doesn't matter much, so LGTM |
Please rebase. |
Looks good to me as well (following the rebase) |
These were only compiled in on Windows and x86_32. They provided "optimized" copies and compares based on blocksizes for the AMD Athlon and Duron CPU families. The code was taken from something that AMD provides with a as-is license. Just get rid of this crap.
Maybe we should replace these with optimized x86_64 (or other platform specific ones), or just a placeholder wrapper so they can be marked for improvement in the future. |
First, a word of advice. Assume that the people who wrote your standard Second, yes, there are better alternatives.
Matthew Parlane On 22 May 2014 14:57, shuffle2 [email protected] wrote:
|
That is basically what I said...I expect we have situations which fall into "Or, you can use your superior knowledge of your specific situation.", especially around graphics and other buffers which the game programmers must have made properly aligned for the device to operate correctly. Our memory base is aligned, so "gamecube-aligned" buffers happen to be implicitly aligned from dolphin's view. However, the compiler cannot automagically see this. |
If someone feels like taking the time to write a "faster" x86_64 specific memcpy and memcmp then do it. That is outside of the scope of this PR. |
I made a Memcpy16 with dst and size have to be aligned on 16 for vertex buffer upload but not in git yet and i have first to profile to see if it is worth the effort |
Alright, we will leave the architecture-specific optimizations to future PRs. |
Removes ZeroFrog's "optimized" memcpy and memcmp functions.
Mine was dedicated to help VB uploading, nVidia and AMD advertise to align everything on 16 to help the driver and have as little possible overhead as possible. |
@galop1n OK, those could be nice as well :) Data with alignment requirements on the host may have slightly different alignment/size requirements, so it might be good to have a different specialization for such things (which may take into account host GPU model, CPU model, etc). |
These were only compiled in on Windows and x86_32.
They provided "optimized" copies and compares based on blocksizes for the AMD Athlon and Duron CPU families.
The code was taken from something that AMD provides with a as-is license.
Just get rid of this crap.