forked from llvm-mirror/llvm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and nee…
…dsFrameBaseReg Summary: Without the fix to isFrameOffsetLegal to consider the instruction's immediate offset, the new test case hits the corresponding assertion in resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a different base register. With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of places because frame base registers are added where they're not needed. This is addressed by properly implementing needsFrameBaseReg, which also helps to avoid unnecessary zero frame indices in a bunch of other places. Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D27344 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289048 91177308-0d34-0410-b5e6-96231b3b80d8
- Loading branch information
Showing
9 changed files
with
145 additions
and
103 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
;RUN: llc < %s -march=amdgcn -mcpu=verde -mattr=+vgpr-spilling -mattr=-promote-alloca -verify-machineinstrs | FileCheck %s -check-prefix=CHECK | ||
;RUN: llc < %s -march=amdgcn -mcpu=tonga -mattr=+vgpr-spilling -mattr=-promote-alloca -verify-machineinstrs | FileCheck %s -check-prefix=CHECK | ||
|
||
; Allocate two stack slots of 2052 bytes each requiring a total of 4104 bytes. | ||
; Extracting the last element of each does not fit into the offset field of | ||
; MUBUF instructions, so a new base register is needed. This used to not | ||
; happen, leading to an assertion. | ||
|
||
; CHECK-LABEL: {{^}}main: | ||
; CHECK: buffer_store_dword | ||
; CHECK: buffer_store_dword | ||
; CHECK: buffer_load_dword | ||
; CHECK: buffer_load_dword | ||
define amdgpu_gs float @main(float %v1, float %v2, i32 %idx1, i32 %idx2) { | ||
main_body: | ||
%m1 = alloca [513 x float] | ||
%m2 = alloca [513 x float] | ||
|
||
%gep1.store = getelementptr [513 x float], [513 x float]* %m1, i32 0, i32 %idx1 | ||
store float %v1, float* %gep1.store | ||
|
||
%gep2.store = getelementptr [513 x float], [513 x float]* %m2, i32 0, i32 %idx2 | ||
store float %v2, float* %gep2.store | ||
|
||
; This used to use a base reg equal to 0. | ||
%gep1.load = getelementptr [513 x float], [513 x float]* %m1, i32 0, i32 0 | ||
%out1 = load float, float* %gep1.load | ||
|
||
; This used to attempt to re-use the base reg at 0, generating an out-of-bounds instruction offset. | ||
%gep2.load = getelementptr [513 x float], [513 x float]* %m2, i32 0, i32 512 | ||
%out2 = load float, float* %gep2.load | ||
|
||
%r = fadd float %out1, %out2 | ||
ret float %r | ||
} |
Oops, something went wrong.