Skip to content

Commit

Permalink
[CUDA] Fix emission of constant strings in sections
Browse files Browse the repository at this point in the history
CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue
to unnamed. When emitting the object file LLVM will mark the surrounding
section as SHF_MERGE iff the string is nul-terminated and contains no
other nuls (see IsNullTerminatedString). This results in problems when
saving temporaries because LLVM doesn't set an EntrySize, so reading in
the serialized assembly file fails.
This never happened for the GPU binaries because they usually contain
a nul-character somewhere. Instead this only affected the module ID
when compiling relocatable device code.

However, this points to a potentially larger problem: If we put a
constant string into a named section, we really want the data to end
up in that section in the object file. To avoid LLVM merging sections
this patch unmarks the GlobalVariable's address as unnamed which also
fixes the problem of invalid serialized assembly files when saving
temporaries.

Differential Revision: https://reviews.llvm.org/D47902

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@334281 91177308-0d34-0410-b5e6-96231b3b80d8
  • Loading branch information
hahnjo committed Jun 8, 2018
1 parent a90c85a commit dcdd537
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
6 changes: 5 additions & 1 deletion lib/CodeGen/CGCUDANV.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,12 @@ class CGNVCUDARuntime : public CGCUDARuntime {
auto ConstStr = CGM.GetAddrOfConstantCString(Str, Name.c_str());
llvm::GlobalVariable *GV =
cast<llvm::GlobalVariable>(ConstStr.getPointer());
if (!SectionName.empty())
if (!SectionName.empty()) {
GV->setSection(SectionName);
// Mark the address as used which make sure that this section isn't
// merged and we will really have it in the object file.
GV->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::None);
}
if (Alignment)
GV->setAlignment(Alignment);

Expand Down
6 changes: 3 additions & 3 deletions test/CodeGenCUDA/device-stub.cu
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ void use_pointers() {
// ALL: private unnamed_addr constant{{.*}}kernelfunc{{.*}}\00"
// * constant unnamed string with GPU binary
// HIP: @[[FATBIN:__hip_fatbin]] = external constant i8, section ".hip_fatbin"
// CUDA: @[[FATBIN:.*]] = private unnamed_addr constant{{.*GPU binary would be here.*}}\00",
// CUDA: @[[FATBIN:.*]] = private constant{{.*GPU binary would be here.*}}\00",
// CUDANORDC-SAME: section ".nv_fatbin", align 8
// CUDARDC-SAME: section "__nv_relfatbin", align 8
// * constant struct that wraps GPU binary
Expand All @@ -81,7 +81,7 @@ void use_pointers() {
// * variable to save GPU binary handle after initialization
// NORDC: @__[[PREFIX]]_gpubin_handle = internal global i8** null
// * constant unnamed string with NVModuleID
// RDC: [[MODULE_ID_GLOBAL:@.*]] = private unnamed_addr constant
// RDC: [[MODULE_ID_GLOBAL:@.*]] = private constant
// CUDARDC-SAME: c"[[MODULE_ID:.+]]\00", section "__nv_module_id", align 32
// HIPRDC-SAME: c"[[MODULE_ID:.+]]\00", section "__hip_module_id", align 32
// * Make sure our constructor was added to global ctor list.
Expand Down Expand Up @@ -141,7 +141,7 @@ void hostfunc(void) { kernelfunc<<<1, 1>>>(1, 1, 1); }
// There should be no __[[PREFIX]]_register_globals if we have no
// device-side globals, but we still need to register GPU binary.
// Skip GPU binary string first.
// CUDANOGLOBALS: @{{.*}} = private unnamed_addr constant{{.*}}
// CUDANOGLOBALS: @{{.*}} = private constant{{.*}}
// HIPNOGLOBALS: @{{.*}} = external constant{{.*}}
// NOGLOBALS-NOT: define internal void @__{{.*}}_register_globals
// NOGLOBALS: define internal void @__[[PREFIX:cuda|hip]]_module_ctor
Expand Down

0 comments on commit dcdd537

Please sign in to comment.