You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While playing with the built-in array here.gpus, I discovered that the behaviour of here.gpus[x] when x is out of bounds is somehow not clearly defined. For example, I executed this program on a system with 8 GPUs, but only one enabled (CHPL_RT_NUM_GPUS_PER_LOCALE=1):
configconst gpuID =0;
proc main() {
writeln(here.gpus);
writeln(here.gpus.domain);
writeln(here.gpus[gpuID], "\n");
var A:[1..10]int;
onhere.gpus[gpuID] {
var B:[1..10]int;
@assertOnGpu
foreach i in B.domain {
B[i]= i;
}
A = B;
}
writeln(A);
}
By default (--gpuID 0), this program returns as expected:
But, for any --gpuID x values greater than 1, we get:
LOCALE0-GPU0
{0..0}
nil
1 2 3 4 5 6 7 8 9 10
While nil is probably expected because only one GPU is enabled, it seems that assertOnGPU is not triggered. Is on nil possible?
I also extended the experiment to negative numbers, and the results seem unpredictable as I encountered at least four different outputs:
For -1, here.gpus[-1] returns here.gpus[0] and assertOnGPU is not triggered:
Of course here.gpus is not expected to be used that way, and these experiments are a bit sadistic, but first I'd like to report this just in case this is not a known behaviour, and then I wonder if there is any interesting explanation behind that.
The text was updated successfully, but these errors were encountered:
@Guillaume-Helbecque : I don't think this behavior is intentional and strongly suspect that it's one of the impacts of the following warning printed when doing GPU compilations:
warning: The prototype GPU support implies --no-checks. This may impact debuggability. To suppress this warning, compile with --no-checks explicitly
Specifically, the gpus array is a normal array in Chapel, and accesses to it would normally be bounds-checked; but since GPU compilations use --no-checks, that bounds-checking is disabled. If I compile similar programs for the flat (non-GPU) locale model, I get out-of-bounds errors as expected.
In saying this, I'm only providing a likely explanation, not saying that this is as we'd like things to be. Coming up with a way to enable checks for GPU compilations is definitely something that we'd consider to be important to Chapel's long-term productivity for GPU programming. Sorry for any hassle in the meantime.
Descritpion
While playing with the built-in array
here.gpus
, I discovered that the behaviour ofhere.gpus[x]
whenx
is out of bounds is somehow not clearly defined. For example, I executed this program on a system with 8 GPUs, but only one enabled (CHPL_RT_NUM_GPUS_PER_LOCALE=1
):By default (
--gpuID 0
), this program returns as expected:But, for any
--gpuID x
values greater than 1, we get:While
nil
is probably expected because only one GPU is enabled, it seems thatassertOnGPU
is not triggered. Ison nil
possible?I also extended the experiment to negative numbers, and the results seem unpredictable as I encountered at least four different outputs:
here.gpus[-1]
returnshere.gpus[0]
andassertOnGPU
is not triggered:here.gpus[x]
returnshere.id
andassertOnGPU
is triggered:here.gpus[-4]
returnssegfault
:here.gpus[-5]
returnsnil
andassertOnGPU
is triggered:etc.
Of course
here.gpus
is not expected to be used that way, and these experiments are a bit sadistic, but first I'd like to report this just in case this is not a known behaviour, and then I wonder if there is any interesting explanation behind that.The text was updated successfully, but these errors were encountered: