Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLU] add index select and index_select_grad kernel #4

Open
wants to merge 804 commits into
base: mlu-r2.4
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
804 commits
Select commit Hold shift + click to select a range
46f8e88
Add symbolic shape deduction function for unfold, scatter_nd_add, p_n…
weishengying Oct 13, 2022
f856fc8
test=infer-coverage (#46983)
Wangzheee Oct 13, 2022
770501b
add thread name for dataloader (#46990)
zhiqiu Oct 13, 2022
e86dbd6
[Paddle Inference] Add bmm trt convert layer. (#46877)
xiaoxiaohehe001 Oct 13, 2022
910e1b6
logsumexp support fp16 (#45817)
xiaohemaikoo Oct 13, 2022
561fd8c
Fix quantize model deploy bugs when using MKLDNN (#45920)
yeliang2258 Oct 13, 2022
8d797fd
[Phi] Refactor logic of judging whether having a phi kernrel (#46920)
zyfncg Oct 13, 2022
f6ae9fb
[geometric] Add unittest for send_uv (#46948)
DesmonDay Oct 13, 2022
cba0020
add athor (#46994)
zhiqiu Oct 14, 2022
fcdc677
Update distributed_strategy.proto (#46531)
Wong4j Oct 14, 2022
48bb2c0
Add more record event in run program op (#46949)
0x45f Oct 14, 2022
974e98b
[Auto Parallel] Fix the bug for None labels (#46987)
aoyulong Oct 14, 2022
2f9de5f
[inference][trt] fix reshape2 opteller and elementwise min/max trt re…
zhangjun Oct 14, 2022
eb42993
remove BackendType in inference api. (#46942)
jiweibo Oct 14, 2022
eb32746
TRT pool2d adaptive mode bugfix (#46802)
wwbitejotunn Oct 14, 2022
a6a2618
Fix hAPI bug of not compatible with LayerHook (#47001)
parap1uie-s Oct 14, 2022
eee6b3a
speed_up for deformable conv (#46997)
Rayman96 Oct 14, 2022
31a437b
[AutoParallel] adapt for gpt-gen (#46771)
zhaoyinglia Oct 14, 2022
a1b9997
Update imread() read image way error (#47005)
jingsongliujing Oct 14, 2022
2010bdc
Fix collective APIs cannot be recognized when building docs (#46962)
HermitSun Oct 14, 2022
eded601
Simplify conv_mkldnn op registration (#46907)
chenwhql Oct 14, 2022
64b61fc
delete GetExpectedKernelType mkldnn of transpose2 (#46977)
jiahy0825 Oct 15, 2022
166ff39
add common subexpression elimination (#44386)
zzk0 Oct 16, 2022
73196e5
[Custom Device] Add singleton to custom device (#46963)
YanhuiDua Oct 17, 2022
b4a1f43
[Eager] use CastPyArg2Double to parse python float obj (#47029)
veyron95 Oct 17, 2022
328236d
fix typo error in operator.cc (#46995)
jiahy0825 Oct 17, 2022
7493839
add shape info into eager log (#46934)
JiabinYang Oct 17, 2022
f4ea771
fix unittest test_post_training_quantization_lstm_model problem (#47024)
yghstill Oct 17, 2022
6566b8f
fix dygraph new format problem export in QAT (#47023)
yghstill Oct 17, 2022
f9c1cdc
Fix the bug of PHI kernel of reduce_sum in kunlun when using eager mo…
ZibinGuo Oct 17, 2022
198c799
[CodeStyle][py2] remove `compat` module (to_bytes) (#47035)
SigureMo Oct 17, 2022
2e7dc66
skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr (#…
pangyoki Oct 17, 2022
acbda3e
fix for conv_bias_mkldnn_pass (#47037)
jakpiase Oct 17, 2022
f0af270
[Auto Parallel] Fix the bug of completion (#47056)
aoyulong Oct 17, 2022
9e08633
Layernorm shift partition enhance (#46816)
wwbitejotunn Oct 17, 2022
6430790
support __floordiv__ (#47060)
veyron95 Oct 17, 2022
abb3813
[Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape (#46694)
OccupyMars2025 Oct 17, 2022
f1a9f87
【Hackathon No.8】 add gumbel distribution api (#46255)
PureNatural Oct 17, 2022
1328443
Fix warning message format error (#47045)
RedContritio Oct 17, 2022
d43c972
[hidden trouble] Update test_sparse_transpose_op.py to get rid of a …
OccupyMars2025 Oct 17, 2022
ec74939
[PHI]Modify DataLayout's namespace from paddle::experimental to phi (…
YuanRisheng Oct 17, 2022
776e80a
delete maybe unused code in paddle\phi\infermeta\sparse\unary.h (#46844)
OccupyMars2025 Oct 17, 2022
7c6835c
Revert "add common subexpression elimination (#44386)" (#47062)
phlrain Oct 17, 2022
0b39b24
Support BF16 training for sharding (#46846)
GhostScreaming Oct 17, 2022
b9a2f29
Add enable_partial_send_recv switch in pipeline_configs (#46992)
GhostScreaming Oct 17, 2022
7c92177
[AutoParallel] add callbacks (#47014)
zhaoyinglia Oct 18, 2022
a9c2066
delete GetExpectedKernelType mkldnn of conv_op (#47044)
jiahy0825 Oct 18, 2022
bdd3dde
[code-gen] Support code-gen for opmaker of sparse op (#46993)
zyfncg Oct 18, 2022
55ac9c4
[XPU] update xpu cmake to 1016. test=kunlun (#47041)
houj04 Oct 18, 2022
62c0aba
[Eager, Performance optimization] support pow( ** operator) to sink t…
veyron95 Oct 18, 2022
ad4c773
[CodeStyle][py2] remove `compat` module (to_text) (#47036)
SigureMo Oct 18, 2022
a21a2b5
[Paddle Inference] Add_expand_v2_trt_layer (#47002)
xiaoxiaohehe001 Oct 18, 2022
a89b33f
fix doc of some sparse api (#47020)
zhwesky2010 Oct 18, 2022
1cc482b
reconstruct code for convert_fp16 (#46428)
jiweibo Oct 18, 2022
42e312a
Construct exec and ctx only once in cond op to speed up (#47009)
zh794390558 Oct 18, 2022
30dae6d
remove __future__ import in docstring, test=document_fix (#46890)
SigureMo Oct 18, 2022
da05135
[Auto Parallel] Add cost interface (#47043)
Caozhou1995 Oct 18, 2022
b7a23ad
FC + activation fuse passes (#45183)
Silv3S Oct 18, 2022
5e9f491
Merge layernorm trt fuse (#46320)
wwbitejotunn Oct 18, 2022
e5e3d5c
Add value check & error message for gather_tree (#47051)
FrostML Oct 18, 2022
d68c38e
add embedding range check (#46991)
seemingwang Oct 18, 2022
178d7e5
add strategy group (#47021)
LiYuRio Oct 18, 2022
9cdf30d
[CustomDevice] turn on WITH_CUSTOM_DEVICE when WITH_PYTHON=ON (#47108)
ronny1996 Oct 18, 2022
3108ba1
[Auto Parallel]Add parallel tuner (#46189)
Caozhou1995 Oct 18, 2022
35d5db3
[Zero-Dim] support 0D Tensor for reshape/create_parameters (#47074)
zhwesky2010 Oct 18, 2022
5c0bfc1
[Paddle-TRT]Rewrite strided_slice converter using shape tensor (#46…
zhoutianzi666 Oct 18, 2022
c7d2e82
update audio api examples (#46938)
SmileGoat Oct 18, 2022
75b1678
Fix bugs in the General Plugin Mechanism (#47072)
weishengying Oct 18, 2022
d817d89
fix send for old dygraph mode by passing use_calc_stream to the send …
sljlp Oct 19, 2022
ddf317e
[CodeStyle][py2] fix a decode error caused by 47036 (#47097)
SigureMo Oct 19, 2022
be273ea
fix build warning: [Wsign-compare] on linux (#46644)
Li-fAngyU Oct 19, 2022
1a14d01
Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098)
zlsh80826 Oct 19, 2022
3c39475
fix old dygraph a vlog bug (#47115)
wanghuancoder Oct 19, 2022
3f40cdf
Loose TRT fp16 tests tolerance (#47100)
zlsh80826 Oct 19, 2022
d53bd8c
Loose TRT half test tolerance to 1e-3 (#47106)
zlsh80826 Oct 19, 2022
e435d69
clean unused code: piece.cc/h (#47103)
zhiqiu Oct 19, 2022
2814d7f
Construct exec and ctx only once in cond op to speed up (#47092)
zh794390558 Oct 19, 2022
b3afac8
[Dy2Stat]Polish @to_static temporary file directory to speed up trans…
Aurelius84 Oct 19, 2022
36ab58f
Loose TRT half test tolerance to 1e-3 (#47101)
zlsh80826 Oct 19, 2022
be3908a
[Dy2Static] Remove GradTransformer (#47063)
2742195759 Oct 19, 2022
1e1c727
slice op supports uint8_t (#47067)
will-jl944 Oct 19, 2022
9413219
[Dy2St]Fix recurrent op eager deletion pass error in dy2st (#47105)
0x45f Oct 19, 2022
af4bded
Support uniform api and sigmoid api in new AD (#46960)
Charles-hit Oct 19, 2022
3bc4b85
Enable to record whether the conv algo is got by exhaustive search to…
Xreki Oct 19, 2022
d00b7d8
Support stream overlap for c_allreduce_sum (#47030)
From00 Oct 19, 2022
ab36997
remove fluid symbol depend in sync bn (#47122)
chenwhql Oct 19, 2022
0ad7f53
[Dy2Static] Support TypeHint for function decorated by @to_static (#4…
2742195759 Oct 19, 2022
85489d3
Rename name of op and op_args in yaml to align python api (#46343)
zyfncg Oct 19, 2022
95ca886
move the logic of mkldnn layout in GetKernelTypeForVar from Activatio…
zyfncg Oct 19, 2022
065608d
[CodeStyle] add more information when codestyle check failed (#47116)
SigureMo Oct 19, 2022
de6e743
add nvtxRangePush/Pop for naive_executor and refine some code (#47139)
yuanlehome Oct 19, 2022
499d2da
[CodeStyle][F403] expand star import (#46946)
SigureMo Oct 19, 2022
3684ad1
[CodeStyle][F403] remove error code E403 in flake8 config (#46947)
SigureMo Oct 19, 2022
e6fb551
[CodeStyle][py2] remove `six` package (part 1) (#46965)
SigureMo Oct 19, 2022
e89d729
fix rpc compile bug (#47026)
Ningsir Oct 19, 2022
89d481d
modify timeout limitation, test=infer-coverage (#46831)
RichardWooSJTU Oct 19, 2022
6f7e768
test=infer-coverage fix A10 test_fc_elementwise_layernorm_fuse_pass (…
carryyu Oct 19, 2022
7a2489e
fix clang warning of [-Wformat] (#47137)
GreatV Oct 19, 2022
b9c8c1b
[Eager] polish general_grad (#47151)
veyron95 Oct 19, 2022
b9e6b94
fix sparse inplace (#47167)
Oct 20, 2022
2246e88
[Wsign-compare] Close Wno-error=sign_compare (#47163)
Li-fAngyU Oct 20, 2022
f040877
[Eager, Performance optimization] support not equal to sink to cpp la…
veyron95 Oct 20, 2022
8d2ce06
add test for stage2 + dp (#47114)
wuhuachaocoding Oct 20, 2022
0e552c0
support qat in sharding stage2 (#47169)
haohongxiang Oct 20, 2022
a343667
[Dy2Static] Remove deprecated code in dy2static (#47148)
2742195759 Oct 20, 2022
c91b1b9
PaddlePaddle Hackathon 3 No.45 & 46】:为 Paddle cumsum和logcumsumexp 支持 …
thunder95 Oct 20, 2022
e1c0461
[CodeStyle][W605] Add escape symbols to some strings (#46752)
caolonghao Oct 20, 2022
dc64db1
[CodeStyle][W605] Update .flake8 config (#46753)
caolonghao Oct 20, 2022
68e27f3
fix gcc54 compile failed (#47172)
chenwhql Oct 20, 2022
979af47
[AutoParallel] fix fp16 for subblock (#47189)
zhaoyinglia Oct 20, 2022
0508b94
Add _get_phi_kernel_name interface (#47032)
JZZ-NOTE Oct 20, 2022
aab11dd
add get ops scripts (#47048)
JZZ-NOTE Oct 20, 2022
af9486f
Add infer prune function (#47046)
JZZ-NOTE Oct 20, 2022
10881b6
fix problem of persistable var saving in QAT (#47178)
yghstill Oct 20, 2022
0e1b614
[Sparse] Fix indices (#47190)
Oct 20, 2022
ad7aeb9
support compiling: with_distribute=on and with_pscore=off (#47192)
Ningsir Oct 20, 2022
acf56fb
A10 trt test trt activation pass (#47175)
wwbitejotunn Oct 20, 2022
d6208aa
log only if > 0 (#47181)
sfraczek Oct 20, 2022
32cb7e2
add -Werror=format for macos (#47216)
GreatV Oct 20, 2022
420d4bc
add pdsa-2022-001, test=document_fix (#47222)
VigiZhang Oct 20, 2022
4dc4d5f
[MKLDNN] Delete mkldnn hard code of fc (#47138)
jiahy0825 Oct 20, 2022
5a2e517
Add FusedMultiTransformer fuse pass for GPT3 (#45907)
heavengate Oct 20, 2022
ec5b27f
add paddle audio dataset && backend (#45939)
SmileGoat Oct 20, 2022
ac66653
fix:use constant_fold for vit pass (#47211)
Oct 20, 2022
1ba592d
opt mkldnn selection judgement (#47217)
jiahy0825 Oct 20, 2022
f61f9e7
clean conv_op useless variable (#47213)
jiahy0825 Oct 20, 2022
1c8ef38
[Sparse] change paddle.incubate.sparse to paddle.sparse (#47152)
zhwesky2010 Oct 20, 2022
340009d
fix nvprof_nvtx_push interface bug (#47232)
yuanlehome Oct 21, 2022
43ad0b1
Fix the bug where the device memory address appears in abs_grad kerne…
ZibinGuo Oct 21, 2022
ab936d8
fix links (#47243)
VigiZhang Oct 21, 2022
f1b8f0e
fix process group init bug (#47224)
Caozhou1995 Oct 21, 2022
a9ac608
fix bug of abs_grad in eager mode for kunlun, test=kunlun (#47164)
zhangyk0314 Oct 21, 2022
9be2b72
Fix virtualpp with mp/recompute bugs (#47242)
FeixLiu Oct 21, 2022
a657465
fix numpy issue in codeblock examples (#47042)
kevinng77 Oct 21, 2022
b438dff
fix paddle.get_default_dtype (#47040)
CodeNTrade2025 Oct 21, 2022
016766c
fix runtime error (#47133)
gglin001 Oct 21, 2022
7097630
[CodeStyle][black] use black instead of yapf (#46014)
SigureMo Oct 23, 2022
a5f556f
[CodeStyle][black] restore changes of PR-CI-Coverage in #46014 (#47266)
SigureMo Oct 23, 2022
31f57f2
Move the header file of conv cudnn and miopen to phi directory. (#47248)
Xreki Oct 24, 2022
c5fe109
[CodeStyle] fix macos inconsistent-missing-override warnings and add …
GreatV Oct 24, 2022
9f66661
c++ support read flags from env (#47223)
jiweibo Oct 24, 2022
aede713
[MKLDNN] Delete mkldnn hard code of mul (#47166)
jiahy0825 Oct 24, 2022
cc753aa
[CodeStyle][F522] Remove unused arguments (#46743)
caolonghao Oct 24, 2022
512cb29
[CodeStyle][black] format dy2static unittests (#47268)
SigureMo Oct 24, 2022
05d1cf1
[CodeStyle][black] update flake8 config (#47270)
SigureMo Oct 24, 2022
28ed27a
[CodeStyle][F522] Update .flake8 config (#47287)
caolonghao Oct 24, 2022
84273aa
fix cumsum compilation error for GPU architecture that does not suppo…
Oct 24, 2022
2f3ad5a
optimize: vit static shape (#47280)
Oct 24, 2022
bc47e7a
Enhance the implementation of some conv functions. (#47281)
Xreki Oct 24, 2022
3a0690e
fix issue (#47250)
firestonelib Oct 24, 2022
2e299ad
add prior_box and box_coder for paddle.vision.ops (#47282)
nemonameless Oct 24, 2022
4021258
Fix compilation bug caused by incorrect log information (#47254)
yeliang2258 Oct 24, 2022
7aa608a
fix for bias caching and scales optimization (#47234)
jakpiase Oct 24, 2022
1431265
[Sparse] fix doc (#46967)
Oct 24, 2022
5b1dd38
[code-gen] Generate static graph code for exp op (#47120)
zyfncg Oct 24, 2022
3f64a2c
Polish slice code in fluid (#45746)
zyfncg Oct 24, 2022
5e97651
fix warning infos of recompute hybrid in eager mode (#47288)
haohongxiang Oct 24, 2022
b420d4d
fix multiprocess error (#47301)
Aurelius84 Oct 25, 2022
a3e8ca4
[geometric] fix english doc (#46485)
DesmonDay Oct 25, 2022
3e7abca
[Add author] Add author (#47259)
jiahy0825 Oct 25, 2022
afd5a96
opt conv_transpose cudnn (#47294)
jiahy0825 Oct 25, 2022
ac3b882
[Zero-Dim] support input 0D Tensor for softmax/log_softmax/gumbel_sof…
zhwesky2010 Oct 25, 2022
13a5f18
[BugFix] while cond receives dict as input (#47299)
2742195759 Oct 25, 2022
9507969
【Hackathon No.6】implement nan_to_num (#42469)
tiancaishaonvjituizi Oct 25, 2022
ea8e87f
[CodeStyle][py2] remove `paddle.compat` (#47269)
SigureMo Oct 25, 2022
6f5e782
[Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (Part2 add …
jiahy0825 Oct 25, 2022
d869056
fix braced-scalar-init warnings on macos (#47309)
GreatV Oct 25, 2022
33a7bb8
[CodeStyle][E231] remove unnecessary `,` (#47297)
SigureMo Oct 25, 2022
0d04bfe
[CodeStyle][E231] update flake8 config (#47298)
SigureMo Oct 25, 2022
d5e7d20
minor split optimization (#47314)
jakpiase Oct 25, 2022
98beb5a
export all symbols for phi_function_api (#47208)
engineer1109 Oct 25, 2022
ff07f8a
[CUDNN hardcode] Opt CUDNN hardcode of sequence_softmax (#47319)
jiahy0825 Oct 25, 2022
c1077ae
clean fusion_conv_inception headerfile (#47320)
jiahy0825 Oct 25, 2022
7e06541
[Hackathon No.10] Add unit tests for Normal (#47070)
MayYouBeProsperous Oct 25, 2022
06ef3f0
update phi approval list, test=document_fix (#47332)
chenwhql Oct 25, 2022
0abf756
Added workaround for elementwise oneDNN kernel (#47080)
jakpiase Oct 25, 2022
c0525b8
fix a bug that print log twice (#47336)
sljlp Oct 25, 2022
aab21d1
[Wsign-compare] Close Wno-error of sign-compare (#47252)
Li-fAngyU Oct 26, 2022
0521af4
[Eager, Performance optimization] support equal under cpp (#47315)
veyron95 Oct 26, 2022
14536d0
fix pylayer copy error (#47154)
JiabinYang Oct 26, 2022
cb09cf9
fix dygraph higer node creation (#47231)
Charles-hit Oct 26, 2022
17a0362
fix slice_assign_p (#47324)
Charles-hit Oct 26, 2022
6ef5d34
Refine the memory usage of fused_attention and fused_feedforward ops …
sneaxiy Oct 26, 2022
1cb12ff
Remove the declaration of using LoDTensor in framework/lod_tensor.h (…
chenwhql Oct 26, 2022
c334405
clean useless api tests in phi (#47321)
chenwhql Oct 26, 2022
c98af92
[audio]fix split fold in tess dataset (#47328)
SmileGoat Oct 26, 2022
f7616d7
fix slice bug (#47349)
wanghuancoder Oct 26, 2022
dfe6d8f
Fix inference performance problem caused by selecting cudnn kernel of…
zyfncg Oct 26, 2022
076c41e
fix uninitialized, tautological-constant-out-of-range-compare and lit…
GreatV Oct 26, 2022
40f1595
fix pylayer name crash (#47323)
JiabinYang Oct 26, 2022
4137c46
fix multi_tensor adam/momentum bug (#47352)
sneaxiy Oct 26, 2022
2534ca7
test success on cuda11.7 (#47348)
Oct 26, 2022
d8314ff
[Fix] Fix paddle.pow() Gets Incorrect Result When Broadcasting Is Tri…
Bobholamovic Oct 26, 2022
1f3ff41
Fix dlpack deletion (#47310)
DesmonDay Oct 26, 2022
436115c
clean mkldnn headerfile (#47362)
jiahy0825 Oct 26, 2022
40ce7f4
Wandb callback (#46918)
manangoel99 Oct 26, 2022
d78dd7e
[MKLDNN] Delete mkldnn hard code of prior_box (#47068)
jiahy0825 Oct 26, 2022
c1c2be2
FC/matmul(v2) + scale fuse pass (#47127)
Silv3S Oct 26, 2022
d17d0cd
Preln_Layernorm_Shift_Partition (#47099)
b3602sss Oct 26, 2022
54dd19b
[Docs]fix return_type issue (#47371)
Ligoml Oct 27, 2022
19feba3
Fix compile error of mkldnn and tensorrt (#47388)
chenwhql Oct 27, 2022
cb74666
delete GetKernelTypeForVar mkldnn hardcode (#47360)
jiahy0825 Oct 27, 2022
77dbb31
fix reduce_any kernel data race on sharedMem (#47233)
zhangbopd Oct 27, 2022
493fbfd
Update of PHI transpose_grad (#47311)
jczaja Oct 27, 2022
13181fd
Add launch_bounds (#47285)
Wong4j Oct 27, 2022
8dca988
Fix the symbol missing bug about cinn. (#47347)
wzzju Oct 27, 2022
daf98c1
New precise map (#47389)
risemeup1 Oct 27, 2022
23c9c88
precise_test_logic_update (#47387)
risemeup1 Oct 27, 2022
8607a18
clean abs cudnn (#47374)
jiahy0825 Oct 27, 2022
4d5c8a6
clean angle cudnn (#47375)
jiahy0825 Oct 27, 2022
539f300
clean gelu cudnn (#47378)
jiahy0825 Oct 27, 2022
2096448
make all cpp tests dynamic linked to libpaddle.so [except windows] (#…
zhiqiu Oct 27, 2022
8775545
support prepare_data for selected_rows in c++ api (#47380)
zyfncg Oct 27, 2022
b68c4a1
[Dy2St]Fix abnormal growth of memory in train mode and no_grad for Dy…
0x45f Oct 27, 2022
5429d14
update dygraph PTQ export_model api (#47284)
yghstill Oct 27, 2022
0972d6a
[Paddle Inference] improve convert_to_mixed_precision (#47333)
yuanlehome Oct 27, 2022
b160d09
[JIT] Add Predictor for JITLayer (#47379)
Aurelius84 Oct 27, 2022
800e053
fix pragma-pack warning on macos (#47399)
GreatV Oct 28, 2022
6b77bff
fix default setting of dygraph PTQ (#47413)
yghstill Oct 28, 2022
533f6cb
Revert "Optimiza params sync between CPU and GPU. (#45805)" (#47356)
jiweibo Oct 28, 2022
6baeb2d
Generate static graph code for some activation ops by Yaml (#47382)
zyfncg Oct 28, 2022
57d5ffa
[Dygraph] Fix memory bugs of no sync and SplitTensors in DataParallel…
haohongxiang Oct 28, 2022
e48b6dc
[JITLayer]Enable OneDNN on CPU and Fix zero shape (#47428)
Aurelius84 Oct 28, 2022
0f649b3
remove tcp store barrier (#47184)
LiYuRio Oct 28, 2022
26c419c
[audio]fix audio get_window security error (#47386)
SmileGoat Oct 28, 2022
315ef26
[AutoParallel] fix engine _build and cost method (#47263)
zhaoyinglia Oct 28, 2022
e77c062
[Dygraph] Finish fixing mem bugs of no sync in DataParallel (#47444)
haohongxiang Oct 28, 2022
17fb92b
generate static graph code for some ops by yaml (#47416)
zyfncg Oct 28, 2022
c036c5c
Add fused_allreduce_gradients_with_group for PPFleetX (#47447)
sneaxiy Oct 28, 2022
67ca9d4
[INCUBATE] Add dist save/load for sharding stage2 (#46908)
sljlp Oct 29, 2022
605b3f9
Fix gen cmake (#47457)
sljlp Oct 30, 2022
2b6bccc
Fix the problem of printing log (#47474)
risemeup1 Oct 31, 2022
31b677b
apply new precise_card_test to coverage_ci (#47473)
risemeup1 Oct 31, 2022
1e2a371
repair log bugs that keeps printing warnings (#47467)
risemeup1 Oct 31, 2022
91096ae
remove boost compiler flags in flags.cmake (#47468)
GreatV Oct 31, 2022
d4b68da
[audio] rm kaiser window in audio get_window function && rm audio uti…
SmileGoat Oct 31, 2022
81b93eb
fix python module not found bug (#47438)
zhangbo9674 Oct 31, 2022
c8fc337
[Zero-Dim] support input 0D Tensor for reduce_sum/reduce_mean (#47219)
zhwesky2010 Oct 31, 2022
f5912d0
fix typos for `True` and `False` (#47477)
SigureMo Oct 31, 2022
bb6356e
[MLU] fix compile error & add mlu blacklist function. (#47439)
ShawnNew Oct 31, 2022
3b5e732
fix nlu compilation (#47707)
zhiqiu Nov 7, 2022
41b6463
[MLU] fix softmax_with_cross_entropy failed in 370-X8.
ShawnNew Nov 23, 2022
d7fc2e9
[MLU] fix cncl stuck caused by multiple initializations.
ShawnNew Nov 23, 2022
9eae75d
[MLU] add analysis_config for mlu.
ShawnNew Oct 10, 2022
3805553
[MLU] add paddleinference support.
ShawnNew Oct 12, 2022
29ae4ec
[MLU] fix compile error cause by pdinference.
ShawnNew Dec 12, 2022
1a1e2be
[MLU] fix ce_loss unpack failed in non-static mode.
ShawnNew Jan 4, 2023
4ef39c2
[MLU] fix bn and enable tf32 (#1)
fwenguang Jan 9, 2023
1ba2313
[MLU] fix masked_select (#2)
PeiyuLau Jun 8, 2023
2ee213b
[MLU] add index_select and index_select_grad kernel
PeiyuLau Jun 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add thread name for dataloader (PaddlePaddle#46990)
  • Loading branch information
zhiqiu authored Oct 13, 2022
commit 770501b802e9371059fe0b308c7c0c3f1ab70d75
2 changes: 2 additions & 0 deletions paddle/fluid/platform/os_info.cc
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ limitations under the License. */
#else
#include <unistd.h>
#endif
#include "glog/logging.h"
#include "paddle/fluid/framework/new_executor/workqueue/thread_data_registry.h"
#include "paddle/fluid/platform/macros.h" // import DISABLE_COPY_AND_ASSIGN

Expand Down Expand Up @@ -115,6 +116,7 @@ bool SetCurrentThreadName(const std::string& name) {
return false;
}
instance.SetCurrentThreadData(name);
VLOG(4) << __func__ << " " << name;
return true;
}

Expand Down
2 changes: 1 addition & 1 deletion paddle/fluid/platform/os_info.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ ThreadId GetCurrentThreadId();
// create/destory when using it.
std::unordered_map<uint64_t, ThreadId> GetAllThreadIds();

static constexpr const char* kDefaultThreadName = "unset";
static constexpr const char* kDefaultThreadName = "unnamed";
// Returns kDefaultThreadName if SetCurrentThreadName is never called.
std::string GetCurrentThreadName();

Expand Down
2 changes: 2 additions & 0 deletions paddle/fluid/pybind/pybind.cc
Original file line number Diff line number Diff line change
Expand Up @@ -854,6 +854,8 @@ PYBIND11_MODULE(libpaddle, m) {

m.def("_set_paddle_lib_path", &paddle::platform::dynload::SetPaddleLibPath);

m.def("set_current_thread_name", &paddle::platform::SetCurrentThreadName);

m.def("_promote_types_if_complex_exists",
&paddle::framework::PromoteTypesIfComplexExists);

Expand Down
2 changes: 2 additions & 0 deletions python/paddle/fluid/dataloader/dataloader_iter.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,7 @@ def _thread_loop(self, legacy_expected_place):
# If we do not set cudaDeviceId in new thread, the default cudaDeviceId will be 0,
# Which may cost hundreds of MB of GPU memory on CUDAPlace(0) if calling some cuda
# APIs in this thread.
core.set_current_thread_name("Dataloader_" + str(id(self)))
_set_expected_place(legacy_expected_place)

while not self._thread_done_event.is_set():
Expand Down Expand Up @@ -530,6 +531,7 @@ def _thread_loop(self, legacy_expected_place):
# If we do not set cudaDeviceId in new thread, the default cudaDeviceId will be 0,
# Which may cost hundreds of MB of GPU memory on CUDAPlace(0) if calling some cuda
# APIs in this thread.
core.set_current_thread_name("Dataloader_" + str(id(self)))
_set_expected_place(legacy_expected_place)

while not self._thread_done_event.is_set():
Expand Down
1 change: 1 addition & 0 deletions python/paddle/fluid/layers/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -477,6 +477,7 @@ def start_provide_thread(func):
def __provider_thread__(legacy_expected_place):
try:
# See _DataLoaderIterSingleProcess._thread_loop() for why set expected place here.

_set_expected_place(legacy_expected_place)

for tensors in func():
Expand Down
3 changes: 3 additions & 0 deletions python/paddle/fluid/reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -1126,6 +1126,7 @@ def _exit_thread_unexpectedly(self):

def _reader_thread_loop_for_multiprocess(self, legacy_expected_place):
# See _DataLoaderIterSingleProcess._thread_loop() for why set expected place here.
core.set_current_thread_name("Dataloader_" + str(id(self)))
_set_expected_place(legacy_expected_place)

while not self._thread_done_event.is_set():
Expand Down Expand Up @@ -1169,6 +1170,7 @@ def _reader_thread_loop_for_multiprocess(self, legacy_expected_place):
def _reader_thread_loop_for_singleprocess(self, legacy_expected_place):
try:
# See _DataLoaderIterSingleProcess._thread_loop() for why set expected place here.
core.set_current_thread_name("Dataloader_" + str(id(self)))
_set_expected_place(legacy_expected_place)

for sample in self._batch_reader():
Expand Down Expand Up @@ -1419,6 +1421,7 @@ def _start(self):
def __thread_main__(legacy_expected_place):
try:
# See _DataLoaderIterSingleProcess._thread_loop() for why set expected place here.
core.set_current_thread_name("Dataloader_" + str(id(self)))
_set_expected_place(legacy_expected_place)

while not self._queue.wait_for_inited(1):
Expand Down