-
Notifications
You must be signed in to change notification settings - Fork 1
/
datafetcher.log
510 lines (506 loc) · 43 KB
/
datafetcher.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
2024-08-01 11:03:00,336 INFO Created a list of tuples with 96352 entries
2024-08-01 11:03:00,343 INFO Loaded 96352 associations
2024-08-01 11:03:00,809 INFO Created a dataframe with 96352 entries and column values ['id' 'subject_id' 'subject_label' 'subject_iri' 'subject_category'
'subject_taxon_id' 'subject_taxon_label' 'object_id' 'object_label'
'object_iri' 'object_category' 'object_taxon_id' 'object_taxon_label'
'relation_id' 'relation_label' 'relation_iri']
2024-08-01 11:03:01,710 INFO Created a dataframe with 9732 entries and column values ['id' 'semantic_groups' 'name']
2024-08-01 11:05:14,003 INFO Created a list of tuples with 228753 entries
2024-08-01 11:05:14,016 INFO Loaded 228753 associations
2024-08-01 11:05:15,047 INFO Created a dataframe with 228753 entries and column values ['id' 'subject_id' 'subject_label' 'subject_iri' 'subject_category'
'subject_taxon_id' 'subject_taxon_label' 'object_id' 'object_label'
'object_iri' 'object_category' 'object_taxon_id' 'object_taxon_label'
'relation_id' 'relation_label' 'relation_iri']
2024-08-01 11:05:17,050 INFO Created a dataframe with 14618 entries and column values ['id' 'semantic_groups' 'name']
2024-08-01 11:05:39,246 INFO Created a list of tuples with 95152 entries
2024-08-01 11:05:39,251 INFO Loaded 95152 associations
2024-08-01 11:05:39,602 INFO Created a dataframe with 95152 entries and column values ['id' 'subject_id' 'subject_label' 'subject_iri' 'subject_category'
'subject_taxon_id' 'subject_taxon_label' 'object_id' 'object_label'
'object_iri' 'object_category' 'object_taxon_id' 'object_taxon_label'
'relation_id' 'relation_label' 'relation_iri']
2024-08-01 11:05:40,349 INFO Created a dataframe with 10740 entries and column values ['id' 'semantic_groups' 'name']
2024-08-01 13:03:07,050 INFO Created a list of tuples with 96352 entries
2024-08-01 13:03:07,055 INFO Loaded 96352 associations
2024-08-01 13:22:23,295 INFO Created a list of tuples with 96352 entries
2024-08-01 13:22:23,298 INFO Loaded 96352 associations
2024-08-01 13:22:49,372 INFO Created a list of tuples with 96352 entries
2024-08-01 13:22:49,375 INFO Loaded 96352 associations
2024-08-01 13:22:49,378 INFO Created a list of tuples with 203 entries
2024-08-01 13:22:49,378 INFO Loaded 203 associations
2024-08-01 13:22:49,380 INFO Created a list of tuples with 29 entries
2024-08-01 13:22:49,380 INFO Loaded 29 associations
2024-08-01 13:22:49,766 INFO The graph contains 7 different semantic groups: {'ANAT', 'GENO', 'GENE', 'VARI', 'PHYS', 'ORTH', 'DISO'}
2024-08-01 13:22:49,766 INFO For the graph, a total of 95906 edges and 9732 nodes have been generated.
2024-08-01 13:22:49,927 INFO The graph contains 8 different semantic groups: {'DRUG', 'ANAT', 'GENO', 'GENE', 'VARI', 'PHYS', 'ORTH', 'DISO'}
2024-08-01 13:22:49,928 INFO For the graph, a total of 96138 edges and 9885 nodes have been generated.
2024-08-01 13:22:49,928 INFO There are 8 semantic groups: ['DISO' 'ORTH' 'GENO' 'PHYS' 'GENE' 'VARI' 'DRUG' 'ANAT']
2024-08-01 13:22:49,953 INFO There are 22 relation labels: relation_label
relation_id
BFO:0000050 is part of
CustomRO:DC is substance that treats
CustomRO:TTD targets
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
GENO:0000840 pathogenic for condition
GENO:0000841 likely pathogenic for condition
RO:0002200 has phenotype
RO:0002206 expressed in
RO:0002325 colocalizes with
RO:0002326 contributes to
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004012 is causal loss of function germline mutation o...
RO:0004013 is causal germline mutation in
RO:HOM0000017 in orthology relationship with
RO:HOM0000020 in 1 to 1 orthology relationship with
2024-08-01 13:22:50,621 INFO Graph of all connections between concepts saved to prev_oi_concepts.png
2024-08-01 13:22:50,647 INFO List of triplets saved to prev_oi_triples.csv
2024-08-01 13:24:25,682 INFO Created a list of tuples with 96352 entries
2024-08-01 13:24:25,686 INFO Loaded 96352 associations
2024-08-01 13:24:25,689 INFO Created a list of tuples with 203 entries
2024-08-01 13:24:25,689 INFO Loaded 203 associations
2024-08-01 13:24:25,691 INFO Created a list of tuples with 29 entries
2024-08-01 13:24:25,691 INFO Loaded 29 associations
2024-08-01 13:24:26,053 INFO The graph contains 7 different semantic groups: {'ANAT', 'GENO', 'PHYS', 'DISO', 'VARI', 'ORTH', 'GENE'}
2024-08-01 13:24:26,053 INFO For the graph, a total of 95906 edges and 9732 nodes have been generated.
2024-08-01 13:24:26,191 INFO The graph contains 8 different semantic groups: {'ANAT', 'GENO', 'PHYS', 'DRUG', 'DISO', 'VARI', 'ORTH', 'GENE'}
2024-08-01 13:24:26,191 INFO For the graph, a total of 96138 edges and 9885 nodes have been generated.
2024-08-01 13:24:26,192 INFO There are 8 semantic groups: ['DISO' 'VARI' 'ORTH' 'GENE' 'DRUG' 'PHYS' 'ANAT' 'GENO']
2024-08-01 13:24:26,207 INFO There are 22 relation labels: relation_label
relation_id
BFO:0000050 is part of
CustomRO:DC is substance that treats
CustomRO:TTD targets
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
GENO:0000840 pathogenic for condition
GENO:0000841 likely pathogenic for condition
RO:0002200 has phenotype
RO:0002206 expressed in
RO:0002325 colocalizes with
RO:0002326 contributes to
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004012 is causal loss of function germline mutation o...
RO:0004013 is causal germline mutation in
RO:HOM0000017 in orthology relationship with
RO:HOM0000020 in 1 to 1 orthology relationship with
2024-08-01 13:24:26,483 INFO Graph of all connections between concepts saved to prev_oi_concepts.png
2024-08-01 13:24:26,500 INFO List of triplets saved to prev_oi_triples.csv
2024-08-01 16:21:31,165 INFO Created a list of tuples with 96352 entries
2024-08-01 16:21:31,170 INFO Loaded 96352 associations
2024-08-01 16:21:31,549 INFO The graph contains 12 different semantic groups: {'homology', 'pathway', 'interaction', 'variant', 'marker', 'function', 'disease', 'phenotype', 'genotype', 'anatomy', 'model', 'gene'}
2024-08-01 16:21:31,550 INFO For the graph, a total of 95906 edges and 9732 nodes have been generated.
2024-08-01 16:21:31,552 INFO Extracted a total of 9732 nodes that belong to at least one of the semantic groups []
2024-08-01 16:21:31,564 INFO A total of 9732 gene IDs has been extracted
2024-08-01 16:21:31,637 INFO Loaded 19378 drug-target interactions:
DRUG_NAME STRUCT_ID TARGET_NAME TARGET_CLASS ... MOA_SOURCE_URL ACTION_TYPE TDL ORGANISM
0 levobupivacaine 4 Potassium voltage-gated channel subfamily H me... Ion channel ... NaN NaN Tclin Homo sapiens
1 levobupivacaine 4 Sodium channel protein type 1 subunit alpha Ion channel ... NaN NaN Tclin Homo sapiens
2 levobupivacaine 4 Sodium channel protein type 4 subunit alpha Ion channel ... https://www.ebi.ac.uk/chembl/compound/inspect/... BLOCKER Tclin Homo sapiens
[3 rows x 20 columns]
2024-08-01 16:21:46,781 INFO For a total of 16141 drug-target interactions, new mapped IDs are found.
2024-08-01 16:21:46,790 INFO Retrieved 208 drug-target interactions with matched gene IDs:
DRUG_NAME STRUCT_ID GENE_ID PROD_ID PROD_NAME
198 aclarubicin 80 HGNC:7166 P08253 72 kDa type IV collagenase
319 aldosterone 111 HGNC:10839 P04278 Sex hormone-binding globulin
475 aminoquinuride 174 HGNC:7176 P14780 Matrix metalloproteinase-9
716 androstenediol 214 HGNC:10839 P04278 Sex hormone-binding globulin
2024-08-01 16:21:46,792 INFO Total of 208 drug-target associations changed to 203 by dropping duplicates.
2024-08-01 16:24:50,581 INFO Created a list of tuples with 96352 entries
2024-08-01 16:24:50,585 INFO Loaded 96352 associations
2024-08-01 16:24:50,970 INFO The graph contains 12 different semantic groups: {'interaction', 'variant', 'disease', 'genotype', 'homology', 'anatomy', 'marker', 'phenotype', 'gene', 'function', 'pathway', 'model'}
2024-08-01 16:24:50,970 INFO For the graph, a total of 95906 edges and 9732 nodes have been generated.
2024-08-01 16:24:50,973 INFO Extracted a total of 9732 nodes that belong to at least one of the semantic groups []
2024-08-01 16:24:50,987 INFO A total of 9732 gene IDs has been extracted
2024-08-01 16:24:51,096 INFO Loaded 19378 drug-target interactions:
DRUG_NAME STRUCT_ID TARGET_NAME TARGET_CLASS ... MOA_SOURCE_URL ACTION_TYPE TDL ORGANISM
0 levobupivacaine 4 Potassium voltage-gated channel subfamily H me... Ion channel ... NaN NaN Tclin Homo sapiens
1 levobupivacaine 4 Sodium channel protein type 1 subunit alpha Ion channel ... NaN NaN Tclin Homo sapiens
2 levobupivacaine 4 Sodium channel protein type 4 subunit alpha Ion channel ... https://www.ebi.ac.uk/chembl/compound/inspect/... BLOCKER Tclin Homo sapiens
[3 rows x 20 columns]
2024-08-01 16:25:06,083 INFO For a total of 16141 drug-target interactions, new mapped IDs are found.
2024-08-01 16:25:06,091 INFO Retrieved 208 drug-target interactions with matched gene IDs:
DRUG_NAME STRUCT_ID GENE_ID PROD_ID PROD_NAME
198 aclarubicin 80 HGNC:7166 P08253 72 kDa type IV collagenase
319 aldosterone 111 HGNC:10839 P04278 Sex hormone-binding globulin
475 aminoquinuride 174 HGNC:7176 P14780 Matrix metalloproteinase-9
716 androstenediol 214 HGNC:10839 P04278 Sex hormone-binding globulin
2024-08-01 16:25:06,093 INFO Total of 208 drug-target associations changed to 203 by dropping duplicates.
2024-08-01 16:25:06,245 INFO All 406 TTD associations are saved into restr_oi_ttd_associations.csv
2024-08-01 16:25:06,246 INFO Created a list of tuples with 406 entries
2024-08-01 16:25:06,251 INFO Extracted a total of 153 nodes that belong to at least one of the semantic groups ['drug']
2024-08-01 16:25:06,254 INFO Extracted a total of 5370 nodes that belong to at least one of the semantic groups ['disease', 'phenotype']
2024-08-01 16:25:06,267 INFO There are 153 unique drug names
2024-08-01 16:25:06,268 INFO There are 5370 unique disease/phenotype IDs
2024-08-01 16:25:06,396 INFO Loaded 28978 drug-disease pairs:
DRUG_ID DRUG_NAME DISEASE_NAME PHASE
0 D00ABE ald-301 ischemia Phase 2
1 D00ABE ald-301 peripheral arterial disease Phase 2
2 D00ABO kw-2449 acute myeloid leukaemia Phase 1
2024-08-01 16:25:06,410 INFO Loaded 699 phenotypes with matching IDs scoring 80:
Name ontologyTermName ontologyTermIRI score validated review DISEASE_ID
0 respiratory failure Respiratory failure http://purl.obolibrary.org/obo/HP_0002878 100.00 False False HP:0002878
1 sexual dysfunction Male sexual dysfunction http://purl.obolibrary.org/obo/HP_0040307 86.49 False False HP:0040307
6 pollakiuria Pollakisuria http://purl.obolibrary.org/obo/HP_0100515 88.00 False False HP:0100515
2024-08-01 16:25:06,444 INFO Total of 14760 disease names mapped to their IDs:
DRUG_ID DRUG_NAME DISEASE_ID DISEASE_NAME
1 D00ABE ald-301 HP:0004950 peripheral arterial disease
2 D00ABO kw-2449 HP:0004808 acute myeloid leukaemia
4 D00ACC nd1251 HP:0000716 depression
5 D00ACH hmr-4004 HP:0100658 bacterial infection
11 D00AHT prame antigen-specific cancer immunotherapeutic HP:0030358 non small cell lung cancer
12 D00AHT prame antigen-specific cancer immunotherapeutic HP:0030358 non small cell lung cancer
14 D00AHT prame antigen-specific cancer immunotherapeutic HP:0002861 melanoma
19 D00AJS aik11 HP:0005978 non insulin dependent diabetes
21 D00AKQ o-desulfated heparin HP:0006510 chronic obstructive pulmonary disease
22 D00AKR amg 479 HP:0003002 breast cancer
2024-08-01 16:25:06,450 INFO A total of 29 are matched with existing drugs and diseases/phenotypes
2024-08-01 16:25:06,452 INFO Total of 29 drug-disease associations changed to 29 by dropping duplicates.
2024-08-01 16:25:06,508 INFO All DrugCentral associations are saved into restr_oi_drugcentral_associations.csv
2024-08-01 16:25:06,508 INFO Created a list of tuples with 29 entries
2024-08-01 16:25:06,659 INFO The graph contains 14 different semantic groups: {'interaction', 'gene product', 'variant', 'disease', 'genotype', 'homology', 'anatomy', 'marker', 'phenotype', 'gene', 'function', 'pathway', 'model', 'drug'}
2024-08-01 16:25:06,660 INFO For the graph, a total of 96157 edges and 9907 nodes have been generated.
2024-08-01 16:25:06,661 INFO There are 14 semantic groups: ['phenotype' 'homology' 'gene' 'variant' 'interaction' 'drug' 'function'
'genotype' 'model' 'disease' 'pathway' 'gene product' 'anatomy' 'marker']
2024-08-01 16:25:06,677 INFO There are 23 relation labels: relation_label
relation_id
BFO:0000050 is part of
CustomRO:DC is substance that treats
CustomRO:TTD1 is product of
CustomRO:TTD2 targets
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
GENO:0000840 pathogenic for condition
GENO:0000841 likely pathogenic for condition
RO:0002200 has phenotype
RO:0002206 expressed in
RO:0002325 colocalizes with
RO:0002326 contributes to
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004012 is causal loss of function germline mutation o...
RO:0004013 is causal germline mutation in
RO:HOM0000017 in orthology relationship with
RO:HOM0000020 in 1 to 1 orthology relationship with
2024-08-01 16:25:07,121 INFO Graph of all connections between concepts saved to all_oi_concepts.png
2024-08-01 16:25:07,139 INFO List of triplets saved to all_oi_triples.csv
2024-08-01 16:33:14,148 INFO The graph contains 11 different semantic groups: {'gene product', 'variant', 'disease', 'genotype', 'phenotype', 'gene', 'biological process', 'taxon', 'biological artifact', 'molecular function', 'drug'}
2024-08-01 16:33:14,148 INFO For the graph, a total of 97068 edges and 9897 nodes have been generated.
2024-08-01 16:33:14,150 INFO There are 11 semantic groups: ['phenotype' 'gene' 'variant' 'drug' 'molecular function' 'taxon'
'genotype' 'disease' 'biological process' 'gene product'
'biological artifact']
2024-08-01 16:33:14,168 INFO There are 21 relation labels: relation_label
relation_id
CustomRO:DC is substance that treats
CustomRO:TTD1 is product of
CustomRO:TTD2 targets
CustomRO:associatedphenotype associated with phenotype
CustomRO:expressesgene expresses gene
CustomRO:foundin found in
CustomRO:isof is of
CustomRO:isvariantin is variant in
CustomRO:likelycauses likely causes condition
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
RO:0002325 colocalizes with
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004012 is causal loss of function germline mutation o...
RO:HOM0000017 in orthology relationship with
2024-08-01 16:33:14,304 INFO Graph of all connections between concepts saved to restr_oi_concepts.png
2024-08-01 16:33:14,328 INFO List of triplets saved to restr_oi_triples.csv
2024-08-01 16:40:42,506 INFO Created a list of tuples with 96352 entries
2024-08-01 16:40:42,509 INFO Loaded 96352 associations
2024-08-01 16:40:42,884 INFO The graph contains 12 different semantic groups: {'interaction', 'function', 'homology', 'genotype', 'gene', 'disease', 'phenotype', 'model', 'marker', 'anatomy', 'variant', 'pathway'}
2024-08-01 16:40:42,884 INFO For the graph, a total of 95906 edges and 9732 nodes have been generated.
2024-08-01 16:40:42,887 INFO Extracted a total of 9732 nodes that belong to at least one of the semantic groups []
2024-08-01 16:40:42,900 INFO A total of 9732 gene IDs has been extracted
2024-08-01 16:40:42,967 INFO Loaded 19378 drug-target interactions:
DRUG_NAME STRUCT_ID TARGET_NAME TARGET_CLASS ... MOA_SOURCE_URL ACTION_TYPE TDL ORGANISM
0 levobupivacaine 4 Potassium voltage-gated channel subfamily H me... Ion channel ... NaN NaN Tclin Homo sapiens
1 levobupivacaine 4 Sodium channel protein type 1 subunit alpha Ion channel ... NaN NaN Tclin Homo sapiens
2 levobupivacaine 4 Sodium channel protein type 4 subunit alpha Ion channel ... https://www.ebi.ac.uk/chembl/compound/inspect/... BLOCKER Tclin Homo sapiens
[3 rows x 20 columns]
2024-08-01 16:40:58,359 INFO For a total of 16141 drug-target interactions, new mapped IDs are found.
2024-08-01 16:40:58,365 INFO Retrieved 208 drug-target interactions with matched gene IDs:
DRUG_NAME STRUCT_ID GENE_ID PROD_ID PROD_NAME
198 aclarubicin 80 HGNC:7166 P08253 72 kDa type IV collagenase
319 aldosterone 111 HGNC:10839 P04278 Sex hormone-binding globulin
475 aminoquinuride 174 HGNC:7176 P14780 Matrix metalloproteinase-9
716 androstenediol 214 HGNC:10839 P04278 Sex hormone-binding globulin
2024-08-01 16:40:58,366 INFO Total of 208 drug-target associations changed to 203 by dropping duplicates.
2024-08-01 16:40:58,526 INFO All 406 TTD associations are saved into restr_oi_ttd_associations.csv
2024-08-01 16:40:58,527 INFO Created a list of tuples with 406 entries
2024-08-01 16:40:58,532 INFO Extracted a total of 153 nodes that belong to at least one of the semantic groups ['drug']
2024-08-01 16:40:58,538 INFO Extracted a total of 5370 nodes that belong to at least one of the semantic groups ['disease', 'phenotype']
2024-08-01 16:40:58,549 INFO There are 153 unique drug names
2024-08-01 16:40:58,550 INFO There are 5370 unique disease/phenotype IDs
2024-08-01 16:40:58,684 INFO Loaded 28978 drug-disease pairs:
DRUG_ID DRUG_NAME DISEASE_NAME PHASE
0 D00ABE ald-301 ischemia Phase 2
1 D00ABE ald-301 peripheral arterial disease Phase 2
2 D00ABO kw-2449 acute myeloid leukaemia Phase 1
2024-08-01 16:40:58,704 INFO Loaded 699 phenotypes with matching IDs scoring 80:
Name ontologyTermName ontologyTermIRI score validated review DISEASE_ID
0 respiratory failure Respiratory failure http://purl.obolibrary.org/obo/HP_0002878 100.00 False False HP:0002878
1 sexual dysfunction Male sexual dysfunction http://purl.obolibrary.org/obo/HP_0040307 86.49 False False HP:0040307
6 pollakiuria Pollakisuria http://purl.obolibrary.org/obo/HP_0100515 88.00 False False HP:0100515
2024-08-01 16:40:58,730 INFO Total of 14760 disease names mapped to their IDs:
DRUG_ID DRUG_NAME DISEASE_ID DISEASE_NAME
1 D00ABE ald-301 HP:0004950 peripheral arterial disease
2 D00ABO kw-2449 HP:0004808 acute myeloid leukaemia
4 D00ACC nd1251 HP:0000716 depression
5 D00ACH hmr-4004 HP:0100658 bacterial infection
11 D00AHT prame antigen-specific cancer immunotherapeutic HP:0030358 non small cell lung cancer
12 D00AHT prame antigen-specific cancer immunotherapeutic HP:0030358 non small cell lung cancer
14 D00AHT prame antigen-specific cancer immunotherapeutic HP:0002861 melanoma
19 D00AJS aik11 HP:0005978 non insulin dependent diabetes
21 D00AKQ o-desulfated heparin HP:0006510 chronic obstructive pulmonary disease
22 D00AKR amg 479 HP:0003002 breast cancer
2024-08-01 16:40:58,735 INFO A total of 29 are matched with existing drugs and diseases/phenotypes
2024-08-01 16:40:58,738 INFO Total of 29 drug-disease associations changed to 29 by dropping duplicates.
2024-08-01 16:40:58,796 INFO All DrugCentral associations are saved into restr_oi_drugcentral_associations.csv
2024-08-01 16:40:58,797 INFO Created a list of tuples with 29 entries
2024-08-01 16:40:58,957 INFO The graph contains 14 different semantic groups: {'interaction', 'function', 'homology', 'genotype', 'gene', 'drug', 'disease', 'phenotype', 'model', 'marker', 'anatomy', 'gene product', 'variant', 'pathway'}
2024-08-01 16:40:58,957 INFO For the graph, a total of 96157 edges and 9907 nodes have been generated.
2024-08-01 16:40:58,958 INFO There are 14 semantic groups: ['phenotype' 'homology' 'gene' 'drug' 'disease' 'function' 'variant'
'gene product' 'pathway' 'interaction' 'anatomy' 'genotype' 'model'
'marker']
2024-08-01 16:40:58,973 INFO There are 23 relation labels: relation_label
relation_id
BFO:0000050 is part of
CustomRO:DC is substance that treats
CustomRO:TTD1 is product of
CustomRO:TTD2 targets
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
GENO:0000840 pathogenic for condition
GENO:0000841 likely pathogenic for condition
RO:0002200 has phenotype
RO:0002206 expressed in
RO:0002325 colocalizes with
RO:0002326 contributes to
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004012 is causal loss of function germline mutation o...
RO:0004013 is causal germline mutation in
RO:HOM0000017 in orthology relationship with
RO:HOM0000020 in 1 to 1 orthology relationship with
2024-08-01 16:40:59,313 INFO Graph of all connections between concepts saved to all_oi_concepts.png
2024-08-01 16:40:59,332 INFO List of triplets saved to all_oi_triples.csv
2024-08-01 16:49:30,518 INFO The graph contains 11 different semantic groups: {'taxon', 'genotype', 'gene', 'drug', 'disease', 'phenotype', 'biological process', 'biological artifact', 'molecular function', 'gene product', 'variant'}
2024-08-01 16:49:30,519 INFO For the graph, a total of 97068 edges and 9897 nodes have been generated.
2024-08-01 16:49:30,520 INFO There are 11 semantic groups: ['phenotype' 'gene' 'drug' 'disease' 'taxon' 'molecular function'
'variant' 'gene product' 'biological process' 'genotype'
'biological artifact']
2024-08-01 16:49:30,537 INFO There are 20 relation labels: relation_label
relation_id
CustomRO:DC is substance that treats
CustomRO:TTD1 is product of
CustomRO:TTD2 targets
CustomRO:associatedphenotype associated with phenotype
CustomRO:expressesgene expresses gene
CustomRO:foundin found in
CustomRO:isof is of
CustomRO:isvariantin is variant in
CustomRO:likelycauses likely causes condition
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
RO:0002325 colocalizes with
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:HOM0000017 in orthology relationship with
2024-08-01 16:49:30,665 INFO Graph of all connections between concepts saved to restr_oi_concepts.png
2024-08-01 16:49:30,690 INFO List of triplets saved to restr_oi_triples.csv
2024-08-05 10:51:32,890 INFO Created a list of tuples with 96352 entries
2024-08-05 10:51:32,895 INFO Loaded 96352 associations
2024-08-05 10:51:33,493 INFO Created a dataframe with 96352 entries and column values ['id' 'subject_id' 'subject_label' 'subject_iri' 'subject_category'
'subject_taxon_id' 'subject_taxon_label' 'object_id' 'object_label'
'object_iri' 'object_category' 'object_taxon_id' 'object_taxon_label'
'relation_id' 'relation_label' 'relation_iri']
2024-08-05 10:51:34,674 INFO Created a dataframe with 9732 entries and column values ['id' 'semantic_groups' 'name']
2024-08-05 10:52:07,673 INFO Created a list of tuples with 228753 entries
2024-08-05 10:52:07,683 INFO Loaded 228753 associations
2024-08-05 10:52:09,233 INFO Created a dataframe with 228753 entries and column values ['id' 'subject_id' 'subject_label' 'subject_iri' 'subject_category'
'subject_taxon_id' 'subject_taxon_label' 'object_id' 'object_label'
'object_iri' 'object_category' 'object_taxon_id' 'object_taxon_label'
'relation_id' 'relation_label' 'relation_iri']
2024-08-05 10:52:12,586 INFO Created a dataframe with 14618 entries and column values ['id' 'semantic_groups' 'name']
2024-08-05 12:33:35,733 INFO Created a list of tuples with 228753 entries
2024-08-05 12:33:35,743 INFO Loaded 228753 associations
2024-08-05 12:33:35,748 INFO Created a list of tuples with 305 entries
2024-08-05 12:33:35,749 INFO Loaded 305 associations
2024-08-05 12:33:35,754 INFO Created a list of tuples with 39 entries
2024-08-05 12:33:35,754 INFO Loaded 39 associations
2024-08-05 12:33:39,480 INFO The graph contains 7 different semantic groups: {'PHYS', 'GENO', 'ANAT', 'VARI', 'GENE', 'ORTH', 'DISO'}
2024-08-05 12:33:39,481 INFO For the graph, a total of 228655 edges and 14618 nodes have been generated.
2024-08-05 12:33:40,418 INFO The graph contains 8 different semantic groups: {'DRUG', 'PHYS', 'GENO', 'ANAT', 'VARI', 'GENE', 'ORTH', 'DISO'}
2024-08-05 12:33:40,419 INFO For the graph, a total of 228999 edges and 14833 nodes have been generated.
2024-08-05 12:33:40,422 INFO There are 8 semantic groups: ['ORTH' 'DISO' 'GENO' 'DRUG' 'GENE' 'ANAT' 'PHYS' 'VARI']
2024-08-05 12:33:40,484 INFO There are 21 relation labels: relation_label
relation_id
BFO:0000050 is part of
CustomRO:DC is substance that treats
CustomRO:TTD targets
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
GENO:0000840 pathogenic for condition
RO:0002200 has phenotype
RO:0002206 expressed in
RO:0002325 colocalizes with
RO:0002326 contributes to
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004013 is causal germline mutation in
RO:0004016 is causal germline mutation partially giving r...
RO:HOM0000017 in orthology relationship with
RO:HOM0000020 in 1 to 1 orthology relationship with
2024-08-05 12:33:42,106 INFO Graph of all connections between concepts saved to prev_hd_concepts.png
2024-08-05 12:33:42,167 INFO List of triplets saved to prev_hd_triples.csv
2024-08-05 12:40:00,638 INFO Created a list of tuples with 228753 entries
2024-08-05 12:40:00,650 INFO Loaded 228753 associations
2024-08-05 12:40:04,339 INFO The graph contains 12 different semantic groups: {'variant', 'gene', 'interaction', 'model', 'disease', 'phenotype', 'function', 'anatomy', 'pathway', 'marker', 'homology', 'genotype'}
2024-08-05 12:40:04,340 INFO For the graph, a total of 228655 edges and 14618 nodes have been generated.
2024-08-05 12:40:04,355 INFO Extracted a total of 14618 nodes that belong to at least one of the semantic groups []
2024-08-05 12:40:04,413 INFO A total of 14618 gene IDs has been extracted
2024-08-05 12:40:04,532 INFO Loaded 19378 drug-target interactions:
DRUG_NAME STRUCT_ID TARGET_NAME ... ACTION_TYPE TDL ORGANISM
0 levobupivacaine 4 Potassium voltage-gated channel subfamily H me... ... NaN Tclin Homo sapiens
1 levobupivacaine 4 Sodium channel protein type 1 subunit alpha ... NaN Tclin Homo sapiens
2 levobupivacaine 4 Sodium channel protein type 4 subunit alpha ... BLOCKER Tclin Homo sapiens
[3 rows x 20 columns]
2024-08-05 12:40:29,177 INFO For a total of 16141 drug-target interactions, new mapped IDs are found.
2024-08-05 12:40:29,200 INFO Retrieved 316 drug-target interactions with matched gene IDs:
DRUG_NAME STRUCT_ID GENE_ID PROD_ID PROD_NAME
237 adenosine 90 HGNC:4141 P04406 Glyceraldehyde-3-phosphate dehydrogenase liver
257 adenosine triphosphate 91 HGNC:5241 P11142 Heat shock cognate 71 kDa protein
258 adenosine triphosphate 91 HGNC:5232 P0DMV8 Heat shock 70 kDa protein 1A
330 alfentanil 114 HGNC:8156 P35372 Mu-type opioid receptor
2024-08-05 12:40:29,203 INFO Total of 316 drug-target associations changed to 305 by dropping duplicates.
2024-08-05 12:40:30,286 INFO All 610 TTD associations are saved into restr_hd_ttd_associations.csv
2024-08-05 12:40:30,290 INFO Created a list of tuples with 610 entries
2024-08-05 12:40:30,313 INFO Extracted a total of 215 nodes that belong to at least one of the semantic groups ['drug']
2024-08-05 12:40:30,326 INFO Extracted a total of 6440 nodes that belong to at least one of the semantic groups ['disease', 'phenotype']
2024-08-05 12:40:30,353 INFO There are 215 unique drug names
2024-08-05 12:40:30,354 INFO There are 6440 unique disease/phenotype IDs
2024-08-05 12:40:30,735 INFO Loaded 28978 drug-disease pairs:
DRUG_ID DRUG_NAME DISEASE_NAME PHASE
0 D00ABE ald-301 ischemia Phase 2
1 D00ABE ald-301 peripheral arterial disease Phase 2
2 D00ABO kw-2449 acute myeloid leukaemia Phase 1
2024-08-05 12:40:30,772 INFO Loaded 699 phenotypes with matching IDs scoring 80:
Name ontologyTermName ontologyTermIRI score validated review DISEASE_ID
0 respiratory failure Respiratory failure http://purl.obolibrary.org/obo/HP_0002878 100.00 False False HP:0002878
1 sexual dysfunction Male sexual dysfunction http://purl.obolibrary.org/obo/HP_0040307 86.49 False False HP:0040307
6 pollakiuria Pollakisuria http://purl.obolibrary.org/obo/HP_0100515 88.00 False False HP:0100515
2024-08-05 12:40:30,821 INFO Total of 14760 disease names mapped to their IDs:
DRUG_ID DRUG_NAME DISEASE_ID DISEASE_NAME
1 D00ABE ald-301 HP:0004950 peripheral arterial disease
2 D00ABO kw-2449 HP:0004808 acute myeloid leukaemia
4 D00ACC nd1251 HP:0000716 depression
5 D00ACH hmr-4004 HP:0100658 bacterial infection
11 D00AHT prame antigen-specific cancer immunotherapeutic HP:0030358 non small cell lung cancer
12 D00AHT prame antigen-specific cancer immunotherapeutic HP:0030358 non small cell lung cancer
14 D00AHT prame antigen-specific cancer immunotherapeutic HP:0002861 melanoma
19 D00AJS aik11 HP:0005978 non insulin dependent diabetes
21 D00AKQ o-desulfated heparin HP:0006510 chronic obstructive pulmonary disease
22 D00AKR amg 479 HP:0003002 breast cancer
2024-08-05 12:40:30,830 INFO A total of 39 are matched with existing drugs and diseases/phenotypes
2024-08-05 12:40:30,832 INFO Total of 39 drug-disease associations changed to 39 by dropping duplicates.
2024-08-05 12:40:31,006 INFO All DrugCentral associations are saved into restr_hd_drugcentral_associations.csv
2024-08-05 12:40:31,008 INFO Created a list of tuples with 39 entries
2024-08-05 12:40:31,836 INFO The graph contains 14 different semantic groups: {'variant', 'gene', 'interaction', 'model', 'disease', 'phenotype', 'function', 'drug', 'anatomy', 'pathway', 'gene product', 'marker', 'homology', 'genotype'}
2024-08-05 12:40:31,837 INFO For the graph, a total of 229039 edges and 14877 nodes have been generated.
2024-08-05 12:40:31,839 INFO There are 14 semantic groups: ['homology' 'phenotype' 'gene' 'interaction' 'drug' 'function' 'model'
'variant' 'genotype' 'gene product' 'anatomy' 'pathway' 'disease'
'marker']
2024-08-05 12:40:31,884 INFO There are 22 relation labels: relation_label
relation_id
BFO:0000050 is part of
CustomRO:DC is substance that treats
CustomRO:TTD1 is product of
CustomRO:TTD2 targets
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
GENO:0000840 pathogenic for condition
RO:0002200 has phenotype
RO:0002206 expressed in
RO:0002325 colocalizes with
RO:0002326 contributes to
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:0004013 is causal germline mutation in
RO:0004016 is causal germline mutation partially giving r...
RO:HOM0000017 in orthology relationship with
RO:HOM0000020 in 1 to 1 orthology relationship with
2024-08-05 12:40:32,931 INFO Graph of all connections between concepts saved to all_hd_concepts.png
2024-08-05 12:40:33,002 INFO List of triplets saved to all_hd_triples.csv
2024-08-05 14:04:06,150 INFO The graph contains 11 different semantic groups: {'taxon', 'biological process', 'molecular function', 'gene', 'variant', 'biological artifact', 'disease', 'phenotype', 'drug', 'gene product', 'genotype'}
2024-08-05 14:04:06,151 INFO For the graph, a total of 230228 edges and 14882 nodes have been generated.
2024-08-05 14:04:06,152 INFO There are 11 semantic groups: ['gene' 'phenotype' 'drug' 'molecular function' 'biological artifact'
'taxon' 'variant' 'genotype' 'gene product' 'biological process'
'disease']
2024-08-05 14:04:06,180 INFO There are 19 relation labels: relation_label
relation_id
CustomRO:DC is substance that treats
CustomRO:TTD1 is product of
CustomRO:TTD2 targets
CustomRO:associatedphenotype associated with phenotype
CustomRO:expressesgene expresses gene
CustomRO:foundin found in
CustomRO:isof is of
CustomRO:isvariantin is variant in
GENO:0000222 has genotype
GENO:0000408 is allele of
GENO:0000418 has affected feature
RO:0002325 colocalizes with
RO:0002327 enables
RO:0002331 involved in
RO:0002434 interacts with
RO:0003301 has role in modeling
RO:0003303 causes condition
RO:0003304 contributes to condition
RO:HOM0000017 in orthology relationship with
2024-08-05 14:04:06,547 INFO Graph of all connections between concepts saved to restr_hd_concepts.png
2024-08-05 14:04:06,585 INFO List of triplets saved to restr_hd_triples.csv