Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cigar is None #13

Open
martinhaagmans opened this issue Mar 24, 2022 · 2 comments
Open

Cigar is None #13

martinhaagmans opened this issue Mar 24, 2022 · 2 comments

Comments

@martinhaagmans
Copy link

martinhaagmans commented Mar 24, 2022

<multiprocessing.context.SpawnContext object at 0x7fdd16fc21f0>
Environment set: <multiprocessing.context.SpawnContext object at 0x7fdd16fc21f0>
Using 16 cores.
Filtering reads aligned to unindexed regions with minimap2
Done filtering. Reads filtered:7587
batch nt: 16444094 total_nt: 263105501
27073
26857
27083
27130
27307
27044
26775
26847
26784
26718
26972
27147
27161
26887
27175
27244
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Using SLAMEM
Time for slaMEM to find mems:1133.155166387558 seconds.
Starting aligning reads.
Nr reads: 432204 nr batches: 16 [27073, 26857, 27083, 27130, 27307, 27044, 26775, 26847, 26784, 26718, 26972, 27147, 27161, 26887, 27175, 27244]
Processed 5000 reads in batch 1
Processed 5000 reads in batch 0
Processed 5000 reads in batch 5
Processed 5000 reads in batch 4
Processed 5000 reads in batch 15
Processed 5000 reads in batch 10
Processed 5000 reads in batch 14
Processed 10000 reads in batch 0
Processed 10000 reads in batch 15
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/align.py", line 670, in align_single_helper
    return align_single(*arguments)
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/align.py", line 528, in align_single
    non_covered_regions, mam_value, mam_solution = classify_read_with_mams.main(mem_solution, ref_segment_sequences, ref_flank_sequences, parts_to_segments, \
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/classify_read_with_mams.py", line 447, in main
    add_segment_to_mam(read_seq, ref_chr_id, segment_seq, s_start, s_stop, segm_id, mam_instance, min_acc, annot_label = '_full_segment' )
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/classify_read_with_mams.py", line 312, in add_segment_to_mam
    locations, edit_distance, accuracy = edlib_alignment(exon_seq, read_seq, mode="HW", task = 'path', k = 0.4*min(len(read_seq), len(exon_seq)) )
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/classify_read_with_mams.py", line 111, in edlib_alignment
    accuracy = cigar_to_accuracy(cigar_string)
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/classify_read_with_mams.py", line 74, in cigar_to_accuracy
    result = re.split(r'[=DXSMI]+', cigar_string)
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/re.py", line 231, in split
    return _compile(pattern, flags).split(string, maxsplit)
TypeError: expected string or bytes-like object
"""
 ``
The above exception was the direct cause of the following exception:
 
Traceback (most recent call last):
  File "/home/martin/miniconda3/envs/ultra/bin/uLTRA", line 717, in <module>
    align_reads(args)
  File "/home/martin/miniconda3/envs/ultra/bin/uLTRA", line 504, in align_reads
    classifications, alignment_outfiles = align.align_parallel(read_batches, refs_id_lengths, args)
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/site-packages/modules/align.py", line 688, in align_parallel
    results =res.get(999999999) # Without the timeout this blocking call ignores all signals.
  File "/home/martin/miniconda3/envs/ultra/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
TypeError: expected string or bytes-like object

I added a try except for this line
result = re.split(r'[=DXSMI]+', cigar_string)
to print the cigar, and the cigar is None

Any ideas?

Cheers,
Martin

@ksahlin
Copy link
Owner

ksahlin commented Mar 24, 2022

What is the run command? Do you have a properly formated GTF? See related issue caused by invalid GTF format: #11 (comment)

@martinhaagmans
Copy link
Author

My run command:
uLTRA align /media/ssd1/martin/referentie/hg19.fa rna.sup.fastq ultra_out/ --ont --t 16 --index /media/ssd1/martin/referentie/ultra/

I got the GTF from UCSC https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/

chr1    refGene transcript      11869   14362   .       +       .       gene_id "LOC102725121"; transcript_id "NR_148357";  gene_name "LOC102725121";
chr1    refGene exon    11869   12227   .       +       .       gene_id "LOC102725121"; transcript_id "NR_148357"; exon_number "1"; exon_id "NR_148357.1"; gene_name "LOC102725121";
chr1    refGene exon    12613   12721   .       +       .       gene_id "LOC102725121"; transcript_id "NR_148357"; exon_number "2"; exon_id "NR_148357.2"; gene_name "LOC102725121";
chr1    refGene exon    13221   14362   .       +       .       gene_id "LOC102725121"; transcript_id "NR_148357"; exon_number "3"; exon_id "NR_148357.3"; gene_name "LOC102725121";
chr1    refGene transcript      11874   14409   .       +       .       gene_id "DDX11L1"; transcript_id "NR_046018";  gene_name "DDX11L1";
chr1    refGene exon    11874   12227   .       +       .       gene_id "DDX11L1"; transcript_id "NR_046018"; exon_number "1"; exon_id "NR_046018.1"; gene_name "DDX11L1";
chr1    refGene exon    12613   12721   .       +       .       gene_id "DDX11L1"; transcript_id "NR_046018"; exon_number "2"; exon_id "NR_046018.2"; gene_name "DDX11L1";
chr1    refGene exon    13221   14409   .       +       .       gene_id "DDX11L1"; transcript_id "NR_046018"; exon_number "3"; exon_id "NR_046018.3"; gene_name "DDX11L1";
chr22   refGene transcript      24666799        24813706        .       +       .       gene_id "SPECC1L"; transcript_id "NM_015330";  gene_name "SPECC1L";
chr22   refGene exon    24666799        24666951        .       +       .       gene_id "SPECC1L"; transcript_id "NM_015330"; exon_number "1"; exon_id "NM_015330.1"; gene_name "SPECC1L";

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants