Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Need a Nucleotide or Protein alphabet #63

Open
qianxin-kxy opened this issue May 11, 2023 · 1 comment
Open

ValueError: Need a Nucleotide or Protein alphabet #63

qianxin-kxy opened this issue May 11, 2023 · 1 comment

Comments

@qianxin-kxy
Copy link

qianxin-kxy commented May 11, 2023

The following is the code I ran and the error situation. Has anyone encountered this issue? BioPython version is 1.77, and PhiSpy version is 4.2.21

(PhiSpy) [kxy@zju out]$ PhiSpy.py my_output.gbk -o output_directory
Processing 34 contigs
Making Testing Set...
Start Classification Algorithm...
Using the following metric(s): {'gc_skew', 'at_skew', 'shannon_slope', 'orf_length_med', 'max_direction'}.
Running the random forest classifier with 500 trees and 2 threads
/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of n_init will change from 10 to 'auto' in 1.4. Set the value of n_init explicitly to suppress the warning
warnings.warn(
As the training flag is zero, down-weighting unknown functions
Evaluating...
Checking prophages we might have found
Potential prophages (sorted highest to lowest)
Contig Start Stop Number of potential genes Status
NODE_25_length 4943 22015 29 Dropped. No genes were identified as phage genes
NODE_17_length 67767 91585 24 Kept
NODE_31_length 48 3971 8 Dropped. No genes were identified as phage genes
NODE_3_length 36724 41861 5 Dropped. No genes were identified as phage genes
NODE_7_length 40892 42026 2 Dropped. Region too small (Not enough genes)
NODE_9_length 147704 149860 1 Dropped. Region too small (Not enough genes)
NODE_9_length 128083 130272 1 Dropped. Region too small (Not enough genes)
NODE_8_length 31922 32392 1 Dropped. Region too small (Not enough genes)
NODE_2_length 90215 91141 1 Dropped. Region too small (Not enough genes)
NODE_27_length 8285 8959 1 Dropped. Region too small (Not enough genes)
NODE_20_length 9731 10882 1 Dropped. Region too small (Not enough genes)
NODE_18_length 55595 56515 1 Dropped. Region too small (Not enough genes)
NODE_17_length 38600 40714 1 Dropped. Region too small (Not enough genes)
NODE_16_length 41140 41724 1 Dropped. Region too small (Not enough genes)
NODE_15_length 34985 35749 1 Dropped. Region too small (Not enough genes)
NODE_11_length 62124 64214 1 Dropped. Region too small (Not enough genes)
PROPHAGE: 1 Contig: NODE_17_length Start: 67767 Stop: 91585
Creating output files
Writing GenBank output file
Traceback (most recent call last):
File "/data/users/kxy/miniconda3/envs/PhiSpy/bin/PhiSpy.py", line 10, in
sys.exit(run())
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/PhiSpyModules/main.py", line 122, in run
main(sys.argv)
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/PhiSpyModules/main.py", line 114, in main
PhiSpyModules.write_all_outputs(**vars(args_parser))
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/PhiSpyModules/writers.py", line 401, in write_all_outputs
write_genbank(self)
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/PhiSpyModules/writers.py", line 98, in write_genbank
SeqIO.write(self.record, handle, 'genbank')
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/Bio/SeqIO/init.py", line 531, in write
count = writer_class(handle).write_file(sequences)
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/Bio/SeqIO/Interfaces.py", line 235, in write_file
count = self.write_records(records, maxcount)
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/Bio/SeqIO/Interfaces.py", line 209, in write_records
self.write_record(record)
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/Bio/SeqIO/InsdcIO.py", line 1005, in write_record
self._write_the_first_line(record)
File "/data/users/kxy/miniconda3/envs/PhiSpy/lib/python3.10/site-packages/Bio/SeqIO/InsdcIO.py", line 757, in _write_the_first_line
raise ValueError("Need a Nucleotide or Protein alphabet")
ValueError: Need a Nucleotide or Protein alphabet

Additionally, because the ID in the gbk file obtained through the prokka annotation is too long, I used the following code to transform all LOCUS IDs in the file as follows:

LOCUS NODE_2_length_354722_cov_51.4144354722 bp DNA linear
sed -re 's/(_length)[^=]*$/\1/' 4751.gbk > my_output.gbk
LOCUS NODE_2_length

@linsalrob
Copy link
Owner

Can you share the original or modified GenBank file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants