Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
FIND_MOTIF.py		FIND_MOTIF.py
README.md		README.md
athaliana.fasta		athaliana.fasta
find_most_probable.py		find_most_probable.py
hpt_sequences.txt		hpt_sequences.txt
list_proteomes.txt		list_proteomes.txt
spombe.fasta		spombe.fasta

Repository files navigation

motifs

Motif Generator: FIND_MOTIF

Finds a sequence motif from a set of provided peptide sequences. Uses the Gibbs Motif Finding algorithm as outlined in the Bioinformatics Algorithms textbook: https://stepic.org/Bioinformatics-Algorithms-2/

usage: FIND_MOTIF.py inputfile outputfile k motif_file

inputfile: collection of unaligned sequences belongign to the same protein family

outputfile: filename where the motif sequences from each original sequence (from inputfile) will be outputted. This file can then be uploaded to servers such as Weblogo 3 to create motif logos.

k: length of the motif

motif_file: filename where the motif profile matrix will be outputted

Motif Finder from a proteome: find_most_probable.py

From a specified proteome or list of proteomes, rank all proteins in terms of the probability of the sequence best matching a given motif profile being a genuine motif. The boxplot function generates boxplots of the proteomes, with whiskers at 3IQR and outliers indicated. Linees 157-159 (commented out) also sorts and lists the top 25 proteins according to probability.

Usage: find_most_probable.py input proteomes output

input: input motif file. Usually motif_file as outputted by FIND_MOTIF.py

proteomes: a list of the proteomes to be assessed. This should be a text file with the filenames of the fasta formatted proteins on each line. The boxplot uses these filenames for labeling the results.

output: output image file where the boxplot will be saved.

Sample Files

athaliana.fasta: Sample proteome of the plant Arabidopsis thaliana

spombe.fasta: Sample proteome of the fungus Schizosaccharomyces pombe

list_proteomes.txt: Sample input file for find_most_probable.py listing the proteomes to be queried

hpt_sequences.txt: sample input reference set for FIND_MOTIF.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

motifs

Motif Generator: FIND_MOTIF

Motif Finder from a proteome: find_most_probable.py

Sample Files

About

Releases

Packages

Languages

dsurujon/motifs

Folders and files

Latest commit

History

Repository files navigation

motifs

Motif Generator: FIND_MOTIF

Motif Finder from a proteome: find_most_probable.py

Sample Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages