Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
env		env
figures		figures
latex		latex
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
Arabidopsis_files_and_scripts.docx		Arabidopsis_files_and_scripts.docx
README.md		README.md
Snakefile		Snakefile
config.json		config.json
environment_ploidy-seq.yaml		environment_ploidy-seq.yaml
setup.sh		setup.sh
update_config.json		update_config.json
update_setup.sh		update_setup.sh

Repository files navigation

Ploidy-seq data analysis

DISCLAIMER: Don't do any of this yet, I haven't debugged everything yet so it probably won't work.

First, we will need to create an instance on Jetstream Atmosphere and connect to iRODS. An explanation for how to do both of these things is available on Mick Song's blog: https://michaelsongagradstudent.github.io/blog/2017/04/12/Cheat_Sheet_Atmosphere

Once we are in our Atmosphere web shell, we first want to install the package manager conda:

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

We will need to re-initialize our web shell:

source ~/.bashrc

Next, clone the repository and cd into it:

git clone https://github.com/barneypotter24/ploidy-seq.git
cd ploidy-seq

From there, we can install all the software that we will need for the analysis and migrate our data from the CyVerse data store. All of this can be done by running the command:

bash setup.sh

At several points during the installation, we will be prompted to accept the installation of programs that we will use. Just hit the return key to accept the installation. Once the command is done running we will have a few things:

populated directories for all of our raw and reference data
a new environment built inside which all of our programs are installed called ploidy-seq
empty directories that will store temporary files used during the pipeline as well as our pipeline output We will activate our new environment by running:

source activate ploidy-seq

This gives us access to all the programs that we need to continue.

Now, we can test that everything is correctly installed and all of our data is living in the correct place:

snakemake -n

If no errors come up, we can start our analysis. Note that analysis will run on every file that is listed in config.json, and it may take a long time.

snakemake

All the output should end up in the folders htseq and fastqc and no intermediary files will be stored, to keep space use to a minimum.

Finally, move all files back to iPlant:

cd htseq/
iput -bf *.txt /iplant/home/jcoate/Arabidopsis/2017/HTSeq/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ploidy-seq data analysis

DISCLAIMER: Don't do any of this yet, I haven't debugged everything yet so it probably won't work.

About

Releases

Packages

Contributors 2

Languages

barneypotter24/ploidy-seq

Folders and files

Latest commit

History

Repository files navigation

Ploidy-seq data analysis

DISCLAIMER: Don't do any of this yet, I haven't debugged everything yet so it probably won't work.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages