-
Notifications
You must be signed in to change notification settings - Fork 4
Experiment reproduction
Kang Hu edited this page Dec 25, 2024
·
36 revisions
- As Repbase is a paid database, you must download the Repbase data from https://www.girinst.org/server/RepBase/index.php before reproducing our experiments. For example, download RepBase26.05.fasta.tar.
- After downloading the data, copy the files athrep.ref, cbrrep.ref, drorep.ref, maize.ref, oryrep.ref, and zebrep.ref to the ${pathToHiTE}/library directory. For instance, include HiTE/library/oryrep.ref to replicate experiments related to rice. Remember to remove non-TE elements from the oryrep.ref file, including satellite sequences and others.
Download the Reference Genome of Rice.
cd ${pathTo}/HiTE
git clone https://github.com/oushujun/EDTA.git
python main.py \
--genome ${pathTo/genome}/GCF_001433935.1_IRGSP-1.0_genomic.fna \
--thread ${thread} \
--outdir ${output_dir} \
--plant 1 \
--BM_RM2 1 \
--BM_EDTA 1 \
--EDTA_home ${EDTA_home} \
--BM_HiTE 1 \
--coverage_threshold 0.95 \ # Switch to 0.99 if you prefer a more stringent threshold.
--species rice #[dmel, rice, cb, zebrafish, maize, ath], set --plant 0 if you choose the non-plant species
# Example command: python main.py \
# --genome /home/hukang/EDTA/krf_test/rice/GCF_001433935.1_IRGSP-1.0_genomic.fna \
# --thread 40 \
# --outdir /homeb/hukang/KmerRepFinder_test/library/rice/ \
# --plant 1 \
# --BM_RM2 1 \
# --BM_EDTA 1 \
# --EDTA_home /home/hukang/HiTE/EDTA \
# --BM_HiTE 1 \
# --coverage_threshold 0.95 \
# --species rice
# The benchmarking results can be found at "${output_dir}/BM_RM2.log, ${output_dir}/BM_EDTA.log, ${output_dir}/BM_HiTE.log".
# 1. Run HiTE.
python main.py \
--genome ${pathTo/genome}/GCF_001433935.1_IRGSP-1.0_genomic.fna \
--thread ${thread} \
--outdir ${output_dir} \
--plant 1 # Set --plant 0 if you choose a non-plant species
# Example command: python main.py \
# --genome /home/hukang/EDTA/krf_test/rice/GCF_001433935.1_IRGSP-1.0_genomic.fna \
# --thread 40 \
# --outdir /homeb/hukang/KmerRepFinder_test/library/rice/ \
# --plant 1
# 2. Skip HiTE, and run the benchmarking method of RepeatModeler2 (BM_RM2)
python main.py \
--genome ${pathTo/genome}/GCF_001433935.1_IRGSP-1.0_genomic.fna \
--thread ${thread} \
--outdir ${output_dir} \
--plant 1 \
--skip_HiTE 1 \
--BM_RM2 1 \
--coverage_threshold 0.95 \ # Switch to 0.99 if you prefer a more stringent threshold.
--species rice #[dmel, rice, cb, zebrafish, maize, ath], set --plant 0 if you choose the non-plant species
# Example command: python main.py \
# --genome /home/hukang/EDTA/krf_test/rice/GCF_001433935.1_IRGSP-1.0_genomic.fna \
# --thread 40 \
# --outdir /homeb/hukang/KmerRepFinder_test/library/rice/ \
# --plant 1 \
# --skip_HiTE 1 \
# --BM_RM2 1 \
# --coverage_threshold 0.95 \
# --species rice
# The benchmarking results can be found at "${output_dir}/BM_RM2.log".
# 3. Skip HiTE, BM_RM2, and run the benchmarking method of EDTA (BM_EDTA)
python main.py \
--genome ${pathTo/genome}/GCF_001433935.1_IRGSP-1.0_genomic.fna \
--thread ${thread} \
--outdir ${output_dir} \
--plant 1 \
--skip_HiTE 1 \
--BM_RM2 0 \
--BM_EDTA 1 \
--EDTA_home ${EDTA_home} \
--species rice #[dmel, rice, cb, zebrafish, maize, ath], set --plant 0 if you choose the non-plant species
# Example command: python main.py \
# --genome /home/hukang/EDTA/krf_test/rice/GCF_001433935.1_IRGSP-1.0_genomic.fna \
# --thread 40 \
# --outdir /homeb/hukang/KmerRepFinder_test/library/rice/ \
# --plant 1 \
# --skip_HiTE 1 \
# --BM_RM2 0 \
# --BM_EDTA 1 \
# --EDTA_home /home/hukang/HiTE/EDTA \
# --species rice
# The benchmarking results can be found at "${output_dir}/BM_EDTA.log".
# 4. Skip HiTE, BM_RM2, BM_EDTA, and run the benchmarking method of HiTE(BM_HiTE)
python main.py \
--genome ${pathTo/genome}/GCF_001433935.1_IRGSP-1.0_genomic.fna \
--thread ${thread} \
--outdir ${output_dir} \
--plant 1 \
--skip_HiTE 1 \
--BM_RM2 0 \
--BM_EDTA 0 \
--EDTA_home ${EDTA_home} \
--BM_HiTE 1 \
--coverage_threshold 0.95 \ # Switch to 0.99 if you prefer a more stringent threshold.
--species rice #[dmel, rice, cb, zebrafish, maize, ath], set --plant 0 if you choose the non-plant species
# Example command: python main.py \
# --genome /home/hukang/EDTA/krf_test/rice/GCF_001433935.1_IRGSP-1.0_genomic.fna \
# --thread 40 \
# --outdir /homeb/hukang/KmerRepFinder_test/library/rice/ \
# --plant 1 \
# --skip_HiTE 1 \
# --BM_RM2 0 \
# --BM_EDTA 0 \
# --EDTA_home /home/hukang/HiTE/EDTA \
# --BM_HiTE 1 \
# --coverage_threshold 0.95 \
# --species rice
# The benchmarking results can be found at "${output_dir}/BM_HiTE.log".
Hunan Provincial Key Lab on Bioinformatics, Central South University.