Using EL & Dl methods to predict DNA 4mC site.
2024/12/7: Proposed a method called feature_selection_with_lazypredict, which can evaluate different feature encoding methods, based on 3 ML methods testing on 3 species dataset.
2024/12/11: Created a preliminary CNN PreTraining model and a CatBoost model. Conducted further evaluation and training.
Python Version=3.7
pip list can be achieved by:
pip install -r requirements.txt
Package Version
absl-py 0.15.0 astunparse 1.6.3 bio 1.6.2 biopython 1.81 biothings-client 0.3.1 cachetools 4.2.4 certifi 2020.6.20 charset-normalizer 2.0.12 click 8.1.7 colorama 0.4.5 cycler 0.11.0 Cython 0.29.28 dataclasses 0.6 flatbuffers 1.12 gast 0.3.3 gensim 4.2.0 google-auth 2.17.3 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 gprofiler-official 1.0.0 grpcio 1.32.0 h5py 2.10.0 idna 3.4 importlib-metadata 4.8.3 importlib-resources 5.4.0 joblib 1.1.1 keras 2.11.0 Keras-Preprocessing 1.1.2 kiwisolver 1.3.1 lazypredict 0.2.12 lightgbm 3.3.5 Markdown 3.3.7 matplotlib 3.3.4 mygene 3.2.2 numpy 1.19.5 oauthlib 3.2.2 opt-einsum 3.3.0 packaging 24.0 pandas 1.1.5 Pillow 8.4.0 pip 23.2.1 platformdirs 4.0.0 pooch 1.8.2 protobuf 3.19.6 pyasn1 0.5.0 pyasn1-modules 0.3.0 pyparsing 3.0.9 python-dateutil 2.8.2 pytz 2023.3 requests 2.27.1 requests-oauthlib 1.3.1 rsa 4.9 scikit-learn 0.24.2 scipy 1.5.4 setuptools 68.0.0 six 1.15.0 smart-open 6.3.0 tensorboard 2.10.1 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorflow-estimator 2.4.0 tensorflow-gpu 2.4.1 termcolor 1.1.0 threadpoolctl 3.1.0 tqdm 4.64.1 typing_extensions 4.7.1 urllib3 1.26.15 Werkzeug 2.0.3 wheel 0.41.2 wincertstore 0.2 wrapt 1.12.1 xgboost 1.4.2 zipp 3.6.0