Skip to content

Geologic age prediction model and interactive magmatic flow dashboard

Notifications You must be signed in to change notification settings

lilynorthcutt/sierraNevadaAge

Repository files navigation


Flow Through the Ages: Geologic Age Prediction in the Sierra Nevadas

Welcome! This collaborative project aims to innovate rock age prediction using machine learning models. Our goal is to fill in gaps between known geologic ages to better understand magma migration patterns in the Sierra Nevadas.

animated

Note: This is an ongoing project! I am currently in the process of predicting unknown ages using the trained models. These predictions will then be added to the Shiny app to help researchers understand the overall magamtic flow in the area.

Quick Links:

About The Project

The main goals of this project are to accomplish the following:

  • Develop novel machine learning (ML) methods to fill gaps in geologic maps by predicting undated rock ages.
  • Apply the results from the model to predict undated rock ages, and access a less biased view of magmatic migration.
  • Build Interactive Visualizations for researchers and curious minds to view and explore the data and predictions.

Getting Started

This project uses python for data ingestion, processing, and model creation, and R for the interactive Shiny dashboard. All predictions from trained models are stored in Data/output/model_output.xlxs, and accessed by
R/app/global.R along with the processed data to build the visuals for the app.

Installation Instructions

To get started, please clone the repo:

git clone https://github.com/lilynorthcutt/sierraNevadaAge.git

Running the Code

If you would like to use the processed data in the Data/processed/ folder, then you are good to go!

But, if you want to use the raw data and go through the ingestion, cleaning, wrangling, and feature engineering process from scratch then the following is necessary:

Navigate to the get_url_for_zip() within the elevation.py file of the pyhigh package (python3.9/pyhigh/elevation.py). The USGS no longer allows downloads of the elevation data referenced in the get_url_for_zip(), thus we can access the data from a different site. You can use the site I use by updating the get_url_for_zip() function as shown here:

 def get_url_for_zip(zip_name):
    return f'https://firmware.ardupilot.org/SRTM/North_America/{zip_name}'

You can find more info under this github issue

Data

This project uses two data types:

  1. CSV File: Around 215 points were aggregated from 40 sources of published and unpublished papers. Most importantly, for our analysis, it contains the dated sample ages and their respective locations.

    ‼️ If you have additional data you think would be relevant to add to this project, or an amendment to any data used, please reach out to me via email: [email protected]

  2. SHAPEFILES: There are 4 shapefiles, in our raw data and 5 in our processed data. The main two shapefiles we focus on are:

    • DetailedMapDataAreaBoundaryLine - Contains linestring with boundary for region of interest
    • CretaceousIntrusionsIndividualPolygons - Contains detailed polygons (approx. 1400), along with a label of its geologic time period (Early Cretaceous (eK), Late Cretaceous (lk), and Cretaceous-Jurassic (KJ) )

    Additionally, in the processed data there is one more shapefile:

    • UnionPolygons - Contains the union of all polygons of the same time period. This condenses the ~1400 polygons significantly without losing the shapes or time period labels.

About

Geologic age prediction model and interactive magmatic flow dashboard

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published