Description: Creating a framework to process earthquake arrival times in parallel using Apache Spark. More on GitHub.

Techniques: distributed computing, grid-search earthquake location

Tools: Spark (PySpark), Arrow, AWS (S3, EC2), MySQL, Dash, Python (pandas, numpy)


This project was completed in three weeks for the Insight Data Engineering program in Seattle, Fall 2019.

A more thorough description can be found on GitHub.

Here is a walkthrough of the web frontend: