
Skills
Python, PySpark, R, SQL, machine learning (supervised and unsupervised learning), advanced statistical analysis, Databricks, Azure, Google Cloud Platform, Azure Machine Learning
For class and personal project code visit my Github.
Work Experience
Allstate, Lead Machine Learning Engineer
August, 2024 – Present
Kimberly-Clark, Lead Data Scientist, AI Engineering
July, 2022 – August 2024
- Led manufacturing advanced analytics projects that generated $470k in annual savings by identifying tissue breaks in real-time and uncovering and remediating the root causes of tissue breaks
- Lead data science center of excellence, which provides across-team Databricks optimization guidance that has resulted in over $200k in annual cloud cost savings
- Created scalable data science solution accelerators for the most common ML use cases, which have decreased solution run times by 90 percent and reduced time to value when utilized by teams
- Created a data science best practices wiki to increase adherence to industry standards across the company
- Added functionality to our in-house MLOps platform to flag significant model drift and recommend retraining, which will reduce costs by preventing model degradation and reducing unnecessary retraining
- Provided beginner and advanced Databricks trainings to over 100 employees
Maritz, Director, Decision Sciences
March, 2021 – June, 2022
- Managed a team of 8 data scientists and analysts focusing on modeling, analytics, and dashboarding
- Built Maritz’s data science portfolio from inception to an advanced practice that sold over $1.3 million in new data science contracts and renewals
- Set up team data science processes including version control, code reviews, and coding best practices
- Built models in Python to measure lift and predict ROI of future promotions
- Built a pipeline of machine learning models to predict lifetime value and target high-value customers early
- Created clustering models in PySpark and R used to inform targeted communication and promotion strategies
- Created models in R to predict churn and measure engagement trends, which were used as the basis for personalized lifecycle communication plans for major retail and airline clients
Avanade, Senior Consultant – Data Science
February, 2020 – March, 2021
- Built and deployed scalable end-to-end image processing pipelines in Azure ML with the Python SDK projected to reduce client costs by $10 to $20 million dollars annually
- Proven track record of building sales assets and serving as a subject matter expert in successful client sales pursuits. Contributed to projects and helped win engagements ranging from $30,000 to over $1 million.
- Advised leading beverage company on the best strategy for migrating data science models from AWS to Azure
- Created machine learning models in Python to identify high-value partners to maximize revenue and sales
Whole Foods Market, Data Science Analyst
May, 2019 – February, 2020
- Developed a time series anomaly detection model to forecast and plan for fluctuations in Prime Now sales
- Created a statistical model to optimize product purchases and reduce waste by setting store shrinkage targets
Bayer Crop Science, Data Scientist – Intermediate Contractor
October, 2017 – April, 2019
- Created and validated prediction models in R to reallocate resources and limit risk in the seed production process
- In one location the model reduced failure by 70% for products adjusted by the model versus a control group
- In a second location the model contributed to a 20% decrease in costs without a significant increase in failure
- Performed power analyses in R and SAS and built interactive Spotfire visualizations to:
- Distinguish between product performance in all regions of the global corn testing network, allowing for resource reduction and prioritization of harvest based on need
- Differentiate between the yield of two major corn traits, informing product placement and reducing risk
- Performed simulations comparing different magnitudes of resource reduction or reallocation within the product testing, seed production, and data collection networks to suggest changes to maximize profit and minimize risk.
- Analyzed experimental crop data measuring the effectiveness of fungicides and other crop treatments.
Washington University in St. Louis, Statistical Data Analyst
June, 2016 – October, 2017
- Measured the impact of healthcare programs on insurance coverage and young adult cancer outcomes
- Helped write a SAS simulation used to estimate cost savings of a major local public health program
- Author on published briefs and a journal article assessing the impact of various healthcare policies
Certifications
- Databricks Certified Generative AI Engineer Associate, June, 2024
- Databricks Certified Machine Learning Professional, November, 2023
- Databricks Certified Machine Learning Associate, October, 2023
- Databricks Certified Associate Developer for Apache Spark, October, 2023
- Certified Microsoft Azure Fundamentals, October, 2020
- Certified Azure Data Scientist Associate, September, 2020
Presentations & Leadership
- Databricks Data + AI Summit, 2022
Predicting Repeat Admissions to Substance Abuse Treatment with Machine Learning
- Gateway to Innovation Conference, 2022
Data Science Leadership Panel Sponsored by Google
- Machine Learning, Statistics, & Big Data Programming Instructor, 2019 – 2023
- Delivered classroom-based and live-streamed curricula and built website to provide materials to students
