I am fairly new to the corporate use of data so it's been fun to explore and learn the different problems and needs of business. Feel free to use any techniques used here in your own problems and if you have any to share I'm all ears!


Reducing Employee Churn

            Retaining the best talent is an essential part of running a great company, but how to do it is not always straight forward. In this project I used data from an anonymized tech company to understand what factors cause employees to leave and recommend ways to reduce churn rates. I did this by: 1. first plotting distributions of different features to get an idea of what may be causing it and 2. Using a decision tree classifier to find the most predictive features. 
Some Findings:
- Employees tend to quit after one year of working at the company.
- Employees with the highest and lowest salaries tend to stay in the company, and the most likely to leave are ones with average salaries.


Is it normal for a website to increase user interactions the less it caters to it's user? For this A/B test, a company tracked user interaction for it's website using a common language template and another using local translators for each region. After the data pointed to the common template performing better, the test was analyzed for sample bias using T-tests and a decision tree model. Ultimately, sample bias was found and a decision tree model was built to automate future sample bias detection.  



Out-Reach is a project to optimize the way non-profits conduct their outreach efforts. Three years worth of MTA turnstile data was used to pin-point areas of New York with the highest volume of foot traffic according to hour of the day, day of the week, and season. Recommendations were made in the form of visualizations to increase accessibility of the data.

Some Interesting Insights:

  • Spring is the season with the highest entries to the subway
  • During the weekends the best times to canvas are around the afternoon, from 12 -