What I make

 
 

Data Tools

Rent: MARKET ANALYSIS APP

Rent is an app created to automate the way real estate agents conduct market analysis. It is powered by a regression model trained on data scraped from rental and real estate websites and is continually retrained weekly to keep up with market changes. A Flask app is currently being constructed that will be deployed via AWS BeanStalk and it's data stored in AWS DBS.   

tumblr_m33hp4cBdf1qdeuyro1_1280.jpg

WORLDLY: NEWS SUMMARY TOOL

Worldly is a D3.js tool that allows users to compare what different world regions care about in world events. Worldly allows you to see the volume of articles that focus on a certain event and compare how another country focused on it. Did you know that India was focused on Brexit before anyone else? Isn't it curious that the U.K. covered Zika as much as the U.S. although it was in no real danger? These are the type of insights that Worldly uncovers!

Data for Worldly was acquired by scraping 3 years worth of articles from newspapers across 5 continents. Data was cleaned, stored in a non-relational data base and then topics were extracted using Non-Matrix Factorization.


Business Insights

OUT-REACH

Out-Reach is a project to optimize the way non-profits conduct their outreach efforts. Three years worth of MTA turnstile data was used to pin-point areas of New York with the highest volume of foot traffic according to hour of the day, day of the week, and season. Recommendations were made in the form of visualizations to increase accessibility of the data.

Some Interesting Insights:

  • Spring is the season with the highest entries to the subway
  • During the weekends the best times to canvas are around the afternoon, from 12 - 6.pm.

A/B TESTING LANGUAGE TRANSLATIONS

 

Is it normal for a website to increase user interactions the less it caters to it's user? For this A/B test, a company tracked user interaction for it's website using a common language template and another using local translators for each region. After the data pointed to the common template performing better, the test was analyzed for sample bias using T-tests and a decision tree model. Ultimately, sample bias was found and a decision tree model was built to automate future sample bias detection.  


macys-shoppers-experience-a-random-act-of-culture1-e1416414943262-1940x1093.jpg

FORECASTING: BLACK FRIDAY

 

This project aims to prepare retail stores to stock the appropriate amount of inventory for Black Friday. Using historical customer and product sales data a Random Forrest Regression Model was trained to predict the volume of sales anticipated for Black Friday. To anticipate for bigger data, this was all done in an AWS EC2 instance using Spark's MLlib package. 


Data Journalism

Screen+Shot+2016-11-18+at+1.22.31+PM.png

ELECTION 2016: EMOJI SENTIMENT ANALYSIS

Emojis have been called a new form of language, so what better way to get a different angle on the recent election than to analyze how people used them. Using the Twitter Streaming API, 20 million tweets were acquired, cleaned, and stored in a SQL database. Log odds ratio was used to measure the degree of association that an emoji had to a certain candidate. Visualizations were made for different times and types of emojis. 

Work completed for Prismoji


Screen+Shot+2016-11-18+at+12.34.11+PM+2.jpg

THIRD PRESIDENTIAL DEBATE: TOPIC ANALYSIS

 

Do the American people really care about what the media focuses on? Using 1 million tweets collected during the third presidential debate, interest for debate topics were identified using volume of tweets. An interactive plot was build using Plotly to visualize the level of interest throughout the debate. 

Work completed for Prismoji