PROJECTS
Analyze a bowl of M&Ms using computer vision to make sure there are no green ones. A fundamental quality control automation workflow with applications across many industries.
[GitHub Repository] [Video Demo]
Keywords: C#, ASP.NET MVC, AWS RekognitionFacebook Authentication
Perform text search of YouTube channel transcripts, and locate precise links directly to specific videos and points of interest. A full-stack commercial application that includes JWT authentication for user account management.
[GitHub Repository]
Keywords: MERN Stack, React, JWT, Google Cloud, MongoDB, Heroku, Express
VouchIt Behavioral Geofencing
Use your existing customer data to predict where future clients will come from, and make objective decisions based on solid data science. A full-stack commercial application that includes ecommerce, authentication, custom APIs, AI and more.
[GitHub Repository] [Live Demo]
Keywords: AWS Amplify, React, Stripe, Machine Learning, GraphQL, REST API
Google Tag Manager URI Parser and Tracker
Dozens of Google Analytics goals can be added to GTM with just a single variable. This is a custom-variable for GTM extracts the protocol from Click URL Variable, which can then be exported to parent tags. Currently used by digital marketing agenices.
[GitHub Repository]
Keywords: Google Tag Manager, Scripting
Type in any location, and compile a gallery of local images. Add locations to the list, or take a Google StreetView tour of that area.
[GitHub Repository]
Keywords: React, Redux, Heroku, Firebase, CSS, Bootstrap, OAuth, Jest automated testing
Real-time results for latest Wikipedia article updates. Uses client-side processing call Wikipedia's API and process the results.
[GitHub]
Keywords: Node, Express, REST API
Using Kobe Bryant shot history dataset with 5000 missing values, built Random Forest and Linear Models to solve for unknowns. In this Kaggle Kernel, I've also done some basic feature creation and exploratory analysis.
Keywords: R, Machine Learning, Random Forest, Statistical Inference, ggplot2
This corpus was created as part of a previous project from 2014. Using ScrapingHub, scraped the content section of over 160,000 blogs and news sites, excluding HTML tags, menu items, advertisements, suggested content, and other irrelevant information.
Keywords: Web Crawling, Web Scraping, Kaggle, ScrapingHub
Kaggle Real-Estate Prediction Contest
Created for a Kaggle contest. Using home sales data from Ames, Iowa, predicted house prices for a test set of homes with missing sale prices. This document outlines the thought process for developing a basic model.
[Contest and source]
Keywords: Statistical Inference, Data Science, Regression, Prediction
HIPAA Breach Regression Analysis
This application feeds from public datasets, and analyzes data breaches across the USA. It was designed for bloggers and reporters, to provide them with analysis about privacy and IT security trends.
Keywords: R Shiny, ggplot2, Regression, Predictions, Data Science, Big Data
SEO T-Test and Inference Spreadsheet
This Excel spreadsheet provides objective measurements on the effectiveness of Search Engine Optimization efforts on SERP positions, using T-Tests and other methods similar to those used in clinical trials. It is also effective in detecting tactics that might lead to Google ranking penalties. This is a partial example of something I've been using professionally, although features and keywords have been anonymised for intellectual property and conflict-of-interest reasons.
Keywords: Statistical Inference, T-Testing, Search Engine Optimization
NLP Backoff Text Prediction Demo
This was my Coursera Data Science Specialization Capstone project. It reads data corpuses from blogs, news and social media, and builds language models to predict the next word you will type. I’ve configured it for English, however it would probably work just as well in any Latin or Germanic language.
Keywords: R Shiny, ggplot2, Natural Language Processing