Jobin

PROGRAMMER AND GROWTH HACKER

Analyze a bowl of M&Ms using computer vision to make sure there are no green ones. A fundamental quality control automation workflow with applications across many industries.
[GitHub Repository] [Video Demo]
Keywords: C#, ASP.NET MVC, AWS RekognitionFacebook Authentication

Zeeph YouTube Search

Perform text search of YouTube channel transcripts, and locate precise links directly to specific videos and points of interest. A full-stack commercial application that includes JWT authentication for user account management.
[GitHub Repository]
Keywords: MERN Stack, React, JWT, Google Cloud, MongoDB, Heroku, Express

VouchIt Behavioral Geofencing

Use your existing customer data to predict where future clients will come from, and make objective decisions based on solid data science. A full-stack commercial application that includes ecommerce, authentication, custom APIs, AI and more.
[GitHub Repository] [Live Demo]
Keywords: AWS Amplify, React, Stripe, Machine Learning, GraphQL, REST API

Google Tag Manager URI Parser and Tracker

Dozens of Google Analytics goals can be added to GTM with just a single variable. This is a custom-variable for GTM extracts the protocol from Click URL Variable, which can then be exported to parent tags. Currently used by digital marketing agenices.
[GitHub Repository]

Keywords: Google Tag Manager, Scripting

Tourister Virtual Sigtseeing

Type in any location, and compile a gallery of local images. Add locations to the list, or take a Google StreetView tour of that area.
[GitHub Repository]

Keywords: React, Redux, Heroku, Firebase, CSS, Bootstrap, OAuth, Jest automated testing

Latest Wikipedia Edits

Real-time results for latest Wikipedia article updates. Uses client-side processing call Wikipedia's API and process the results.
[GitHub]
Keywords: Node, Express, REST API

Kobe Bryant Shot Prediction

Using Kobe Bryant shot history dataset with 5000 missing values, built Random Forest and Linear Models to solve for unknowns. In this Kaggle Kernel, I've also done some basic feature creation and exploratory analysis.
Keywords: R, Machine Learning, Random Forest, Statistical Inference, ggplot2

Blog Articles Data Corpus

This corpus was created as part of a previous project from 2014. Using ScrapingHub, scraped the content section of over 160,000 blogs and news sites, excluding HTML tags, menu items, advertisements, suggested content, and other irrelevant information.
Keywords: Web Crawling, Web Scraping, Kaggle, ScrapingHub

Kaggle Real-Estate Prediction Contest

Created for a Kaggle contest. Using home sales data from Ames, Iowa, predicted house prices for a test set of homes with missing sale prices. This document outlines the thought process for developing a basic model.
[Contest and source]
Keywords: Statistical Inference, Data Science, Regression, Prediction

HIPAA Breach Regression Analysis

This application feeds from public datasets, and analyzes data breaches across the USA. It was designed for bloggers and reporters, to provide them with analysis about privacy and IT security trends.
Keywords: R Shiny, ggplot2, Regression, Predictions, Data Science, Big Data

SEO T-Test and Inference Spreadsheet

This Excel spreadsheet provides objective measurements on the effectiveness of Search Engine Optimization efforts on SERP positions, using T-Tests and other methods similar to those used in clinical trials. It is also effective in detecting tactics that might lead to Google ranking penalties. This is a partial example of something I've been using professionally, although features and keywords have been anonymised for intellectual property and conflict-of-interest reasons.
Keywords: Statistical Inference, T-Testing, Search Engine Optimization

NLP Backoff Text Prediction Demo

This was my Coursera Data Science Specialization Capstone project. It reads data corpuses from blogs, news and social media, and builds language models to predict the next word you will type. I’ve configured it for English, however it would probably work just as well in any Latin or Germanic language.

Keywords: R Shiny, ggplot2, Natural Language Processing