Projects

OptiRanker

Programming Language/IDE: Python (Flask), Jupyter Notebook

Description: With the growing need for data-driven strategies in marketing and business intelligence, organizations are increasingly turning to advanced analytical tools. This project introduces optimized ranking with a multi-index analysis framework, leveraging machine learning techniques such as K-Means clustering and SHAP (SHapley Additive exPlanations), to derive actionable insights from media data mainly intended for Market Prioritization, Audience Selection, and Competitive Whitespace Identification.

The methodology encompasses data preprocessing, feature scaling, cluster analysis, and SHAP-based weight recommendations to ensure precision in decision-making. Metrics like the Category Index, Brand Index, and Total Index are dynamically computed and visualized using bubble charts, bar plots, and tables, with a final downloable report for manual analysis.

Note: These numerical outputs are mockups derived from a real project meant to highlight the structure and potential application uses of this tool.

Keywords: Optimized Ranking, Market Prioritization, SHAP, K-Means Clustering, Audience Selection, Competitive Analysis, Multi-Index Analysis, Business Intelligence, Data Visualization

MetrixIQ

Programming Language/IDE: Python (Flask), Jupyter Notebook

Description: TBD

Note: These numerical outputs are mockups derived from a real project meant to highlight the structure and potential application uses of this tool.

Granger Causality and Impulse Response Function on Customer Sentiment Scores

Programming Language/IDE: R, RStudio (RMarkdown)

Description: Using customer sentiment scores, I examined whether each metric (trust, interest, pride, happiness) could aid in forecasting XCompany’s revenue. Pairing the results of these statistical tests with impulse response functions on each metric demonstrating granger causality, I was able to understand the direction, magnitude, and duration of the impact of the granger relationship. This report is directionally impactful and can help shape future profit maximization strategies and overarching business decisions.

Note: These numerical outputs are mockups derived from a real project meant to highlight the structure and potential application uses of this report.

A Comparison Analysis of Accuracy and Reliability Rates in Stock Market Forecasting Models

Programming Language/IDE: Python, Jupyter Notebook

Description: With an increasing interest in artificial intelligence and machine learning developing in the past decade, the financial sector has turned its attention to stock market forecasting due to its predictive qualities and potential for maximizing returns. Investors often rely on methods such as ARIMA, KNN and LSTM to get a sense of the direction of the market, so they can determine their next course of action. However, because these methods are relatively new, they may not be as accurate or reliable as needed for this particular financial activity. This is due to many factors, including the market being affected by external and unpredictable forces, usually as a result of human behavior. Therefore, when evaluating these predictive models for accuracy and reliability, we can adjust the flaws in their forecasting and create improved methods for predicting stock prices. After running through a detailed structural analysis, including preprocessing, training, testing, modeling and fitting, forecasting, then calculating error values, we have reached the conclusion that ARIMA stands out as the most accurate method, KNN as most reliable, and LSTM as the least accurate and reliable of the three. However, the limitations of LSTM are well known and since the model incorporates a neural network, it needs more data to learn through several cycles and cannot depend on recent observations to attain greater accuracy and reliability. 

Keywords: Machine learning, ARIMA, LSTM, K-Nearest Neighbors (KNN), Stock Market, Time Series, Stock Market Forecasting, Predictive Modeling

Hospital UI Web Scraper for Accessibility Compliance Verification

Programming Language: Python

Description: Created a web-harvesting script to extract data from over 10,000 hospital user interfaces, compiling comprehensive lists including phone information, addresses, TTY/TDD, large print, and chatbot capabilities. Subsequently, developed Python algorithms to automate data analytics of these accessibility elements in healthcare websites for the hearing or visually impaired. Stored data in an MS Azure data warehouse to later inform legislation proposals on accessibility compliance.

Note: Project image displays first iteration of this application and does not reflect final, scalable version.

Causal Impact Analysis on Customer Sentiment Scores

Programming Language/IDE: R, RStudio (RMarkdown)

Description: Employed causal impact analysis to assess how a specific event (i.e., a legal scandal, product launch, service outage, or marketing campaign) influenced customer sentiment scores. After collecting historical data from various sources and cleaning as well as preprocessing all metrics, I used the causal impact package in R to isolate the effect of the event on customer sentiment, accounting for any other influencing factors. By comparing the sentiment scores before and after the intervention period, I was able to quantify the causal impact of the event, providing valuable insights for strategic decision-making and future marketing planning.

Note: These numerical outputs are mockups derived from a real project meant to highlight the structure and potential application uses of this report.

Sales Document Generator for Security Systems Company

Programming Language: VBA

Description: Maintained and updated VBA code for a Sales Cost Sheet with product and price information as well as project inputs, which automatically generates a BOM (Bill of Materials), Bid sheet, Purchasing Order, and Master Packing List once filled out. Contributed an additional search button via macros for ease of product/part search as parts list extends over thousands of rows.

1P Customer Data ETL Pipeline Creation in Snowflake using Snowpipes

Programming Language: SQL

Description: In Snowflake UI, set up storage integration to extract 15MM rows of data from Amazon S3 warehouse to then load into a stage. Transformed csv file customer data with pipe delimiters to a comma-separated format including error handling in the event of a corrupted file with invalid encoding or additional characters deviating from delimiter structure. Created and automated Snowpipe to feed incremental data into a table in Snowflake at a daily cadence.

Bootstrapping Applications: Simple Linear Regression Model to Estimate Deaths Caused by Interpersonal Violence

Programming Language/IDE: R, RStudio (Quarto)

Description: This project uses the "Causes of Death - Our World in Data" dataset merged with World Bank population data to calculate death rates and evaluate a Bootstrap method's effectiveness. By generating 1,000 new samples, it estimates population mean and standard deviation, constructing confidence intervals to compare with true values, revealing a close fit. In a subsequent analysis, Bootstrap models predict deaths from interpersonal violence against country populations, showing higher fidelity to sample data compared to the full population model, illustrating the method's utility in addressing variability and enhancing model performance.