Here's what I have done recently

Heart Disease Classification

K-Nearest Neighbor algorithm was used to classify patients with heart disease indirect groups based on patients demographics and clinical characteristics. Python libraries used: Numpy, Pandas, Matplotlib, KNN.

Digital Inclusion Analysis

K-Means clustering algorithm was used to classify and rank selected African countries, to determine countries without digital inclusion and determine countries suited for World digital aid. Logistic regression was to determine countries most likely to adequately utilize the aid. R libraries used: Tidyverse, ggplot, Nbcluster.

Analysis of Nursing Homes Quality and Utilization in the US

Nursing homes, also known as skilled nursing facilities (SNFs) or long-term care facilities, are residential facilities that provide comprehensive healthcare and assistance to individuals who require significant support with daily activities and medical care. They primarily cater to older adults with chronic illnesses, disabilities, or those recovering from surgeries or hospital stays. Research shows that there are about 26,514 nursing homes in the US and an estimated 70% of people who reach the age of 65 will need long term care at some point in life. R libraries used: Tidyverse, Janitor, bbc.

BBC Goodfood recipes Dashboard

This project explores the recipe classes from the BBC Goodfood website. We scrapped food recipes data for varieties of food types from the BBCGood website e.g Quick & Easy foods, Vegan, Vegetarian etc. Data was cleaned and structured using the tidyverse package. Analysis of recipes preparation time, average rating, average servings, recipe type and their nutritional contents was explored. Using the tidymodels package we developed a linear regression and random forest regression models to predict average preparation based on user-selected inputs. These inputs includes recipe type, recipe subtype, servings, amount of sugar, carbs, protein R libraries used: Rvest, Shiny, Shinycssloaders, Tidyverse, Tidymodels, ggdist, bs4Dash, tidyquant.

News Headlines Topic Suggestion

This project explores Bertopic model to suggest a topic base on news headlines. This project utilized data from a popular Indian Newspaper, preprocessed the data, and adapted a Bertopic model to predict a topic based on the news headlines. Python package used: Pandas, Bertopic, Flask, Json, Joblib

HR Dashboard (Tableau)

This visualization explores Human Resources data generated using ChatGPT and Faker library. The visualization explores the current status of employees, salary, demographics and termination status.