Posts

Kaggle Account

Image
Hello,  I have completed a Decision Tree Classifier Model for the famous Titanic Data set on Kaggle.  Background: Several attributes of the passengers on the titanic were collected such as: the class of their cabin, where they embarked from, and whether or not they had siblings aboard the ship.  Task: Choose which attributes most contributed to their survival and use those attributes to create a data model that predict whether or not a passenger would survive. Click here to View it Steps: 1. View the Data 2. Clean the data (for numeric columns I filled null values with the median and assigned dummy variables to categorical columns.) 3. Split the Data into a Training Set and a Test set. 4. Use a Decision Tree Classifier and train it with the Training Set. 5. Test the model's accuracy on the test set and get the score. 6. If the score is unsatisfactory, repeat step 2 until more refined. 7. Use the test set on the sample data given and upload the results into Kaggle. https:/...

Machine Learning Example Code

Image
 Business Problem: How to improve customer retention and predict a return customer. Dataset:  Hotel_bookings.csv   Note: This code exemplifies my pre-processing with python as well as the use of logistic regression.       Hotel_Predictor In [1]: import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns In [2]: df = pd . read_csv ( 'hotel_bookings.csv' ) In [3]: df . head () Out[3]: hotel is_canceled lead_time arrival_date_year arrival_date_month arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults ... deposit_type agent company days_in_waiting_list customer_type ...