top of page

RENTAL BIKES PROJECT

In this data set, the data comes from a bike sharing system which allows people to rent and return bikes automatically. With this data set, it keeps track of the weather, date, amount of bikes rented each day and much more. Through this project, I use Python to find out what variables have an effect on the amount of bikes rented for each day.​

​

Link for Data: https://www.kaggle.com/imakash3011/rental-bike-sharing

Introduction

Rental Bikes Project: Text

Correlation using Python

Link to Code: https://github.com/tracylam1/Portfolio/blob/main/Rental%20Bikes%20Correlation.ipynb

​

I started off the project by cleaning data. Some things I did included checking for any nulls, duplicates and outliers which can be seen in the link above. After cleaning the data, I created a confusion matrix to take a look at what variables have the highest correlation for the total amount of bikes rented each day. As we can see below, it appears that the 'temp', 'atemp', and 'year have the highest correlation with the total amount of rented bicycles. 'Instant' was not counted since it is the index. The 'casual' and 'registered' are also not counted since they are the values that make up the 'count' variable.

Screen Shot 2021-08-26 at 5.24.59 PM.png
Rental Bikes Project: Text

I also created some scatterplots involving the 'temp' and 'atemp' variables to showcase the upward positive relationship they have with the total amount of bikes rented each day. The scatterplot below shows the scatterplot for temp vs. cnt.

download.png
Rental Bikes Project: Text

©2021 by Tracy Lam. Created with Wix.com

bottom of page