Project - Bike Rental Forecasting

1 / 49

End to End Project - Bikes Assessment - Description

Task

The dataset (Location: /cxldata/datasets/project/bikes.csv) contains the hourly rental bike demand data. The goal of the model is to estimate the bike demand in future given the parameters as observed in the past.

We will be following this example step-by-step in this assessment.

Input data available:

  1. instant: record index

  2. dteday : date

  3. season: season (1: springer, 2: summer, 3: fall, 4: winter)

  4. yr: year (0: 2011, 1:2012)

  5. mnth: month (1 to 12)

  6. hr: hour (0 to 23)

  7. holiday: weather day is holiday or not (extracted from [Web Link])

  8. weekday: day of the week

  9. workingday: if day is neither weekend nor holiday is 1, otherwise is 0.

  10. weathersit:

    • 1: Clear, Few clouds, Partly cloudy, Partly cloudy

    • 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist

    • 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds

    • 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog

  11. temp: Normalized temperature in Celsius. The values are derived via (tt_min)/(t_maxt_min), t_min=*8, t_max=+39 (only in hourly scale)

  12. atemp: Normalized feeling temperature in Celsius. The values are derived via (tt_min)/(t_maxt_min), t_min=*16, t_max=+50 (only in hourly scale)

  13. hum: Normalized humidity. The values are divided to 100 (max)

  14. windspeed: Normalized wind speed. The values are divided to 67 (max)

  15. casual: count of casual users

  16. registered: count of registered users

  17. cnt: count of total rental bikes including both casual and registered

Steps that will be followed:

  1. Importing the libraries

  2. Defining some utility functions

  3. Loading the data

  4. Cleaning the data

  5. Adding derived features

    • isWorking: Neither a workingday nor a holiday

    • monthCount: count of the number of months from the beginning of the dataset

    • xformHr: transform by shifting the hours by 5 hrs

    • dayCnt: count of the days from the beginning of the dataset

    • xformWorkHr: transforming the hour dataset to make the non-working days to have hours from 25 to 48

    • cntDeTrended: De-trended count values

  6. Analyzing the dataset

  7. Dividing the dataset into training and test dataset

    • using train_test_split in the ratio 70:30
  8. Training several models and analyzing their performance

  9. Selecting a model and evaluating using test dataset

  10. Improving the model by finding the best hyper-parameters and features

  11. Analyzing the residuals


No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...