Project - Bike Rental Demand Forecasting

Objectives:

  1. Using historical usage patterns and weather data, forecast(predict) bike rental demand (number of bike users (‘cnt’)) on hourly basis.
  2. Use “Bikes Rental” data set to predict the bike demand (bike users count - 'cnt') using various best possible models (ML algorithms) and report the values of the performance measures for different models.
  3. Use dimensionality reduction on the data set before using it for Training the models.
  4. Report the model that performs best, and fine-tune the same model using one of the model fine-tuning techniques, and report the best possible combination of hyperparameters for the selected model.
  5. Use the selected model to make final predictions and report the values of various performance measures for the same.

filePath = '/cxldata/datasets/project/bikes.csv'

The dataset contains the following parameters:

  1. instant: record index
  2. dteday : date
  3. season : season (1:springer, 2:summer, 3:fall, 4:winter)
  4. yr : year (0: 2011, 1:2012)
  5. mnth : month ( 1 to 12)
  6. hr : hour (0 to 23)
  7. holiday : weather day is holiday or not (extracted from [Web Link])
  8. weekday : day of the week (0 to 6; 0 - Sunday, 6 - Saturday)
  9. workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
  10. weathersit: Clear, Few clouds, Partly cloudy, Partly cloudy Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
  11. temp : Normalized temperature in Celsius. The values are derived via (tt_min)/(t_maxt_min), t_min=*8, t_max=+39 (only in hourly scale)
  12. atemp: Normalized feeling temperature in Celsius. The values are derived via (tt_min)/(t_maxt_min), t_min=*16, t_max=+50 (only in hourly scale)
  13. hum: Normalized humidity. The values are divided to 100 (max)
  14. windspeed: Normalized wind speed. The values are divided to 67 (max)
  15. casual: count of casual users
  16. registered: count of registered users
  17. cnt: count of total rental bikes including both casual and registered
  18. The "target" data set ('y') should have only one 'label' i.e. 'cnt'.

Acknowledgements

Cloudxlab is using this “Bike Sharing Demand” problem for its machine learning learners for learning and practicing. This dataset was provided by Hadi Fanaee Tork using data from Capital Bikeshare. We also thank the UCI machine learning repository for hosting the dataset. Fanaee-T, Hadi, and Gama, Joao, Event labeling combining ensemble detectors and background knowledge, Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg.


No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...