Enrollments closing soon for **Post Graduate Certificate Program in Applied Data Science & AI** By **IIT Roorkee** | 3 Seats Left

- Home
- Assessment

18 / 38

Now, since we have cleaned the `bikesData`

data set, let us split it into `Training`

and `Test`

data sets into 70:30 ratio using scikit-learn's `train_test_split()`

function.

Also, `train_test_split()`

function uses 'Random Sampling', hence resulting `train_set`

and `test_set`

data sets have to be sorted by `dayCount`

.
Random Sampling may not be the best way to split the data, what other types of best Sampling method you can think of?

We will also define an utility function named `display_scores`

. This function is used to calculate the basics stats of observed scores from cross-validation of models. Please copy this function in your code, we will be using it often in this project.

Set np random seed to 42 using code below to ensure the results of the exercise are repeatable.

`np.random.seed(42)`

Import

`train_test_split`

function from scikit-learn's`model_selection`

Please add a new feature(column)

`dayCount`

to`bikesData`

data set using below code:`bikesData['dayCount'] = pd.Series(range(bikesData.shape[0]))/24`

Split the

`bikesData`

data set into Training set`train_set`

and Test set`test_set`

in 70:30 ratio using scikit-learn's`train_test_split()`

function.Sort the

`train_set`

and`test_set`

values by`dayCount`

by using the below code:`train_set.sort_values('dayCount', axis= 0, inplace=True) test_set.sort_values('dayCount', axis= 0, inplace=True)`

Now print the 'number of instances' for

`train_set`

and`test_set`

data sets.Finally, create the function

`display_scores`

as shown below:`def display_scores(scores): print("Scores:", scores) print("Mean:", scores.mean()) print("Standard deviation:", scores.std())`

XP

Taking you to the next exercise in seconds...

Want to create exercises like this yourself? Click here.

Checking Please wait.

Success

Error

Fetching hint, please wait...

Error

Answer is not availble for this assesment

**Note - **Having trouble with the assessment engine? Follow the steps listed
here

## Loading comments...