Flash Sale: Flat 70% + Addl. 25% Off on all Courses | Use Coupon DS25 in Checkout | Offer Expires InEnroll Now
Now, since we have cleaned the
bikesData data set, let us split it into
Test data sets into 70:30 ratio using scikit-learn's
train_test_split() function uses 'Random Sampling', hence resulting
test_set data sets have to be sorted by
Random Sampling may not be the best way to split the data, what other types of best Sampling method you can think of?
train_test_split() function from scikit-learn.
Please add a new feature(column)
bikesData data set using below code:
bikesData['dayCount'] = pd.Series(range(bikesData.shape))/24
bikesData data set into
Test data set (train_set and test_set) in 70:30 ratio using scikit-learn's
test_set values by
dayCount by using the below code:
train_set.sort_values('dayCount', axis= 0, inplace=True)
test_set.sort_values('dayCount', axis= 0, inplace=True)
Now print the 'number of instances' for
test_set data sets.
No hints are availble for this assesment
Answer is not availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here