Problems on Data Cleaning and Processing For Machine Learning

12 / 12

Operations on time-series data

We have time-series data kept in the file /cxldata/pet_mle/time_series_data.csv

This is the stock data over a period of time. It is stored chronologically - the first record is older than second, second is older than third and so on. Do not shuffle it.

We want to create a dataset on which we can train a model to predict the stock price given the previous 5 values. So, we have to convert it into a dataset such that the previous 5 values are the features and the 6th value is the label.

If our input dataset is: t1, t2, t3, t4, t5, t6, t7, t8, t9, t10

Our expected X is:

[
    [t1, t2, t3, t4, t5] ,
    [t2, t3, t4, t5, t6] ,
    [t3, t4, t5, t6, t7] ,
    [t4, t5, t6, t7, t8] ,
    [t5, t6, t7, t8, t9] ,
    [t6, t7, t8, t9, t10] ,
]

Our expect y is: [t6, t7, t8, t9, t10]

INSTRUCTIONS
  1. Note that this is time-series data, so it is very important to preserve the order of data points. Hence don't attempt to shuffle or manipulate the order of the data.
  2. Please import pandas as pd and import numpy as np
  3. Create NumPy arrays X and y as described above.
  4. X should be two dimensional and y should a single-dimensional NumPy array
See Answer

No hints are availble for this assesment


Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...