Project - Introduction to Transfer Learning (Cat vs Non-cats Project)

10 / 26

Cats vs Non-cats using Transfer Learning - Loading the data

About the Dataset

Dataset is located at /cxldata/datasets/project/cat-non-cat

Dataset is in .h5 file. It is a file format that could store the data - along with its meta-data - in the form of a hierarchy. Import h5py to interact with a dataset that is stored in an H5 file. It contains

  • train_catvnoncat.h5 - a training set of images labeled as cat (y=1) or non-cat (y=0)
  • test_catvnoncat.h5 - a test set of images labeled as cat or non-cat
  • Each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px) Now, let us load the dataset into our working session.
  • Load the data from /cxldata/datasets/project/cat-non-cat. For that, let us first access the h5py files of the train and test sets, by using h5py.File function.

    train_dataset = h5py.File('/cxldata/datasets/project/cat-non-cat/train_catvnoncat.h5', "r")
    test_dataset = << your code comes here >>('/cxldata/datasets/project/cat-non-cat/test_catvnoncat.h5', "r")
    print("File format of train_dataset:",train_dataset)
    print("File format of test_dataset:",<< your code comes here >>)
  • The train_dataset and test_dataset are HDF5 file objects. They have the data stored in a hierarchical format. Let us access the data and store it in form of numpy array as follows:

    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # train set labels
    test_set_x_orig = np.array(<< your code comes here >>["test_set_x"][:]) # test set features
    test_set_y_orig = np.array(<< your code comes here >>["test_set_y"][:]) # test set labels
Get Hint See Answer

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...