Project - Introduction to Neural Style Transfer using Deep Learning & TensorFlow 2 (Art Generation Project)

6 / 18

Load the Images

Let us first load and visualize the content and style images we want to work with.

We shall do that by:

  • defining the function load_img to load an image and limit its maximum dimension to 512 pixels.

  • creating a simple function imshow to display an image

We shall create the function load_img in this slide, and create the imshow function in the next slide.

Note:

  • tf.io.read_file(path_to_img) reads the entire contents of the input filename.

    It returns a Tensor of type string which contains bit-representations of the image.

  • tf.image.decode_image(img, channels=3) detects whether an image is a BMP, GIF, JPEG, or PNG, and performs the appropriate operation to convert the input bytes string into a Tensor of type uint8. For uint8, the minimum value is 0 and the maximum value is 255.

    Images that are represented using floating-point values are expected to have values in the range [0,1).

    Image data stored in integer data types are expected to have values in the range [0,MAX], where MAX is the largest positive representable number for the data type.

    This op converts between data types, scaling the values appropriately before casting.

  • tf.image.convert_image_dtype(img, tf.float32) converts the image tensor values to the specified dtype, scaling its values if needed.

  • tf.shape(input_tensor) returns the shape of a tensor.

  • tf.cast casts a tensor to a new type.

  • tf.cast(tf.shape(img)[:-1], tf.float32) converts all the dimensions - except the last dimension - of the img to float32 data type.

  • tf.image.resize resizes image to a specified size.

  • tf.newaxis is used to increase the dimension of the existing array by one more dimension.

INSTRUCTIONS
  • Mention the content-image path "/cxldata/dlcourse/dog.jpg" in the variable content_path.

    << your code comes here >> = "/cxldata/dlcourse/dog.jpg"
    
  • Mention the style-image path "/cxldata/dlcourse/moon.jpg" in the variable style_path.

     << your code comes here >> = "/cxldata/dlcourse/moon.jpg"
    
  • We shall now define the function load_img as follows:

    • Set max_dim to 512 to set the maximum dimensions of the input image.

    • Read the image from the given path using tf.image.decode_image.

    • Convert the image pixels to float32 using tf.image.convert_image_dtype.

    • Get the maximum dimension long_dim from the shape of the input image. The max_dim is divided by this long_dim to get the scale measure so that the scale could be used to resize the image.

    • The shape is multiplied by scale and the integer result is used along with tf.image.resize to resize the image.

    The above steps are Pythonically implemented as follows. So use the below code to load the image.

    def load_img(path_to_img):
        max_dim = 512
        img = tf.io.read_file(path_to_img)
    
        img = tf.image.decode_image(img, channels=3)
    
        img = tf.image.convert_image_dtype(img, tf.float32)
    
        shape = tf.cast(tf.shape(img)[:-1], tf.float32)
    
        long_dim = max(shape)
        scale = max_dim / long_dim
    
        new_shape = tf.cast(shape * scale, tf.int32)
    
        img = tf.image.resize(img, new_shape)
        img = img[tf.newaxis, :]
    
        return img
    
  • Now call the load_img function to load the content and style images by passing the paths to each of the images as input arguments respectively.

    content_image = << your code comes here >>(content_path)
    style_image = << your code comes here >>(style_path)
    

Get Hint See Answer


Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...