Login using Social Account
     Continue with GoogleLogin using your credentials
Let us first load and visualize the content and style images we want to work with.
We shall do that by:
defining the function load_img
to load an image and limit its maximum dimension to 512 pixels.
creating a simple function imshow
to display an image
We shall create the function load_img
in this slide, and create the imshow
function in the next slide.
Note:
tf.io.read_file(path_to_img)
reads the entire contents of the input filename.
It returns a Tensor of type string which contains bit-representations of the image.
tf.image.decode_image(img, channels=3)
detects whether an image is a BMP, GIF, JPEG, or PNG, and performs the appropriate operation to convert the input bytes string into a Tensor of type uint8
. For uint8
, the minimum value is 0 and the maximum value is 255.
Images that are represented using floating-point values are expected to have values in the range [0,1).
Image data stored in integer data types are expected to have values in the range [0,MAX], where MAX is the largest positive representable number for the data type.
This op converts between data types, scaling the values appropriately before casting.
tf.image.convert_image_dtype(img, tf.float32)
converts the image tensor values to the specified dtype, scaling its values if needed.
tf.shape(input_tensor)
returns the shape of a tensor.
tf.cast
casts a tensor to a new type.
tf.cast(tf.shape(img)[:-1], tf.float32)
converts all the dimensions - except the last dimension - of the img
to float32 data type.
tf.image.resize
resizes image to a specified size.
tf.newaxis
is used to increase the dimension of the existing array by one more dimension.
Mention the content-image path "/cxldata/dlcourse/dog.jpg"
in the variable content_path
.
<< your code comes here >> = "/cxldata/dlcourse/dog.jpg"
Mention the style-image path "/cxldata/dlcourse/moon.jpg"
in the variable style_path
.
<< your code comes here >> = "/cxldata/dlcourse/moon.jpg"
We shall now define the function load_img
as follows:
Set max_dim
to 512
to set the maximum dimensions of the input image.
Read the image from the given path using tf.image.decode_image
.
Convert the image pixels to float32
using tf.image.convert_image_dtype
.
Get the maximum dimension long_dim
from the shape of the input image. The max_dim
is divided by this long_dim
to get the scale
measure so that the scale
could be used to resize the image.
The shape is multiplied by scale
and the integer result is used along with tf.image.resize
to resize the image.
The above steps are Pythonically implemented as follows. So use the below code to load the image.
def load_img(path_to_img):
max_dim = 512
img = tf.io.read_file(path_to_img)
img = tf.image.decode_image(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.float32)
shape = tf.cast(tf.shape(img)[:-1], tf.float32)
long_dim = max(shape)
scale = max_dim / long_dim
new_shape = tf.cast(shape * scale, tf.int32)
img = tf.image.resize(img, new_shape)
img = img[tf.newaxis, :]
return img
Now call the load_img
function to load the content and style images by passing the paths to each of the images as input arguments respectively.
content_image = << your code comes here >>(content_path)
style_image = << your code comes here >>(style_path)
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Note - Having trouble with the assessment engine? Follow the steps listed here
Loading comments...