#NoPayJan Offer - Access all CloudxLab Courses for free between 1st to 31st JanEnroll Now >>
Computer can only process numbers but not words. Thus we need to convert the words in
truncated_vocabulary into numbers.
So we now need to add a preprocessing step to replace each word with its ID (i.e., its index in the
truncated_vocabulary). We will create a lookup table for this, using 1,000 out-of-vocabulary (oov) buckets.
We shall create the lookup table such that the most frequently occurring words have lower indices than less frequently occurring words.
tf.lookup.KeyValueTensorInitializer : Table initializer given keys and values tensors. More here
tf.lookup.StaticVocabularyTable : String to Id table wrapper that assigns out-of-vocabulary keys to buckets. More here
<other term> -> bucket_id, where bucket_id will be between 3 and 3 +
num_oov_buckets - 1, calculated by: hash(
num_oov_buckets + vocab_size
table.lookup : Looks up keys in the table, outputs the corresponding values.
Create a tensor
words containing the words of
<< your code comes here >>= tf.constant(truncated_vocabulary)
word_ids using the corresponding indices of words in
word_ids = tf.range(len(truncated_vocabulary), dtype=tf.int64)
Create the table initializer
tf.lookup.KeyValueTensorInitializer, given the keys(here
words) and the values(here
vocab_init = << your code comes here >>(words, word_ids)
num_oov_buckets = 1000 and create the lookup table
tf.lookup.StaticVocabularyTable. Observe, we pass the
vocab_init, num_oov_buckets as input arguments to this.
num_oov_buckets = 1000 table = << your code comes here >>(vocab_init, num_oov_buckets)
Let's use the above table to look up the IDs of a few words:
table.lookup(tf.constant([b"This movie was faaaaaantastic".split()]))
Note: The words “this,” “movie,” and “was” were found in the table, so their IDs are lower than 10,000, while the word “faaaaaantastic” was not found, so it was mapped to one of the oov buckets, with an ID greater than or equal to 10,000.
No hints are availble for this assesment
Answer is not availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here