**Registrations Closing Soon** for **DevOps Certification Training** by CloudxLab | Registrations Closing in

- Home
- Assessment

So far we have only dealt with numerical attributes, but now let’s look at text attributes. In this dataset, there is just one: the `ocean_proximity`

attribute. A Machine Learning model does not understand categorical values, so we will turn this into a numerical value using `onehot encoding`

.

`Onehot encoding`

creates one binary attribute per category: one attribute equal to `1`

when the category
is `<1H OCEAN`

(and `0`

otherwise), another attribute equal to `1`

when the category is `INLAND`

(and `0`

otherwise), and so on.

Notice that the output is a `SciPy`

sparse matrix, instead of a `NumPy`

array. This is very useful when you have categorical attributes with thousands of categories. After `onehot encoding`

, we get a matrix with thousands of columns, and the matrix is full of 0s except for a single 1 per row. Using up tons of memory mostly to store zeros would be very wasteful, so instead a sparse matrix only stores the location of the nonzero elements.

Let's see how it is done.

First, we will store the categorical feature in a new variable called

`housing_cat`

`<<your code goes here>> = housing[["ocean_proximity"]]`

Let's see what it looks like using the

`head`

method`housing_cat.<<your code goes here>>(10)`

Now let's import

`OneHotEncoder`

from`sklearn`

`from sklearn.preprocessing import <<your code goes here>>`

Now we will

`fit_transform`

our categorical data`cat_encoder = OneHotEncoder() housing_cat_1hot = cat_encoder.<<your code goes here>>(housing_cat) housing_cat_1hot`

Finally, we will convert it to a dense Numpy array using

`toarray`

method`housing_cat_1hot.<<your code goes here>>()`

XP

Checking Please wait.

Success

Error

No hints are availble for this assesment

Answer is not availble for this assesment

**Note - **Having trouble with the assessment engine? Follow the steps listed
here

## Loading comments...