{"id":1587,"date":"2019-03-14T11:27:47","date_gmt":"2019-03-14T11:27:47","guid":{"rendered":"https:\/\/cloudxlab.com\/blog\/?p=1587"},"modified":"2019-04-16T10:55:40","modified_gmt":"2019-04-16T10:55:40","slug":"one-on-one-discussion-with-sandeep-on-gradient-descent","status":"publish","type":"post","link":"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/","title":{"rendered":"One-on-one discussion on Gradient Descent"},"content":{"rendered":"\n<p>Usually, the learners from our classes schedule 1-on-1 discussions with the mentors to clarify their doubts. So, thought of sharing the video of one of these 1-on-1 discussions that one of our CloudxLab learner &#8211; <strong>Leo<\/strong> &#8211; had <strong>with Sandeep<\/strong> last week. <\/p>\n\n\n\n<p>Below are the questions from the same discussion.<br><\/p>\n\n\n\n<p>You can go through the detailed discussion which happened around these questions, in the attached video below.<\/p>\n\n\n\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><div class=\"wp-block-embed__wrapper\">\n<div style=\"max-width: 1778px;\"><div style=\"left: 0; width: 100%; height: 0; position: relative; padding-bottom: 56.25%;\"><iframe title=\"(1-1 with Leo) Understanding Machine Learning Concepts - Gradient Descent (Discussion)\" src=\"https:\/\/www.youtube.com\/embed\/Gp7gXle_Zzo?rel=0\" style=\"border: 0; top: 0; left: 0; width: 100%; height: 100%; position: absolute;\" allowfullscreen scrolling=\"no\" allow=\"encrypted-media; accelerometer; clipboard-write; gyroscope; picture-in-picture\"><\/iframe><\/div><\/div><script type=\"text\/javascript\">window.addEventListener(\"message\",function(e){\n                window.parent.postMessage(e.data,\"*\");\n            },false);<\/script>\n<\/div><figcaption>One-on-one discussion with Sandeep on Gradient Descent<\/figcaption><\/figure>\n\n\n\n<!--more-->\n\n\n\n<p><strong>Q.1. <\/strong>In the <em>Life Cycle of Node Value<\/em> chapter, please explain me the below line of code. What are zz_v, z_v and y_v values ?<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>zz_v, z_v, y_v = s.run([zz, z, y] <\/code><\/pre>\n\n\n\n<p><strong>Ans.1.<\/strong> The complete code looks like below:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>w = tf.constants(3) \nx = w + 2 \ny = x + 5 \nz = x + 3 \nzz = tf.square(y+z) \nwith tf.Session() as s: \n    zz_v, z_v, y_v = s.run([zz, z, y])\nprint(zz_y)<\/code><\/pre>\n\n\n\n<p>s.run([zz, z, y]) expression above is basically evaluating zz, z and y, and is returning their evaluated values, which we are storing in variables zz_v, z_v and y_v variables respectively. &nbsp;Basically, the the <em>run<\/em> function of <a href=\"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/Session\">session<\/a> object returns the same data type as passed in the first argument. Here, the run method is returning an array of values. <\/p>\n\n\n\n<p>TensorFlow is a Python library and in Python, we can return multiple values from a function in the form of a <em>tuple<\/em>. Here is a simple example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>x, y , z = [2, 3, 4] \nprint(x) \n2 \nprint(y) \n3 \nprint(z) \n4 <\/code><\/pre>\n\n\n\n<p>Same thing is happening here, s.run() is returning multiple values which are being stored in variables zz_v, z_v and y_v respectively.<br><\/p>\n\n\n\n<p>So, the evaluated value of zz is stored in variable zz_v, evaluated value of variable z in z_v and evaluated value of variable y in variable y_z. <br><\/p>\n\n\n\n<p><strong>Q.2.<\/strong> In Linear Regression chapter, we are using housing price dataset. How do we know that the model for housing price dataset is a linear equation ? Is it just an assumption ?<br><\/p>\n\n\n\n<p><strong>Ans.2.<\/strong> Yes, it is an assumption that model for housing price dataset is a linear equation.<br><\/p>\n\n\n\n<p>We can use linear equation for a non-linear problem also.<br><\/p>\n\n\n\n<p>We convert most of the non-linear problems into a linear problem by using polynomial features.<\/p>\n\n\n\n<p>Even Polynomial Regression problem is solved using Linear Regression by converting a non-linear problem to linear problem by adding polynomial features.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>y = \u03f40 + \u03f41 x + \u03f42 x1 + \u03f43 x2 + ...<\/code><\/pre>\n\n\n\n<p>Suppose your equation is<\/p>\n\n\n\n<p>where x1 and x2 are polynomial features and <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>x1 = square(x)\nx2 = cube(x)<\/code><\/pre>\n\n\n\n<p>\u03f40, \u03f41, \u03f42, &#8230;. etc are weights or also called coefficients.<br><\/p>\n\n\n\n<p>In Linear Regression, when Gradient Descent is applied on this equation, weights \u03f41 and \u03f43 will go down to 0 (zero) and weight \u03f42 will become bigger. Hence, at the end of Gradient Descent, our above equation will look like below i.e. we get a non-linear equation<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>y = \u03f40 + \u03f42 x1\nor , y = \u03f40 + \u03f42 square(x)<\/code><\/pre>\n\n\n\n<p><br><\/p>\n\n\n\n<p><strong>Q.3.<\/strong> Equations of Gradient and Gradient Descent, I don\u2019t understand them<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2019\/03\/Screenshot-2019-03-07-at-13.30.25.png\" alt=\"Equation for Gradient for Linear Regression\" class=\"wp-image-1746\" width=\"424\" height=\"143\" srcset=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2019\/03\/Screenshot-2019-03-07-at-13.30.25.png 664w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2019\/03\/Screenshot-2019-03-07-at-13.30.25-300x102.png 300w\" sizes=\"(max-width: 424px) 85vw, 424px\" \/><\/figure>\n\n\n\n<p><strong>Ans.3. <\/strong><br><\/p>\n\n\n\n<p>The below equation is for calculating the Gradient<br><\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img src=\"https:\/\/lh5.googleusercontent.com\/5tinUdHu4FAe-aRT4eG7NAirTGpUGsc-67qvuxmT4YJ4Pyd2qunXem4QUKK1vBCIobgpSs0jqwhmBkP31rfTm4uZeVvIAM1MNoX4mk5tCr83zqnlrK132SkeS9Ay5g5mujDXK_39\" alt=\"Equation for Gradient for Linear Regression\" width=\"423\" height=\"143\"\/><\/figure>\n\n\n\n<p>MSE is &#8216;Mean Squared Error&#8217;<\/p>\n\n\n\n<p>m is total number of instances.<br><\/p>\n\n\n\n<p>X dataset is a matrix with \u2018n\u2019 columns (features) and \u2018m\u2019 rows (instances). <\/p>\n\n\n\n<p>y is a vector (containing &nbsp;actual values of label) with \u2018m\u2019 rows and 1 column<\/p>\n\n\n\n<p>y^ is a also a vector (containing predicted values of label) with &#8216;m&#8217; rows and 1 column<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>y = \u03f40 + \u03f41 x1 + \u03f42 x2\nMSE (E) = square(y\u2227 - y) \/ m\nMSE (E) = square(\u03f40 + \u03f41 x1 + \u03f42 x2 - y) \/ m\n\u2202(MSE)\/\u2202\u03f40 = 2(\u03f40 + \u03f41 x1 + \u03f42 x2 - y) \/ m               \n               = 2\/m (X.\u03f4 - y )\n\u2202(MSE)\/\u2202\u03f41 = 2(\u03f40 + \u03f41 x1 + \u03f42 x2 - y) x1 \/ m               \n               = 2\/m (X.\u03f4 - y) X               \n               = 2\/m X (X.\u03f4 - y)<\/code><\/pre>\n\n\n\n<p>Therefore, we get, <\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img src=\"https:\/\/lh5.googleusercontent.com\/5tinUdHu4FAe-aRT4eG7NAirTGpUGsc-67qvuxmT4YJ4Pyd2qunXem4QUKK1vBCIobgpSs0jqwhmBkP31rfTm4uZeVvIAM1MNoX4mk5tCr83zqnlrK132SkeS9Ay5g5mujDXK_39\" alt=\"Equation for Gradient for Linear Regression\" width=\"434\" height=\"146\"\/><\/figure>\n\n\n\n<p>Below equation is for calculating the Gradient Descent <\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img src=\"https:\/\/lh5.googleusercontent.com\/gRqbG76esTx9knvuW9yNIAL0Qbqv0E5Q4XtwPpmgmYlRmXrofHj2H6G_zXQ_fNr-I7VOFfUtBtMqvgeWBmM0O4iN_R5Y_O7QV2gAVHfyCWWUbhRgSRHMyM6v2slyuczUEjSpCKZa\" alt=\"Gradient Descent Equation for Linear Regression\" width=\"380\" height=\"40\"\/><\/figure>\n\n\n\n<p>\u03b7 &nbsp;is the learning rate here.<br><\/p>\n\n\n\n<p>\u03f4 is an array of theta values.<br><\/p>\n\n\n\n<figure class=\"wp-block-image is-resized\"><img src=\"https:\/\/lh6.googleusercontent.com\/Zqr5I-U7z3Je-a3dw_RGpnURveNwUmzPzBT3bVuD-8NHVt8SpHSnVO1HhwRLaiQtNedo2z5XtNUSFsQoZP2qoTM9MjlxvCDqC5egj2LXj38vqjPgWcSyBxFV9IL867vYetYOs0O1\" alt=\"\" width=\"97\" height=\"24\"\/><\/figure>\n\n\n\n<p> &nbsp;is also an array of values, and is called the <em>Gradient <\/em>or the rate of change of error (E).<br><\/p>\n\n\n\n<p>If the <em>Gradient <\/em>increases, we need to decrease the \u03f4, and if the <em>Gradient <\/em>decreases, we need to increase the \u03f4. Eventually, we need to move towards making the <em>Gradient<\/em> equal to 0 (zero) or nearly 0.<br><br><\/p>\n\n\n\n<p><strong>Q.4.<\/strong> In Gradient Descent, what we can do to avoid getting stuck in <em>local minima<\/em> ?<\/p>\n\n\n\n<p><strong>Ans.4.<\/strong> You can use Stochastic Gradient Descent to avoid getting stuck in <em>local minima<\/em>.<\/p>\n\n\n\n<p>You can find more details about this in our Machine Learning course.<\/p>\n\n\n\n<p>For the complete course on Machine Learning, please visit&nbsp;<a href=\"https:\/\/cloudxlab.com\/course\/specialization\/1\/machine-learning-specialization\">Specialization Course on Machine Learning &amp; Deep Learning<\/a><br><\/p>\n\n\n\n<p><br><br><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Usually, the learners from our classes schedule 1-on-1 discussions with the mentors to clarify their doubts. So, thought of sharing the video of one of these 1-on-1 discussions that one of our CloudxLab learner &#8211; Leo &#8211; had with Sandeep last week. Below are the questions from the same discussion. You can go through the &hellip; <a href=\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;One-on-one discussion on Gradient Descent&#8221;<\/span><\/a><\/p>\n","protected":false},"author":21,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[29,15,28,30],"tags":[19,17,47,16],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v16.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>One-on-one discussion on Gradient Descent | CloudxLab Blog<\/title>\n<meta name=\"description\" content=\"Gradient Descent for Machine Learning and Deep Learning\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"One-on-one discussion on Gradient Descent | CloudxLab Blog\" \/>\n<meta property=\"og:description\" content=\"Gradient Descent for Machine Learning and Deep Learning\" \/>\n<meta property=\"og:url\" content=\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/\" \/>\n<meta property=\"og:site_name\" content=\"CloudxLab Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cloudxlab\" \/>\n<meta property=\"article:published_time\" content=\"2019-03-14T11:27:47+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-04-16T10:55:40+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2019\/03\/Screenshot-2019-03-07-at-13.30.25.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:site\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\">\n\t<meta name=\"twitter:data1\" content=\"4 minutes\">\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"CloudxLab Blog\",\"description\":\"Learn AI, Machine Learning, Deep Learning, Devops &amp; Big Data\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/cloudxlab.com\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2019\/03\/Screenshot-2019-03-07-at-13.30.25.png\",\"contentUrl\":\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2019\/03\/Screenshot-2019-03-07-at-13.30.25.png\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/#webpage\",\"url\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/\",\"name\":\"One-on-one discussion on Gradient Descent | CloudxLab Blog\",\"isPartOf\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/#primaryimage\"},\"datePublished\":\"2019-03-14T11:27:47+00:00\",\"dateModified\":\"2019-04-16T10:55:40+00:00\",\"author\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/cb0503c4e740565d85cd28a1b167f48b\"},\"description\":\"Gradient Descent for Machine Learning and Deep Learning\",\"breadcrumb\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"position\":2,\"item\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/one-on-one-discussion-with-sandeep-on-gradient-descent\/#webpage\"}}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/cb0503c4e740565d85cd28a1b167f48b\",\"name\":\"Deepak Singh\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/93f963a3a3600e7852f1a3677966d5c4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/93f963a3a3600e7852f1a3677966d5c4?s=96&d=mm&r=g\",\"caption\":\"Deepak Singh\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/1587"}],"collection":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/comments?post=1587"}],"version-history":[{"count":154,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/1587\/revisions"}],"predecessor-version":[{"id":2095,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/1587\/revisions\/2095"}],"wp:attachment":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/media?parent=1587"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/categories?post=1587"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/tags?post=1587"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}