<![CDATA[Swayam's Blog]]>https://swayam-blog.hashnode.devRSS for NodeMon, 14 Oct 2024 08:35:19 GMT60<![CDATA[Project-1: Cats and Dogs Classification using Logistic Regression]]>https://swayam-blog.hashnode.dev/project-1-cats-and-dogs-classification-using-logistic-regressionhttps://swayam-blog.hashnode.dev/project-1-cats-and-dogs-classification-using-logistic-regressionWed, 29 Dec 2021 21:51:03 GMT<![CDATA[<h2 id="heading-overview">Overview</h2><p>Welcome to the 4<sup>th</sup> article of <a target="_blank" href="https://swayam-blog.hashnode.dev/series/demystifying-ml"><strong>Demystifying Machine Learning</strong> </a> series. In this article, we'll be going to make our first project of <strong><em>prediction of cats or dogs from their respective images</em></strong> using Logistic Regression. We are going to use <a target="_blank" href="https://scikit-learn.org/stable/">scikit-learn</a> library for this project but I already covered <a target="_blank" href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html">Logistic Regression</a> in great depth in the 3<sup>rd</sup> article of this series. </p><blockquote><p> <strong><em> The main focus of this sample project is towards data collection and data preprocessing rather than training a Logistic Regression model. The reason is that I showed how to set different hyperparameters in order to achieve satisfying results in <a target="_blank" href="https://swayam-blog.hashnode.dev/logistic-regression">3<sup>rd</sup> article</a> of this series and often times whenever working on a project 80% of the time one gonna spent is on data preprocessing.</em></strong></p></blockquote><p>The project is going to be a <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb"><strong>Jupyter Notebook</strong></a>. You can grab that notebook from <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">here</a>. Everything you need from code to explanation is already provided in that <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">notebook</a>. This article will tell you few points about the dataset and some techniques we are going to use.</p><h2 id="heading-about-dataset">About Dataset</h2><p>The dataset we are going to use comes from <a target="_blank" href="http://kaggle.com">Kaggle</a>. You can view that dataset on Kaggle from <a target="_blank" href="https://www.kaggle.com/c/dogs-vs-cats">here</a>. This dataset only contains images of cats and dogs separated as train and test already for model training.</p><p>In the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">notebook</a> we'll be training our model from images of cats and dogs inside <code>train</code> folder and validate our model using images inside <code>test</code> folder. </p><p>The dataset in total contains 11250 images of cats and 11250 images of dogs in train folder. Similarly, it contains 1250 images of cats and 1250 images of dogs in test folder. There is no labels file so we had to create it by ourselves to let our model understand which image is of a cat and which is of a dog.</p><h2 id="heading-how-to-open">How to open?</h2><p>As mentioned earlier this project is basically a Jupyter notebook and you can get that notebook from Github via this <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">link</a>. You don't need to download the dataset manually because inside the notebook you'll find a function that will download the dataset for you automatically and save it in your current working directory then another function will extract that dataset into a folder.</p><p>Now you can either upload this notebook on <a target="_blank" href="http://colab.research.google.com">Google Colab</a> or any other cloud service and run it there or download it on your system and create a virtual environment then run it after installing all the required dependencies. Both ways it's totally gonna work fine without any issue.</p><p>Using cloud service can be easier because you don't have to manage virtual environments or install required dependencies but you also need to keep in mind that they provide limited RAM storage so keep a keen eye on your memory management and if it's getting out of your provided range then change the number of samples in training set inside <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">notebook</a>.</p><blockquote><p><strong><em>Note: All the code sample is done in a notebook with a proper explanation of each cell</em></strong></p></blockquote><h2 id="heading-conclusion">Conclusion</h2><p>That's it for this supplement article for our first project. Now head to the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/project-1">repository for this project</a> and grab the notebook. If you just want to take a look at the notebook and it's not opening on Github then click on this <a target="_blank" href="https://nbviewer.org/github/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">link</a> to explore the entire notebook on <a target="_blank" href="https://nbviewer.org">nbviewer</a>.</p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. </p><p><strong><em>Wish you all a very HAPPY NEW YEAR ..!!</em></strong></p>]]><![CDATA[<h2 id="heading-overview">Overview</h2><p>Welcome to the 4<sup>th</sup> article of <a target="_blank" href="https://swayam-blog.hashnode.dev/series/demystifying-ml"><strong>Demystifying Machine Learning</strong> </a> series. In this article, we'll be going to make our first project of <strong><em>prediction of cats or dogs from their respective images</em></strong> using Logistic Regression. We are going to use <a target="_blank" href="https://scikit-learn.org/stable/">scikit-learn</a> library for this project but I already covered <a target="_blank" href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html">Logistic Regression</a> in great depth in the 3<sup>rd</sup> article of this series. </p><blockquote><p> <strong><em> The main focus of this sample project is towards data collection and data preprocessing rather than training a Logistic Regression model. The reason is that I showed how to set different hyperparameters in order to achieve satisfying results in <a target="_blank" href="https://swayam-blog.hashnode.dev/logistic-regression">3<sup>rd</sup> article</a> of this series and often times whenever working on a project 80% of the time one gonna spent is on data preprocessing.</em></strong></p></blockquote><p>The project is going to be a <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb"><strong>Jupyter Notebook</strong></a>. You can grab that notebook from <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">here</a>. Everything you need from code to explanation is already provided in that <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">notebook</a>. This article will tell you few points about the dataset and some techniques we are going to use.</p><h2 id="heading-about-dataset">About Dataset</h2><p>The dataset we are going to use comes from <a target="_blank" href="http://kaggle.com">Kaggle</a>. You can view that dataset on Kaggle from <a target="_blank" href="https://www.kaggle.com/c/dogs-vs-cats">here</a>. This dataset only contains images of cats and dogs separated as train and test already for model training.</p><p>In the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">notebook</a> we'll be training our model from images of cats and dogs inside <code>train</code> folder and validate our model using images inside <code>test</code> folder. </p><p>The dataset in total contains 11250 images of cats and 11250 images of dogs in train folder. Similarly, it contains 1250 images of cats and 1250 images of dogs in test folder. There is no labels file so we had to create it by ourselves to let our model understand which image is of a cat and which is of a dog.</p><h2 id="heading-how-to-open">How to open?</h2><p>As mentioned earlier this project is basically a Jupyter notebook and you can get that notebook from Github via this <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">link</a>. You don't need to download the dataset manually because inside the notebook you'll find a function that will download the dataset for you automatically and save it in your current working directory then another function will extract that dataset into a folder.</p><p>Now you can either upload this notebook on <a target="_blank" href="http://colab.research.google.com">Google Colab</a> or any other cloud service and run it there or download it on your system and create a virtual environment then run it after installing all the required dependencies. Both ways it's totally gonna work fine without any issue.</p><p>Using cloud service can be easier because you don't have to manage virtual environments or install required dependencies but you also need to keep in mind that they provide limited RAM storage so keep a keen eye on your memory management and if it's getting out of your provided range then change the number of samples in training set inside <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">notebook</a>.</p><blockquote><p><strong><em>Note: All the code sample is done in a notebook with a proper explanation of each cell</em></strong></p></blockquote><h2 id="heading-conclusion">Conclusion</h2><p>That's it for this supplement article for our first project. Now head to the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/project-1">repository for this project</a> and grab the notebook. If you just want to take a look at the notebook and it's not opening on Github then click on this <a target="_blank" href="https://nbviewer.org/github/practice404/demystifying_machine_learning/blob/master/project-1/project_1.ipynb">link</a> to explore the entire notebook on <a target="_blank" href="https://nbviewer.org">nbviewer</a>.</p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. </p><p><strong><em>Wish you all a very HAPPY NEW YEAR ..!!</em></strong></p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1640800385356/5Ofm8NDFO.jpeg<![CDATA[Logistic Regression]]>https://swayam-blog.hashnode.dev/logistic-regressionhttps://swayam-blog.hashnode.dev/logistic-regressionSat, 18 Dec 2021 05:24:03 GMT<![CDATA[<h2 id="heading-overview">Overview</h2><p>Welcome to the 3<sup>rd</sup> article of the <a target="_blank" href="https://swayam-blog.hashnode.dev/series/demystifying-ml"><strong>Demystifying Machine Learning</strong></a> series. In this article, we are going to discuss a supervised classification algorithm <strong>Logistic Regression</strong>, how it works, why it's important, mathematics behind the scenes, linear and non-linear separation and Regularization.</p><p>A good grasp of Linear Regression is needed to understand this algorithm and luckily we already covered it, for reference you can read it from <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">here</a>. <strong>Logistic Regression</strong> builds the base of <strong>Neural Networks</strong>, so it gets very important to understand the terms and working of this algorithm.</p><blockquote><p><strong>Logistic Regression</strong> is not a regression algorithm, its name does consist of the word "Regression" but it's a classification algorithm.</p></blockquote><h2 id="heading-what-is-logistic-regression">What is Logistic Regression?</h2><p><strong>Logistic Regression</strong> <strong><em>also known as Perceptron algorithm is a supervised classification algorithm i.e. we teach our hypothesis with categorical labelled data and it predicts the classes (or categories) with some certain probability</em></strong>. The reason this algorithm is called Logistic Regression is maybe that it's working is pretty similar to that of Linear Regression and the term "Logistic" is because we use a Logistic function in this algorithm (<em>we'll going to see it later</em>). The difference is that with Linear Regression we intend to predict the continuous values but with Logistic Regression we want to predict the categorical value. Like whether the student gets enrolled in a university or not, if the picture contains a cat or not, etc.</p><p>Here's a representation of how Logistic Regression classifies the data points.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752060348/CzQh-pR6T.png" alt="1.png" /></p><p>We can see that the <em>blue points</em> are separated from the <em>orange points</em> through a line and we call this line a <strong>decision boundary</strong>. Basically, with Logistic Regression we separate the classes (or categories) with the help of decision boundaries, they can be linear or non-linear. </p><h2 id="heading-working-of-logistic-regression">Working of Logistic Regression</h2><p>Let's revise what we learnt in <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">Linear Regression</a>, we initialise the parameters with all 0s, then with the help of Gradient Descent calculate the optimal parameters by reducing the cost function and lastly draw the hypothesis to predict the continuous-valued target. </p><p>But here we don't need continuous values, instead, we want to output the probability that lies between [0,1]. So the question arises, <strong>how we can get probability or a number between [0,1] from the continuous values of Linear Regression?</strong></p><h3 id="heading-sigmoid-function">Sigmoid function</h3><p><strong>Sigmoid function</strong> is a type of <strong><em>logistic function</em></strong> which takes a real number as input and gives out a real number between [0,1] as output. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756326745/bI7AUi4gX.png" alt="1.png" /></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752080151/XSs-lCUma.png" alt="2.png" /></p><p>So basically we'll generate the continuous values using Linear Regression and convert those continuous values into probability i.e. between [0,1] by passing through <strong>sigmoid function</strong>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756341261/wJM47_tPB.png" alt="2.png" /></p><p>So in the end our final hypothesis will look like this:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756529024/Hsdngqks9.png" alt="3.png" /></p><p>This hypothesis is different from the hypothesis of Linear Regression. Yeah looks fair enough, let me give you a visualisation about overall how Logistic Regression works.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752100559/4t7OW4zk5.png" alt="13.png" /></p><blockquote><p><strong><em>Note:- X0 is basically 1, we will it later why?</em></strong></p></blockquote><p>So it's very clear from the above representation that the part behind the <code>sigmoid</code> function is very similar to that of Linear Regression. Let's now move ahead and define the cost function for Logistic Regression.</p><h3 id="heading-cost-function">Cost function</h3><p>Our hypothesis is different from that of Linear Regression, so we need to define a new cost function. We already learnt in our <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd article</a> about what is cost function is and how to define one for our algorithm. Let's use those concepts and define one for Logistic Regression.</p><p>For simplicity let's consider the case of binary classification which means our target value will be either 1(True) or 0(False). For example, the image contains a cat (1, True) or the image does not contain a cat (0, False). This means that our predicted values will also be either 0 or 1.</p><p>Let me first show you what is the cost function for Logistic Regression and then we'll try to understand its derivation.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756571823/yAX5pAFqL.png" alt="4.png" /></p><p>Combining both the conditions and taking their mean for <strong>m</strong> samples in a dataset:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756596078/c2xN6n22Z.png" alt="5.png" /></p><p>The equation shown above is of the cost function for Logistic Regression and it looks very different than that of Linear Regression, let's break it down and understand how we came up with the above cost function? Get ready, probability class is about to begin...</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756622846/xcBIW7ru-.png" alt="6.png" /></p><blockquote><p>There's a negative sign in the original cost function because when training the algorithm we want probabilities to be large but here we are representing it to minimize the cost.</p><p><strong><em>minimise loss => max log probability</em></strong></p></blockquote><p>Okay, that's a lot of maths, but it's all basics. Focus on the general form in the above equation and that's more than enough to understand how we came up with such a complex looking cost function. Let's see how to calculate the cost for <strong>m</strong> examples for some datasets:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756647909/EFS95vPVG.png" alt="7.png" /></p><h3 id="heading-gradient-descent">Gradient Descent</h3><p>We already covered the working of gradient descent in our <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd article</a>, you can refer to it for revision. In this section, we'll be looking at formulas of gradients and updating the parameters.</p><p>The gradient of the cost is a vector of the same length as where the jth parameter (for j=0,1,,n) is defined as follows:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756666029/ZE6goQy68.png" alt="8.png" /></p><blockquote><p>Calculation of gradients from cost function is demonstrated in <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd Article</a>.</p></blockquote><p>As we can see that the formula for calculating gradients is pretty similar to that of Linear Regression but note that the values for $h_\theta \left( x^{(i)} \right)$ are different due to the use of <code>sigmoid function</code>.</p><p>After calculating gradients we can simultaneously update our parameter as :</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756688657/HRoDBUBKs.png" alt="9.png" /></p><p>Great now we have all the ingredients for writing our own Logistic Regression from scratch, Let's get started with it in the next section. Till now have a break for 15 minutes cause you just studied a hell of lot of maths by now.</p><h2 id="heading-code-implementation">Code Implementation</h2><p>In this section, we'll be writing our <code>LogisticRegression</code> class using Python.</p><blockquote><p><strong><em>Note: You can find all the codes for this article from <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">here</a>. It's highly recommended to follow the Jupyter notebook while going through this section.</em></strong></p></blockquote><p>Let's begin 🙂</p><p>Let me give you a basic overall working of this class. Firstly it'll take your <em>feature</em> and <em>target</em> arrays as input then it'll normalize the features around mean (if you want to) and add an extra column of all 1s to your <em>feature</em> array for the bias term, as we know from Linear Regression that <code>y=wx+b</code>. So this <code>b</code> gets handled by this extra column of 1s with matrix multiplication of <em>features</em> and <em>parameters</em> arrays.</p><p>for example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756741748/e70XrL7kZ.png" alt="10.png" /></p><p>Then it initializes the parameter array with all 0s after that training loop starts till the epoch count and it calculates the cost and gradient for certain parameters and simultaneously keep updating the parameters with a certain learning rate. </p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LogisticRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid</span>(<span class="hljs-params">self, x</span>):</span> z = np.array(x) g = np.zeros(z.shape) g = <span class="hljs-number">1</span>/(<span class="hljs-number">1</span> + np.exp(-z) ) <span class="hljs-keyword">return</span> g <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: None Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] z = np.dot(self.X, self.parameters) z = z.reshape(<span class="hljs-number">-1</span>) z = z.astype(np.float128, copy=<span class="hljs-literal">False</span>) y_hat = self.sigmoid(z) cost = <span class="hljs-number">-1</span> * np.mean(self.Y*(np.log(y_hat)) + (<span class="hljs-number">1</span>-self.Y)*(np.log(<span class="hljs-number">1</span>-y_hat))) gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(self.parameters)): temp = np.mean((y_hat-self.Y)*self.X[:,n]) gradients[n] = temp <span class="hljs-comment"># Vectorized form</span> <span class="hljs-comment"># gradients = np.dot(self.X.T, error)/m </span> <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns: None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self, X</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = X.copy() mu = np.mean(X, axis=<span class="hljs-number">0</span>) sigma = np.std(X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize(self.X) self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost() self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True, threshold=<span class="hljs-number">0.5</span></span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions: Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> [<span class="hljs-number">1</span> <span class="hljs-keyword">if</span> i > threshold <span class="hljs-keyword">else</span> <span class="hljs-number">0</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> self.sigmoid(np.dot(x,self.parameters))]</code></pre><p>This code looks pretty similar to that of Linear Regression using Gradient Descent. If you are following this series you'll be pretty familiar with this implementation. Still, I like to point out a few methods of this class:-</p><ul><li><strong><code>sigmoid</code></strong>: We added this new method to calculate the sigmoid of the continuous values generated from the linear hypothesis i.e. from <sup>T</sup>X to get the probabilities.</li><li><strong><code>calculate_cost</code></strong>: We change the definition of this function because our cost function is changed too, it's not confusing if you are well aware of the formulas I gave and the <code>numpy</code> library then it won't be difficult for you to understand.</li><li><strong><code>predict</code></strong>: This function takes the input and returns the array of predicted values 0 and 1. There's an extra parameter <code>threshold</code> which had a default value of 0.5, if the predicted value > 0.5 then it'll predict 1 otherwise 0 for the predicted array. You can change this <code>threshold</code> according to your confidence level.</li></ul><h3 id="heading-trying-it-out-on-a-dataset">Trying it out on a dataset</h3><p>In this sub-section, we will use our class on the dataset to check how it's working.</p><blockquote><p>All the codes and implementations are provided in this <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/logistic_regression/notebook.ipynb">jupyter notebook</a>, follow it for better understanding in this section.</p></blockquote><p>For the dataset, we have records of students' marks for some Exam1 and Exam2 and the target column represents whether they get admitted into the university or not. Let's visualize it using a plot:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752142703/GNP0R62nW.png" alt="3.png" /></p><p>So what we basically want from Logistic Regression is to tell us whether a certain student with some scores of Exam1 and Exam2 is admitted or not. Let's create an instance of the <code>LogisticRegression</code> class and try it out.</p><p>Firstly I'm going to find the optimal parameters for this dataset and I'm going to show you two ways of doing it.</p><ul><li>Using our custom class</li></ul><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755806863/-skmJvfk3.png" alt="4.png" /></p><ul><li><p>Using Scipy's optimize module</p><p>Sometimes using gradient descent takes a lot of time so for time-saving, I'll show you how you can easily find the parameters by just using <code>scipy.optimize.minimize</code> function bypassing the cost function into it.</p></li></ul><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755824131/E1q3mdC80.png" alt="5.png" /></p><p> Firstly I appended an extra column of 1s for bias term then pass my <code>costFunction</code>, <code>initial_theta</code> (initially 0s) and my <code>X</code> and <code>Y</code> as arguments. It easily calculated the optimal parameters in 0.3 seconds much faster than gradient descent which took about 6.5 seconds.</p><blockquote><p><strong><em>Note: <code>costFunction</code> is similar to what we have in our class method as <code>calculate_cost</code>, I just put it outside to show you the work of <code>scipy.optimize.minimize</code> function.</em></strong></p></blockquote><p>Great now let's see how well it's performed by printing out its accuracy on the training set.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755839852/VjGQjf-gg.png" alt="6.png" /></p><p>Hmmm, around 89%, it seems good although there are a few algorithms that we'll be covering in future that can perform way much better than this. Now let me show you its decision boundary, as we can see that we didn't perform any polynomial transformation (for more refer to <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">article 2</a>) on our input features so the decision boundary is going to be a straight line.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755907860/N8MYuSOL9.png" alt="7.png" /></p><p>That's so great we just implemented our <code>LogisticRegression</code> class on the student's dataset. Let's move ahead and understand the problem of overfitting in the next section. Till then have a short 5-minute break.</p><h2 id="heading-problem-of-overfitting">Problem of Overfitting</h2><p>In order to understand overfitting in Logistic Regression, I'll show you an implementation of this algorithm on another dataset where we need to fit a non-linear decision boundary. Let's visualize our 2nd dataset on the graph:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755934992/5-pXUnqBv.png" alt="8.png" /></p><p>As we can see, it's not linearly separable data so we need to fit a non-linear decision boundary. If you went through the <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd article</a> of this series then you probably know how we do this, but in brief, we take our original features and apply polynomial transformations on them, like squaring, cubing or multiplying with each other to obtain new features and then training our algorithm on those new features results in non-linear classification.</p><p>In the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/logistic_regression/notebook.ipynb">notebook</a> you'll find a function <code>mapFeature</code> that take individual features as input and return new transformed features. </p><blockquote><p>If you wanna know how it's working then consider referring to the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/logistic_regression/notebook.ipynb">notebook</a> and it's recommended to follow it while reading this article.</p></blockquote><p>After getting the new transformed features and following the exact steps we followed in the above section, you'll be able to print out its decision boundary that will look something like this:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755949600/JSXj2eNSn.png" alt="9.png" /></p><p>After seeing it, you probably may say that "WOW!, it performed so well almost classified all the training points". Well, it does seem good but in reality, it's worst. Our hypothesis fitted so well on the training set that it loses the generality that means if we provide a new set of points that is not in the training set then our hypothesis will not be able to classify it clearly.</p><p>In short, it's necessary to maintain the generality in our hypothesis so that it can perform well on the data it is never seen. <strong>Regularization</strong> is the way to achieve it. Let's see how to maintain generality using Regularization in the next section.</p><h2 id="heading-regularization">Regularization</h2><p>In this section, we'll be discussing how to implement regularization. <strong><em>Overfitting occurs when the algorithm provides heavy parameters to some features according to the training dataset and hyperparameters. This makes those features dominant in the overall hypothesis and lead to a nice fit in the training set but not so good on the samples outside the training set.</em></strong></p><p><strong><em>The plan is to add the square of parameters by multiplying them with some big number () to the cost function because our algorithms' main motive is to decrease the cost function, so in this way, the algorithm will end up giving the small parameters just to cancel the effect addition of parameters by multiplying with a large number (&</em></strong>lambda;). So our final cost function gets modified to:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756780412/aW2HU1RZv.png" alt="11.png" /></p><p><strong><em>Note: We denote the bias term as <sub>0</sub> and it's not needed to regularized the bias term that's why we are only considering only <sub>1</sub> to <sub>n</sub> parameters.</em></strong></p><p>Since our cost function is changed that's why our formulas for gradients were also get affected. The new formula for the gradient are:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756795932/Yh788isRU.png" alt="12.png" /></p><p>The new formulas for gradients can be easily derived by partially differentiating the new cost function J() w.r.t to some <sub>j</sub>. </p><blockquote><p>Calculating of gradients from cost function is demonstrated in <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd Article</a>.</p></blockquote><p> is known as a regularization parameter and it should be greater than 0. <strong><em>Large value of leads to underfitting and very small values lead to overfitting</em></strong>, so you need to pick the right one for your dataset through iterating on some sample values.</p><h3 id="heading-implementing-regularization-on-logisticregression-class">Implementing Regularization on <code>LogisticRegression</code> class</h3><p>We only need to modify the <code>calculate_cost</code> method because only this method is responsible for calculating both cost and gradients. The modified version is shown below:</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RegLogisticRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid</span>(<span class="hljs-params">self, x</span>):</span> z = np.array(x) g = np.zeros(z.shape) g = <span class="hljs-number">1</span>/(<span class="hljs-number">1</span> + np.exp(-z) ) <span class="hljs-keyword">return</span> g <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid_derivative</span>(<span class="hljs-params">self, x</span>):</span> derivative = self.sigmoid(x) * (<span class="hljs-number">1</span> - self.sigmoid(x)) <span class="hljs-keyword">return</span> derivative <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self, lambda_</span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: None Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] z = np.dot(self.X, self.parameters) z = z.reshape(<span class="hljs-number">-1</span>) z = z.astype(np.float128, copy=<span class="hljs-literal">False</span>) y_hat = self.sigmoid(z) cost = <span class="hljs-number">-1</span> * np.mean(self.Y*(np.log(y_hat)) + (<span class="hljs-number">1</span>-self.Y)*(np.log(<span class="hljs-number">1</span>-y_hat))) + lambda_ * (np.sum(self.parameters[<span class="hljs-number">1</span>:]**<span class="hljs-number">2</span>))/(<span class="hljs-number">2</span>*m) gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(self.parameters)): <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>: temp = np.mean((y_hat-self.Y)*self.X[:,n]) <span class="hljs-keyword">else</span>: temp = np.mean((y_hat-self.Y)*self.X[:,n]) + lambda_*self.parameters[n]/m gradients[n] = temp <span class="hljs-comment"># gradients = np.dot(self.X.T, error)/m</span> <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns:None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self, X</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = X.copy() mu = np.mean(X, axis=<span class="hljs-number">0</span>) sigma = np.std(X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, lambda_=<span class="hljs-number">0</span>,is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize(self.X) self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost(lambda_=lambda_) self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True, threshold=<span class="hljs-number">0.5</span></span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions : Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> [<span class="hljs-number">1</span> <span class="hljs-keyword">if</span> i > threshold <span class="hljs-keyword">else</span> <span class="hljs-number">0</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> self.sigmoid(np.dot(x,self.parameters))]</code></pre><p>Now we have our regularized version of <code>RegLogisticRegression</code> class. Let's address the previous problem of overfitting on polynomial regression by using a set of values for to pick the right one.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755979522/0MXdt0T4B.png" alt="10.png" /></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756009035/JPqLNcv2r.png" alt="11.png" /></p><p>I can say that =1 and =10 looks pretty good and they both are able to maintain the generality in hypothesis, the curve is more smooth and less wiggling type. But we can see that as we keep increasing the value if the more our hypothesis starts to <strong>underfit</strong> the data. It basically means that it starts to perform even worst on the training set. Let's visualise the underfitting by plotting cost functions for each </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756025955/gX8nPqCOC.png" alt="12.png" /></p><p>We can see that as increases cost also increases. So it's advised to select the value for carefully according to your custom dataset.</p><h2 id="heading-conclusion">Conclusion</h2><p>Great work everyone, we successfully learnt and implemented Logistic Regression. Most people don't write their Machine Learning algorithm from scratch instead they use libraries like Scikit-Learn. Scikit-Learn contains wrappers for many Machine Learning algorithms and it's really flexible and easy to use. But it's not harmful to know about the algorithm you're going to use and the best way of doing it is to understand the underlying mathematics and implement it from scratch.</p><p>So in the next article, we'll be making a classification project using the Scikit-Learn library and you'll see how easy it is to use for making some really nice creative projects.</p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. Till then enjoy your day and keep learning.</p>]]><![CDATA[<h2 id="heading-overview">Overview</h2><p>Welcome to the 3<sup>rd</sup> article of the <a target="_blank" href="https://swayam-blog.hashnode.dev/series/demystifying-ml"><strong>Demystifying Machine Learning</strong></a> series. In this article, we are going to discuss a supervised classification algorithm <strong>Logistic Regression</strong>, how it works, why it's important, mathematics behind the scenes, linear and non-linear separation and Regularization.</p><p>A good grasp of Linear Regression is needed to understand this algorithm and luckily we already covered it, for reference you can read it from <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">here</a>. <strong>Logistic Regression</strong> builds the base of <strong>Neural Networks</strong>, so it gets very important to understand the terms and working of this algorithm.</p><blockquote><p><strong>Logistic Regression</strong> is not a regression algorithm, its name does consist of the word "Regression" but it's a classification algorithm.</p></blockquote><h2 id="heading-what-is-logistic-regression">What is Logistic Regression?</h2><p><strong>Logistic Regression</strong> <strong><em>also known as Perceptron algorithm is a supervised classification algorithm i.e. we teach our hypothesis with categorical labelled data and it predicts the classes (or categories) with some certain probability</em></strong>. The reason this algorithm is called Logistic Regression is maybe that it's working is pretty similar to that of Linear Regression and the term "Logistic" is because we use a Logistic function in this algorithm (<em>we'll going to see it later</em>). The difference is that with Linear Regression we intend to predict the continuous values but with Logistic Regression we want to predict the categorical value. Like whether the student gets enrolled in a university or not, if the picture contains a cat or not, etc.</p><p>Here's a representation of how Logistic Regression classifies the data points.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752060348/CzQh-pR6T.png" alt="1.png" /></p><p>We can see that the <em>blue points</em> are separated from the <em>orange points</em> through a line and we call this line a <strong>decision boundary</strong>. Basically, with Logistic Regression we separate the classes (or categories) with the help of decision boundaries, they can be linear or non-linear. </p><h2 id="heading-working-of-logistic-regression">Working of Logistic Regression</h2><p>Let's revise what we learnt in <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">Linear Regression</a>, we initialise the parameters with all 0s, then with the help of Gradient Descent calculate the optimal parameters by reducing the cost function and lastly draw the hypothesis to predict the continuous-valued target. </p><p>But here we don't need continuous values, instead, we want to output the probability that lies between [0,1]. So the question arises, <strong>how we can get probability or a number between [0,1] from the continuous values of Linear Regression?</strong></p><h3 id="heading-sigmoid-function">Sigmoid function</h3><p><strong>Sigmoid function</strong> is a type of <strong><em>logistic function</em></strong> which takes a real number as input and gives out a real number between [0,1] as output. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756326745/bI7AUi4gX.png" alt="1.png" /></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752080151/XSs-lCUma.png" alt="2.png" /></p><p>So basically we'll generate the continuous values using Linear Regression and convert those continuous values into probability i.e. between [0,1] by passing through <strong>sigmoid function</strong>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756341261/wJM47_tPB.png" alt="2.png" /></p><p>So in the end our final hypothesis will look like this:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756529024/Hsdngqks9.png" alt="3.png" /></p><p>This hypothesis is different from the hypothesis of Linear Regression. Yeah looks fair enough, let me give you a visualisation about overall how Logistic Regression works.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752100559/4t7OW4zk5.png" alt="13.png" /></p><blockquote><p><strong><em>Note:- X0 is basically 1, we will it later why?</em></strong></p></blockquote><p>So it's very clear from the above representation that the part behind the <code>sigmoid</code> function is very similar to that of Linear Regression. Let's now move ahead and define the cost function for Logistic Regression.</p><h3 id="heading-cost-function">Cost function</h3><p>Our hypothesis is different from that of Linear Regression, so we need to define a new cost function. We already learnt in our <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd article</a> about what is cost function is and how to define one for our algorithm. Let's use those concepts and define one for Logistic Regression.</p><p>For simplicity let's consider the case of binary classification which means our target value will be either 1(True) or 0(False). For example, the image contains a cat (1, True) or the image does not contain a cat (0, False). This means that our predicted values will also be either 0 or 1.</p><p>Let me first show you what is the cost function for Logistic Regression and then we'll try to understand its derivation.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756571823/yAX5pAFqL.png" alt="4.png" /></p><p>Combining both the conditions and taking their mean for <strong>m</strong> samples in a dataset:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756596078/c2xN6n22Z.png" alt="5.png" /></p><p>The equation shown above is of the cost function for Logistic Regression and it looks very different than that of Linear Regression, let's break it down and understand how we came up with the above cost function? Get ready, probability class is about to begin...</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756622846/xcBIW7ru-.png" alt="6.png" /></p><blockquote><p>There's a negative sign in the original cost function because when training the algorithm we want probabilities to be large but here we are representing it to minimize the cost.</p><p><strong><em>minimise loss => max log probability</em></strong></p></blockquote><p>Okay, that's a lot of maths, but it's all basics. Focus on the general form in the above equation and that's more than enough to understand how we came up with such a complex looking cost function. Let's see how to calculate the cost for <strong>m</strong> examples for some datasets:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756647909/EFS95vPVG.png" alt="7.png" /></p><h3 id="heading-gradient-descent">Gradient Descent</h3><p>We already covered the working of gradient descent in our <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd article</a>, you can refer to it for revision. In this section, we'll be looking at formulas of gradients and updating the parameters.</p><p>The gradient of the cost is a vector of the same length as where the jth parameter (for j=0,1,,n) is defined as follows:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756666029/ZE6goQy68.png" alt="8.png" /></p><blockquote><p>Calculation of gradients from cost function is demonstrated in <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd Article</a>.</p></blockquote><p>As we can see that the formula for calculating gradients is pretty similar to that of Linear Regression but note that the values for $h_\theta \left( x^{(i)} \right)$ are different due to the use of <code>sigmoid function</code>.</p><p>After calculating gradients we can simultaneously update our parameter as :</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756688657/HRoDBUBKs.png" alt="9.png" /></p><p>Great now we have all the ingredients for writing our own Logistic Regression from scratch, Let's get started with it in the next section. Till now have a break for 15 minutes cause you just studied a hell of lot of maths by now.</p><h2 id="heading-code-implementation">Code Implementation</h2><p>In this section, we'll be writing our <code>LogisticRegression</code> class using Python.</p><blockquote><p><strong><em>Note: You can find all the codes for this article from <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">here</a>. It's highly recommended to follow the Jupyter notebook while going through this section.</em></strong></p></blockquote><p>Let's begin 🙂</p><p>Let me give you a basic overall working of this class. Firstly it'll take your <em>feature</em> and <em>target</em> arrays as input then it'll normalize the features around mean (if you want to) and add an extra column of all 1s to your <em>feature</em> array for the bias term, as we know from Linear Regression that <code>y=wx+b</code>. So this <code>b</code> gets handled by this extra column of 1s with matrix multiplication of <em>features</em> and <em>parameters</em> arrays.</p><p>for example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756741748/e70XrL7kZ.png" alt="10.png" /></p><p>Then it initializes the parameter array with all 0s after that training loop starts till the epoch count and it calculates the cost and gradient for certain parameters and simultaneously keep updating the parameters with a certain learning rate. </p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LogisticRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid</span>(<span class="hljs-params">self, x</span>):</span> z = np.array(x) g = np.zeros(z.shape) g = <span class="hljs-number">1</span>/(<span class="hljs-number">1</span> + np.exp(-z) ) <span class="hljs-keyword">return</span> g <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: None Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] z = np.dot(self.X, self.parameters) z = z.reshape(<span class="hljs-number">-1</span>) z = z.astype(np.float128, copy=<span class="hljs-literal">False</span>) y_hat = self.sigmoid(z) cost = <span class="hljs-number">-1</span> * np.mean(self.Y*(np.log(y_hat)) + (<span class="hljs-number">1</span>-self.Y)*(np.log(<span class="hljs-number">1</span>-y_hat))) gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(self.parameters)): temp = np.mean((y_hat-self.Y)*self.X[:,n]) gradients[n] = temp <span class="hljs-comment"># Vectorized form</span> <span class="hljs-comment"># gradients = np.dot(self.X.T, error)/m </span> <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns: None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self, X</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = X.copy() mu = np.mean(X, axis=<span class="hljs-number">0</span>) sigma = np.std(X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize(self.X) self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost() self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True, threshold=<span class="hljs-number">0.5</span></span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions: Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> [<span class="hljs-number">1</span> <span class="hljs-keyword">if</span> i > threshold <span class="hljs-keyword">else</span> <span class="hljs-number">0</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> self.sigmoid(np.dot(x,self.parameters))]</code></pre><p>This code looks pretty similar to that of Linear Regression using Gradient Descent. If you are following this series you'll be pretty familiar with this implementation. Still, I like to point out a few methods of this class:-</p><ul><li><strong><code>sigmoid</code></strong>: We added this new method to calculate the sigmoid of the continuous values generated from the linear hypothesis i.e. from <sup>T</sup>X to get the probabilities.</li><li><strong><code>calculate_cost</code></strong>: We change the definition of this function because our cost function is changed too, it's not confusing if you are well aware of the formulas I gave and the <code>numpy</code> library then it won't be difficult for you to understand.</li><li><strong><code>predict</code></strong>: This function takes the input and returns the array of predicted values 0 and 1. There's an extra parameter <code>threshold</code> which had a default value of 0.5, if the predicted value > 0.5 then it'll predict 1 otherwise 0 for the predicted array. You can change this <code>threshold</code> according to your confidence level.</li></ul><h3 id="heading-trying-it-out-on-a-dataset">Trying it out on a dataset</h3><p>In this sub-section, we will use our class on the dataset to check how it's working.</p><blockquote><p>All the codes and implementations are provided in this <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/logistic_regression/notebook.ipynb">jupyter notebook</a>, follow it for better understanding in this section.</p></blockquote><p>For the dataset, we have records of students' marks for some Exam1 and Exam2 and the target column represents whether they get admitted into the university or not. Let's visualize it using a plot:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639752142703/GNP0R62nW.png" alt="3.png" /></p><p>So what we basically want from Logistic Regression is to tell us whether a certain student with some scores of Exam1 and Exam2 is admitted or not. Let's create an instance of the <code>LogisticRegression</code> class and try it out.</p><p>Firstly I'm going to find the optimal parameters for this dataset and I'm going to show you two ways of doing it.</p><ul><li>Using our custom class</li></ul><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755806863/-skmJvfk3.png" alt="4.png" /></p><ul><li><p>Using Scipy's optimize module</p><p>Sometimes using gradient descent takes a lot of time so for time-saving, I'll show you how you can easily find the parameters by just using <code>scipy.optimize.minimize</code> function bypassing the cost function into it.</p></li></ul><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755824131/E1q3mdC80.png" alt="5.png" /></p><p> Firstly I appended an extra column of 1s for bias term then pass my <code>costFunction</code>, <code>initial_theta</code> (initially 0s) and my <code>X</code> and <code>Y</code> as arguments. It easily calculated the optimal parameters in 0.3 seconds much faster than gradient descent which took about 6.5 seconds.</p><blockquote><p><strong><em>Note: <code>costFunction</code> is similar to what we have in our class method as <code>calculate_cost</code>, I just put it outside to show you the work of <code>scipy.optimize.minimize</code> function.</em></strong></p></blockquote><p>Great now let's see how well it's performed by printing out its accuracy on the training set.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755839852/VjGQjf-gg.png" alt="6.png" /></p><p>Hmmm, around 89%, it seems good although there are a few algorithms that we'll be covering in future that can perform way much better than this. Now let me show you its decision boundary, as we can see that we didn't perform any polynomial transformation (for more refer to <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">article 2</a>) on our input features so the decision boundary is going to be a straight line.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755907860/N8MYuSOL9.png" alt="7.png" /></p><p>That's so great we just implemented our <code>LogisticRegression</code> class on the student's dataset. Let's move ahead and understand the problem of overfitting in the next section. Till then have a short 5-minute break.</p><h2 id="heading-problem-of-overfitting">Problem of Overfitting</h2><p>In order to understand overfitting in Logistic Regression, I'll show you an implementation of this algorithm on another dataset where we need to fit a non-linear decision boundary. Let's visualize our 2nd dataset on the graph:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755934992/5-pXUnqBv.png" alt="8.png" /></p><p>As we can see, it's not linearly separable data so we need to fit a non-linear decision boundary. If you went through the <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd article</a> of this series then you probably know how we do this, but in brief, we take our original features and apply polynomial transformations on them, like squaring, cubing or multiplying with each other to obtain new features and then training our algorithm on those new features results in non-linear classification.</p><p>In the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/logistic_regression/notebook.ipynb">notebook</a> you'll find a function <code>mapFeature</code> that take individual features as input and return new transformed features. </p><blockquote><p>If you wanna know how it's working then consider referring to the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/logistic_regression/notebook.ipynb">notebook</a> and it's recommended to follow it while reading this article.</p></blockquote><p>After getting the new transformed features and following the exact steps we followed in the above section, you'll be able to print out its decision boundary that will look something like this:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755949600/JSXj2eNSn.png" alt="9.png" /></p><p>After seeing it, you probably may say that "WOW!, it performed so well almost classified all the training points". Well, it does seem good but in reality, it's worst. Our hypothesis fitted so well on the training set that it loses the generality that means if we provide a new set of points that is not in the training set then our hypothesis will not be able to classify it clearly.</p><p>In short, it's necessary to maintain the generality in our hypothesis so that it can perform well on the data it is never seen. <strong>Regularization</strong> is the way to achieve it. Let's see how to maintain generality using Regularization in the next section.</p><h2 id="heading-regularization">Regularization</h2><p>In this section, we'll be discussing how to implement regularization. <strong><em>Overfitting occurs when the algorithm provides heavy parameters to some features according to the training dataset and hyperparameters. This makes those features dominant in the overall hypothesis and lead to a nice fit in the training set but not so good on the samples outside the training set.</em></strong></p><p><strong><em>The plan is to add the square of parameters by multiplying them with some big number () to the cost function because our algorithms' main motive is to decrease the cost function, so in this way, the algorithm will end up giving the small parameters just to cancel the effect addition of parameters by multiplying with a large number (&</em></strong>lambda;). So our final cost function gets modified to:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756780412/aW2HU1RZv.png" alt="11.png" /></p><p><strong><em>Note: We denote the bias term as <sub>0</sub> and it's not needed to regularized the bias term that's why we are only considering only <sub>1</sub> to <sub>n</sub> parameters.</em></strong></p><p>Since our cost function is changed that's why our formulas for gradients were also get affected. The new formula for the gradient are:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756795932/Yh788isRU.png" alt="12.png" /></p><p>The new formulas for gradients can be easily derived by partially differentiating the new cost function J() w.r.t to some <sub>j</sub>. </p><blockquote><p>Calculating of gradients from cost function is demonstrated in <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descent">2nd Article</a>.</p></blockquote><p> is known as a regularization parameter and it should be greater than 0. <strong><em>Large value of leads to underfitting and very small values lead to overfitting</em></strong>, so you need to pick the right one for your dataset through iterating on some sample values.</p><h3 id="heading-implementing-regularization-on-logisticregression-class">Implementing Regularization on <code>LogisticRegression</code> class</h3><p>We only need to modify the <code>calculate_cost</code> method because only this method is responsible for calculating both cost and gradients. The modified version is shown below:</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RegLogisticRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid</span>(<span class="hljs-params">self, x</span>):</span> z = np.array(x) g = np.zeros(z.shape) g = <span class="hljs-number">1</span>/(<span class="hljs-number">1</span> + np.exp(-z) ) <span class="hljs-keyword">return</span> g <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid_derivative</span>(<span class="hljs-params">self, x</span>):</span> derivative = self.sigmoid(x) * (<span class="hljs-number">1</span> - self.sigmoid(x)) <span class="hljs-keyword">return</span> derivative <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self, lambda_</span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: None Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] z = np.dot(self.X, self.parameters) z = z.reshape(<span class="hljs-number">-1</span>) z = z.astype(np.float128, copy=<span class="hljs-literal">False</span>) y_hat = self.sigmoid(z) cost = <span class="hljs-number">-1</span> * np.mean(self.Y*(np.log(y_hat)) + (<span class="hljs-number">1</span>-self.Y)*(np.log(<span class="hljs-number">1</span>-y_hat))) + lambda_ * (np.sum(self.parameters[<span class="hljs-number">1</span>:]**<span class="hljs-number">2</span>))/(<span class="hljs-number">2</span>*m) gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(self.parameters)): <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>: temp = np.mean((y_hat-self.Y)*self.X[:,n]) <span class="hljs-keyword">else</span>: temp = np.mean((y_hat-self.Y)*self.X[:,n]) + lambda_*self.parameters[n]/m gradients[n] = temp <span class="hljs-comment"># gradients = np.dot(self.X.T, error)/m</span> <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns:None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self, X</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = X.copy() mu = np.mean(X, axis=<span class="hljs-number">0</span>) sigma = np.std(X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, lambda_=<span class="hljs-number">0</span>,is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize(self.X) self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost(lambda_=lambda_) self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True, threshold=<span class="hljs-number">0.5</span></span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions : Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> [<span class="hljs-number">1</span> <span class="hljs-keyword">if</span> i > threshold <span class="hljs-keyword">else</span> <span class="hljs-number">0</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> self.sigmoid(np.dot(x,self.parameters))]</code></pre><p>Now we have our regularized version of <code>RegLogisticRegression</code> class. Let's address the previous problem of overfitting on polynomial regression by using a set of values for to pick the right one.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639755979522/0MXdt0T4B.png" alt="10.png" /></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756009035/JPqLNcv2r.png" alt="11.png" /></p><p>I can say that =1 and =10 looks pretty good and they both are able to maintain the generality in hypothesis, the curve is more smooth and less wiggling type. But we can see that as we keep increasing the value if the more our hypothesis starts to <strong>underfit</strong> the data. It basically means that it starts to perform even worst on the training set. Let's visualise the underfitting by plotting cost functions for each </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639756025955/gX8nPqCOC.png" alt="12.png" /></p><p>We can see that as increases cost also increases. So it's advised to select the value for carefully according to your custom dataset.</p><h2 id="heading-conclusion">Conclusion</h2><p>Great work everyone, we successfully learnt and implemented Logistic Regression. Most people don't write their Machine Learning algorithm from scratch instead they use libraries like Scikit-Learn. Scikit-Learn contains wrappers for many Machine Learning algorithms and it's really flexible and easy to use. But it's not harmful to know about the algorithm you're going to use and the best way of doing it is to understand the underlying mathematics and implement it from scratch.</p><p>So in the next article, we'll be making a classification project using the Scikit-Learn library and you'll see how easy it is to use for making some really nice creative projects.</p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. Till then enjoy your day and keep learning.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1639746003675/KrOp7t_9u.jpeg<![CDATA[Linear Regression using Gradient Descent]]>https://swayam-blog.hashnode.dev/linear-regression-using-gradient-descenthttps://swayam-blog.hashnode.dev/linear-regression-using-gradient-descentFri, 03 Dec 2021 10:02:45 GMT<![CDATA[<h2 id="heading-overview">Overview</h2><p>This is the second article of <strong>Demystifying Machine Learning</strong> series, frankly, it is basically the <strong><em>sequel</em></strong> of our previous article where we explained <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-normal-equation"><strong>Linear Regression using Normal equation</strong></a>. In this article we'll be exploring another optimizing algorithm known as <strong>Gradient Descent</strong>, how it works, what is a cost function, mathematics behind gradient descent, Python implementation, Regularization and some extra topics like polynomial regression and using regularized polynomial regression.</p><h2 id="heading-how-gradient-descent-works-intuition">How Gradient Descent works (Intuition)</h2><p>Gradient descent is basically an iterative optimizing algorithm i.e. we can use it to find the minimum of a differential function. Intuitively we can think of a situation where you're standing somewhere on a mountain and you want to go to the foot of that mountain as fast as possible. Since we're in a random position on the mountain, one way is to move along the steepest direction while taking small steps <em>(taking large steps towards steepest direction may get you injured)</em> and we'll see later that taking large steps is also not good for algorithm too. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638466863825/77yNKPIN1.png" alt="im1.png" /></p><p>Now that's something similar we are going to do for optimizing our hypothesis to give the least error. As we learnt in the <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-normal-equation">previous article</a> that we need to find the optimal parameters , that helps us to calculate the best possible hyperplane to fit the data. In that article, we used the Normal equation to directly find those parameters but here that's not gonna happen. </p><p>We are randomly going to pick the parameters and then calculate the cost function that will tell us how much error those random parameters are giving, then we use gradient descent to find the minimum of that cost function and optimize those random parameters into the optimal ones.</p><h2 id="heading-cost-function">Cost Function</h2><p>A cost function is basically a continuous and differentiable function that tells how good an algorithm is performing by returning the amount of error as output. The lesser the error, the better the algorithm is doing that's why we randomly generate the parameters and then keep changing them in order to reach the minimum of that cost function.</p><p>Now let's define the cost function for Linear regression. First, we need to think that in linear regression how we can calculate the error. <strong><em>Mathamtically error is the difference between the original value and the calculated value.</em></strong> Luckily we can use this definition here. Our original values are the <strong>target</strong> matrix itself and the calculated values are the <strong>predictions</strong> from our hypothesis. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638466995178/JyM-eWoK2.jpeg" alt="im2.jpg" /></p><p>We simply subtract the original target value from the predicted target value and take the square of them as the error for a single sample. Ultimately we need to find the <strong>squared error</strong> for all the samples in the dataset and take their <strong>mean</strong> as our final cost for a certain hypothesis. Squaring the difference helps in avoiding the condition when the negative and positive errors nullify each other in the final hypothesis's cost.</p><p>This error function is also known as <strong><em> Mean Square Error (MSE).</em></strong></p><p>So mathematically let's say we have <strong><em>m</em></strong> number of samples in the dataset then:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468533053/6zmgK-BmP.png" alt="1cost_intro.png" /></p><p>It's important to notice that our cost function <strong><em>J()</em></strong> is depend upon the parameters because our <strong>y</strong>(target) and <strong>X</strong> are fixed, the only varying quantity are the parameters and it makes sense because that's how gradient descent will help us in finding the appropriate parameters for a minimum of the cost function.</p><h2 id="heading-mathematics-of-gradient-descent">Mathematics of Gradient Descent</h2><blockquote><p>Time to talk Calculus.</p></blockquote><p>Before diving into the algorithm let's first talk about what is a Gradient? </p><p><strong><em>Gradient of a differentiable function is a vector field whose value at a certain point is a vector whose components are the partial derivatives of that function at that same point.</em></strong> Alright so many big words let's break them down and try to understand what it really is?</p><p>Mathematically suppose you have a function <strong><em>f(x,y,z)</em></strong> then the gradient at some point will be the vector whose components are going to be the partial derivatives of <strong><em>f(x,y,z)</em></strong> w.r.t to <strong>x,y</strong> and <strong>z</strong> at that point.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468619748/ST8KUxBJU.png" alt="2gradient_intro.png" /></p><blockquote><p><strong>Property</strong>: <strong><em>At a certain point, the gradient vector always points towards the direction of the greatest increase of that function</em></strong>. </p><p><strong><em>Since we need to go in the direction of greatest decrease that's why we follow the direction of negative of the gradient vector.</em></strong></p><p><strong><em>Gradient vector is always perpendicular to the contour lines of the graph of a function</em></strong> <em>(we'll be dealing Contour graphs later)</em></p></blockquote><p>Let's visualize the gradient concept using graphs. Say a function <em>f(x,y)</em> as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468696039/LY3Pqd50l.png" alt="3gradient_demo.png" /></p><p>If we plot the above graph, it'll look something like this:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467127850/YLkZ2Ej-A.png" alt="im3.png" /></p><p>If you're aware of vector calculus, then you probably know that Contour plots are very useful for working with 3D curves. A contour plot is basically a 2D graph that is the sliced version of a 3D plot along the z-axis at regular intervals, so if we graph the Contour plot of the above function then it'll look something like:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467150493/PdsqUNH_B.png" alt="im4.png" /></p><p>Now, this graph makes it really clear that gradient always points in the direction of the greatest increase of the function, as we can see that the black arrows represent the direction of the gradient and the red arrow represent the direction where we need to move in our cost function to reach the minimum.</p><p>Great now we know that in order to reach the minimum we need to move in the opposite direction of the gradient that is in the <strong>-f()</strong> direction and keep updating our initial random parameters accordingly.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468715741/4S-q1EM4z.png" alt="4parameter_update.png" /></p><blockquote><ul><li> is the matirx of all parameters <sub>s</sub> </li><li><sub>j</sub> is the parameter for j<sup>th</sup> feature</li><li>J() is the cost function</li><li> is learning rate</li></ul></blockquote><p>Everything seems obvious instead of this symbol . It's known as learning rate, remember we discussed that we need to take small steps, makes sure that our algorithm should take small steps for reaching the minimum. The learning rate is always less than 1.</p><p>But what if we keep a large learning rate?</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467194140/4upLMkJeQ.jpeg" alt="im5.jpg" /></p><p>As we see in the above figure that our cost function will not able to reach a minimum if we take large learning rates and its results in an increment of loss instead of decreasing it as represented below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467207240/7MwUZNTyz.png" alt="im6.png" /></p><h2 id="heading-applying-gradient-descent-to-cost-function">Applying Gradient descent to cost function</h2><p>In this section, we'll be deriving the formulas for gradients so that we can directly use those formulas in Python implementation. Since we already have our cost function as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468737096/6wgjcLkv0v.png" alt="5only_cost.png" /></p><p>expanding X<sup>i</sup> into individual <strong><em>n</em></strong> features as [X<sup>i</sup><sub>1</sub>, X<sup>i</sup><sub>2</sub>, X<sup>i</sup><sub>3</sub>, ....., X<sup>i</sup><sub>n</sub> ] then:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468758810/aMz8aSzT6.png" alt="6cost_expand.png" /></p><p>This form will be easier to understand the calculation of gradients, let's compute them for each <sub>j</sub>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468800716/vWC1EDSXe.png" alt="7deriving_grad.png" /></p><p>so basically we can write the partial derivative of cost function w.r.t to any <sub>j</sub> as :</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468828933/yAnd3z0Wj.png" alt="8grad_derived.png" /></p><p>Now we can loop over each <sub>j</sub> from 0 to <strong><em>n</em></strong> and update them as :</p><blockquote><p><strong>Note</strong>: <em><sub>0</sub> represent the bias term</em></p></blockquote><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468849004/dufhB3ei4.png" alt="9grad_formula.png" /></p><p>That's great, now we have all the tools we need let's jump straight into the code and implement this algorithm in Python.</p><h2 id="heading-python-implementation">Python Implementation</h2><p>In this section, we'll be using Python and the formulas we derived in the previous section to create a Python class that will be able to perform Linear Regression by using Gradient Descent as an optimizing algorithm to work on a dataset.</p><blockquote><p><strong>*Note</strong>: All the code files can be found on Github through <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/linear_regression_gradient_descent">this link</a>.*</p><p><strong><em>And it's highly recommended to follow the notebook along with this section for better understanding.</em></strong></p></blockquote><p>Before we dive into writing code one important observation is to keep in mind that before using gradient descent, <strong><em>it's always helpful to normalize the features around its mean</em></strong>. The reason is that initially in the dataset we can have many independent features and they can have way different values, like on average <em>number of bedrooms</em> can be 3-4 but the <em>area of the house</em> feature can have way large values. Normalizing makes all the values of different features lie on a comparable range and it also makes it easier for the algorithm to identify the patterns.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468868160/_QTaKkaEQs.png" alt="10normalize.png" /></p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LinearRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: None Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] y_hat = np.dot(self.X, self.parameters) y_hat = y_hat.reshape(<span class="hljs-number">-1</span>) error = y_hat - self.Y cost = np.dot(error.T, error)/(<span class="hljs-number">2</span>*m) <span class="hljs-comment"># Modified way to calculate cost</span> gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(self.X.shape[<span class="hljs-number">1</span>]): gradients[i] = np.mean(error * self.X[:,i]) <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns:None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self, X</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = X.copy() mu = np.mean(X, axis=<span class="hljs-number">0</span>) sigma = np.std(X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize(self.X) self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost() self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True</span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions : Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> np.dot(x,self.parameters)</code></pre><p>The class and its methods are pretty obvious, try to go one by one and you'll understand what each method is doing and how it's connected to others.</p><p>Still, I would like to put your focus on 3 main methods:</p><ul><li><strong><code>calculate_cost</code></strong> : This method actually uses the formulas we derived in the previous section to calculate the cost according to certain parameters. If you carefully go through the method you may find a weird thing that initially we mentioned cost as:</li></ul><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468964836/JhGwq6QlG.png" alt="5only_cost.png" /></p><p> but in code we are calculating cost as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468984986/_XdkowoLE.png" alt="11wierd_error.png" /></p><p> No need to be puzzled, they both are the same thing, the second equation is the vectorized form of the first one. If you're aware of Linear Algebra operations you can prove to yourself that they both are the same equations. We preferred the second one because often vectorized operations are faster and efficient instead of using loops.</p><ul><li><p><strong><code>fit</code></strong>: This is the method where the actual magic happens. It firstly normalizes the features then add an extra feature of all 1s for the bias term and lastly, it keeps iterating to calculate the cost and gradients then update each parameter simultaneously.</p><blockquote><p><strong>*Note:</strong> We first normalize the features then we add an extra feature of 1s for bias term because it doesn't make any sense to normalize that extra feature that contains all 1s*</p></blockquote></li><li><p><strong><code>predict</code></strong>: This method first normalizes the input then uses the optimal parameters calculated by the <code>fit</code> method to return the predicted target values.</p><blockquote><p><strong>*Note</strong>: <code>predict</code> method uses the same and that we calculated during the training loop from the training set to normalize the input*.</p></blockquote></li></ul><p>Great now we have our class, it's time to test it on the datasets.</p><h3 id="heading-testing-on-datasets">Testing on datasets</h3><p>In this sub-section, we'll be using Sklearn's generated dataset for linear regression to see how our Linear Regression class is performing. Let's visulaize it on graph:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467397014/Hvi4JBKFI.png" alt="im7.png" /></p><p>Let's create an instance of the <code>LinearRegression</code> class and fit this data on it for 500 epochs to get the optimal parameters for our hypothesis.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467415391/BobtltDs_.png" alt="im8.png" /></p><p>Okay, let's see how this hypothesis looks:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467427082/IMvDPzlum.png" alt="im9.png" /></p><p>It fits nicely, but plotting the <strong>cost</strong> is a great way to assure that everything is working fine, let's do that. <code>LinearRegression</code> class had a property of <code>cost_history</code> and it stores the cost after each iteration, let's plot it:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467458230/fD0nMGPvYm.png" alt="im10.png" /></p><p>We can see that our cost function is always decreasing and it's a good sign to show that our model is working pretty good.</p><p>Before moving on to the next section and discussing Regularization, I want to demonstrate how we can also fit a curve instead of a straight line to a dataset, let's see it in the next sub-section.</p><h3 id="heading-polynomial-regression">Polynomial Regression</h3><p>We basically going to take the generated dataset for linear regression from sklearn and apply some transformation to it to make it non-linear.</p><p><strong><em>Note: For detailed code implementation I recommend you to go through the notebook from <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/linear_regression_gradient_descent/notebook.ipynb">here</a> since for the sake of learning I'm only showing a few code cells for verification</em></strong></p><p>So what we did is generate the data from Sklearn's <code>make_regression</code> of 1 feature and a target column then apply the following transformation on it to make that data non-linear</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469011681/B_g1nUhwZ.png" alt="12transformation.png" /></p><p>and after applying to the dataset, it looks like:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467692506/kDWHE3kiG.png" alt="im15.png" /></p><p>Looks good we are able to introduce non-linearity but it'll be great if it also contains some noise samples, anyway let's start working on this non-linear dataset.</p><p>To make our linear regression predict a non-linear hypothesis we need to create more features (since we have only 1 here) from the features we already have. A popular way to create more features is to perform some polynomial functions on the original features one by one. For this example, we are going to make 6 different features from the original one:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469032162/WkTsNGVO6.png" alt="13add_new_feat.png" /></p><p>We will be stacking these X1, X2, ..., X6 as features to make our final input/feature matrix <strong>X_</strong>. Now let's use this <strong>X_</strong> matrix to predict the optimal curve.</p><blockquote><p><em>The process of fitting and predicting is the same as shown in the previous section, or you can also refer to the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/linear_regression_gradient_descent/notebook.ipynb">notebook</a> for better clarity.</em></p></blockquote><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467741753/ELUJqstMU.png" alt="im16.png" /></p><p>It looks great, our algorithm is able to predict a fine non-linear boundary and it fits our training set very precisely. But there's a problem, we can see that our algorithm is performing very well on the training set and it's also possible that it won't work good on the data outside the training set. This is known as <strong>Overfitting</strong> and it leads to the lack of generality in the hypothesis. </p><p>We are going to address this problem using Regularization in the next section.</p><h2 id="heading-regularization">Regularization</h2><p>With the help of Regularization, we can prevent the problem of overfitting from our algorithm. Overfitting occurs when the algorithm provides heavy parameters to some features according to the training dataset and hyperparameters. This makes those features dominant in the overall hypothesis and lead to a nice fit in the training set but not so good on the samples outside the training set.</p><p>The plan is to add the square of parameters by multiplying them with some big number () to the cost function because our algorithms' main motive is to decrease the cost function so in this way algorithm will end up giving the small parameters just to cancel the effect addition of parameters by multiplying with a large number. So our final cost function gets modified to:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469065654/QYnc7E1kM.png" alt="14regularized_cost.png" /></p><p><strong><em>Note: We denote the bias term as <sub>0</sub> and it's not needed to regularized the bias term that's why we are only considering only <sub>1</sub> to <sub>n</sub> parameters.</em></strong></p><p>Since our cost function is changed that's why our formulas for gradients were also get affected. The new formula for the gradient are:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469094284/VvUR7q0Lt.png" alt="15regularized_grad.png" /></p><p> is known as a regularization parameter and it should be greater than 0. A large value of leads to underfitting and very small values lead to overfitting, so you need to pick the right one for your dataset by iterating on some sample values.</p><p>Let's implement the Regularization by modifying our <code>LinearRegression</code> class. We only need to modify the <code>calculate_cost</code> method because only this method is responsible for calculating both cost and gradients. The modified version is shown below:</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LinearRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self, lambda_=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: lambda_ : value of regularization parameter (default is 0) Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] y_hat = np.dot(self.X, self.parameters) y_hat = y_hat.reshape(<span class="hljs-number">-1</span>) error = y_hat - self.Y cost = (np.dot(error.T, error) + lambda_*np.sum((self.parameters)**<span class="hljs-number">2</span>))/(<span class="hljs-number">2</span>*m) gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(self.X.shape[<span class="hljs-number">1</span>]): gradients[i] = (np.mean(error * self.X[:,i]) + (lambda_*self.parameters[i])/m) <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns:None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = self.X.copy() mu = np.mean(self.X, axis=<span class="hljs-number">0</span>) sigma = np.std(self.X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(self.X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, lambda_=<span class="hljs-number">0</span>, is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize() self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost(lambda_=lambda_) self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True</span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions : Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> np.dot(x,self.parameters)</code></pre><p>Now we have our regularized version of the <code>LinearRegression</code> class. Let's address the previous problem of overfitting on polynomial regression by using a set of values for to pick the right one.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467863656/kXTH7851h.png" alt="im17.png" /></p><p>From the plots, I think that = 10 and = 20 looks good. We can see that as we increase the values of , our algorithm starts to perform even worst on the training set and leads to <strong>Underfitting</strong>. So it gets really important to select the right value of for our dataset.</p><h2 id="heading-conclusion">Conclusion</h2><p>Great work everyone, we successfully learnt and implemented Linear Regression using Gradient Descent. There are a few things that we need to keep in our mind that this optimizing algorithm requires more hyperparameters than the Normal equation that we learnt in the previous article but irrespective of that, gradient descent works efficiently on large datasets covering the drawback of the Normal equation method.</p><p>In the next article we'll be learning our first supervised classification algorithm known as <strong>Logistic Regression</strong> and going to understand how Regularization prevents overfitting there. </p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. Till then enjoy your day and keep learning.</p>]]><![CDATA[<h2 id="heading-overview">Overview</h2><p>This is the second article of <strong>Demystifying Machine Learning</strong> series, frankly, it is basically the <strong><em>sequel</em></strong> of our previous article where we explained <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-normal-equation"><strong>Linear Regression using Normal equation</strong></a>. In this article we'll be exploring another optimizing algorithm known as <strong>Gradient Descent</strong>, how it works, what is a cost function, mathematics behind gradient descent, Python implementation, Regularization and some extra topics like polynomial regression and using regularized polynomial regression.</p><h2 id="heading-how-gradient-descent-works-intuition">How Gradient Descent works (Intuition)</h2><p>Gradient descent is basically an iterative optimizing algorithm i.e. we can use it to find the minimum of a differential function. Intuitively we can think of a situation where you're standing somewhere on a mountain and you want to go to the foot of that mountain as fast as possible. Since we're in a random position on the mountain, one way is to move along the steepest direction while taking small steps <em>(taking large steps towards steepest direction may get you injured)</em> and we'll see later that taking large steps is also not good for algorithm too. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638466863825/77yNKPIN1.png" alt="im1.png" /></p><p>Now that's something similar we are going to do for optimizing our hypothesis to give the least error. As we learnt in the <a target="_blank" href="https://swayam-blog.hashnode.dev/linear-regression-using-normal-equation">previous article</a> that we need to find the optimal parameters , that helps us to calculate the best possible hyperplane to fit the data. In that article, we used the Normal equation to directly find those parameters but here that's not gonna happen. </p><p>We are randomly going to pick the parameters and then calculate the cost function that will tell us how much error those random parameters are giving, then we use gradient descent to find the minimum of that cost function and optimize those random parameters into the optimal ones.</p><h2 id="heading-cost-function">Cost Function</h2><p>A cost function is basically a continuous and differentiable function that tells how good an algorithm is performing by returning the amount of error as output. The lesser the error, the better the algorithm is doing that's why we randomly generate the parameters and then keep changing them in order to reach the minimum of that cost function.</p><p>Now let's define the cost function for Linear regression. First, we need to think that in linear regression how we can calculate the error. <strong><em>Mathamtically error is the difference between the original value and the calculated value.</em></strong> Luckily we can use this definition here. Our original values are the <strong>target</strong> matrix itself and the calculated values are the <strong>predictions</strong> from our hypothesis. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638466995178/JyM-eWoK2.jpeg" alt="im2.jpg" /></p><p>We simply subtract the original target value from the predicted target value and take the square of them as the error for a single sample. Ultimately we need to find the <strong>squared error</strong> for all the samples in the dataset and take their <strong>mean</strong> as our final cost for a certain hypothesis. Squaring the difference helps in avoiding the condition when the negative and positive errors nullify each other in the final hypothesis's cost.</p><p>This error function is also known as <strong><em> Mean Square Error (MSE).</em></strong></p><p>So mathematically let's say we have <strong><em>m</em></strong> number of samples in the dataset then:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468533053/6zmgK-BmP.png" alt="1cost_intro.png" /></p><p>It's important to notice that our cost function <strong><em>J()</em></strong> is depend upon the parameters because our <strong>y</strong>(target) and <strong>X</strong> are fixed, the only varying quantity are the parameters and it makes sense because that's how gradient descent will help us in finding the appropriate parameters for a minimum of the cost function.</p><h2 id="heading-mathematics-of-gradient-descent">Mathematics of Gradient Descent</h2><blockquote><p>Time to talk Calculus.</p></blockquote><p>Before diving into the algorithm let's first talk about what is a Gradient? </p><p><strong><em>Gradient of a differentiable function is a vector field whose value at a certain point is a vector whose components are the partial derivatives of that function at that same point.</em></strong> Alright so many big words let's break them down and try to understand what it really is?</p><p>Mathematically suppose you have a function <strong><em>f(x,y,z)</em></strong> then the gradient at some point will be the vector whose components are going to be the partial derivatives of <strong><em>f(x,y,z)</em></strong> w.r.t to <strong>x,y</strong> and <strong>z</strong> at that point.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468619748/ST8KUxBJU.png" alt="2gradient_intro.png" /></p><blockquote><p><strong>Property</strong>: <strong><em>At a certain point, the gradient vector always points towards the direction of the greatest increase of that function</em></strong>. </p><p><strong><em>Since we need to go in the direction of greatest decrease that's why we follow the direction of negative of the gradient vector.</em></strong></p><p><strong><em>Gradient vector is always perpendicular to the contour lines of the graph of a function</em></strong> <em>(we'll be dealing Contour graphs later)</em></p></blockquote><p>Let's visualize the gradient concept using graphs. Say a function <em>f(x,y)</em> as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468696039/LY3Pqd50l.png" alt="3gradient_demo.png" /></p><p>If we plot the above graph, it'll look something like this:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467127850/YLkZ2Ej-A.png" alt="im3.png" /></p><p>If you're aware of vector calculus, then you probably know that Contour plots are very useful for working with 3D curves. A contour plot is basically a 2D graph that is the sliced version of a 3D plot along the z-axis at regular intervals, so if we graph the Contour plot of the above function then it'll look something like:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467150493/PdsqUNH_B.png" alt="im4.png" /></p><p>Now, this graph makes it really clear that gradient always points in the direction of the greatest increase of the function, as we can see that the black arrows represent the direction of the gradient and the red arrow represent the direction where we need to move in our cost function to reach the minimum.</p><p>Great now we know that in order to reach the minimum we need to move in the opposite direction of the gradient that is in the <strong>-f()</strong> direction and keep updating our initial random parameters accordingly.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468715741/4S-q1EM4z.png" alt="4parameter_update.png" /></p><blockquote><ul><li> is the matirx of all parameters <sub>s</sub> </li><li><sub>j</sub> is the parameter for j<sup>th</sup> feature</li><li>J() is the cost function</li><li> is learning rate</li></ul></blockquote><p>Everything seems obvious instead of this symbol . It's known as learning rate, remember we discussed that we need to take small steps, makes sure that our algorithm should take small steps for reaching the minimum. The learning rate is always less than 1.</p><p>But what if we keep a large learning rate?</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467194140/4upLMkJeQ.jpeg" alt="im5.jpg" /></p><p>As we see in the above figure that our cost function will not able to reach a minimum if we take large learning rates and its results in an increment of loss instead of decreasing it as represented below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467207240/7MwUZNTyz.png" alt="im6.png" /></p><h2 id="heading-applying-gradient-descent-to-cost-function">Applying Gradient descent to cost function</h2><p>In this section, we'll be deriving the formulas for gradients so that we can directly use those formulas in Python implementation. Since we already have our cost function as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468737096/6wgjcLkv0v.png" alt="5only_cost.png" /></p><p>expanding X<sup>i</sup> into individual <strong><em>n</em></strong> features as [X<sup>i</sup><sub>1</sub>, X<sup>i</sup><sub>2</sub>, X<sup>i</sup><sub>3</sub>, ....., X<sup>i</sup><sub>n</sub> ] then:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468758810/aMz8aSzT6.png" alt="6cost_expand.png" /></p><p>This form will be easier to understand the calculation of gradients, let's compute them for each <sub>j</sub>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468800716/vWC1EDSXe.png" alt="7deriving_grad.png" /></p><p>so basically we can write the partial derivative of cost function w.r.t to any <sub>j</sub> as :</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468828933/yAnd3z0Wj.png" alt="8grad_derived.png" /></p><p>Now we can loop over each <sub>j</sub> from 0 to <strong><em>n</em></strong> and update them as :</p><blockquote><p><strong>Note</strong>: <em><sub>0</sub> represent the bias term</em></p></blockquote><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468849004/dufhB3ei4.png" alt="9grad_formula.png" /></p><p>That's great, now we have all the tools we need let's jump straight into the code and implement this algorithm in Python.</p><h2 id="heading-python-implementation">Python Implementation</h2><p>In this section, we'll be using Python and the formulas we derived in the previous section to create a Python class that will be able to perform Linear Regression by using Gradient Descent as an optimizing algorithm to work on a dataset.</p><blockquote><p><strong>*Note</strong>: All the code files can be found on Github through <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/linear_regression_gradient_descent">this link</a>.*</p><p><strong><em>And it's highly recommended to follow the notebook along with this section for better understanding.</em></strong></p></blockquote><p>Before we dive into writing code one important observation is to keep in mind that before using gradient descent, <strong><em>it's always helpful to normalize the features around its mean</em></strong>. The reason is that initially in the dataset we can have many independent features and they can have way different values, like on average <em>number of bedrooms</em> can be 3-4 but the <em>area of the house</em> feature can have way large values. Normalizing makes all the values of different features lie on a comparable range and it also makes it easier for the algorithm to identify the patterns.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468868160/_QTaKkaEQs.png" alt="10normalize.png" /></p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LinearRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: None Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] y_hat = np.dot(self.X, self.parameters) y_hat = y_hat.reshape(<span class="hljs-number">-1</span>) error = y_hat - self.Y cost = np.dot(error.T, error)/(<span class="hljs-number">2</span>*m) <span class="hljs-comment"># Modified way to calculate cost</span> gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(self.X.shape[<span class="hljs-number">1</span>]): gradients[i] = np.mean(error * self.X[:,i]) <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns:None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self, X</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = X.copy() mu = np.mean(X, axis=<span class="hljs-number">0</span>) sigma = np.std(X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize(self.X) self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost() self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True</span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions : Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> np.dot(x,self.parameters)</code></pre><p>The class and its methods are pretty obvious, try to go one by one and you'll understand what each method is doing and how it's connected to others.</p><p>Still, I would like to put your focus on 3 main methods:</p><ul><li><strong><code>calculate_cost</code></strong> : This method actually uses the formulas we derived in the previous section to calculate the cost according to certain parameters. If you carefully go through the method you may find a weird thing that initially we mentioned cost as:</li></ul><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468964836/JhGwq6QlG.png" alt="5only_cost.png" /></p><p> but in code we are calculating cost as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638468984986/_XdkowoLE.png" alt="11wierd_error.png" /></p><p> No need to be puzzled, they both are the same thing, the second equation is the vectorized form of the first one. If you're aware of Linear Algebra operations you can prove to yourself that they both are the same equations. We preferred the second one because often vectorized operations are faster and efficient instead of using loops.</p><ul><li><p><strong><code>fit</code></strong>: This is the method where the actual magic happens. It firstly normalizes the features then add an extra feature of all 1s for the bias term and lastly, it keeps iterating to calculate the cost and gradients then update each parameter simultaneously.</p><blockquote><p><strong>*Note:</strong> We first normalize the features then we add an extra feature of 1s for bias term because it doesn't make any sense to normalize that extra feature that contains all 1s*</p></blockquote></li><li><p><strong><code>predict</code></strong>: This method first normalizes the input then uses the optimal parameters calculated by the <code>fit</code> method to return the predicted target values.</p><blockquote><p><strong>*Note</strong>: <code>predict</code> method uses the same and that we calculated during the training loop from the training set to normalize the input*.</p></blockquote></li></ul><p>Great now we have our class, it's time to test it on the datasets.</p><h3 id="heading-testing-on-datasets">Testing on datasets</h3><p>In this sub-section, we'll be using Sklearn's generated dataset for linear regression to see how our Linear Regression class is performing. Let's visulaize it on graph:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467397014/Hvi4JBKFI.png" alt="im7.png" /></p><p>Let's create an instance of the <code>LinearRegression</code> class and fit this data on it for 500 epochs to get the optimal parameters for our hypothesis.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467415391/BobtltDs_.png" alt="im8.png" /></p><p>Okay, let's see how this hypothesis looks:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467427082/IMvDPzlum.png" alt="im9.png" /></p><p>It fits nicely, but plotting the <strong>cost</strong> is a great way to assure that everything is working fine, let's do that. <code>LinearRegression</code> class had a property of <code>cost_history</code> and it stores the cost after each iteration, let's plot it:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467458230/fD0nMGPvYm.png" alt="im10.png" /></p><p>We can see that our cost function is always decreasing and it's a good sign to show that our model is working pretty good.</p><p>Before moving on to the next section and discussing Regularization, I want to demonstrate how we can also fit a curve instead of a straight line to a dataset, let's see it in the next sub-section.</p><h3 id="heading-polynomial-regression">Polynomial Regression</h3><p>We basically going to take the generated dataset for linear regression from sklearn and apply some transformation to it to make it non-linear.</p><p><strong><em>Note: For detailed code implementation I recommend you to go through the notebook from <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/linear_regression_gradient_descent/notebook.ipynb">here</a> since for the sake of learning I'm only showing a few code cells for verification</em></strong></p><p>So what we did is generate the data from Sklearn's <code>make_regression</code> of 1 feature and a target column then apply the following transformation on it to make that data non-linear</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469011681/B_g1nUhwZ.png" alt="12transformation.png" /></p><p>and after applying to the dataset, it looks like:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467692506/kDWHE3kiG.png" alt="im15.png" /></p><p>Looks good we are able to introduce non-linearity but it'll be great if it also contains some noise samples, anyway let's start working on this non-linear dataset.</p><p>To make our linear regression predict a non-linear hypothesis we need to create more features (since we have only 1 here) from the features we already have. A popular way to create more features is to perform some polynomial functions on the original features one by one. For this example, we are going to make 6 different features from the original one:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469032162/WkTsNGVO6.png" alt="13add_new_feat.png" /></p><p>We will be stacking these X1, X2, ..., X6 as features to make our final input/feature matrix <strong>X_</strong>. Now let's use this <strong>X_</strong> matrix to predict the optimal curve.</p><blockquote><p><em>The process of fitting and predicting is the same as shown in the previous section, or you can also refer to the <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/blob/master/linear_regression_gradient_descent/notebook.ipynb">notebook</a> for better clarity.</em></p></blockquote><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467741753/ELUJqstMU.png" alt="im16.png" /></p><p>It looks great, our algorithm is able to predict a fine non-linear boundary and it fits our training set very precisely. But there's a problem, we can see that our algorithm is performing very well on the training set and it's also possible that it won't work good on the data outside the training set. This is known as <strong>Overfitting</strong> and it leads to the lack of generality in the hypothesis. </p><p>We are going to address this problem using Regularization in the next section.</p><h2 id="heading-regularization">Regularization</h2><p>With the help of Regularization, we can prevent the problem of overfitting from our algorithm. Overfitting occurs when the algorithm provides heavy parameters to some features according to the training dataset and hyperparameters. This makes those features dominant in the overall hypothesis and lead to a nice fit in the training set but not so good on the samples outside the training set.</p><p>The plan is to add the square of parameters by multiplying them with some big number () to the cost function because our algorithms' main motive is to decrease the cost function so in this way algorithm will end up giving the small parameters just to cancel the effect addition of parameters by multiplying with a large number. So our final cost function gets modified to:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469065654/QYnc7E1kM.png" alt="14regularized_cost.png" /></p><p><strong><em>Note: We denote the bias term as <sub>0</sub> and it's not needed to regularized the bias term that's why we are only considering only <sub>1</sub> to <sub>n</sub> parameters.</em></strong></p><p>Since our cost function is changed that's why our formulas for gradients were also get affected. The new formula for the gradient are:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638469094284/VvUR7q0Lt.png" alt="15regularized_grad.png" /></p><p> is known as a regularization parameter and it should be greater than 0. A large value of leads to underfitting and very small values lead to overfitting, so you need to pick the right one for your dataset by iterating on some sample values.</p><p>Let's implement the Regularization by modifying our <code>LinearRegression</code> class. We only need to modify the <code>calculate_cost</code> method because only this method is responsible for calculating both cost and gradients. The modified version is shown below:</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LinearRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.parameters = <span class="hljs-literal">None</span> self.cost_history = [] self.mu = <span class="hljs-literal">None</span> self.sigma = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_cost</span>(<span class="hljs-params">self, lambda_=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Returns the cost and gradients. parameters: lambda_ : value of regularization parameter (default is 0) Returns: cost : Caculated loss (scalar). gradients: array containing the gradients w.r.t each parameter """</span> m = self.X.shape[<span class="hljs-number">0</span>] y_hat = np.dot(self.X, self.parameters) y_hat = y_hat.reshape(<span class="hljs-number">-1</span>) error = y_hat - self.Y cost = (np.dot(error.T, error) + lambda_*np.sum((self.parameters)**<span class="hljs-number">2</span>))/(<span class="hljs-number">2</span>*m) gradients = np.zeros(self.X.shape[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(self.X.shape[<span class="hljs-number">1</span>]): gradients[i] = (np.mean(error * self.X[:,i]) + (lambda_*self.parameters[i])/m) <span class="hljs-keyword">return</span> cost, gradients <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init_parameters</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Initialize the parameters as array of 0s parameters: None Returns:None """</span> self.parameters = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">feature_normalize</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Normalize the samples. parameters: X : input/feature matrix Returns: X_norm : Normalized X. """</span> X_norm = self.X.copy() mu = np.mean(self.X, axis=<span class="hljs-number">0</span>) sigma = np.std(self.X, axis=<span class="hljs-number">0</span>) self.mu = mu self.sigma = sigma <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(self.X.shape[<span class="hljs-number">1</span>]): X_norm[:,n] = (X_norm[:,n] - mu[n]) / sigma[n] <span class="hljs-keyword">return</span> X_norm <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self, x, y, learning_rate=<span class="hljs-number">0.01</span>, epochs=<span class="hljs-number">500</span>, lambda_=<span class="hljs-number">0</span>, is_normalize=True, verbose=<span class="hljs-number">0</span></span>):</span> <span class="hljs-string">""" Iterates and find the optimal parameters for input dataset parameters: x : input/feature matrix y : target matrix learning_rate: between 0 and 1 (default is 0.01) epochs: number of iterations (default is 500) is_normalize: boolean, for normalizing features (default is True) verbose: iterations after to print cost Returns: parameters : Array of optimal value of weights. """</span> self.X = x self.Y = y self.cost_history = [] <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) is_normalize = <span class="hljs-literal">False</span> <span class="hljs-keyword">if</span> is_normalize: self.X = self.feature_normalize() self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.init_parameters() <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(epochs): cost, gradients = self.calculate_cost(lambda_=lambda_) self.cost_history.append(cost) self.parameters -= learning_rate * gradients.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> verbose: <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (i % verbose): print(<span class="hljs-string">f"Cost after <span class="hljs-subst">{i}</span> epochs: <span class="hljs-subst">{cost}</span>"</span>) <span class="hljs-keyword">return</span> self.parameters <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self,x, is_normalize=True</span>):</span> <span class="hljs-string">""" Returns the predictions after fitting. parameters: x : input/feature matrix Returns: predictions : Array of predicted target values. """</span> x = np.array(x, dtype=np.float64) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-keyword">if</span> is_normalize: <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(x.shape[<span class="hljs-number">1</span>]): x[:,n] = (x[:,n] - self.mu[n]) / self.sigma[n] x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> np.dot(x,self.parameters)</code></pre><p>Now we have our regularized version of the <code>LinearRegression</code> class. Let's address the previous problem of overfitting on polynomial regression by using a set of values for to pick the right one.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1638467863656/kXTH7851h.png" alt="im17.png" /></p><p>From the plots, I think that = 10 and = 20 looks good. We can see that as we increase the values of , our algorithm starts to perform even worst on the training set and leads to <strong>Underfitting</strong>. So it gets really important to select the right value of for our dataset.</p><h2 id="heading-conclusion">Conclusion</h2><p>Great work everyone, we successfully learnt and implemented Linear Regression using Gradient Descent. There are a few things that we need to keep in our mind that this optimizing algorithm requires more hyperparameters than the Normal equation that we learnt in the previous article but irrespective of that, gradient descent works efficiently on large datasets covering the drawback of the Normal equation method.</p><p>In the next article we'll be learning our first supervised classification algorithm known as <strong>Logistic Regression</strong> and going to understand how Regularization prevents overfitting there. </p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. Till then enjoy your day and keep learning.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1638466497113/pN9wKsCH7.jpeg<![CDATA[Linear Regression using Normal Equation]]>https://swayam-blog.hashnode.dev/linear-regression-using-normal-equationhttps://swayam-blog.hashnode.dev/linear-regression-using-normal-equationFri, 26 Nov 2021 19:47:04 GMT<![CDATA[<h2 id="heading-overview">Overview</h2><p>Linear Regression is the first algorithm in the <strong>Demystifying Machine Learning</strong> series and in this article we'll be discussing Linear Regression using Normal equation. This article covers what is Linear Regression, how it works, the maths behind the normal equation method, fixing some edge cases, handling overfitting and code implementation.</p><h2 id="heading-what-is-linear-regression">What is Linear Regression?</h2><p>Linear Regression in simple terms is fitting the best possible linear hypothesis <em>(a line or a hyperplane)</em> on the data having a linear relationship so that we can predict the new unknown data point with the least possible error. It's not necessary to have a linear relationship in data but having such can lead to approximately close predictions. For a reference take a look at the below representation.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922660881/L_Jxcc_92.png" alt="intro_linear_reg.png" /></p><p>In the above picture, only 1 feature <em>(along the x-axis)</em> and the target <em>(along the y-axis)</em> is displayed just for the sake of simplicity and we can see that the red line is fitting the data very nicely covering most of the variance. </p><p>One thing is to be noted that we call it Linear Regression but it's not always fitting a line, we call it hypothesis or hyperplane. If we have N-Dimensional data <em>(data having N number of features)</em> then we can fit a hyperplane of atmost N-Dimensions.</p><h2 id="heading-mathematics-behind-the-scenes">Mathematics behind the scenes</h2><p>Let's take a very simpler problem and dataset to derive and mimic the algorithm we are going to use in Linear Regression. </p><p>Assume we have a dataset in which we have only 1 feature say <strong><em>x</em></strong> and target as <strong><em>y</em></strong> such that <strong><code>x = [1,2,3]</code></strong> and <strong><code>y = [1,2,2]</code></strong> and we are going to fit the best possible line on this dataset.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922695043/JrJqCdQvo.png" alt="plot1.png" /></p><p>In the above plot, we can see that the feature is displayed on the <em>x-axis</em> and the target is along <em>y-axis</em> and the blue line is the best possible hypothesis through the points. <strong>Now all we need to understand is how to come up with this best possible hypothesis that is the above blue line in our case</strong>.</p><p>The equation of any line is in the format of <strong><code>y = wx + b</code></strong> where <strong>w</strong> is the slope and <strong>b</strong> is the intercept. In machine learning lingo we call <strong>w</strong> as <strong>weight</strong> and <strong>b</strong> as <strong>bias</strong>. For above line it came out to be <strong>w = 0.5</strong> and <strong>b = 0.667 </strong> <em>(Don't worry! we'll see how).</em></p><p>Now ultimately we can say that somehow we need to calculate the <strong>weight</strong> and <strong>bias</strong> term/terms <em>(since 'x' is already known to us)</em> for hypothesis and then we could use them to get the line's equation.</p><h3 id="heading-linear-algebra">Linear Algebra</h3><p>We are going to use some Linear Algebra concepts for finding the right <strong>weight</strong> and <strong>bias</strong> terms for our data. </p><p>Let say the weight and bias terms are <strong>w</strong> and <strong>b</strong> respectively. So we can write each data point (x, y) as : </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923298112/ZrWBdVRT1.png" alt="Screenshot 2021-11-26 at 4.11.19 PM.png" /></p><p>we can write the above equations as a system of equations using matrices as <strong>X = Y</strong> where <strong>X</strong> is input / feature matrix, <strong></strong> is matrix for unknowns and <strong>Y</strong> is the target matrix as: </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923333788/PsWHA4D9v.png" alt="Screenshot 2021-11-26 at 4.12.21 PM.png" /></p><p>Great, Now all we need is to solve this system of equations and get the <strong>w</strong> and <strong>b</strong> terms.</p><p>Wait there's a problem. We can't solve the above system of equations because target matrix <strong>Y</strong> does not lie in the column space of input matrix <strong>X</strong>. In simple terms, if we see the previous graph again then we can notice that our data points are not collinear i.e. they don't lie on the line so and that's why we can't find the <strong>w</strong> and <strong>b</strong> for the above system of equations.</p><p>And if we think for a moment then it sounds right because in Linear Regression we fit a hypothesis to predict the target for some input with the least possible error. We do not intend to predict the exact target.</p><p>So what we can do here? We can't solve the above system of equations because <strong>Y</strong> is not in the column space of <strong>X</strong>. So instead we can project the <strong>Y</strong> onto the column space of <strong>X</strong>. It is exactly equivalent to projecting one vector onto another.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922751260/n1hwkT-KI.png" alt="Projection_and_rejection.png" /></p><p>In the above representation, <strong>a</strong> and <strong>b</strong> are two vectors and the <strong>a<sub>1</sub></strong> is the projection of vector <strong>a</strong> onto <strong>b</strong>. With this, we can see that now we have the component of vector <strong>a</strong> that lies in the vector space of <strong>b</strong>. </p><p>We can achieve the component of <strong>Y</strong> that lies in the column space of <strong>X</strong> by doing inner product <em>(also known as dot product)</em>. </p><blockquote><p>The inner product of two vectors <strong>a</strong> and <strong>b</strong> can be found by calculating <strong>a<sup>T</sup>b</strong> </p></blockquote><p>Now we can re-write our system of equations as:$$X \Theta = Y \$$</p><center>multiplying both sides by X<sup>T</sup>.</center><p>$$X^{T}X\Theta = X^{T}Y \ $$</p><center><b>Assuming (X<sup>T</sup>X) to be invertible</b></center><p>$$\Theta = (X^{T}X)^{-1}X^{T}Y$$</p><p>The above equation is known as <strong>Normal equation</strong>. Now we have the formula to find our matrix , let's use it and calculate the <strong>w</strong> and <strong>b</strong>. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923424277/FV3gjG7gsX.png" alt="Screenshot 2021-11-26 at 4.13.45 PM.png" /></p><p>from the above last equation we have our <strong>w</strong> = <strong>0.5</strong> and <strong>b</strong> = <strong>2/3</strong> <em>(0.6667)</em> and we can check from the equation of blue line that our <strong>w</strong> and <strong>b</strong> are exactly correct. That's how we can get the weights and bias terms for our perfect hypothesis using the <strong><em>Normal equation</em></strong>.</p><h2 id="heading-python-implementation">Python Implementation</h2><p>Now it's time to get our hands dirty with code implementation and make our algorithm work for real datasets. In this section we're going to create a class <strong><code>NormalLinearRegression</code></strong> to handle all the computations for us and return the optimal values of <strong>weights</strong>.</p><blockquote><p>Note: All the code files can be found on Github through <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/linear_regression">this link</a>.</p><p><strong><em>And it's highly recommended to follow the notebook along with this section for better understanding.</em></strong></p></blockquote><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NormalLinearRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.theta = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self,x,y</span>):</span> <span class="hljs-string">""" Returns the optimal weights. parameters: x : input/feature matrix y : target matrix Returns: theta: Array of the optimal value of weights. """</span> self.X = x <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-comment"># adding extra column of 1s for the bias term</span> self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>], <span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.Y = y self.theta = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) self.theta = self.calculate_theta() self.theta = self.theta.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> self.theta <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self, x</span>):</span> <span class="hljs-string">""" Returns the predicted target. parameters: x : test input/feature matrix Returns: y: predicted target value. """</span> x = np.array(x) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-comment"># adding extra dimension in front</span> x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> np.dot(x,self.theta) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_theta</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Calculate the optimal weights. parameters: None Returns: theta_temp: Array containing the calculated value of weights """</span> y_projection = np.dot(self.X.T, self.Y) cov = np.dot(self.X.T, self.X) cov_inv = np.linalg.pinv(cov) theta_temp = np.dot(cov_inv, y_projection) <span class="hljs-keyword">return</span> theta_temp</code></pre><p>Code is pretty self-explanatory and I added comments wherever I found necessary but still, I want to point out a few things. We are adding an extra column of 1s to <strong>X</strong> for the <strong>bias</strong> term. Since our <strong><code>theta</code></strong> matrix had both <strong>weight</strong> and <strong>bias</strong> terms so <strong><em>we just added an extra column of 1s so that matrix multiplication handles the addition of bias term</em></strong>.</p><p>We are going to test our hypothesis on two datasets. Our first dataset contains 2 columns in which the first one <em>(only feature)</em> is the population of a city (in 10,000s) and the second column is the profit of a food truck in that city (in $10,000s). A negative value for profit indicates a loss. Let's visualize it on the graph.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922789942/TJep5PtnG.png" alt="plot2.png" /></p><p>Now let's use our <strong><code>NormalLinearRegression</code></strong> class to find the best hypothesis to fit on our data.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922799172/yZJFn1iX2.png" alt="plot3.png" /></p><p>Great now let's find the predictions using the <code>params</code>. Class <strong><code>NormalLinearRegression</code></strong> had a method <strong><code>predict</code></strong> that we can use to get the predictions and after that, we can use them to draw the hypothesis as shown below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922812953/q967vG0BaE.png" alt="plot4.png" /></p><p>Okay, our hypothesis looks pretty nice. Now let's take a dataset that has multiple features, for the sake of graphical representation our next dataset contains a training set of housing prices in Portland, Oregon. The first column is the size of the house (in square feet), the second column is the number of bedrooms, and the third column is the price of the house. So in this dataset, we have 2 features and 1 target. Let's visualize it on the graph.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922990293/V5UjxLgho.png" alt="plot5.png" /></p><blockquote><p><strong><em>If you're wondering how I plot them, just visit the repo for this algorithm through <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/linear_regression">this link</a> and you'll find the notebook where all the implementations are already done for you.</em></strong> </p></blockquote><p>Now let's find the weights of the best hypothesis for this dataset.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923012695/9bN-O3FZb.png" alt="plot6.png" /></p><p>Awesome, now let's find the predictions and plot the hypothesis for this dataset using the <code>predict</code> method of our class.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923027083/wTvqinUK5.png" alt="plot7.png" /></p><p>Yeah I know, it looks messy but we can uniform it as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923062520/TK7bjvXGj.png" alt="plot8.png" /></p><p>Now the thing to be noted is that it's not a straight line, plotting a 3D graph over a 2D surface may give a feel that it's a line but it's not. It's an N-1dimensional hyperplane and in our case, it's a 2D plane.</p><p>Great work people, so far we designed our own Linear Regression algorithm using a Normal equation and tested it on 2 datasets with single and multiple features, you can even try it on your custom dataset to see how it works. So far so good.</p><p>BUT still, there's an edge case is left, let's handle it in the next section.</p><h2 id="heading-what-if-xlesssupgreatertlesssupgreaterx-is-non-invertible">What if (X<sup>T</sup>X) is non-invertible?</h2><p>When we are deriving the Normal equation, we assumed that (X<sup>T</sup>X) to be invertible and then we calculated its inverse to find the matrix . But what if it's not invertible?</p><p>Let's discuss the cases when it cannot be invertible:-</p><ul> <li>(X<sup>T</sup>X) is not a square matrix</li> <li>The columns or rows of (X<sup>T</sup>X) are not independent</li></ul><p>The 1<sup>st</sup> case is obviously wrong, let's see how?</p><p>Suppose the dimensions of X are (m,n) then the dimensions of X<sup>T</sup> will be (n,m). So after performing matrix multiplication the dimensions of (X<sup>T</sup>X) will be (m,m) and hence it's a square matrix.</p><p>But the 2<sup>nd</sup> case can be true, let's see how?</p><p>Suppose you have a dataset and in which the features are not linearly independent. For example, let's say there's a feature labelled as <strong>weight in Kg</strong> and another feature labelled as <strong>weight in pounds</strong>, both the features are linearly dependent i.e we can get one feature by performing some linear transformations on another feature and this can make (X<sup>T</sup>X) as non-invertible. </p><p>Although in our python implementation we used <strong><code>np.linalg.pinv()</code></strong> function to calculate the inverse and it uses <a target="_blank" href="https://youtu.be/rYz83XPxiZo">Singular Value Decomposition</a> to return the pseudo-inverse of the matrix if it's non-invertible.</p><p>Another way to remove such ambiguity is to identify those features and remove them manually. OR we can use Regularization and make it invertible. Let's see how we can use Regularization to achieve this. </p><h3 id="heading-regularized-normal-equation">Regularized Normal Equation</h3><p>In Regularization, we add an extra Identity matrix of dimensions (n+1, n+1) to (X<sup>T</sup>X) where n is the number of features and adding extra 1 denotes the extra column of 1s for bias term.</p><p>Regularized Normal equation can be written as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923507020/OivEeQSrk.png" alt="Screenshot 2021-11-26 at 4.15.07 PM.png" /></p><p>Now let's understand the above equation, look at the 1<sup>st</sup> column it has only 0s no 1s because the 1st column in X is the column of 1s for the bias term and we do not regularize that column. Mathematically it can be proven that (X<sup>T</sup>X + M) is always invertible.</p><p> is called the <strong>regularization parameter</strong>. You need to set it according to your dataset by choosing from a <strong>set of values that should be greater than 0</strong> and select the one which gives the least <strong>root mean square error</strong> on your training set other than = 0. </p><p>Let's implement this in code and see how this method works. We only need to change the <strong><code>calculate_theta</code></strong> method of our class which is reponsible for the calculation of <strong>(X<sup>T</sup>X)<sup>-1</sup>X<sup>T</sup>Y</strong> .</p><p>The modified <strong><code>calculate_theta</code></strong> method should look something like this:</p><pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_theta</span>(<span class="hljs-params">self, lambda_</span>):</span> <span class="hljs-string">""" Calculate the optimal weights. parameters: None Returns: theta_temp: Array containing the calculated value of weights """</span> y_projection = np.dot(self.X.T, self.Y) <span class="hljs-comment"># Creating matrix M (identity matrix with fist element 0)</span> M = np.identity(self.X.shape[<span class="hljs-number">1</span>]) M[<span class="hljs-number">0</span>,<span class="hljs-number">0</span>] = <span class="hljs-number">0</span> cov = np.dot(self.X.T, self.X) + lambda_*M <span class="hljs-comment"># adding lambda_ times M to X.T@X</span> cov_inv = np.linalg.pinv(cov) theta_temp = np.dot(cov_inv, y_projection) <span class="hljs-keyword">return</span> theta_temp</code></pre><p>We don't need to change anything else in our class and now let's pick some random values for and pick the best one.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923087755/GWeZph9vW.png" alt="plot9.png" /></p><p>Intuitively we can see that plots for = 0 and = 10 are quite good, just to be sure we stored the root mean squared error in a dictionary <strong><code>errors</code></strong> let's print it out and see which got the least error.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923101471/9m3TgvwXF.png" alt="plot10.png" /></p><p>We can see that = 0 got the least error and = 10 is slightly greater than that but just be on the safer side we will pick = 10 for our hypothesis.</p><h2 id="heading-conclusion">Conclusion</h2><p>Great work everyone, we have successfully learnt and implemented our first machine learning algorithm of Linear Regression using a Normal equation on 2 datasets. There are a few things that we need to keep in our mind while using the Normal equation method. <strong><em>This method is not efficient with respect to computation time on large datasets (samples > 10,000) and the reason is the operation of calculating the inverse of a matrix that had a time complexity of around O(n<sup>3</sup>).</em></strong> </p><p>So in that case <strong><em>Gradient Descent</em></strong> method comes very handy and we'll be going to learn that in our next article. Also, we'll be exploring Regularization in great depth too. </p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. Till then enjoy your day and keep learning.</p>]]><![CDATA[<h2 id="heading-overview">Overview</h2><p>Linear Regression is the first algorithm in the <strong>Demystifying Machine Learning</strong> series and in this article we'll be discussing Linear Regression using Normal equation. This article covers what is Linear Regression, how it works, the maths behind the normal equation method, fixing some edge cases, handling overfitting and code implementation.</p><h2 id="heading-what-is-linear-regression">What is Linear Regression?</h2><p>Linear Regression in simple terms is fitting the best possible linear hypothesis <em>(a line or a hyperplane)</em> on the data having a linear relationship so that we can predict the new unknown data point with the least possible error. It's not necessary to have a linear relationship in data but having such can lead to approximately close predictions. For a reference take a look at the below representation.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922660881/L_Jxcc_92.png" alt="intro_linear_reg.png" /></p><p>In the above picture, only 1 feature <em>(along the x-axis)</em> and the target <em>(along the y-axis)</em> is displayed just for the sake of simplicity and we can see that the red line is fitting the data very nicely covering most of the variance. </p><p>One thing is to be noted that we call it Linear Regression but it's not always fitting a line, we call it hypothesis or hyperplane. If we have N-Dimensional data <em>(data having N number of features)</em> then we can fit a hyperplane of atmost N-Dimensions.</p><h2 id="heading-mathematics-behind-the-scenes">Mathematics behind the scenes</h2><p>Let's take a very simpler problem and dataset to derive and mimic the algorithm we are going to use in Linear Regression. </p><p>Assume we have a dataset in which we have only 1 feature say <strong><em>x</em></strong> and target as <strong><em>y</em></strong> such that <strong><code>x = [1,2,3]</code></strong> and <strong><code>y = [1,2,2]</code></strong> and we are going to fit the best possible line on this dataset.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922695043/JrJqCdQvo.png" alt="plot1.png" /></p><p>In the above plot, we can see that the feature is displayed on the <em>x-axis</em> and the target is along <em>y-axis</em> and the blue line is the best possible hypothesis through the points. <strong>Now all we need to understand is how to come up with this best possible hypothesis that is the above blue line in our case</strong>.</p><p>The equation of any line is in the format of <strong><code>y = wx + b</code></strong> where <strong>w</strong> is the slope and <strong>b</strong> is the intercept. In machine learning lingo we call <strong>w</strong> as <strong>weight</strong> and <strong>b</strong> as <strong>bias</strong>. For above line it came out to be <strong>w = 0.5</strong> and <strong>b = 0.667 </strong> <em>(Don't worry! we'll see how).</em></p><p>Now ultimately we can say that somehow we need to calculate the <strong>weight</strong> and <strong>bias</strong> term/terms <em>(since 'x' is already known to us)</em> for hypothesis and then we could use them to get the line's equation.</p><h3 id="heading-linear-algebra">Linear Algebra</h3><p>We are going to use some Linear Algebra concepts for finding the right <strong>weight</strong> and <strong>bias</strong> terms for our data. </p><p>Let say the weight and bias terms are <strong>w</strong> and <strong>b</strong> respectively. So we can write each data point (x, y) as : </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923298112/ZrWBdVRT1.png" alt="Screenshot 2021-11-26 at 4.11.19 PM.png" /></p><p>we can write the above equations as a system of equations using matrices as <strong>X = Y</strong> where <strong>X</strong> is input / feature matrix, <strong></strong> is matrix for unknowns and <strong>Y</strong> is the target matrix as: </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923333788/PsWHA4D9v.png" alt="Screenshot 2021-11-26 at 4.12.21 PM.png" /></p><p>Great, Now all we need is to solve this system of equations and get the <strong>w</strong> and <strong>b</strong> terms.</p><p>Wait there's a problem. We can't solve the above system of equations because target matrix <strong>Y</strong> does not lie in the column space of input matrix <strong>X</strong>. In simple terms, if we see the previous graph again then we can notice that our data points are not collinear i.e. they don't lie on the line so and that's why we can't find the <strong>w</strong> and <strong>b</strong> for the above system of equations.</p><p>And if we think for a moment then it sounds right because in Linear Regression we fit a hypothesis to predict the target for some input with the least possible error. We do not intend to predict the exact target.</p><p>So what we can do here? We can't solve the above system of equations because <strong>Y</strong> is not in the column space of <strong>X</strong>. So instead we can project the <strong>Y</strong> onto the column space of <strong>X</strong>. It is exactly equivalent to projecting one vector onto another.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922751260/n1hwkT-KI.png" alt="Projection_and_rejection.png" /></p><p>In the above representation, <strong>a</strong> and <strong>b</strong> are two vectors and the <strong>a<sub>1</sub></strong> is the projection of vector <strong>a</strong> onto <strong>b</strong>. With this, we can see that now we have the component of vector <strong>a</strong> that lies in the vector space of <strong>b</strong>. </p><p>We can achieve the component of <strong>Y</strong> that lies in the column space of <strong>X</strong> by doing inner product <em>(also known as dot product)</em>. </p><blockquote><p>The inner product of two vectors <strong>a</strong> and <strong>b</strong> can be found by calculating <strong>a<sup>T</sup>b</strong> </p></blockquote><p>Now we can re-write our system of equations as:$$X \Theta = Y \$$</p><center>multiplying both sides by X<sup>T</sup>.</center><p>$$X^{T}X\Theta = X^{T}Y \ $$</p><center><b>Assuming (X<sup>T</sup>X) to be invertible</b></center><p>$$\Theta = (X^{T}X)^{-1}X^{T}Y$$</p><p>The above equation is known as <strong>Normal equation</strong>. Now we have the formula to find our matrix , let's use it and calculate the <strong>w</strong> and <strong>b</strong>. </p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923424277/FV3gjG7gsX.png" alt="Screenshot 2021-11-26 at 4.13.45 PM.png" /></p><p>from the above last equation we have our <strong>w</strong> = <strong>0.5</strong> and <strong>b</strong> = <strong>2/3</strong> <em>(0.6667)</em> and we can check from the equation of blue line that our <strong>w</strong> and <strong>b</strong> are exactly correct. That's how we can get the weights and bias terms for our perfect hypothesis using the <strong><em>Normal equation</em></strong>.</p><h2 id="heading-python-implementation">Python Implementation</h2><p>Now it's time to get our hands dirty with code implementation and make our algorithm work for real datasets. In this section we're going to create a class <strong><code>NormalLinearRegression</code></strong> to handle all the computations for us and return the optimal values of <strong>weights</strong>.</p><blockquote><p>Note: All the code files can be found on Github through <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/linear_regression">this link</a>.</p><p><strong><em>And it's highly recommended to follow the notebook along with this section for better understanding.</em></strong></p></blockquote><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NormalLinearRegression</span>:</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>) -> <span class="hljs-keyword">None</span>:</span> self.X = <span class="hljs-literal">None</span> self.Y = <span class="hljs-literal">None</span> self.theta = <span class="hljs-literal">None</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fit</span>(<span class="hljs-params">self,x,y</span>):</span> <span class="hljs-string">""" Returns the optimal weights. parameters: x : input/feature matrix y : target matrix Returns: theta: Array of the optimal value of weights. """</span> self.X = x <span class="hljs-keyword">if</span> self.X.ndim == <span class="hljs-number">1</span>: <span class="hljs-comment"># adding extra dimension, if X is a 1-D array</span> self.X = self.X.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-comment"># adding extra column of 1s for the bias term</span> self.X = np.concatenate([np.ones((self.X.shape[<span class="hljs-number">0</span>], <span class="hljs-number">1</span>)), self.X], axis=<span class="hljs-number">1</span>) self.Y = y self.theta = np.zeros((self.X.shape[<span class="hljs-number">1</span>],<span class="hljs-number">1</span>)) self.theta = self.calculate_theta() self.theta = self.theta.reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> self.theta <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self, x</span>):</span> <span class="hljs-string">""" Returns the predicted target. parameters: x : test input/feature matrix Returns: y: predicted target value. """</span> x = np.array(x) <span class="hljs-comment"># converting list to numpy array</span> <span class="hljs-keyword">if</span> x.ndim == <span class="hljs-number">1</span>: x = x.reshape(<span class="hljs-number">1</span>,<span class="hljs-number">-1</span>) <span class="hljs-comment"># adding extra dimension in front</span> x = np.concatenate([np.ones((x.shape[<span class="hljs-number">0</span>],<span class="hljs-number">1</span>)), x], axis=<span class="hljs-number">1</span>) <span class="hljs-keyword">return</span> np.dot(x,self.theta) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_theta</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-string">""" Calculate the optimal weights. parameters: None Returns: theta_temp: Array containing the calculated value of weights """</span> y_projection = np.dot(self.X.T, self.Y) cov = np.dot(self.X.T, self.X) cov_inv = np.linalg.pinv(cov) theta_temp = np.dot(cov_inv, y_projection) <span class="hljs-keyword">return</span> theta_temp</code></pre><p>Code is pretty self-explanatory and I added comments wherever I found necessary but still, I want to point out a few things. We are adding an extra column of 1s to <strong>X</strong> for the <strong>bias</strong> term. Since our <strong><code>theta</code></strong> matrix had both <strong>weight</strong> and <strong>bias</strong> terms so <strong><em>we just added an extra column of 1s so that matrix multiplication handles the addition of bias term</em></strong>.</p><p>We are going to test our hypothesis on two datasets. Our first dataset contains 2 columns in which the first one <em>(only feature)</em> is the population of a city (in 10,000s) and the second column is the profit of a food truck in that city (in $10,000s). A negative value for profit indicates a loss. Let's visualize it on the graph.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922789942/TJep5PtnG.png" alt="plot2.png" /></p><p>Now let's use our <strong><code>NormalLinearRegression</code></strong> class to find the best hypothesis to fit on our data.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922799172/yZJFn1iX2.png" alt="plot3.png" /></p><p>Great now let's find the predictions using the <code>params</code>. Class <strong><code>NormalLinearRegression</code></strong> had a method <strong><code>predict</code></strong> that we can use to get the predictions and after that, we can use them to draw the hypothesis as shown below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922812953/q967vG0BaE.png" alt="plot4.png" /></p><p>Okay, our hypothesis looks pretty nice. Now let's take a dataset that has multiple features, for the sake of graphical representation our next dataset contains a training set of housing prices in Portland, Oregon. The first column is the size of the house (in square feet), the second column is the number of bedrooms, and the third column is the price of the house. So in this dataset, we have 2 features and 1 target. Let's visualize it on the graph.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637922990293/V5UjxLgho.png" alt="plot5.png" /></p><blockquote><p><strong><em>If you're wondering how I plot them, just visit the repo for this algorithm through <a target="_blank" href="https://github.com/practice404/demystifying_machine_learning/tree/master/linear_regression">this link</a> and you'll find the notebook where all the implementations are already done for you.</em></strong> </p></blockquote><p>Now let's find the weights of the best hypothesis for this dataset.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923012695/9bN-O3FZb.png" alt="plot6.png" /></p><p>Awesome, now let's find the predictions and plot the hypothesis for this dataset using the <code>predict</code> method of our class.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923027083/wTvqinUK5.png" alt="plot7.png" /></p><p>Yeah I know, it looks messy but we can uniform it as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923062520/TK7bjvXGj.png" alt="plot8.png" /></p><p>Now the thing to be noted is that it's not a straight line, plotting a 3D graph over a 2D surface may give a feel that it's a line but it's not. It's an N-1dimensional hyperplane and in our case, it's a 2D plane.</p><p>Great work people, so far we designed our own Linear Regression algorithm using a Normal equation and tested it on 2 datasets with single and multiple features, you can even try it on your custom dataset to see how it works. So far so good.</p><p>BUT still, there's an edge case is left, let's handle it in the next section.</p><h2 id="heading-what-if-xlesssupgreatertlesssupgreaterx-is-non-invertible">What if (X<sup>T</sup>X) is non-invertible?</h2><p>When we are deriving the Normal equation, we assumed that (X<sup>T</sup>X) to be invertible and then we calculated its inverse to find the matrix . But what if it's not invertible?</p><p>Let's discuss the cases when it cannot be invertible:-</p><ul> <li>(X<sup>T</sup>X) is not a square matrix</li> <li>The columns or rows of (X<sup>T</sup>X) are not independent</li></ul><p>The 1<sup>st</sup> case is obviously wrong, let's see how?</p><p>Suppose the dimensions of X are (m,n) then the dimensions of X<sup>T</sup> will be (n,m). So after performing matrix multiplication the dimensions of (X<sup>T</sup>X) will be (m,m) and hence it's a square matrix.</p><p>But the 2<sup>nd</sup> case can be true, let's see how?</p><p>Suppose you have a dataset and in which the features are not linearly independent. For example, let's say there's a feature labelled as <strong>weight in Kg</strong> and another feature labelled as <strong>weight in pounds</strong>, both the features are linearly dependent i.e we can get one feature by performing some linear transformations on another feature and this can make (X<sup>T</sup>X) as non-invertible. </p><p>Although in our python implementation we used <strong><code>np.linalg.pinv()</code></strong> function to calculate the inverse and it uses <a target="_blank" href="https://youtu.be/rYz83XPxiZo">Singular Value Decomposition</a> to return the pseudo-inverse of the matrix if it's non-invertible.</p><p>Another way to remove such ambiguity is to identify those features and remove them manually. OR we can use Regularization and make it invertible. Let's see how we can use Regularization to achieve this. </p><h3 id="heading-regularized-normal-equation">Regularized Normal Equation</h3><p>In Regularization, we add an extra Identity matrix of dimensions (n+1, n+1) to (X<sup>T</sup>X) where n is the number of features and adding extra 1 denotes the extra column of 1s for bias term.</p><p>Regularized Normal equation can be written as:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923507020/OivEeQSrk.png" alt="Screenshot 2021-11-26 at 4.15.07 PM.png" /></p><p>Now let's understand the above equation, look at the 1<sup>st</sup> column it has only 0s no 1s because the 1st column in X is the column of 1s for the bias term and we do not regularize that column. Mathematically it can be proven that (X<sup>T</sup>X + M) is always invertible.</p><p> is called the <strong>regularization parameter</strong>. You need to set it according to your dataset by choosing from a <strong>set of values that should be greater than 0</strong> and select the one which gives the least <strong>root mean square error</strong> on your training set other than = 0. </p><p>Let's implement this in code and see how this method works. We only need to change the <strong><code>calculate_theta</code></strong> method of our class which is reponsible for the calculation of <strong>(X<sup>T</sup>X)<sup>-1</sup>X<sup>T</sup>Y</strong> .</p><p>The modified <strong><code>calculate_theta</code></strong> method should look something like this:</p><pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_theta</span>(<span class="hljs-params">self, lambda_</span>):</span> <span class="hljs-string">""" Calculate the optimal weights. parameters: None Returns: theta_temp: Array containing the calculated value of weights """</span> y_projection = np.dot(self.X.T, self.Y) <span class="hljs-comment"># Creating matrix M (identity matrix with fist element 0)</span> M = np.identity(self.X.shape[<span class="hljs-number">1</span>]) M[<span class="hljs-number">0</span>,<span class="hljs-number">0</span>] = <span class="hljs-number">0</span> cov = np.dot(self.X.T, self.X) + lambda_*M <span class="hljs-comment"># adding lambda_ times M to X.T@X</span> cov_inv = np.linalg.pinv(cov) theta_temp = np.dot(cov_inv, y_projection) <span class="hljs-keyword">return</span> theta_temp</code></pre><p>We don't need to change anything else in our class and now let's pick some random values for and pick the best one.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923087755/GWeZph9vW.png" alt="plot9.png" /></p><p>Intuitively we can see that plots for = 0 and = 10 are quite good, just to be sure we stored the root mean squared error in a dictionary <strong><code>errors</code></strong> let's print it out and see which got the least error.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637923101471/9m3TgvwXF.png" alt="plot10.png" /></p><p>We can see that = 0 got the least error and = 10 is slightly greater than that but just be on the safer side we will pick = 10 for our hypothesis.</p><h2 id="heading-conclusion">Conclusion</h2><p>Great work everyone, we have successfully learnt and implemented our first machine learning algorithm of Linear Regression using a Normal equation on 2 datasets. There are a few things that we need to keep in our mind while using the Normal equation method. <strong><em>This method is not efficient with respect to computation time on large datasets (samples > 10,000) and the reason is the operation of calculating the inverse of a matrix that had a time complexity of around O(n<sup>3</sup>).</em></strong> </p><p>So in that case <strong><em>Gradient Descent</em></strong> method comes very handy and we'll be going to learn that in our next article. Also, we'll be exploring Regularization in great depth too. </p><p>I hope you have learnt something new, for more updates on upcoming articles get connected with me through <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and stay tuned for more. Till then enjoy your day and keep learning.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1637922619542/auJ1namo7.jpeg<![CDATA[Demystifying Machine Learning]]>https://swayam-blog.hashnode.dev/demystifying-machine-learninghttps://swayam-blog.hashnode.dev/demystifying-machine-learningFri, 19 Nov 2021 19:38:11 GMT<![CDATA[<h2 id="heading-overview">Overview</h2><p><strong>Demystifying Machine Learning</strong> refers to the series of blogs in which I am going to explain the different algorithms of Machine Learning with behind the scenes mathematics and code implementation from scratch. This is an introductory article for the entire series about what the audience can expect, what are Algorithms we are going to cover and what prerequisites you need to fulfil in order to get the most out of this series. </p><h2 id="heading-prerequisites">Prerequisites</h2><p><img src="https://media.giphy.com/media/dAuumiq85i5evf5UVY/giphy.gif" alt /></p><p>There are a few things you need to be comfortable with so that you can get the best out of the articles.</p><ul><li>Working knowledge of Python and OOP concepts. It'll be great if you already worked with Numpy before.</li><li>Basic mathematical concepts like <strong><em>Linear Algebra</em></strong>, <strong><em>Calculus</em></strong> and <strong><em>probability</em></strong> will be needed.</li></ul><p>Don't worry, if you're not a pro in maths. It's going to be a beginner-friendly introduction to all algorithms and mathematics with explaining every concept in brief.</p><h2 id="heading-algorithms-to-cover">Algorithms to cover</h2><p>Below is the list of all the algorithms we are going to cover in this series. The main focus of this series is going to be on Supervised and Unsupervised algorithms.</p><p><img src="https://media.giphy.com/media/3o7TKT0XDElcmlCeNa/giphy.gif" alt /></p><blockquote><p>Deep Learning is a subset of Machine Learning, so we will also be dealing with Neural Networks from scratch, but just to keep content consistent, advanced stuff like CNNs, sequence models, etc are kept out of the scope for this series.</p></blockquote><ol><li><strong>Linear Regression :</strong> <ol><li>Linear Regression using <em>normal equation</em> and regularization</li><li>Linear Regression using <em>gradient descent</em> and regularization</li></ol></li><li><strong>Logistic Regression with regularization</strong></li><li><strong>Support Vector Machines (SVM)</strong></li><li><strong>Neural Networks :</strong><ol><li><em>Shallow</em> Neural Networks</li><li><em>Deep</em> Neural Networks</li></ol></li><li><strong>K-Means Clustering</strong></li><li><strong>Dimensionality Reduction (P.C.A)</strong></li><li><strong>Anomaly Detection</strong></li></ol><p>And the best part, we're not just going to understand the algorithm and implement it with Python. In the end or in middle, we'll be going to make our own Python package including classes of the algorithms we implemented and later on publish it on <code>PIP</code> so that you can <code>pip install <package></code> it whenever you want. (Just for fun 😉)</p><p><img src="https://media.giphy.com/media/NISzyl7TRgmImBdUOZ/giphy.gif" /></p><h2 id="heading-the-best-route">The best route</h2><p>Articles are going to be long and comprehensive with loads of mathematical notations and code implementations, so it'll be obvious not to hurry just to finish an article within a specific reading time. Take your time, do google search whenever required, make hand notes and understand what the code is actually doing.</p><p>If feeling overwhelmed adopt the policy of "1 algorithm a week". This series is designed in such a way that you get to know your algorithm by asking how it's working and why it's working. A good practice is <strong>NOT</strong> to <em>skip lines and jump to some random section</em>. If get stuck somewhere or you got any other doubts, feel free to contact me via my socials.</p><p>All code implementations will be pushed to a GitHub repository whose link will be shared in the respective algorithm's article, so that you can also contribute to it, if you got something interesting about that code implementation.</p><h2 id="heading-conclusion">Conclusion</h2><p>This is all to this introductory article. Nothing better than ending your week by learning a new ML algorithm 😁. So make sure to follow me on <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> for updates on newly published articles. See you in the next week with Linear Regression, till then enjoy and have a nice day 🍁</p>]]><![CDATA[<h2 id="heading-overview">Overview</h2><p><strong>Demystifying Machine Learning</strong> refers to the series of blogs in which I am going to explain the different algorithms of Machine Learning with behind the scenes mathematics and code implementation from scratch. This is an introductory article for the entire series about what the audience can expect, what are Algorithms we are going to cover and what prerequisites you need to fulfil in order to get the most out of this series. </p><h2 id="heading-prerequisites">Prerequisites</h2><p><img src="https://media.giphy.com/media/dAuumiq85i5evf5UVY/giphy.gif" alt /></p><p>There are a few things you need to be comfortable with so that you can get the best out of the articles.</p><ul><li>Working knowledge of Python and OOP concepts. It'll be great if you already worked with Numpy before.</li><li>Basic mathematical concepts like <strong><em>Linear Algebra</em></strong>, <strong><em>Calculus</em></strong> and <strong><em>probability</em></strong> will be needed.</li></ul><p>Don't worry, if you're not a pro in maths. It's going to be a beginner-friendly introduction to all algorithms and mathematics with explaining every concept in brief.</p><h2 id="heading-algorithms-to-cover">Algorithms to cover</h2><p>Below is the list of all the algorithms we are going to cover in this series. The main focus of this series is going to be on Supervised and Unsupervised algorithms.</p><p><img src="https://media.giphy.com/media/3o7TKT0XDElcmlCeNa/giphy.gif" alt /></p><blockquote><p>Deep Learning is a subset of Machine Learning, so we will also be dealing with Neural Networks from scratch, but just to keep content consistent, advanced stuff like CNNs, sequence models, etc are kept out of the scope for this series.</p></blockquote><ol><li><strong>Linear Regression :</strong> <ol><li>Linear Regression using <em>normal equation</em> and regularization</li><li>Linear Regression using <em>gradient descent</em> and regularization</li></ol></li><li><strong>Logistic Regression with regularization</strong></li><li><strong>Support Vector Machines (SVM)</strong></li><li><strong>Neural Networks :</strong><ol><li><em>Shallow</em> Neural Networks</li><li><em>Deep</em> Neural Networks</li></ol></li><li><strong>K-Means Clustering</strong></li><li><strong>Dimensionality Reduction (P.C.A)</strong></li><li><strong>Anomaly Detection</strong></li></ol><p>And the best part, we're not just going to understand the algorithm and implement it with Python. In the end or in middle, we'll be going to make our own Python package including classes of the algorithms we implemented and later on publish it on <code>PIP</code> so that you can <code>pip install <package></code> it whenever you want. (Just for fun 😉)</p><p><img src="https://media.giphy.com/media/NISzyl7TRgmImBdUOZ/giphy.gif" /></p><h2 id="heading-the-best-route">The best route</h2><p>Articles are going to be long and comprehensive with loads of mathematical notations and code implementations, so it'll be obvious not to hurry just to finish an article within a specific reading time. Take your time, do google search whenever required, make hand notes and understand what the code is actually doing.</p><p>If feeling overwhelmed adopt the policy of "1 algorithm a week". This series is designed in such a way that you get to know your algorithm by asking how it's working and why it's working. A good practice is <strong>NOT</strong> to <em>skip lines and jump to some random section</em>. If get stuck somewhere or you got any other doubts, feel free to contact me via my socials.</p><p>All code implementations will be pushed to a GitHub repository whose link will be shared in the respective algorithm's article, so that you can also contribute to it, if you got something interesting about that code implementation.</p><h2 id="heading-conclusion">Conclusion</h2><p>This is all to this introductory article. Nothing better than ending your week by learning a new ML algorithm 😁. So make sure to follow me on <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> for updates on newly published articles. See you in the next week with Linear Regression, till then enjoy and have a nice day 🍁</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1637350461543/FJ0uG5XCY.jpeg<![CDATA[How to Set Up Firebase Authentication in React from Scratch]]>https://swayam-blog.hashnode.dev/how-to-set-up-firebase-authentication-in-react-from-scratch-363b123484b7https://swayam-blog.hashnode.dev/how-to-set-up-firebase-authentication-in-react-from-scratch-363b123484b7Tue, 25 May 2021 09:44:18 GMT<![CDATA[<h2 id="overview">Overview:</h2><p>User authentication is a must if you are building a platform that stores some private data of users like a social media app. At the same time, its kind of tricky to implement. In this article, we will be discussing how we can use Firebase to create a fully functional and secure user authentication.</p><h2 id="agenda">Agenda:</h2><p>Following is the list of features we will build later on in this article.</p><ul><li><p>Sign Up</p></li><li><p>Log In</p></li><li><p>Dashboard</p></li><li><p>Log Out</p></li><li><p>Forget Password</p></li><li><p>Protected Routes</p></li></ul><h2 id="prerequisites">Prerequisites:</h2><ul><li><p>Familiar with React environment.</p></li><li><p>Basic knowledge of <strong><em>Context API.</em></strong></p></li><li><p>Basic knowledge of routing in React.</p></li></ul><h2 id="lets-go-with-the-flow">Lets go with the flow</h2><p>So firstly, we need to create a React app. Navigate into the <code>Desktop</code> folder inside the terminal and type <code>npx create-react-app &lt;give any name&gt;</code> . Inside the <code>src</code> folder, just keep <code>index.js</code> and <code>App.js</code>, delete the rest we dont need them.</p><h3 id="setting-up-firebase">Setting up Firebase:</h3><p>Okay, so now, visit <a target="_blank" href="https://firebase.google.com/">firebase</a> and click on <strong><em>go to console; </em></strong>there, you click on <strong><em>Add Project </em></strong>and give it any name you want.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039535195/-HUE6byJa.png" alt /></p><p>Click on <strong><em>continue </em></strong>and disable <strong><em>google analytics, </em></strong>again click on <strong><em>continue. </em></strong>Itll take some time to process and when its done, our Firebase app is now ready.</p><p>Now, we need to integrate it with our React web app. Click on <strong>*web icon</strong>.<strong> *</strong>Then, itll ask you to enter another name of the project for integration. Dont worry, it can be any name you want.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039537714/yiyyfUFux.png" alt /></p><p>Click on the <strong><em>Register app. </em></strong>Now, Firebase will give you some keys and other configuration settings so that you can connect your React app with Firebase services.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039540377/78t5sc13f.png" alt /></p><blockquote><p>Configuration file will be different for you. Also, dont use the above keys. <strong><em>This demo app will be deleted after this article.</em></strong></p></blockquote><p>We will use them later. Now, lets set up authentication<strong><em>. </em></strong>Navigate to the home page of your firebase app and click on <strong><em>authentication. </em></strong>Then, click on <strong><em>Get Started. </em></strong>Itll show you some methods that you can use to create user authentication. In this article, we will be using the <strong>Email/Password </strong>method. So, click on it. Then hit <strong><em>enable </em></strong>only for the first option and click <strong><em>save.</em></strong></p><p>Great! Now we have a method for authentication and a config file to connect the app. Lets go to our favourite code editor and start coding!</p><h2 id="danger-code-ahead">Danger! Code ahead</h2><p>First, we are going to create a <code>.env.local</code> file for storing our Firebase configuration details. Putting them openly naked inside your code will make it easy for hackers to access your Firebase app. Inside our react-app, create a <code>.env.local</code> file and store only the keys and values that are inside the variable <strong><em>firebaseConfig</em></strong>, as shown below:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039543224/hjRwnes9E.png" alt /></p><blockquote><p>In React, naming of <strong><em>environment variables </em></strong>start with <strong><em>REACT_APP</em></strong></p></blockquote><p>Great! Now, its time to connect it. For that, we are going to use the <strong>firebase </strong>module. And below is the list of all the modules well be using for this entire project.</p><ul><li><p><code>react-router-dom</code> for working with different routes.</p></li><li><p><code>bootstrap</code> for styling</p></li><li><p><code>react-bootstrap</code> for pre-built styled-components</p></li><li><p><code>firebase</code> for working with Firebase</p></li></ul><p>So, go ahead and install them all at once using the command:<code>npm i react-router-dom bootstrap react-bootstrap firebase</code> .</p><p>Inside react-app, create a file <code>firebase.js</code> for making the connection with Firebase.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039545881/pi7vVGxsU.png" alt /></p><p>If you have been observant, then this file is very similar to the file that Firebase gave us for creating a connection. Yeah, we just used the <code>firebase</code> module instead of an external JavaScript script. We initialised the app with environment variables stored in <code>.env.local</code> file and <code>app.auth()</code> is stored inside <code>auth</code> which will be responsible for calling several methods like <strong>login, signup, logout, etc.</strong></p><p>Very well. Now, its time to set up the <strong>Context API </strong>so that we can just define our authentication methods in one file and access them in the relevant components. Inside the <code>src</code> folder, create another folder with the name <strong>context </strong>and inside it, create a file <code>authContext.js</code> as shown below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039548839/1hvTM-ory.png" alt /></p><p>Basic knowledge of <strong>Context API </strong>is mentioned in the prerequisites. But still, Im explaining it. Above, you can see the basic structure of the context JavaScript file. Firstly, we create a context using <code>React.createContext()</code> . It gives us two things <strong>Provider and Consumer</strong>. For now, we are only concerned with <strong>Provider </strong>which enables us to pass the value in it and use them in any component.</p><p>Component <code>AuthProvider</code> returns the <code>&lt;AuthContext.Provider&gt;</code> component with a <code>value</code> prop that contains the values we want to pass, <code>{children}</code> refers to the root component which will be wrapped by the <strong>Provider. </strong>In the end, we created a custom hook <code>useAuth()</code> which directly gives you all of the passed values.</p><p>Now, lets start creating our authentication methods inside <strong>authContext.js </strong>and pass them to <strong>Provider. </strong>Replace the comment in the above code with the following lines.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039551115/_n5-pSrdr.png" alt /></p><p>You can see that firstly, we create two states for storing <code>currentUser</code> that had info about logged in user and <code>loading</code> for displaying <code>{children}</code> after it is set to false. So, instead of simply rendering <code>{children}</code> inside <code>&lt;AuthContext.Provider&gt;</code>, change it to <code>{!loading && children}</code>.</p><p>As I said earlier, all the authentication methods will be stored inside <code>auth</code> from <strong>firebase.js</strong>. So, we used <code>auth</code> to call different methods and stored them inside respective functions so that when needed, we can call them. <code>useEffect()</code> contains an event handler that continuously listens to the authentication state like when the user logs in and when they sign out. According to that, it sets the <code>currentUser</code> state to logged in user or <code>undefined</code> .</p><p><code>auth.onAuthStateChanged()</code> is an event handler. Whenever that component loads, <code>useEffect()</code> sets that handler for use many times. It may cause a memory leak and make your app slow. For dealing with this, <code>auth.onAuthStateChanged()</code> returns a <code>unsubscribe</code> method that can unsubscribe you from the event handler as the component unloads.</p><p>After that, we just pass all the methods and states to values prop inside <strong>Provider.</strong></p><p>Now, we need to wrap our root component with <code>AuthProvider</code> component. In this app, well be wrapping our <code>App</code> component. So, open <strong>index.js</strong> and do the following:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039553368/2HT7GCkPK.png" alt /></p><p>Now our <code>&lt;App /&gt;</code> will be <code>{children}</code> for <code>&lt;AuthProvider&gt;</code> . You can see I also imported a bootstrap CSS file for giving style to each of our components. But you can create your own custom style sheet if you want. In this article, we are just focusing on functionality.</p><p>Till now, we have been dealing with the functionality setup. So, its time to create components for each method. Create a folder inside <code>src</code> folder with the name as <code>components</code>.</p><h2 id="sign-up-component">Sign-up component:</h2><p>Firstly, we are going to deal with the sign-up component. So, inside the <code>components</code> folder, create a file with the name <strong>signup.js</strong>.</p><p>We will create a form with three fields <strong>email, password, confirm-password, </strong>and check <strong>if confirm-passowrd matches with password.</strong> Only then are we going to call the signup method from <strong>authContext.js. </strong>If the signup succeeded, then we redirect our user to the <strong><em>dashboard</em></strong> component (going to create this later).</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039555712/YWuSdq_gy.png" alt /></p><p>As I mentioned earlier in article, our custom hook <code>useAuth()</code> will be used to receive passed values in context and in this component, we are using the <strong>signup</strong> method created inside <strong>authContext.js . `</strong>Link<code>is used to take the user to the **login** page if they already had an account and the</code>useHistory` hook is used for redirecting the user after successfully registering.</p><p><code>emailRef</code> , <code>passwordRef</code> and <code>passwordConfirmRef</code> are used as references for respective input fields, and later, we destructure the <strong>signup </strong>method from the <code>useAuth</code> hook.</p><p>Now, take a look at the function <code>handleSubmit</code>. Its an <code>async</code> function because authentication methods from Firebase return a <strong>promise. </strong>So, we are using <code>async / await</code> to handle it. Here, first we are checking if the password and confirmPassword are the same. And then, inside the <code>try / catch</code> block, we are calling the <strong>signup </strong>function by passing the <strong>email </strong>and <strong>password</strong> entered by the user.</p><p>That is our functionality for <strong>signup.</strong> So now, inside return, lets create the form and other UIs. We are using <strong>bootstrap cards</strong> and <strong>form </strong>for styling purposes.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039558266/SaccRLdou.png" alt /></p><p>Just go line by line. The code is very simple. All we are doing is using <strong>bootstrap</strong> components and creating the form. In the end, we created a link to the <strong>login</strong> component if the user already has an account. Thats it. Our <strong>signup</strong> component is ready.</p><h2 id="login-component">Login Component:</h2><p>Create a file inside the <strong>component</strong> folder and name it <strong>login.js.</strong></p><p>This component will be pretty much the same as the <strong>signup</strong> component. The only difference is we are calling the <strong>login </strong>function instead of <strong>signup. A</strong>nd we dont need to have the confirm-password field here. The rest of the code in both the components will be the same.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039560901/Widt1Ii8A.png" alt /></p><p>What we are returning is also very similar to <strong>signup. </strong>Except instead of creating the link to<strong> login, </strong>we ask if they are not registered, then take them to<strong> </strong>the <strong>signup </strong>component<strong>.</strong></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039563451/B7ntLknPB.png" alt /></p><p>The extra thing we are allowing users is an option to reset their password by creating a link to the <strong>forgetPassword </strong>component (going to create it later).</p><h2 id="forget-password-component">Forget Password Component:</h2><p>The way Firebase password reset works is that when the user clicks on it, they send an email to the registered email address with further instructions and a link to reset their password.</p><p>The cool thing is again, the code will be pretty similar to the above component, and here we are calling the <code>resetpassword</code> method. Go ahead and create a file with the name <strong>forgetpassword.js</strong> and take a look below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039566668/0_UV3X4Va.png" alt /></p><p>As you can see, we are destructuring <code>resetPassword</code> from custom <code>useAuth()</code> hook and had an extra state for storing messages like <strong><em>check your inbox blah blah blah</em> </strong>after successfully calling the <code>resetPassword</code> function.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039568977/wFIzGmY-W.png" alt /></p><p>Nothing new. We are already familiar with this type of code and thats it our <strong>reset password </strong>component is ready.</p><h2 id="dashboard">Dashboard:</h2><p>For now, our dashboard just shows the email of the <code>currentUser</code> and also contains a <code>logout</code> button for logging out the user. You can add more functionality according to your custom project.</p><p>Go ahead and create a file with the name <strong>dashboard.js</strong> inside the <strong>components</strong> folder.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039571405/kFZaEoxQN.png" alt /></p><p>The thing to be notice is that here we are destructuring <code>logout</code> and <code>currentUser</code> for handling signing out and showing the email of the logged-in user respectively. As the user successfully logs out, we then redirect him/her to the <strong>login </strong>component using the <code>useHistory()</code> hook. The <strong>Dashboard </strong>component is ready.</p><h2 id="setting-up-routes-for-components">Setting up Routes for components:</h2><p>We are done with all of our components. Now lets set up each of their routes inside <strong>App.js </strong>using <code>react-router-dom</code> . Open <strong>App.js </strong>and do the following.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039574001/9VH76dG61.png" alt /></p><p>We exported <code>BrowserRouter</code> and <code>Route</code> for setting up routes for each of the different components, and <code>Switch</code> for loading a single component at a time.</p><p>Now, if you start the app by running <code>npm start</code>, then you see a blank screen because the home page is empty right now. For seeing different components, go to their respective URLs. For example: <a target="_blank" href="http://localhost:3000/signup"><em>*</em>http://localhost:3000/signup</a> <em>*</em>will take you to:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039576309/jeT6bec8C.png" alt /></p><p>And as you enter your details and click on the <strong><em>signup </em></strong>button, youll be redirected to the <strong>dashboard </strong>component</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039578359/RMNq2pgEh.png" alt /></p><p>Now, one by one, you can check out the other components too. Now we have successfully implemented authentication. You can see the list of registered users in your Firebase console under the authentication section. <strong><em>Something is still missing..</em></strong></p><h2 id="securing-sensitive-routes">Securing sensitive routes:</h2><p>Our app is working perfectly and authenticating users very well but still, something is missing, we forget something. Can you guess what?</p><p>Well, if you log out the user, and try to access the <strong>dashboard</strong>boom! You can still access it and thats not good. We dont want any guest user to easily access our dashboard. Although it does not contain anything right now, it might not be true in your projects case. We still need to secure it, so that only registered users can access their dashboard.</p><p>The way we can do so is to create another component. Basically a modified version of the <code>Route</code> component from <code>react-router-dom</code> and itll check whether someone is logged in or not. If its true, then itll render the <strong>dashboard,</strong> otherwise, just redirect to the <strong>login </strong>component.</p><p>Create a file with the name <strong>privateRoute.js </strong>inside the <strong>components</strong> folder and take look at what its going to contain.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039580724/pPgA8vjZh.png" alt /></p><p>Lets understand what we are doing. As I said, <strong>PrivateRoute</strong> is a modified version of the <code>Route</code> component which checks for <code>currentUser</code> before setting up any route. Here, first we got hold of the <code>currentUser</code> from <code>useAuth()</code>.</p><p>Then, we are returning the <code>&lt;Route&gt;</code> component with all the props required for a normal <code>&lt;Route&gt;</code> component like <code>path</code>, <code>exact</code> , etc. Then it checks for <code>currentUser</code>. If true, then it renders the <strong>dashboard;</strong> otherwise, it redirects to <strong>login. </strong>Here, instead of using <code>useHistory()</code> for redirecting, well be using the <code>&lt;Redirect&gt;</code> component from <code>react-router-dom</code> because we need to render something if <code>currentUser</code> is not logged in.</p><p>Now, inside <strong>App.js,</strong> import this component and replace the code line where you set up the route for the <strong>dashboard </strong>to the following:</p><p><code>&lt;PrivateRoute exact path=/dashboard component={Dashboard} /&gt;</code></p><p>We are done. Now if you try to access the <strong>dashboard </strong>as a guest user, youll be redirected to the <strong>login</strong> component.</p><h2 id="conclusion">Conclusion:</h2><p>Thats it. We successfully created all the basic features required for user authentication. You can upgrade it in the way you want or instead of using <strong>context, </strong>you can go for <strong>Redux. </strong>Firebase has got so many cool features like <strong>Firestore, </strong>a real-time database, and much more. It really comes in handy while designing big heavy projects. Well be discussing more about <strong>Firestore </strong>in future articles. Till then, stay healthy and keep coding.</p><blockquote><p><strong>If you want the complete code files for your project, grab them from <a target="_blank" href="https://github.com/practice404/react-components/tree/main/firebase-react-auth">here</a>.</strong></p></blockquote>]]><![CDATA[<h2 id="overview">Overview:</h2><p>User authentication is a must if you are building a platform that stores some private data of users like a social media app. At the same time, its kind of tricky to implement. In this article, we will be discussing how we can use Firebase to create a fully functional and secure user authentication.</p><h2 id="agenda">Agenda:</h2><p>Following is the list of features we will build later on in this article.</p><ul><li><p>Sign Up</p></li><li><p>Log In</p></li><li><p>Dashboard</p></li><li><p>Log Out</p></li><li><p>Forget Password</p></li><li><p>Protected Routes</p></li></ul><h2 id="prerequisites">Prerequisites:</h2><ul><li><p>Familiar with React environment.</p></li><li><p>Basic knowledge of <strong><em>Context API.</em></strong></p></li><li><p>Basic knowledge of routing in React.</p></li></ul><h2 id="lets-go-with-the-flow">Lets go with the flow</h2><p>So firstly, we need to create a React app. Navigate into the <code>Desktop</code> folder inside the terminal and type <code>npx create-react-app &lt;give any name&gt;</code> . Inside the <code>src</code> folder, just keep <code>index.js</code> and <code>App.js</code>, delete the rest we dont need them.</p><h3 id="setting-up-firebase">Setting up Firebase:</h3><p>Okay, so now, visit <a target="_blank" href="https://firebase.google.com/">firebase</a> and click on <strong><em>go to console; </em></strong>there, you click on <strong><em>Add Project </em></strong>and give it any name you want.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039535195/-HUE6byJa.png" alt /></p><p>Click on <strong><em>continue </em></strong>and disable <strong><em>google analytics, </em></strong>again click on <strong><em>continue. </em></strong>Itll take some time to process and when its done, our Firebase app is now ready.</p><p>Now, we need to integrate it with our React web app. Click on <strong>*web icon</strong>.<strong> *</strong>Then, itll ask you to enter another name of the project for integration. Dont worry, it can be any name you want.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039537714/yiyyfUFux.png" alt /></p><p>Click on the <strong><em>Register app. </em></strong>Now, Firebase will give you some keys and other configuration settings so that you can connect your React app with Firebase services.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039540377/78t5sc13f.png" alt /></p><blockquote><p>Configuration file will be different for you. Also, dont use the above keys. <strong><em>This demo app will be deleted after this article.</em></strong></p></blockquote><p>We will use them later. Now, lets set up authentication<strong><em>. </em></strong>Navigate to the home page of your firebase app and click on <strong><em>authentication. </em></strong>Then, click on <strong><em>Get Started. </em></strong>Itll show you some methods that you can use to create user authentication. In this article, we will be using the <strong>Email/Password </strong>method. So, click on it. Then hit <strong><em>enable </em></strong>only for the first option and click <strong><em>save.</em></strong></p><p>Great! Now we have a method for authentication and a config file to connect the app. Lets go to our favourite code editor and start coding!</p><h2 id="danger-code-ahead">Danger! Code ahead</h2><p>First, we are going to create a <code>.env.local</code> file for storing our Firebase configuration details. Putting them openly naked inside your code will make it easy for hackers to access your Firebase app. Inside our react-app, create a <code>.env.local</code> file and store only the keys and values that are inside the variable <strong><em>firebaseConfig</em></strong>, as shown below:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039543224/hjRwnes9E.png" alt /></p><blockquote><p>In React, naming of <strong><em>environment variables </em></strong>start with <strong><em>REACT_APP</em></strong></p></blockquote><p>Great! Now, its time to connect it. For that, we are going to use the <strong>firebase </strong>module. And below is the list of all the modules well be using for this entire project.</p><ul><li><p><code>react-router-dom</code> for working with different routes.</p></li><li><p><code>bootstrap</code> for styling</p></li><li><p><code>react-bootstrap</code> for pre-built styled-components</p></li><li><p><code>firebase</code> for working with Firebase</p></li></ul><p>So, go ahead and install them all at once using the command:<code>npm i react-router-dom bootstrap react-bootstrap firebase</code> .</p><p>Inside react-app, create a file <code>firebase.js</code> for making the connection with Firebase.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039545881/pi7vVGxsU.png" alt /></p><p>If you have been observant, then this file is very similar to the file that Firebase gave us for creating a connection. Yeah, we just used the <code>firebase</code> module instead of an external JavaScript script. We initialised the app with environment variables stored in <code>.env.local</code> file and <code>app.auth()</code> is stored inside <code>auth</code> which will be responsible for calling several methods like <strong>login, signup, logout, etc.</strong></p><p>Very well. Now, its time to set up the <strong>Context API </strong>so that we can just define our authentication methods in one file and access them in the relevant components. Inside the <code>src</code> folder, create another folder with the name <strong>context </strong>and inside it, create a file <code>authContext.js</code> as shown below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039548839/1hvTM-ory.png" alt /></p><p>Basic knowledge of <strong>Context API </strong>is mentioned in the prerequisites. But still, Im explaining it. Above, you can see the basic structure of the context JavaScript file. Firstly, we create a context using <code>React.createContext()</code> . It gives us two things <strong>Provider and Consumer</strong>. For now, we are only concerned with <strong>Provider </strong>which enables us to pass the value in it and use them in any component.</p><p>Component <code>AuthProvider</code> returns the <code>&lt;AuthContext.Provider&gt;</code> component with a <code>value</code> prop that contains the values we want to pass, <code>{children}</code> refers to the root component which will be wrapped by the <strong>Provider. </strong>In the end, we created a custom hook <code>useAuth()</code> which directly gives you all of the passed values.</p><p>Now, lets start creating our authentication methods inside <strong>authContext.js </strong>and pass them to <strong>Provider. </strong>Replace the comment in the above code with the following lines.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039551115/_n5-pSrdr.png" alt /></p><p>You can see that firstly, we create two states for storing <code>currentUser</code> that had info about logged in user and <code>loading</code> for displaying <code>{children}</code> after it is set to false. So, instead of simply rendering <code>{children}</code> inside <code>&lt;AuthContext.Provider&gt;</code>, change it to <code>{!loading && children}</code>.</p><p>As I said earlier, all the authentication methods will be stored inside <code>auth</code> from <strong>firebase.js</strong>. So, we used <code>auth</code> to call different methods and stored them inside respective functions so that when needed, we can call them. <code>useEffect()</code> contains an event handler that continuously listens to the authentication state like when the user logs in and when they sign out. According to that, it sets the <code>currentUser</code> state to logged in user or <code>undefined</code> .</p><p><code>auth.onAuthStateChanged()</code> is an event handler. Whenever that component loads, <code>useEffect()</code> sets that handler for use many times. It may cause a memory leak and make your app slow. For dealing with this, <code>auth.onAuthStateChanged()</code> returns a <code>unsubscribe</code> method that can unsubscribe you from the event handler as the component unloads.</p><p>After that, we just pass all the methods and states to values prop inside <strong>Provider.</strong></p><p>Now, we need to wrap our root component with <code>AuthProvider</code> component. In this app, well be wrapping our <code>App</code> component. So, open <strong>index.js</strong> and do the following:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039553368/2HT7GCkPK.png" alt /></p><p>Now our <code>&lt;App /&gt;</code> will be <code>{children}</code> for <code>&lt;AuthProvider&gt;</code> . You can see I also imported a bootstrap CSS file for giving style to each of our components. But you can create your own custom style sheet if you want. In this article, we are just focusing on functionality.</p><p>Till now, we have been dealing with the functionality setup. So, its time to create components for each method. Create a folder inside <code>src</code> folder with the name as <code>components</code>.</p><h2 id="sign-up-component">Sign-up component:</h2><p>Firstly, we are going to deal with the sign-up component. So, inside the <code>components</code> folder, create a file with the name <strong>signup.js</strong>.</p><p>We will create a form with three fields <strong>email, password, confirm-password, </strong>and check <strong>if confirm-passowrd matches with password.</strong> Only then are we going to call the signup method from <strong>authContext.js. </strong>If the signup succeeded, then we redirect our user to the <strong><em>dashboard</em></strong> component (going to create this later).</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039555712/YWuSdq_gy.png" alt /></p><p>As I mentioned earlier in article, our custom hook <code>useAuth()</code> will be used to receive passed values in context and in this component, we are using the <strong>signup</strong> method created inside <strong>authContext.js . `</strong>Link<code>is used to take the user to the **login** page if they already had an account and the</code>useHistory` hook is used for redirecting the user after successfully registering.</p><p><code>emailRef</code> , <code>passwordRef</code> and <code>passwordConfirmRef</code> are used as references for respective input fields, and later, we destructure the <strong>signup </strong>method from the <code>useAuth</code> hook.</p><p>Now, take a look at the function <code>handleSubmit</code>. Its an <code>async</code> function because authentication methods from Firebase return a <strong>promise. </strong>So, we are using <code>async / await</code> to handle it. Here, first we are checking if the password and confirmPassword are the same. And then, inside the <code>try / catch</code> block, we are calling the <strong>signup </strong>function by passing the <strong>email </strong>and <strong>password</strong> entered by the user.</p><p>That is our functionality for <strong>signup.</strong> So now, inside return, lets create the form and other UIs. We are using <strong>bootstrap cards</strong> and <strong>form </strong>for styling purposes.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039558266/SaccRLdou.png" alt /></p><p>Just go line by line. The code is very simple. All we are doing is using <strong>bootstrap</strong> components and creating the form. In the end, we created a link to the <strong>login</strong> component if the user already has an account. Thats it. Our <strong>signup</strong> component is ready.</p><h2 id="login-component">Login Component:</h2><p>Create a file inside the <strong>component</strong> folder and name it <strong>login.js.</strong></p><p>This component will be pretty much the same as the <strong>signup</strong> component. The only difference is we are calling the <strong>login </strong>function instead of <strong>signup. A</strong>nd we dont need to have the confirm-password field here. The rest of the code in both the components will be the same.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039560901/Widt1Ii8A.png" alt /></p><p>What we are returning is also very similar to <strong>signup. </strong>Except instead of creating the link to<strong> login, </strong>we ask if they are not registered, then take them to<strong> </strong>the <strong>signup </strong>component<strong>.</strong></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039563451/B7ntLknPB.png" alt /></p><p>The extra thing we are allowing users is an option to reset their password by creating a link to the <strong>forgetPassword </strong>component (going to create it later).</p><h2 id="forget-password-component">Forget Password Component:</h2><p>The way Firebase password reset works is that when the user clicks on it, they send an email to the registered email address with further instructions and a link to reset their password.</p><p>The cool thing is again, the code will be pretty similar to the above component, and here we are calling the <code>resetpassword</code> method. Go ahead and create a file with the name <strong>forgetpassword.js</strong> and take a look below.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039566668/0_UV3X4Va.png" alt /></p><p>As you can see, we are destructuring <code>resetPassword</code> from custom <code>useAuth()</code> hook and had an extra state for storing messages like <strong><em>check your inbox blah blah blah</em> </strong>after successfully calling the <code>resetPassword</code> function.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039568977/wFIzGmY-W.png" alt /></p><p>Nothing new. We are already familiar with this type of code and thats it our <strong>reset password </strong>component is ready.</p><h2 id="dashboard">Dashboard:</h2><p>For now, our dashboard just shows the email of the <code>currentUser</code> and also contains a <code>logout</code> button for logging out the user. You can add more functionality according to your custom project.</p><p>Go ahead and create a file with the name <strong>dashboard.js</strong> inside the <strong>components</strong> folder.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039571405/kFZaEoxQN.png" alt /></p><p>The thing to be notice is that here we are destructuring <code>logout</code> and <code>currentUser</code> for handling signing out and showing the email of the logged-in user respectively. As the user successfully logs out, we then redirect him/her to the <strong>login </strong>component using the <code>useHistory()</code> hook. The <strong>Dashboard </strong>component is ready.</p><h2 id="setting-up-routes-for-components">Setting up Routes for components:</h2><p>We are done with all of our components. Now lets set up each of their routes inside <strong>App.js </strong>using <code>react-router-dom</code> . Open <strong>App.js </strong>and do the following.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039574001/9VH76dG61.png" alt /></p><p>We exported <code>BrowserRouter</code> and <code>Route</code> for setting up routes for each of the different components, and <code>Switch</code> for loading a single component at a time.</p><p>Now, if you start the app by running <code>npm start</code>, then you see a blank screen because the home page is empty right now. For seeing different components, go to their respective URLs. For example: <a target="_blank" href="http://localhost:3000/signup"><em>*</em>http://localhost:3000/signup</a> <em>*</em>will take you to:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039576309/jeT6bec8C.png" alt /></p><p>And as you enter your details and click on the <strong><em>signup </em></strong>button, youll be redirected to the <strong>dashboard </strong>component</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039578359/RMNq2pgEh.png" alt /></p><p>Now, one by one, you can check out the other components too. Now we have successfully implemented authentication. You can see the list of registered users in your Firebase console under the authentication section. <strong><em>Something is still missing..</em></strong></p><h2 id="securing-sensitive-routes">Securing sensitive routes:</h2><p>Our app is working perfectly and authenticating users very well but still, something is missing, we forget something. Can you guess what?</p><p>Well, if you log out the user, and try to access the <strong>dashboard</strong>boom! You can still access it and thats not good. We dont want any guest user to easily access our dashboard. Although it does not contain anything right now, it might not be true in your projects case. We still need to secure it, so that only registered users can access their dashboard.</p><p>The way we can do so is to create another component. Basically a modified version of the <code>Route</code> component from <code>react-router-dom</code> and itll check whether someone is logged in or not. If its true, then itll render the <strong>dashboard,</strong> otherwise, just redirect to the <strong>login </strong>component.</p><p>Create a file with the name <strong>privateRoute.js </strong>inside the <strong>components</strong> folder and take look at what its going to contain.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1622039580724/pPgA8vjZh.png" alt /></p><p>Lets understand what we are doing. As I said, <strong>PrivateRoute</strong> is a modified version of the <code>Route</code> component which checks for <code>currentUser</code> before setting up any route. Here, first we got hold of the <code>currentUser</code> from <code>useAuth()</code>.</p><p>Then, we are returning the <code>&lt;Route&gt;</code> component with all the props required for a normal <code>&lt;Route&gt;</code> component like <code>path</code>, <code>exact</code> , etc. Then it checks for <code>currentUser</code>. If true, then it renders the <strong>dashboard;</strong> otherwise, it redirects to <strong>login. </strong>Here, instead of using <code>useHistory()</code> for redirecting, well be using the <code>&lt;Redirect&gt;</code> component from <code>react-router-dom</code> because we need to render something if <code>currentUser</code> is not logged in.</p><p>Now, inside <strong>App.js,</strong> import this component and replace the code line where you set up the route for the <strong>dashboard </strong>to the following:</p><p><code>&lt;PrivateRoute exact path=/dashboard component={Dashboard} /&gt;</code></p><p>We are done. Now if you try to access the <strong>dashboard </strong>as a guest user, youll be redirected to the <strong>login</strong> component.</p><h2 id="conclusion">Conclusion:</h2><p>Thats it. We successfully created all the basic features required for user authentication. You can upgrade it in the way you want or instead of using <strong>context, </strong>you can go for <strong>Redux. </strong>Firebase has got so many cool features like <strong>Firestore, </strong>a real-time database, and much more. It really comes in handy while designing big heavy projects. Well be discussing more about <strong>Firestore </strong>in future articles. Till then, stay healthy and keep coding.</p><blockquote><p><strong>If you want the complete code files for your project, grab them from <a target="_blank" href="https://github.com/practice404/react-components/tree/main/firebase-react-auth">here</a>.</strong></p></blockquote>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1622039582776/2CPkDeP5X.jpeg<![CDATA[How to convert PDF file into an Audiobook]]>https://swayam-blog.hashnode.dev/how-to-convert-pdf-file-into-an-audiobookhttps://swayam-blog.hashnode.dev/how-to-convert-pdf-file-into-an-audiobookSat, 15 May 2021 12:50:17 GMT<![CDATA[<h2 id="agenda">Agenda:</h2><p>Hey welcome back, so today we are going to do something to automate our reading task using Python. We are going to build a GUI program to select pdf files and then play them inside our software, exactly no more eye reading it'll read it for you and all you need to do is to sit back and enjoy.</p><h2 id="prerequisites">Prerequisites:</h2><ul><li>Happy relationship with basic Python and <code>tkinter</code>.</li></ul><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/3o7btNa0RUYa5E7iiQ/giphy.gif">https://media.giphy.com/media/3o7btNa0RUYa5E7iiQ/giphy.gif</a></div><p>yeah that's it, I'll be explaining the rest 😉.</p><h2 id="analysing">Analysing:</h2><p>So now we know what we are going to do so let's break it down into smaller chunks and focus on each of them individually.</p><p>First of all we are going to create a window and a dialog box to open the desired pdf file. We also create a text box for displaying the pdf text content and a button <code>play</code> for start playing it as audio.</p><h2 id="modules-used">Modules used:</h2><ul><li><code>tkinter</code> (for dealing with GUI)</li><li><code>gTTS</code> (for converting text into speech)</li><li><code>playsound</code> (for playing the audio file)</li><li><code>PyMuPDF</code> (for reading pdf files)</li></ul><p>before moving ahead I want to tell you something that in most of the online tutorials you'll find people using <code>PyPDF2</code> for working with pdf files but the reason we are not using it is because it does not always work, like till the date I'm writing this post if you use <code>PyPDF2</code> for reading pdf generated by google like from <strong><em>google docs</em></strong>, it's not able to read text from it.</p><p><code>gTTS</code> stands for <strong><em>Google Text To Speech</em></strong> it's a Python library as well as a CLI tool for converting text into speech.</p><p><code>playsound</code> is also a Python library for playing audio files like <code>.mp3</code> or <code>.wav</code>.</p><blockquote><p>we are using <code>playsound</code> just for playing the audio file that will be created using <code>gTTS</code>, you can use any Python library for that like <code>pydub</code> or use <code>os</code> module to play on native audio player installed on terminal, but I guess this only works on <strong><em>Mac OS X</em></strong> and <strong><em>linux</em></strong></p></blockquote><h2 id="lets-dive-into-the-code-now">Let's dive into the code now </h2><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/3oAt1TznOzEcx3MssU/giphy.gif">https://media.giphy.com/media/3oAt1TznOzEcx3MssU/giphy.gif</a></div><h3 id="step-1">Step 1 :</h3><p>In this step we'll be creating our GUI so open up your favourite code editor and create a file as <code>main.py</code> and import <code>tkinter</code>.</p><blockquote><p><code>tkinter</code> comes preinstalled with Python so no need to install it from <code>pip</code>.</p></blockquote><pre><code class="lang-python"><span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> *<span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> filedialog<span class="hljs-comment"># creating main window instance from Tk class defined in tkinter </span>window = Tk()window.title(<span class="hljs-string">"convert pdf to audiobook"</span>)window.geometry(<span class="hljs-string">"500x500"</span>) <span class="hljs-comment"># setting default size of the window </span><span class="hljs-comment"># creating text box for displaying pdf content</span>text_box = Text(window, height=<span class="hljs-number">30</span>, width=<span class="hljs-number">60</span>)text_box.pack(pady=<span class="hljs-number">10</span>)<span class="hljs-comment"># creating menu instance from Menu class</span>menu = Menu(window)window.config(menu=menu)<span class="hljs-comment"># adding `File` tab into menu defined above</span>file_menu = Menu(menu, tearoff=<span class="hljs-literal">False</span>)menu.add_cascade(label=<span class="hljs-string">"File"</span>, menu=file_menu)<span class="hljs-comment"># adding drop-downs to `file_menu`</span>file_menu.add_command(label=<span class="hljs-string">"Open"</span>)file_menu.add_command(label=<span class="hljs-string">"clear"</span>)file_menu.add_separator()file_menu.add_command(label=<span class="hljs-string">"Exit"</span>)<span class="hljs-comment"># adding play button for playing audio</span>play_btn = Button(text=<span class="hljs-string">"Play"</span>)play_btn.pack(pady=<span class="hljs-number">20</span>)<span class="hljs-comment"># for keeping window open till we don't close it manually</span>window.mainloop()</code></pre><p>Now if you run it, you'll see something like this,</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1621007157100/sHJ13IDjO.png" alt="Screenshot 2021-05-14 at 9.14.35 PM.png" /></p><h3 id="step-2">Step 2 :</h3><p>In this step we will create function <code>open_pdf</code> this function will create a dialogue box for selecting pdf file and then reading all of it text and showing inside the <strong><em>text box</em></strong> created earlier, then it'll use <code>gTTS</code> for creating audio file of all the text from <strong><em>text box</em></strong>.</p><pre><code class="lang-python"><span class="hljs-keyword">import</span> fitz <span class="hljs-comment"># fitz is actually PyMuPDF</span><span class="hljs-keyword">from</span> gtts <span class="hljs-keyword">import</span> gTTS<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">open_pdf</span>():</span> <span class="hljs-comment"># creating dialogue box</span> open_file = filedialog.askopenfilename( initialdir=<span class="hljs-string">"/Users/swayam/Downloads/"</span>, title=<span class="hljs-string">"Open PDF file"</span>, filetypes=( (<span class="hljs-string">"PDF Files"</span>, <span class="hljs-string">"*.pdf"</span>), (<span class="hljs-string">"All Files"</span>, <span class="hljs-string">"*.*"</span>) ) ) <span class="hljs-keyword">if</span> open_file: <span class="hljs-comment">#reading pdf file and creating instance of Document class from fitz</span> doc = fitz.Document(open_file) <span class="hljs-comment"># getting total number of pages</span> total_pages = doc.page_count <span class="hljs-comment"># looping through all the pages, collecting text from each page and showing it on text box</span> <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(total_pages): page = doc.load_page(n) page_content = page.get_textpage() content = page_content.extractText() text_box.insert(END, content) <span class="hljs-comment"># after whole pdf content is stored then retrieving it from textbox and storing it inside variable </span> text = text_box.get(<span class="hljs-number">1.0</span>, END) <span class="hljs-comment"># using gTTS to convert that text into audio and storing it inside file named as audio.mp3</span> tts = gTTS(text, lang=<span class="hljs-string">'en'</span>) tts.save(<span class="hljs-string">"audio.mp3"</span>)</code></pre><blockquote><p>You need to install <code>gTTS</code> and <code>PyMuPDF</code>, so inside your terminal run <code>pip install PyMuPDF</code> and <code>pip install gTTS</code> for installing them. </p></blockquote><p>As you can see above code is self explanatory but still I want to highlight some points. First look at the line that says <code>text_box.insert(END, content)</code> basically <code>END</code> is defined inside <code>tkinter</code> and it returns the last index that means where is the end of file, similarly <code>1.0</code> means the beginning of the text.</p><p>So basically when we store the first page data inside text box then <code>starting index == last index == END</code> after that we'll keep inserting text at the end of the previous stored text.</p><h3 id="step-3">Step 3 :</h3><p>Now we have the function so it's time to provide each widget it's own functionality like pressing button and clicking on menu really perform something.</p><p>Go to the code and add <code>command</code> attribute to all the <code>file_menu</code> drop-downs and <code>play_btn</code> as show below</p><pre><code class="lang-python"><span class="hljs-keyword">from</span> playsound <span class="hljs-keyword">import</span> playsound file_menu.add_command(label=<span class="hljs-string">"Open"</span>, command=open_pdf)file_menu.add_command(label=<span class="hljs-string">"clear"</span>, command=<span class="hljs-keyword">lambda</span>: text_box.delete(<span class="hljs-number">1.0</span>, END))file_menu.add_command(label=<span class="hljs-string">"Exit"</span>, command=window.quit)play_btn = Button(text=<span class="hljs-string">"Play"</span>, command=<span class="hljs-keyword">lambda</span>: playsound(<span class="hljs-string">"audio.mp3"</span>))</code></pre><blockquote><p><code>playsound</code> requires <code>pyobjc</code> as dependency for working so you need to install it by <code>pip install pyobjc</code></p></blockquote><p>Basically function provided in <code>command</code> will execute as you click on the widget. For short function like <code>clear</code> or <code>exit</code> we used <code>lambda</code> functions.</p><p><code>window.quit</code> will close the window and <code>clear</code> is self explanatory. As the <strong><em>audio.mp3</em></strong> gets saved <code>playsound("audio.mp3")</code> will play it after you click the button.</p><p>So if you followed well then in the end your final code will somewhat look like: </p><pre><code class="lang-python"><span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> *<span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> filedialog<span class="hljs-keyword">import</span> fitz<span class="hljs-keyword">from</span> gtts <span class="hljs-keyword">import</span> gTTS<span class="hljs-keyword">from</span> playsound <span class="hljs-keyword">import</span> playsound window = Tk()window.title(<span class="hljs-string">"convert pdf to audiobook"</span>)window.geometry(<span class="hljs-string">"500x500"</span>)<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">open_pdf</span>():</span> open_file = filedialog.askopenfilename( initialdir=<span class="hljs-string">"/Users/swayam/Downloads/"</span>, title=<span class="hljs-string">"Open PDF file"</span>, filetypes=( (<span class="hljs-string">"PDF Files"</span>, <span class="hljs-string">"*.pdf"</span>), (<span class="hljs-string">"All Files"</span>, <span class="hljs-string">"*.*"</span>) ) ) <span class="hljs-keyword">if</span> open_file: doc = fitz.Document(open_file) total_pages = doc.page_count <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(total_pages): page = doc.load_page(n) page_content = page.get_textpage() content = page_content.extractText() text_box.insert(END, content) text = text_box.get(<span class="hljs-number">1.0</span>, END) tts = gTTS(text, lang=<span class="hljs-string">'en'</span>) tts.save(<span class="hljs-string">"audio.mp3"</span>)text_box = Text(window, height=<span class="hljs-number">30</span>, width=<span class="hljs-number">60</span>)text_box.pack(pady=<span class="hljs-number">10</span>)menu = Menu(window)window.config(menu=menu)file_menu = Menu(menu, tearoff=<span class="hljs-literal">False</span>)menu.add_cascade(label=<span class="hljs-string">"File"</span>, menu=file_menu)file_menu.add_command(label=<span class="hljs-string">"Open"</span>, command=open_pdf)file_menu.add_command(label=<span class="hljs-string">"clear"</span>, command=<span class="hljs-keyword">lambda</span>: text_box.delete(<span class="hljs-number">1.0</span>, END))file_menu.add_separator()file_menu.add_command(label=<span class="hljs-string">"Exit"</span>, command=window.quit)play_btn = Button(text=<span class="hljs-string">"Play"</span>, command=<span class="hljs-keyword">lambda</span>: playsound(<span class="hljs-string">"audio.mp3"</span>))play_btn.pack(pady=<span class="hljs-number">20</span>)window.mainloop()</code></pre><h2 id="lets-test-it">Let's test it</h2><p>Now it's time to run our code and check if everything is working or not, take a sample pdf file with some text and open it.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/nfDdssirelM">https://youtu.be/nfDdssirelM</a></div><p><strong><em>YAYYYYYY...... 🎉 🥳, WE DID IT GUYS</em></strong></p><p>we just created our own pdf to audio book convertor, now if you want to go some steps further I will recommend you to read <a target="_blank" href="https://gtts.readthedocs.io/en/latest/">gTTS</a> official documentation also if someone wants then you can convert this python script into <code>exe</code> file and share it with your friends so they can have fun too </p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/ENagATV1Gr9eg/giphy.gif">https://media.giphy.com/media/ENagATV1Gr9eg/giphy.gif</a></div><blockquote><p>let's keep <code>converting python scripts to .exe files</code> for next tutorial 😅.</p></blockquote><h2 id="whats-next">What's next !</h2><p>If you are still reading, make sure to follow me on <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> as I share some cool projects and updates there and yeah don't forget I have some exciting stuff coming up every weekend. See Y'all next time and stay safe ^^ 🌻</p>]]><![CDATA[<h2 id="agenda">Agenda:</h2><p>Hey welcome back, so today we are going to do something to automate our reading task using Python. We are going to build a GUI program to select pdf files and then play them inside our software, exactly no more eye reading it'll read it for you and all you need to do is to sit back and enjoy.</p><h2 id="prerequisites">Prerequisites:</h2><ul><li>Happy relationship with basic Python and <code>tkinter</code>.</li></ul><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/3o7btNa0RUYa5E7iiQ/giphy.gif">https://media.giphy.com/media/3o7btNa0RUYa5E7iiQ/giphy.gif</a></div><p>yeah that's it, I'll be explaining the rest 😉.</p><h2 id="analysing">Analysing:</h2><p>So now we know what we are going to do so let's break it down into smaller chunks and focus on each of them individually.</p><p>First of all we are going to create a window and a dialog box to open the desired pdf file. We also create a text box for displaying the pdf text content and a button <code>play</code> for start playing it as audio.</p><h2 id="modules-used">Modules used:</h2><ul><li><code>tkinter</code> (for dealing with GUI)</li><li><code>gTTS</code> (for converting text into speech)</li><li><code>playsound</code> (for playing the audio file)</li><li><code>PyMuPDF</code> (for reading pdf files)</li></ul><p>before moving ahead I want to tell you something that in most of the online tutorials you'll find people using <code>PyPDF2</code> for working with pdf files but the reason we are not using it is because it does not always work, like till the date I'm writing this post if you use <code>PyPDF2</code> for reading pdf generated by google like from <strong><em>google docs</em></strong>, it's not able to read text from it.</p><p><code>gTTS</code> stands for <strong><em>Google Text To Speech</em></strong> it's a Python library as well as a CLI tool for converting text into speech.</p><p><code>playsound</code> is also a Python library for playing audio files like <code>.mp3</code> or <code>.wav</code>.</p><blockquote><p>we are using <code>playsound</code> just for playing the audio file that will be created using <code>gTTS</code>, you can use any Python library for that like <code>pydub</code> or use <code>os</code> module to play on native audio player installed on terminal, but I guess this only works on <strong><em>Mac OS X</em></strong> and <strong><em>linux</em></strong></p></blockquote><h2 id="lets-dive-into-the-code-now">Let's dive into the code now </h2><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/3oAt1TznOzEcx3MssU/giphy.gif">https://media.giphy.com/media/3oAt1TznOzEcx3MssU/giphy.gif</a></div><h3 id="step-1">Step 1 :</h3><p>In this step we'll be creating our GUI so open up your favourite code editor and create a file as <code>main.py</code> and import <code>tkinter</code>.</p><blockquote><p><code>tkinter</code> comes preinstalled with Python so no need to install it from <code>pip</code>.</p></blockquote><pre><code class="lang-python"><span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> *<span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> filedialog<span class="hljs-comment"># creating main window instance from Tk class defined in tkinter </span>window = Tk()window.title(<span class="hljs-string">"convert pdf to audiobook"</span>)window.geometry(<span class="hljs-string">"500x500"</span>) <span class="hljs-comment"># setting default size of the window </span><span class="hljs-comment"># creating text box for displaying pdf content</span>text_box = Text(window, height=<span class="hljs-number">30</span>, width=<span class="hljs-number">60</span>)text_box.pack(pady=<span class="hljs-number">10</span>)<span class="hljs-comment"># creating menu instance from Menu class</span>menu = Menu(window)window.config(menu=menu)<span class="hljs-comment"># adding `File` tab into menu defined above</span>file_menu = Menu(menu, tearoff=<span class="hljs-literal">False</span>)menu.add_cascade(label=<span class="hljs-string">"File"</span>, menu=file_menu)<span class="hljs-comment"># adding drop-downs to `file_menu`</span>file_menu.add_command(label=<span class="hljs-string">"Open"</span>)file_menu.add_command(label=<span class="hljs-string">"clear"</span>)file_menu.add_separator()file_menu.add_command(label=<span class="hljs-string">"Exit"</span>)<span class="hljs-comment"># adding play button for playing audio</span>play_btn = Button(text=<span class="hljs-string">"Play"</span>)play_btn.pack(pady=<span class="hljs-number">20</span>)<span class="hljs-comment"># for keeping window open till we don't close it manually</span>window.mainloop()</code></pre><p>Now if you run it, you'll see something like this,</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1621007157100/sHJ13IDjO.png" alt="Screenshot 2021-05-14 at 9.14.35 PM.png" /></p><h3 id="step-2">Step 2 :</h3><p>In this step we will create function <code>open_pdf</code> this function will create a dialogue box for selecting pdf file and then reading all of it text and showing inside the <strong><em>text box</em></strong> created earlier, then it'll use <code>gTTS</code> for creating audio file of all the text from <strong><em>text box</em></strong>.</p><pre><code class="lang-python"><span class="hljs-keyword">import</span> fitz <span class="hljs-comment"># fitz is actually PyMuPDF</span><span class="hljs-keyword">from</span> gtts <span class="hljs-keyword">import</span> gTTS<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">open_pdf</span>():</span> <span class="hljs-comment"># creating dialogue box</span> open_file = filedialog.askopenfilename( initialdir=<span class="hljs-string">"/Users/swayam/Downloads/"</span>, title=<span class="hljs-string">"Open PDF file"</span>, filetypes=( (<span class="hljs-string">"PDF Files"</span>, <span class="hljs-string">"*.pdf"</span>), (<span class="hljs-string">"All Files"</span>, <span class="hljs-string">"*.*"</span>) ) ) <span class="hljs-keyword">if</span> open_file: <span class="hljs-comment">#reading pdf file and creating instance of Document class from fitz</span> doc = fitz.Document(open_file) <span class="hljs-comment"># getting total number of pages</span> total_pages = doc.page_count <span class="hljs-comment"># looping through all the pages, collecting text from each page and showing it on text box</span> <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(total_pages): page = doc.load_page(n) page_content = page.get_textpage() content = page_content.extractText() text_box.insert(END, content) <span class="hljs-comment"># after whole pdf content is stored then retrieving it from textbox and storing it inside variable </span> text = text_box.get(<span class="hljs-number">1.0</span>, END) <span class="hljs-comment"># using gTTS to convert that text into audio and storing it inside file named as audio.mp3</span> tts = gTTS(text, lang=<span class="hljs-string">'en'</span>) tts.save(<span class="hljs-string">"audio.mp3"</span>)</code></pre><blockquote><p>You need to install <code>gTTS</code> and <code>PyMuPDF</code>, so inside your terminal run <code>pip install PyMuPDF</code> and <code>pip install gTTS</code> for installing them. </p></blockquote><p>As you can see above code is self explanatory but still I want to highlight some points. First look at the line that says <code>text_box.insert(END, content)</code> basically <code>END</code> is defined inside <code>tkinter</code> and it returns the last index that means where is the end of file, similarly <code>1.0</code> means the beginning of the text.</p><p>So basically when we store the first page data inside text box then <code>starting index == last index == END</code> after that we'll keep inserting text at the end of the previous stored text.</p><h3 id="step-3">Step 3 :</h3><p>Now we have the function so it's time to provide each widget it's own functionality like pressing button and clicking on menu really perform something.</p><p>Go to the code and add <code>command</code> attribute to all the <code>file_menu</code> drop-downs and <code>play_btn</code> as show below</p><pre><code class="lang-python"><span class="hljs-keyword">from</span> playsound <span class="hljs-keyword">import</span> playsound file_menu.add_command(label=<span class="hljs-string">"Open"</span>, command=open_pdf)file_menu.add_command(label=<span class="hljs-string">"clear"</span>, command=<span class="hljs-keyword">lambda</span>: text_box.delete(<span class="hljs-number">1.0</span>, END))file_menu.add_command(label=<span class="hljs-string">"Exit"</span>, command=window.quit)play_btn = Button(text=<span class="hljs-string">"Play"</span>, command=<span class="hljs-keyword">lambda</span>: playsound(<span class="hljs-string">"audio.mp3"</span>))</code></pre><blockquote><p><code>playsound</code> requires <code>pyobjc</code> as dependency for working so you need to install it by <code>pip install pyobjc</code></p></blockquote><p>Basically function provided in <code>command</code> will execute as you click on the widget. For short function like <code>clear</code> or <code>exit</code> we used <code>lambda</code> functions.</p><p><code>window.quit</code> will close the window and <code>clear</code> is self explanatory. As the <strong><em>audio.mp3</em></strong> gets saved <code>playsound("audio.mp3")</code> will play it after you click the button.</p><p>So if you followed well then in the end your final code will somewhat look like: </p><pre><code class="lang-python"><span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> *<span class="hljs-keyword">from</span> tkinter <span class="hljs-keyword">import</span> filedialog<span class="hljs-keyword">import</span> fitz<span class="hljs-keyword">from</span> gtts <span class="hljs-keyword">import</span> gTTS<span class="hljs-keyword">from</span> playsound <span class="hljs-keyword">import</span> playsound window = Tk()window.title(<span class="hljs-string">"convert pdf to audiobook"</span>)window.geometry(<span class="hljs-string">"500x500"</span>)<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">open_pdf</span>():</span> open_file = filedialog.askopenfilename( initialdir=<span class="hljs-string">"/Users/swayam/Downloads/"</span>, title=<span class="hljs-string">"Open PDF file"</span>, filetypes=( (<span class="hljs-string">"PDF Files"</span>, <span class="hljs-string">"*.pdf"</span>), (<span class="hljs-string">"All Files"</span>, <span class="hljs-string">"*.*"</span>) ) ) <span class="hljs-keyword">if</span> open_file: doc = fitz.Document(open_file) total_pages = doc.page_count <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(total_pages): page = doc.load_page(n) page_content = page.get_textpage() content = page_content.extractText() text_box.insert(END, content) text = text_box.get(<span class="hljs-number">1.0</span>, END) tts = gTTS(text, lang=<span class="hljs-string">'en'</span>) tts.save(<span class="hljs-string">"audio.mp3"</span>)text_box = Text(window, height=<span class="hljs-number">30</span>, width=<span class="hljs-number">60</span>)text_box.pack(pady=<span class="hljs-number">10</span>)menu = Menu(window)window.config(menu=menu)file_menu = Menu(menu, tearoff=<span class="hljs-literal">False</span>)menu.add_cascade(label=<span class="hljs-string">"File"</span>, menu=file_menu)file_menu.add_command(label=<span class="hljs-string">"Open"</span>, command=open_pdf)file_menu.add_command(label=<span class="hljs-string">"clear"</span>, command=<span class="hljs-keyword">lambda</span>: text_box.delete(<span class="hljs-number">1.0</span>, END))file_menu.add_separator()file_menu.add_command(label=<span class="hljs-string">"Exit"</span>, command=window.quit)play_btn = Button(text=<span class="hljs-string">"Play"</span>, command=<span class="hljs-keyword">lambda</span>: playsound(<span class="hljs-string">"audio.mp3"</span>))play_btn.pack(pady=<span class="hljs-number">20</span>)window.mainloop()</code></pre><h2 id="lets-test-it">Let's test it</h2><p>Now it's time to run our code and check if everything is working or not, take a sample pdf file with some text and open it.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/nfDdssirelM">https://youtu.be/nfDdssirelM</a></div><p><strong><em>YAYYYYYY...... 🎉 🥳, WE DID IT GUYS</em></strong></p><p>we just created our own pdf to audio book convertor, now if you want to go some steps further I will recommend you to read <a target="_blank" href="https://gtts.readthedocs.io/en/latest/">gTTS</a> official documentation also if someone wants then you can convert this python script into <code>exe</code> file and share it with your friends so they can have fun too </p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/ENagATV1Gr9eg/giphy.gif">https://media.giphy.com/media/ENagATV1Gr9eg/giphy.gif</a></div><blockquote><p>let's keep <code>converting python scripts to .exe files</code> for next tutorial 😅.</p></blockquote><h2 id="whats-next">What's next !</h2><p>If you are still reading, make sure to follow me on <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> as I share some cool projects and updates there and yeah don't forget I have some exciting stuff coming up every weekend. See Y'all next time and stay safe ^^ 🌻</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1621015729150/O6kXJNLae.jpeg<![CDATA[How to integrate React frontend with Node backend]]>https://swayam-blog.hashnode.dev/how-to-integrate-react-frontend-with-node-backendhttps://swayam-blog.hashnode.dev/how-to-integrate-react-frontend-with-node-backendSun, 02 May 2021 03:59:12 GMT<![CDATA[<h2 id="overview">Overview :</h2><p>React framework in great for creating awesome web apps and UIs. But sometimes we need extra functionalities like integrating database or performing authentication. Such stuff needs to be done in backend, you don't want that anyone can see your secret keys or hashing stuff in frontend.</p><p>That's why in this article we are going to learn how to connect your React frontend with express backend. Sometimes this task get really overwhelming, you might get stuck with CORS issue, we will be handling all of them in this one article.</p><p>Our motive is to create a server that host an API and then we make a GET request from React frontend and display the data on screen.</p><h2 id="prerequisites">Prerequisites :</h2><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/TGvHZanK0Y8poe2lnA/giphy.gif">https://media.giphy.com/media/TGvHZanK0Y8poe2lnA/giphy.gif</a></div><ul><li>Basic React knowledge and comfortability with environment.</li><li>Basic Express knowledge.</li></ul><h2 id="lets-get-going">Let's get going 👍</h2><p>First of all create a folder and call it anything, we are naming it as <code>app</code>. This folder will contain all our frontend + backend code.</p><p>Now we will start with creating backend first, </p><h2 id="setting-up-server">Setting up Server</h2><p>Open the terminal and locate to your <code>app</code> folder. Now inside this folder we will create <code>server.js</code> file. Yeah exactly this file will contain code responsible for server running and API hosting.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619868185288/Roiq4kTz0.png" alt="Screenshot 2021-05-01 at 8.26.58 AM.png" /></p><p>now we are going to initialize <code>npm</code> in this folder to handle external packages and dependencies.In the terminal type <code>npm init -y</code> it will initialized the npm with default values.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619868607424/Kt0s6rWhe.png" alt="Screenshot 2021-05-01 at 8.28.16 AM.png" /></p><p>As a result you will see a file with name <code>package.json</code> will automatically get created.</p><p>From this step we are going to handle rest of the things on your favourite code editor. I am using VSCode in this article.Open <code>app</code> folder on VSCode.</p><p>Now we have to install some packages, these are</p><ul><li><code>express</code></li><li><code>cors</code></li></ul><p>I hope you are familiar with <code>express</code> it's a widely used module for maintaining backend. Now what is the use of <code>cors</code> library, okay so for this first we need to understand what CORS really is.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/SvRuvlSEa67wNNHuHy/giphy.gif">https://media.giphy.com/media/SvRuvlSEa67wNNHuHy/giphy.gif</a></div><p>CORS is shorthand for Cross-Origin Resource Sharing. It is a mechanism to allow or restrict requested resources on a web server depend on where the HTTP request was initiated.Whenever we make request to a server we send a header called <code>origin</code>. It contain the information about from where the request is originated and using this header a web server can restrict or allow the resource sharing between client and server.</p><blockquote><p>By default requests from any other origins will be restricted by the browser.</p></blockquote><p>If you want to read more about CORS, here's the link you can refer to <a target="_blank" href="https://stackabuse.com/handling-cors-with-node-js/">More on CORS</a> </p><p>Now with the use <code>cors</code> middleware we can allow CORS for all routes or to some specific routes, in this article we will allow for all routes but if you want to read more then refer to <a target="_blank" href="https://www.npmjs.com/package/cors">cors documentation</a>.</p><p>For installing above packages open terminal in your VSCode and type following <code>npm install express cors</code>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619869832446/x3vqaVeP4.png" alt="Screenshot 2021-05-01 at 8.34.42 AM.png" /></p><p>Now all left is to setup our backend server, <strong><em>coding time</em></strong> 🥳</p><p>Let's start with creating a file <code>data.js</code>. This file will contain our API data that we are going to host and then we <code>export</code> the API data so that we can use it inside our <code>server.js</code>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870377567/p-Uc9Xomv.png" alt="Screenshot 2021-05-01 at 8.50.32 AM.png" /></p><p>Okay now let's setup our server, open the <code>server.js</code> and follow the below image</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870409477/IALSsR5zp.png" alt="Screenshot 2021-05-01 at 8.50.07 AM.png" /></p><p>As you can see that the code is self explanatory but still I want to highlight a point that is <strong><em>port number on which our server is listening</em></strong></p><p>Take any free port number you want <em>except</em> <code>3000</code> why? because <code>port 3000</code> is used by react frontend and if you use same for your backend then it might gonna crash. Here you can see I used <code>port 5000</code>.</p><p>Now let's test if our server is working or not. Open the terminal again and type following <code>node server.js</code> and in the console you can see it'll be printing <code>server is running on port 5000</code>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870764377/ZSKBoWnuSl.png" alt="Screenshot 2021-05-01 at 8.51.02 AM.png" /></p><p>Open your browser and go to the following URL <code>http://localhost:5000/api</code></p><p>You can see your API data there in JSON format. For better visualization you can use extensions like <strong><em>JSON viewer pro</em></strong> here's the link for <a target="_blank" href="https://chrome.google.com/webstore/detail/json-viewer-pro/eifflpmocdbdmepbjaopkkhbfmdgijcc">download</a>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870961195/iSlHycNd7.png" alt="Screenshot 2021-05-01 at 8.53.35 AM.png" /></p><p>YAYYYY 🎉... Our server is up and running.</p><p>Time to move to frontend and start building it.</p><h2 id="setting-up-react-frontend">Setting up React frontend</h2><p>First of all make a folder <code>client</code>, this will contain our frontend stuff.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619871668711/3ufxPI7br.png" alt="1.png" /></p><p>Now go inside the <code>client</code> folder and type the following on terminal <code>npx create-react-app my_app</code></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872012141/nGJSw6BK_.png" alt="2.png" /></p><p>It'll take some time to process and when it done you'll see a folder named <code>my_app</code> created, see below</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872220355/9CIFW2sIm.png" alt="3.png" /></p><p>Now inside the VSCode you can see that <code>client/my_app</code> will contain some pre-build files they all are the initial setup for React frontend. You can modify them as the way you want, but in this article we just modify <code>package.json</code> and <code>App.js</code> to our need.</p><p>Now open the <code>package.json</code> from the <code>client/my_app</code> folder on VSCode and add the following property below <code>private: true</code> property. </p><p><code>proxy: "http://localhost:5000"</code></p><p>see below for reference</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872462547/rZa9r4Bah.png" alt="5.png" /></p><p>Now adding this property makes React to use <code>http://localhost:5000</code> as default URL for making requests.</p><p>It's time to setup <code>App.js</code> to make request to our server and render data on client's screen.Open <code>App.js</code> on VSCode and in the function <code>App</code> delete everything inside the <code>div having class 'App'</code> and do the following.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872833443/w4OlDZQhx.png" alt="6.png" /></p><p>As you can see the code is self explanatory but still I again want to highlight a point that, just take a look at the <code>fetch("/api")</code>. You can notice we are not providing complete endpoint like <code>http://localhost:5000/api</code> and the reason is <strong><em>we don't need to</em></strong> remember the <code>proxy</code> property we set earlier. All credit goes to it, now we can create as many routes we want inside your server and can access them in React with similar manner.</p><p>Now open terminal inside VSCode and select a new <code>zsh</code> or <code>bash</code> whatever you prefer and make sure you are inside the<code>my_app</code> folder.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873125531/sl8dVP0Th.png" alt="7.png" /></p><p>and when you are in, type following in terminal to start the React frontend server. <code>npm start</code></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873533809/quItnVM8I.png" alt="9.png" /></p><blockquote><p>ignore all the warnings for now</p></blockquote><p>This command will basically compile your React code and start the server at <code>port 3000</code>. It will also automatically open your web browser and get located to <code>http://localhost:3000</code> and what you can see is a big <strong>" Hello World " </strong> on screen.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873614437/FtfxvMoNZ.png" alt="10.png" /></p><p>Open the <code>Developers tools</code> inside brower and in console you can see that our API data is logged there successfully.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873706248/TxmWBHA7p.png" alt="11.png" /></p><p>Now we are sure that our frontend is working properly and data is also fetched without any problem, so it's time to render the data on screen. Open <code>App.js</code> on VSCode and replace the already written code with the highlighted part of code.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619878646161/rsjtTAwfT.png" alt="Screenshot 2021-05-01 at 7.46.46 PM.png" /></p><p>As you can notice we just replaced the <strong><em>console logging</em></strong> and <strong><em>Hello World</em></strong> to the other code so that it can set the <code>state</code> value to <strong><em>data</em></strong> array and render it on screen with some styling. </p><p>Now just save it and open your browser again to see the final result.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619878726691/3zjGdH6HF.png" alt="Screenshot 2021-05-01 at 7.48.38 PM.png" /></p><p>If you want to re-check just modify the data inside <code>data.js</code> from backend and save it, the modified result will also be displayed on your screen.</p><p><strong><em>YAYYYYYY...... 🎉 🥳</em></strong> our backend and frontend are now perfectly connected, now you can use your backend for integrating database or authentication without any worries of exposing private data on frontend.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/WueGYCBT6OPsjUMj1J/giphy.gif">https://media.giphy.com/media/WueGYCBT6OPsjUMj1J/giphy.gif</a></div><h2 id="whats-next">What's next !</h2><p><strong>If you are still reading, make sure to follow me on <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and subscribe to my newsletter for more as I have some exciting stuff coming up every weekend. See Y'all next time and stay safe ^^ 🌻</strong> </p>]]><![CDATA[<h2 id="overview">Overview :</h2><p>React framework in great for creating awesome web apps and UIs. But sometimes we need extra functionalities like integrating database or performing authentication. Such stuff needs to be done in backend, you don't want that anyone can see your secret keys or hashing stuff in frontend.</p><p>That's why in this article we are going to learn how to connect your React frontend with express backend. Sometimes this task get really overwhelming, you might get stuck with CORS issue, we will be handling all of them in this one article.</p><p>Our motive is to create a server that host an API and then we make a GET request from React frontend and display the data on screen.</p><h2 id="prerequisites">Prerequisites :</h2><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/TGvHZanK0Y8poe2lnA/giphy.gif">https://media.giphy.com/media/TGvHZanK0Y8poe2lnA/giphy.gif</a></div><ul><li>Basic React knowledge and comfortability with environment.</li><li>Basic Express knowledge.</li></ul><h2 id="lets-get-going">Let's get going 👍</h2><p>First of all create a folder and call it anything, we are naming it as <code>app</code>. This folder will contain all our frontend + backend code.</p><p>Now we will start with creating backend first, </p><h2 id="setting-up-server">Setting up Server</h2><p>Open the terminal and locate to your <code>app</code> folder. Now inside this folder we will create <code>server.js</code> file. Yeah exactly this file will contain code responsible for server running and API hosting.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619868185288/Roiq4kTz0.png" alt="Screenshot 2021-05-01 at 8.26.58 AM.png" /></p><p>now we are going to initialize <code>npm</code> in this folder to handle external packages and dependencies.In the terminal type <code>npm init -y</code> it will initialized the npm with default values.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619868607424/Kt0s6rWhe.png" alt="Screenshot 2021-05-01 at 8.28.16 AM.png" /></p><p>As a result you will see a file with name <code>package.json</code> will automatically get created.</p><p>From this step we are going to handle rest of the things on your favourite code editor. I am using VSCode in this article.Open <code>app</code> folder on VSCode.</p><p>Now we have to install some packages, these are</p><ul><li><code>express</code></li><li><code>cors</code></li></ul><p>I hope you are familiar with <code>express</code> it's a widely used module for maintaining backend. Now what is the use of <code>cors</code> library, okay so for this first we need to understand what CORS really is.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/SvRuvlSEa67wNNHuHy/giphy.gif">https://media.giphy.com/media/SvRuvlSEa67wNNHuHy/giphy.gif</a></div><p>CORS is shorthand for Cross-Origin Resource Sharing. It is a mechanism to allow or restrict requested resources on a web server depend on where the HTTP request was initiated.Whenever we make request to a server we send a header called <code>origin</code>. It contain the information about from where the request is originated and using this header a web server can restrict or allow the resource sharing between client and server.</p><blockquote><p>By default requests from any other origins will be restricted by the browser.</p></blockquote><p>If you want to read more about CORS, here's the link you can refer to <a target="_blank" href="https://stackabuse.com/handling-cors-with-node-js/">More on CORS</a> </p><p>Now with the use <code>cors</code> middleware we can allow CORS for all routes or to some specific routes, in this article we will allow for all routes but if you want to read more then refer to <a target="_blank" href="https://www.npmjs.com/package/cors">cors documentation</a>.</p><p>For installing above packages open terminal in your VSCode and type following <code>npm install express cors</code>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619869832446/x3vqaVeP4.png" alt="Screenshot 2021-05-01 at 8.34.42 AM.png" /></p><p>Now all left is to setup our backend server, <strong><em>coding time</em></strong> 🥳</p><p>Let's start with creating a file <code>data.js</code>. This file will contain our API data that we are going to host and then we <code>export</code> the API data so that we can use it inside our <code>server.js</code>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870377567/p-Uc9Xomv.png" alt="Screenshot 2021-05-01 at 8.50.32 AM.png" /></p><p>Okay now let's setup our server, open the <code>server.js</code> and follow the below image</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870409477/IALSsR5zp.png" alt="Screenshot 2021-05-01 at 8.50.07 AM.png" /></p><p>As you can see that the code is self explanatory but still I want to highlight a point that is <strong><em>port number on which our server is listening</em></strong></p><p>Take any free port number you want <em>except</em> <code>3000</code> why? because <code>port 3000</code> is used by react frontend and if you use same for your backend then it might gonna crash. Here you can see I used <code>port 5000</code>.</p><p>Now let's test if our server is working or not. Open the terminal again and type following <code>node server.js</code> and in the console you can see it'll be printing <code>server is running on port 5000</code>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870764377/ZSKBoWnuSl.png" alt="Screenshot 2021-05-01 at 8.51.02 AM.png" /></p><p>Open your browser and go to the following URL <code>http://localhost:5000/api</code></p><p>You can see your API data there in JSON format. For better visualization you can use extensions like <strong><em>JSON viewer pro</em></strong> here's the link for <a target="_blank" href="https://chrome.google.com/webstore/detail/json-viewer-pro/eifflpmocdbdmepbjaopkkhbfmdgijcc">download</a>.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619870961195/iSlHycNd7.png" alt="Screenshot 2021-05-01 at 8.53.35 AM.png" /></p><p>YAYYYY 🎉... Our server is up and running.</p><p>Time to move to frontend and start building it.</p><h2 id="setting-up-react-frontend">Setting up React frontend</h2><p>First of all make a folder <code>client</code>, this will contain our frontend stuff.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619871668711/3ufxPI7br.png" alt="1.png" /></p><p>Now go inside the <code>client</code> folder and type the following on terminal <code>npx create-react-app my_app</code></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872012141/nGJSw6BK_.png" alt="2.png" /></p><p>It'll take some time to process and when it done you'll see a folder named <code>my_app</code> created, see below</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872220355/9CIFW2sIm.png" alt="3.png" /></p><p>Now inside the VSCode you can see that <code>client/my_app</code> will contain some pre-build files they all are the initial setup for React frontend. You can modify them as the way you want, but in this article we just modify <code>package.json</code> and <code>App.js</code> to our need.</p><p>Now open the <code>package.json</code> from the <code>client/my_app</code> folder on VSCode and add the following property below <code>private: true</code> property. </p><p><code>proxy: "http://localhost:5000"</code></p><p>see below for reference</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872462547/rZa9r4Bah.png" alt="5.png" /></p><p>Now adding this property makes React to use <code>http://localhost:5000</code> as default URL for making requests.</p><p>It's time to setup <code>App.js</code> to make request to our server and render data on client's screen.Open <code>App.js</code> on VSCode and in the function <code>App</code> delete everything inside the <code>div having class 'App'</code> and do the following.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619872833443/w4OlDZQhx.png" alt="6.png" /></p><p>As you can see the code is self explanatory but still I again want to highlight a point that, just take a look at the <code>fetch("/api")</code>. You can notice we are not providing complete endpoint like <code>http://localhost:5000/api</code> and the reason is <strong><em>we don't need to</em></strong> remember the <code>proxy</code> property we set earlier. All credit goes to it, now we can create as many routes we want inside your server and can access them in React with similar manner.</p><p>Now open terminal inside VSCode and select a new <code>zsh</code> or <code>bash</code> whatever you prefer and make sure you are inside the<code>my_app</code> folder.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873125531/sl8dVP0Th.png" alt="7.png" /></p><p>and when you are in, type following in terminal to start the React frontend server. <code>npm start</code></p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873533809/quItnVM8I.png" alt="9.png" /></p><blockquote><p>ignore all the warnings for now</p></blockquote><p>This command will basically compile your React code and start the server at <code>port 3000</code>. It will also automatically open your web browser and get located to <code>http://localhost:3000</code> and what you can see is a big <strong>" Hello World " </strong> on screen.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873614437/FtfxvMoNZ.png" alt="10.png" /></p><p>Open the <code>Developers tools</code> inside brower and in console you can see that our API data is logged there successfully.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619873706248/TxmWBHA7p.png" alt="11.png" /></p><p>Now we are sure that our frontend is working properly and data is also fetched without any problem, so it's time to render the data on screen. Open <code>App.js</code> on VSCode and replace the already written code with the highlighted part of code.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619878646161/rsjtTAwfT.png" alt="Screenshot 2021-05-01 at 7.46.46 PM.png" /></p><p>As you can notice we just replaced the <strong><em>console logging</em></strong> and <strong><em>Hello World</em></strong> to the other code so that it can set the <code>state</code> value to <strong><em>data</em></strong> array and render it on screen with some styling. </p><p>Now just save it and open your browser again to see the final result.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619878726691/3zjGdH6HF.png" alt="Screenshot 2021-05-01 at 7.48.38 PM.png" /></p><p>If you want to re-check just modify the data inside <code>data.js</code> from backend and save it, the modified result will also be displayed on your screen.</p><p><strong><em>YAYYYYYY...... 🎉 🥳</em></strong> our backend and frontend are now perfectly connected, now you can use your backend for integrating database or authentication without any worries of exposing private data on frontend.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/WueGYCBT6OPsjUMj1J/giphy.gif">https://media.giphy.com/media/WueGYCBT6OPsjUMj1J/giphy.gif</a></div><h2 id="whats-next">What's next !</h2><p><strong>If you are still reading, make sure to follow me on <a target="_blank" href="https://twitter.com/_s_w_a_y_a_m_">Twitter</a> and subscribe to my newsletter for more as I have some exciting stuff coming up every weekend. See Y'all next time and stay safe ^^ 🌻</strong> </p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1619842841465/91aSM9LOM.jpeg<![CDATA[Automating birthday wishes using Python]]>https://swayam-blog.hashnode.dev/automating-birthday-wishes-using-pythonhttps://swayam-blog.hashnode.dev/automating-birthday-wishes-using-pythonSat, 24 Apr 2021 12:19:33 GMT<![CDATA[<h1 id="overview">Overview:</h1><p>Remembering dates is kinda hard but we are programmers, making hard things easier is our only job so instead of we remember dates why not automate this task.In this article we gonna automate the birthday wishes, yeah exactly our program will check if there's any birthday today and then mail your friend a beautiful wish 😁.</p><blockquote><p>Note: I highly recommend you to remember dates because friends get offended if they got to know about it.</p></blockquote><h2 id="prerequisites">Prerequisites :</h2><ul><li>comfortable with Python</li><li>some basic knowledge about <strong><em>pandas</em></strong> module</li></ul><h2 id="lets-get-going">Let's get going 🌱</h2><h3 id="1-setup">1. Setup:</h3><p>So first of all before heading to write code, we need to create a <strong><em>csv</em></strong> file to store the information about our dearest friends like their email address, name, birthdate. Name that file as <code>birthdays.csv</code>.</p><blockquote><p>csv stands for " comma separated values " it's basically a type of file in which you store the data separated by commas and the first row denote the heading of each values.Just like a spreadsheet first row denote headings then below each heading we write it's value separated by commas.</p></blockquote><p>Here's an example of our csv file.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619260399003/LjBHkUr2R.png" alt="Screenshot 2021-04-24 at 4.03.40 PM.png" /></p><p>Now we have a file which contains all the required data of our friends, it's time to create some really creative wishes.We are going to create <code>.txt</code> files which store the wishes for our friends.</p><p>Here's the example what we are doing...<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619260639729/gbkGesmQO.png" alt="Screenshot 2021-04-24 at 4.07.41 PM.png" /></p><blockquote><p>Python code will replace <code>[NAME]</code> with the birthday boy's/girl's name.</p><p>I recommend you to create multiple files (at least 3) and write different good wishes on them.</p></blockquote><p>rename each of your birthday letters as <code>letter_1.txt</code>,<code>letter_2.txt</code>, etc and save your files inside the folder <code>letters</code>.</p><p> Now we have a folder <code>letters</code> that contains our nice wishes and a csv file. It's time to write our Python code.</p><h3 id="2-real-hustle-begins">2. Real hustle begins</h3><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/l4FATJpd4LWgeruTK/giphy.gif">https://media.giphy.com/media/l4FATJpd4LWgeruTK/giphy.gif</a></div><p>Now open your favourite code editor, create a <code>main.py</code> file and start coding...</p><blockquote><p>as mentioned in prerequisites I'm assuming you worked with python in past and comfortable with the syntax.</p></blockquote><p>below is the list of modules that we are going to used in this project</p><ul><li><code>datetime</code> { for finding today's date and match it with the record }</li><li><code>pandas</code> { for managing and filtering data from our csv file }</li><li><code>random</code> { for selecting a random letter from letters }</li><li><code>smtplib</code> { for sending mails to friends }</li></ul><p>now let's import all the modules to our <code>main.py</code> file</p><pre><code>from datetime <span class="hljs-keyword">import</span> datetime # importing datetime class from datetime module<span class="hljs-keyword">import</span> pandas<span class="hljs-keyword">import</span> random<span class="hljs-keyword">import</span> smtplibmy_email = <span class="hljs-string">"email@gmail.com"</span>passw = <span class="hljs-string">"your_password"</span></code></pre><p>since our program will send mails to friends so it also needs mail address of sender.</p><blockquote><p>use your email address, you don't want your friend to receive birthday wishes by someone else's mail.</p></blockquote><p>So moving ahead now we need to get hold of today's date so that we can compare it with data stored in csv file.</p><p><code>today = datetime.now()</code></p><blockquote><p>here we are calling <code>now()</code> method from <code>datetime</code> class, it'll return today's date time and we are storing it in <code>today</code> variable.</p></blockquote><p>Now we are going to use <code>pandas</code> to read our csv file and convert it into dataframe.</p><pre><code># reading csv file <span class="hljs-keyword">and</span> making it<span class="hljs-string">'s dataframedata = pandas.read_csv("birthdays.csv")# filtering data to check if there'</span>s <span class="hljs-keyword">any</span> <span class="hljs-type">record</span> that birthdate matches <span class="hljs-keyword">with</span> today<span class="hljs-string">'s datebday = data[(data.month == today.month) & (data.day == today.day)]# storing our friend'</span>s <span class="hljs-type">name</span> <span class="hljs-keyword">having</span> birthday today <span class="hljs-keyword">and</span> email <span class="hljs-keyword">to</span> separate variables, stays empty otherwise<span class="hljs-type">name</span> = bday["name"].tolist()email = bday["email"].tolist()# making a list <span class="hljs-keyword">of</span> <span class="hljs-keyword">all</span> the friends <span class="hljs-keyword">having</span> birthdays todayfriends = []<span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(<span class="hljs-type">name</span>)): friends.append( { "name": <span class="hljs-type">name</span>[n], "email": email[n] } )</code></pre><p>Now it's time to select a random letter from the letters we created to send wishes.First we gonna check if our <code>friends</code> list is not empty then we loop over each of its items and generate letters for them</p><pre><code><span class="hljs-comment"># selecting a random integer as letter number from all letters, I assume you have 3.</span>if not friends: print("no birthday")else: for friend in friends: num = random.randint(1, 3) <span class="hljs-keyword">with</span> <span class="hljs-keyword">open</span>(f<span class="hljs-string">"letters/letter_{num}.txt"</span>) <span class="hljs-keyword">as</span> letter: <span class="hljs-keyword">lines</span> = letter.readlines() <span class="hljs-keyword">lines</span>[<span class="hljs-number">0</span>].strip() <span class="hljs-keyword">lines</span>[<span class="hljs-number">0</span>] = <span class="hljs-keyword">lines</span>[<span class="hljs-number">0</span>].replace(<span class="hljs-string">"[NAME]"</span>, friend[<span class="hljs-string">"name"</span>]) <span class="hljs-comment"># replacing [NAME] with friend's name</span> message = <span class="hljs-string">""</span>.join(<span class="hljs-keyword">lines</span>)</code></pre><p>Now the only part remain is to send a mail with the selected random wish to our friend.Here's how we gonna do this, <em>in that same loop</em></p><pre><code><span class="hljs-comment"># connecting to gmail's service</span><span class="hljs-keyword">with</span> smtplib.SMTP(<span class="hljs-string">"smtp.gmail.com"</span>) <span class="hljs-keyword">as</span> connection: connection.starttls()<span class="hljs-comment"># login with our email and password</span> connection.login(user=my_email, password=passw)<span class="hljs-comment"># sending mail to friend's email address</span> connection.sendmail(from_addr=my_email, to_addrs=friend[<span class="hljs-string">"email"</span>], msg=<span class="hljs-string">f"Subject: HAPPY BIRTHDAY\n\n<span class="hljs-subst">{message}</span>"</span>) print(<span class="hljs-string">f"message sent to <span class="hljs-subst">{friend[<span class="hljs-string">'name'</span>]}</span>"</span>)</code></pre><p>And that's it, if you followed well then in the end your code will look something like this:</p><pre><code><span class="hljs-keyword">import</span> datetime <span class="hljs-keyword">as</span> dt<span class="hljs-keyword">import</span> pandas<span class="hljs-keyword">import</span> random<span class="hljs-keyword">import</span> smtplibmy_email = <span class="hljs-string">"your_email@gmail.com"</span>passw = <span class="hljs-string">"your_password"</span>data = pandas.read_csv(<span class="hljs-string">"birthdays.csv"</span>)today = dt.datetime.now()bday = data[(data.month == today.month) & (data.day == today.day)]name = bday[<span class="hljs-string">"name"</span>].tolist()email = bday[<span class="hljs-string">"email"</span>].tolist()friends = []<span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(name)): friends.append( { <span class="hljs-string">"name"</span>: name[n], <span class="hljs-string">"email"</span>: email[n] } )<span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> friends: print(<span class="hljs-string">"no birthday"</span>)<span class="hljs-keyword">else</span>: <span class="hljs-keyword">for</span> friend <span class="hljs-keyword">in</span> friends: num = random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">3</span>) <span class="hljs-keyword">with</span> open(<span class="hljs-string">f"letters/letter_<span class="hljs-subst">{num}</span>.txt"</span>) <span class="hljs-keyword">as</span> letter: lines = letter.readlines() lines[<span class="hljs-number">0</span>].strip() lines[<span class="hljs-number">0</span>] = lines[<span class="hljs-number">0</span>].replace(<span class="hljs-string">"[NAME]"</span>, friend[<span class="hljs-string">"name"</span>]) message = <span class="hljs-string">""</span>.join(lines) <span class="hljs-keyword">with</span> smtplib.SMTP(<span class="hljs-string">"smtp.gmail.com"</span>) <span class="hljs-keyword">as</span> connection: connection.starttls() connection.login(user=my_email, password=passw) connection.sendmail(from_addr=my_email, to_addrs=friend[<span class="hljs-string">"email"</span>], msg=<span class="hljs-string">f"Subject: HAPPY BIRTHDAY\n\n<span class="hljs-subst">{message}</span>"</span>) print(<span class="hljs-string">f"message sent to <span class="hljs-subst">{friend[<span class="hljs-string">'name'</span>]}</span>"</span>)</code></pre><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/bg1MQ6IUVoVOM/giphy.gif">https://media.giphy.com/media/bg1MQ6IUVoVOM/giphy.gif</a></div><p>No not yet, It's time to check if it's working or not.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619579977542/XfR0d49tK.png" alt="Screenshot 2021-04-28 at 8.48.30 AM.png" /></p><p>since in the csv there is no birthdays today so you can see the message in console<code>no birthdays today</code></p><p>Now if I change the csv and set any birth date to today's date and save it then after running the program again</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619580063464/KKR7bOkmM.png" alt="Screenshot 2021-04-28 at 8.51.26 AM.png" /></p><p>Now it states <code>message sent to {whatever name}</code> ,you can also check the mail to confirm it.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619265376411/oMVyP-SJn.png" alt="1KtTySoI1.png" /></p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/LpQrsRA3zOuJNk7KYt/giphy.gif">https://media.giphy.com/media/LpQrsRA3zOuJNk7KYt/giphy.gif</a></div><p>That's all, we did it, YAYYYY 🥳</p><h3 id="whats-next">What's next !</h3><p><strong><em>If you are still reading, make sure to follow me and subscribe to my newsletter for more as I have some exciting stuff coming up every weekend. See ya next time!</em></strong> 🌻 </p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/3oEdv5XT0tYl7B77yM/giphy.gif">https://media.giphy.com/media/3oEdv5XT0tYl7B77yM/giphy.gif</a></div>]]><![CDATA[<h1 id="overview">Overview:</h1><p>Remembering dates is kinda hard but we are programmers, making hard things easier is our only job so instead of we remember dates why not automate this task.In this article we gonna automate the birthday wishes, yeah exactly our program will check if there's any birthday today and then mail your friend a beautiful wish 😁.</p><blockquote><p>Note: I highly recommend you to remember dates because friends get offended if they got to know about it.</p></blockquote><h2 id="prerequisites">Prerequisites :</h2><ul><li>comfortable with Python</li><li>some basic knowledge about <strong><em>pandas</em></strong> module</li></ul><h2 id="lets-get-going">Let's get going 🌱</h2><h3 id="1-setup">1. Setup:</h3><p>So first of all before heading to write code, we need to create a <strong><em>csv</em></strong> file to store the information about our dearest friends like their email address, name, birthdate. Name that file as <code>birthdays.csv</code>.</p><blockquote><p>csv stands for " comma separated values " it's basically a type of file in which you store the data separated by commas and the first row denote the heading of each values.Just like a spreadsheet first row denote headings then below each heading we write it's value separated by commas.</p></blockquote><p>Here's an example of our csv file.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619260399003/LjBHkUr2R.png" alt="Screenshot 2021-04-24 at 4.03.40 PM.png" /></p><p>Now we have a file which contains all the required data of our friends, it's time to create some really creative wishes.We are going to create <code>.txt</code> files which store the wishes for our friends.</p><p>Here's the example what we are doing...<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619260639729/gbkGesmQO.png" alt="Screenshot 2021-04-24 at 4.07.41 PM.png" /></p><blockquote><p>Python code will replace <code>[NAME]</code> with the birthday boy's/girl's name.</p><p>I recommend you to create multiple files (at least 3) and write different good wishes on them.</p></blockquote><p>rename each of your birthday letters as <code>letter_1.txt</code>,<code>letter_2.txt</code>, etc and save your files inside the folder <code>letters</code>.</p><p> Now we have a folder <code>letters</code> that contains our nice wishes and a csv file. It's time to write our Python code.</p><h3 id="2-real-hustle-begins">2. Real hustle begins</h3><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/l4FATJpd4LWgeruTK/giphy.gif">https://media.giphy.com/media/l4FATJpd4LWgeruTK/giphy.gif</a></div><p>Now open your favourite code editor, create a <code>main.py</code> file and start coding...</p><blockquote><p>as mentioned in prerequisites I'm assuming you worked with python in past and comfortable with the syntax.</p></blockquote><p>below is the list of modules that we are going to used in this project</p><ul><li><code>datetime</code> { for finding today's date and match it with the record }</li><li><code>pandas</code> { for managing and filtering data from our csv file }</li><li><code>random</code> { for selecting a random letter from letters }</li><li><code>smtplib</code> { for sending mails to friends }</li></ul><p>now let's import all the modules to our <code>main.py</code> file</p><pre><code>from datetime <span class="hljs-keyword">import</span> datetime # importing datetime class from datetime module<span class="hljs-keyword">import</span> pandas<span class="hljs-keyword">import</span> random<span class="hljs-keyword">import</span> smtplibmy_email = <span class="hljs-string">"email@gmail.com"</span>passw = <span class="hljs-string">"your_password"</span></code></pre><p>since our program will send mails to friends so it also needs mail address of sender.</p><blockquote><p>use your email address, you don't want your friend to receive birthday wishes by someone else's mail.</p></blockquote><p>So moving ahead now we need to get hold of today's date so that we can compare it with data stored in csv file.</p><p><code>today = datetime.now()</code></p><blockquote><p>here we are calling <code>now()</code> method from <code>datetime</code> class, it'll return today's date time and we are storing it in <code>today</code> variable.</p></blockquote><p>Now we are going to use <code>pandas</code> to read our csv file and convert it into dataframe.</p><pre><code># reading csv file <span class="hljs-keyword">and</span> making it<span class="hljs-string">'s dataframedata = pandas.read_csv("birthdays.csv")# filtering data to check if there'</span>s <span class="hljs-keyword">any</span> <span class="hljs-type">record</span> that birthdate matches <span class="hljs-keyword">with</span> today<span class="hljs-string">'s datebday = data[(data.month == today.month) & (data.day == today.day)]# storing our friend'</span>s <span class="hljs-type">name</span> <span class="hljs-keyword">having</span> birthday today <span class="hljs-keyword">and</span> email <span class="hljs-keyword">to</span> separate variables, stays empty otherwise<span class="hljs-type">name</span> = bday["name"].tolist()email = bday["email"].tolist()# making a list <span class="hljs-keyword">of</span> <span class="hljs-keyword">all</span> the friends <span class="hljs-keyword">having</span> birthdays todayfriends = []<span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(<span class="hljs-type">name</span>)): friends.append( { "name": <span class="hljs-type">name</span>[n], "email": email[n] } )</code></pre><p>Now it's time to select a random letter from the letters we created to send wishes.First we gonna check if our <code>friends</code> list is not empty then we loop over each of its items and generate letters for them</p><pre><code><span class="hljs-comment"># selecting a random integer as letter number from all letters, I assume you have 3.</span>if not friends: print("no birthday")else: for friend in friends: num = random.randint(1, 3) <span class="hljs-keyword">with</span> <span class="hljs-keyword">open</span>(f<span class="hljs-string">"letters/letter_{num}.txt"</span>) <span class="hljs-keyword">as</span> letter: <span class="hljs-keyword">lines</span> = letter.readlines() <span class="hljs-keyword">lines</span>[<span class="hljs-number">0</span>].strip() <span class="hljs-keyword">lines</span>[<span class="hljs-number">0</span>] = <span class="hljs-keyword">lines</span>[<span class="hljs-number">0</span>].replace(<span class="hljs-string">"[NAME]"</span>, friend[<span class="hljs-string">"name"</span>]) <span class="hljs-comment"># replacing [NAME] with friend's name</span> message = <span class="hljs-string">""</span>.join(<span class="hljs-keyword">lines</span>)</code></pre><p>Now the only part remain is to send a mail with the selected random wish to our friend.Here's how we gonna do this, <em>in that same loop</em></p><pre><code><span class="hljs-comment"># connecting to gmail's service</span><span class="hljs-keyword">with</span> smtplib.SMTP(<span class="hljs-string">"smtp.gmail.com"</span>) <span class="hljs-keyword">as</span> connection: connection.starttls()<span class="hljs-comment"># login with our email and password</span> connection.login(user=my_email, password=passw)<span class="hljs-comment"># sending mail to friend's email address</span> connection.sendmail(from_addr=my_email, to_addrs=friend[<span class="hljs-string">"email"</span>], msg=<span class="hljs-string">f"Subject: HAPPY BIRTHDAY\n\n<span class="hljs-subst">{message}</span>"</span>) print(<span class="hljs-string">f"message sent to <span class="hljs-subst">{friend[<span class="hljs-string">'name'</span>]}</span>"</span>)</code></pre><p>And that's it, if you followed well then in the end your code will look something like this:</p><pre><code><span class="hljs-keyword">import</span> datetime <span class="hljs-keyword">as</span> dt<span class="hljs-keyword">import</span> pandas<span class="hljs-keyword">import</span> random<span class="hljs-keyword">import</span> smtplibmy_email = <span class="hljs-string">"your_email@gmail.com"</span>passw = <span class="hljs-string">"your_password"</span>data = pandas.read_csv(<span class="hljs-string">"birthdays.csv"</span>)today = dt.datetime.now()bday = data[(data.month == today.month) & (data.day == today.day)]name = bday[<span class="hljs-string">"name"</span>].tolist()email = bday[<span class="hljs-string">"email"</span>].tolist()friends = []<span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(len(name)): friends.append( { <span class="hljs-string">"name"</span>: name[n], <span class="hljs-string">"email"</span>: email[n] } )<span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> friends: print(<span class="hljs-string">"no birthday"</span>)<span class="hljs-keyword">else</span>: <span class="hljs-keyword">for</span> friend <span class="hljs-keyword">in</span> friends: num = random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">3</span>) <span class="hljs-keyword">with</span> open(<span class="hljs-string">f"letters/letter_<span class="hljs-subst">{num}</span>.txt"</span>) <span class="hljs-keyword">as</span> letter: lines = letter.readlines() lines[<span class="hljs-number">0</span>].strip() lines[<span class="hljs-number">0</span>] = lines[<span class="hljs-number">0</span>].replace(<span class="hljs-string">"[NAME]"</span>, friend[<span class="hljs-string">"name"</span>]) message = <span class="hljs-string">""</span>.join(lines) <span class="hljs-keyword">with</span> smtplib.SMTP(<span class="hljs-string">"smtp.gmail.com"</span>) <span class="hljs-keyword">as</span> connection: connection.starttls() connection.login(user=my_email, password=passw) connection.sendmail(from_addr=my_email, to_addrs=friend[<span class="hljs-string">"email"</span>], msg=<span class="hljs-string">f"Subject: HAPPY BIRTHDAY\n\n<span class="hljs-subst">{message}</span>"</span>) print(<span class="hljs-string">f"message sent to <span class="hljs-subst">{friend[<span class="hljs-string">'name'</span>]}</span>"</span>)</code></pre><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/bg1MQ6IUVoVOM/giphy.gif">https://media.giphy.com/media/bg1MQ6IUVoVOM/giphy.gif</a></div><p>No not yet, It's time to check if it's working or not.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619579977542/XfR0d49tK.png" alt="Screenshot 2021-04-28 at 8.48.30 AM.png" /></p><p>since in the csv there is no birthdays today so you can see the message in console<code>no birthdays today</code></p><p>Now if I change the csv and set any birth date to today's date and save it then after running the program again</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619580063464/KKR7bOkmM.png" alt="Screenshot 2021-04-28 at 8.51.26 AM.png" /></p><p>Now it states <code>message sent to {whatever name}</code> ,you can also check the mail to confirm it.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1619265376411/oMVyP-SJn.png" alt="1KtTySoI1.png" /></p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/LpQrsRA3zOuJNk7KYt/giphy.gif">https://media.giphy.com/media/LpQrsRA3zOuJNk7KYt/giphy.gif</a></div><p>That's all, we did it, YAYYYY 🥳</p><h3 id="whats-next">What's next !</h3><p><strong><em>If you are still reading, make sure to follow me and subscribe to my newsletter for more as I have some exciting stuff coming up every weekend. See ya next time!</em></strong> 🌻 </p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/3oEdv5XT0tYl7B77yM/giphy.gif">https://media.giphy.com/media/3oEdv5XT0tYl7B77yM/giphy.gif</a></div>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1619259213930/EOU5cWU-3.jpeg<![CDATA[How to handle big integers in C ++]]>https://swayam-blog.hashnode.dev/how-to-handle-big-integers-in-chttps://swayam-blog.hashnode.dev/how-to-handle-big-integers-in-cSun, 18 Apr 2021 13:11:12 GMT<![CDATA[<h3 id="why-we-need-to-understand-this-topic"><strong> why we need to understand this topic ?</strong></h3><p>If we take a look on the number limits of integer data type of C++, you'll find something like: </p><ul><li><em>int</em> : approx <em>10<sup>9</sup></em></li><li><em>long int</em> : approx <em>10<sup>12</sup></em></li><li><em>long long int</em> : approx <em>10<sup>18</sup></em></li></ul><p>that means we can only store a maximum of <em>10<sup>18</sup></em> integer i.e. only a number upto 19 digits. What if we have to deal with numbers greater than 19 digits ?</p><p>well using standard data types for such problems will lead to <em>overflow</em>. That's why languages like Python, Java, Ruby, etc. have libraries like <strong>Big integers</strong> or their variables are able to handle such large numbers.</p><p>In this article we'll learn how to store such large numbers in C++ and perform some fundamental operations on them.</p><h3 id="prerequisites"><strong> Prerequisites </strong></h3><ul><li>Basic STL (vectors) || you can also use <em>arrays</em></li><li>Functions</li><li>Loops</li></ul><h3 id="how-to-store-such-large-integers"><strong> How to store such large integers ?</strong></h3><p>As we saw in standard data types of integer we can only store number upto 19 digits so in order to store greater than 19 digits we will use <em>vectors</em> and <em>string</em> for storing and taking input respectively. As we know in vectors / arrays items are stored in contiguous memory locations and we can create a vector upto 10<sup>7</sup> items, so if we store each digit of our number into each of the memory locations then we will able to store 10<sup>7</sup> digits. OMG it is much much greater than just 19 digits.</p><blockquote><p>Remember we are storing digits of our number into each memory locations, not entire number into one memory location.</p></blockquote><p>Below is a code sample to explain this</p><pre><code><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span><span class="hljs-meta-string"><bits/stdc++.h></span></span><span class="hljs-keyword">using</span> <span class="hljs-keyword">namespace</span> <span class="hljs-built_in">std</span>;<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span>{ <span class="hljs-built_in">string</span> num = <span class="hljs-string">"123499999999999999999999999999999999"</span>; <span class="hljs-comment">/* taking input as a string */</span> <span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> number; <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i < num.size(); i++) number.push_back(num[i] - <span class="hljs-string">'0'</span>); <span class="hljs-comment">/* converting from character to int (ASCII conversion) */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">auto</span> i : number) <span class="hljs-comment">/* looping on each item in vector */</span> <span class="hljs-built_in">cout</span><<i;}<span class="hljs-comment">/* output: 123456789 */</span></code></pre><p>So now we know how to store such large numbers and that's not the end of the story.As we saw we need to use a contiguous memory locations to store such large numbers in C++, that's why operating on such numbers gets complicated as compare to handling just simple numbers of <em>int</em> data type.For doing operations we need to build our own logic like how to <em>add</em> or <em>subtract</em> such numbers.</p><h3 id="lets-dive-into-how-to-add-such-numbers"><strong>Let's dive into how to Add such numbers</strong></h3><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/S5yqNNTQlEZfqQ7InC/giphy.gif">https://media.giphy.com/media/S5yqNNTQlEZfqQ7InC/giphy.gif</a></div><blockquote><p>before going ahead I recommend you to take a pen and a paper and add 12345 + 6789, like we use to do in our childhood, step by step</p></blockquote><h4 id="algorithm">Algorithm:</h4><ul><li>take input and store both numbers into two different vectors / arrays.</li><li>reverse the vector (because we add from right to left).</li><li>initiate a variable to store carry.</li><li>store the sizes of vectors as <em>min_len</em>, <em>max_len</em>.</li><li>first loop till <em>min_len</em>, do addition of respective digits and store them into another vector (answer).</li><li>loop from <em>min_len</em> to <em>max_len</em> and store the sum of rest digits + carry into answer vector.</li><li>after addition of all digits if still carry remain then store it into answer vector.</li><li>again reverse the answer vector (because we previously reversed our two vectors)</li><li>return the answer vector.</li></ul><p><em>here's the code with comments for better understanding of the logic</em></p><pre><code><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span><span class="hljs-meta-string"><bits/stdc++.h></span></span><span class="hljs-keyword">using</span> <span class="hljs-keyword">namespace</span> <span class="hljs-built_in">std</span>;<span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> add(<span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> x, <span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> y) <span class="hljs-comment">/* function to add two vectors */</span>{ reverse(x.begin(),x.end()); <span class="hljs-comment">/* reversing since we add from right to left */</span> reverse(y.begin(), y.end()); <span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> ans; <span class="hljs-comment">/* vector to store the answer */</span> <span class="hljs-keyword">int</span> min_len = min(x.size(), y.size()); <span class="hljs-comment">/* calculating min_len and max_len from sizes of both vectors */</span> <span class="hljs-keyword">int</span> max_len = max(x.size(), y.size()); <span class="hljs-keyword">int</span> carry = <span class="hljs-number">0</span>; <span class="hljs-comment">/* initially carry will be 0 */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>;i<min_len;i++) <span class="hljs-comment">/* looping till minimum digits of both the numbers and doing addition */</span> { <span class="hljs-keyword">int</span> digit_sum = x[i] + y[i] + carry; carry = digit_sum / <span class="hljs-number">10</span>; <span class="hljs-comment">/* tens digit of digit_sum will be carry */</span> ans.push_back(digit_sum % <span class="hljs-number">10</span>); <span class="hljs-comment">/* we store the once digit as answer */</span> } <span class="hljs-comment">/* handling remaining digits */</span> <span class="hljs-keyword">if</span>(x.size() > y.size()) { <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=min_len; i < x.size(); i++) { <span class="hljs-keyword">int</span> digit_sum = x[i] + carry; carry = digit_sum / <span class="hljs-number">10</span>; ans.push_back(digit_sum % <span class="hljs-number">10</span>); } } <span class="hljs-keyword">if</span>(x.size() < y.size()) { <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=min_len; i < y.size(); i++) { <span class="hljs-keyword">int</span> digit_sum = y[i] + carry; carry = digit_sum / <span class="hljs-number">10</span>; ans.push_back(digit_sum % <span class="hljs-number">10</span>); } } <span class="hljs-comment">/* handling remaining carry */</span> <span class="hljs-keyword">while</span>(carry) { ans.push_back(carry % <span class="hljs-number">10</span>); carry = carry / <span class="hljs-number">10</span>; } reverse(ans.begin(), ans.end()); <span class="hljs-comment">/* reversing the answer vector to get correct answer vector */</span> <span class="hljs-keyword">return</span> ans;}<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span>{ <span class="hljs-built_in">string</span> a, b; <span class="hljs-built_in">cin</span> >> a >> b; <span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> a2,b2; <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>;i<a.size();i++) <span class="hljs-comment">/* storing characters as integers into vector */</span> a2.push_back(a[i] - <span class="hljs-string">'0'</span>); <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>; i<b.size(); i++) b2.push_back(b[i] - <span class="hljs-string">'0'</span>); <span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> result = add(a2, b2); <span class="hljs-comment">/* calling and passing both vectors to add */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">auto</span> i: result) <span class="hljs-comment">/* printing output */</span> <span class="hljs-built_in">cout</span><<i;}</code></pre><p>And we are done.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/11sBLVxNs7v6WA/giphy.gif">https://media.giphy.com/media/11sBLVxNs7v6WA/giphy.gif</a></div><p>Yeah exactly I also never thought addition can be so complex, but if take a closer look all we did is just fundamental addition that we did in our childhood, but never considered so many cases.</p><p>So let's head forwards and understand another important concept, I bet when you started learning C++ you must made a program to calculate factorial of a number, the flaw of that program is that you only calculate factorials upto number 20, on inputting 21 and onwards you'll face <em>overflow</em>.</p><p><em>Now we are going to remake that same program but this time it'll be able to calculate factorials of much bigger numbers.</em></p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/CjmvTCZf2U3p09Cn0h/giphy.gif">https://media.giphy.com/media/CjmvTCZf2U3p09Cn0h/giphy.gif</a></div><h3 id="computing-factorials-of-large-integers"><strong> Computing factorials of large integers</strong></h3><blockquote><p>Idea is same as before, we take a number and store its factorial into vector, but this time instead of calculating addition we'll go for multiplication.</p></blockquote><h4 id="algorithm">Algorithm:</h4><ul><li>input the number.</li><li>initiate answer vector and put 1 in it. (because multiplying any number with leads to same number )</li><li>loop from <em>i</em>=2 to number and each time multiplies the answer vector with <em>i</em> .</li><li>reverse the answer vector for correct answer.</li><li>print output.</li></ul><p><em>here's the code for above problem</em></p><pre><code><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span><span class="hljs-meta-string"><bits/stdc++.h></span></span><span class="hljs-keyword">using</span> <span class="hljs-keyword">namespace</span> <span class="hljs-built_in">std</span>;<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">factorial</span><span class="hljs-params">(<span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> &v, <span class="hljs-keyword">int</span> x)</span> <span class="hljs-comment">/* passing with refernce so that changes also occur in original vector */</span></span>{ <span class="hljs-keyword">int</span> carry = <span class="hljs-number">0</span>; <span class="hljs-comment">/* initially carry will be 0 */</span> <span class="hljs-comment">/* doing fundamental way of multiplication */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i < v.size(); i++) { <span class="hljs-keyword">int</span> digit_product = v[i]*x + carry; carry = digit_product / <span class="hljs-number">10</span>; v[i] = (digit_product % <span class="hljs-number">10</span>); } <span class="hljs-comment">/* handling if carry remians */</span> <span class="hljs-keyword">while</span>(carry) { v.push_back(carry % <span class="hljs-number">10</span>); <span class="hljs-comment">/* push_back() adds number at the end of vector but in multiplication we put carry in front so we need to reverse the vector to get the correct answer */</span> carry /= <span class="hljs-number">10</span>; }}<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span>{ <span class="hljs-keyword">int</span> n; <span class="hljs-built_in">cin</span> >> n; <span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> ans; <span class="hljs-comment">/* vector to store answer */</span> ans.push_back(<span class="hljs-number">1</span>); <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i = <span class="hljs-number">2</span>; i <= n; i++) <span class="hljs-comment">/* looping from 2 to n */</span> { factorial(ans, i); <span class="hljs-comment">/* passing vector and i to multiply the number in vector with i */</span> } reverse(ans.begin(), ans.end()); <span class="hljs-comment">/* reversing the vector to get the correct answer */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">auto</span> i:ans) <span class="hljs-built_in">cout</span><<i;}</code></pre><p>try this code on your own and grasp the concept behind it, I make it as clear as I can with proper comments but you need to visualize the flow using pen and paper, doing the fundamental multiplication of a digits inside vector by a number. </p><h2 id="whats-next">What's next !</h2><p>If you are ready now and feeling confident with this, I have a homework for you.</p><p><em>Try subtraction of two big integers on your own</em>.</p><p>And yeah that was all for this now! </p><p><strong>If you are still reading, make sure to follow me for more as I have some exciting stuff coming up every weekend. See ya next time! </strong></p>]]><![CDATA[<h3 id="why-we-need-to-understand-this-topic"><strong> why we need to understand this topic ?</strong></h3><p>If we take a look on the number limits of integer data type of C++, you'll find something like: </p><ul><li><em>int</em> : approx <em>10<sup>9</sup></em></li><li><em>long int</em> : approx <em>10<sup>12</sup></em></li><li><em>long long int</em> : approx <em>10<sup>18</sup></em></li></ul><p>that means we can only store a maximum of <em>10<sup>18</sup></em> integer i.e. only a number upto 19 digits. What if we have to deal with numbers greater than 19 digits ?</p><p>well using standard data types for such problems will lead to <em>overflow</em>. That's why languages like Python, Java, Ruby, etc. have libraries like <strong>Big integers</strong> or their variables are able to handle such large numbers.</p><p>In this article we'll learn how to store such large numbers in C++ and perform some fundamental operations on them.</p><h3 id="prerequisites"><strong> Prerequisites </strong></h3><ul><li>Basic STL (vectors) || you can also use <em>arrays</em></li><li>Functions</li><li>Loops</li></ul><h3 id="how-to-store-such-large-integers"><strong> How to store such large integers ?</strong></h3><p>As we saw in standard data types of integer we can only store number upto 19 digits so in order to store greater than 19 digits we will use <em>vectors</em> and <em>string</em> for storing and taking input respectively. As we know in vectors / arrays items are stored in contiguous memory locations and we can create a vector upto 10<sup>7</sup> items, so if we store each digit of our number into each of the memory locations then we will able to store 10<sup>7</sup> digits. OMG it is much much greater than just 19 digits.</p><blockquote><p>Remember we are storing digits of our number into each memory locations, not entire number into one memory location.</p></blockquote><p>Below is a code sample to explain this</p><pre><code><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span><span class="hljs-meta-string"><bits/stdc++.h></span></span><span class="hljs-keyword">using</span> <span class="hljs-keyword">namespace</span> <span class="hljs-built_in">std</span>;<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span>{ <span class="hljs-built_in">string</span> num = <span class="hljs-string">"123499999999999999999999999999999999"</span>; <span class="hljs-comment">/* taking input as a string */</span> <span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> number; <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i < num.size(); i++) number.push_back(num[i] - <span class="hljs-string">'0'</span>); <span class="hljs-comment">/* converting from character to int (ASCII conversion) */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">auto</span> i : number) <span class="hljs-comment">/* looping on each item in vector */</span> <span class="hljs-built_in">cout</span><<i;}<span class="hljs-comment">/* output: 123456789 */</span></code></pre><p>So now we know how to store such large numbers and that's not the end of the story.As we saw we need to use a contiguous memory locations to store such large numbers in C++, that's why operating on such numbers gets complicated as compare to handling just simple numbers of <em>int</em> data type.For doing operations we need to build our own logic like how to <em>add</em> or <em>subtract</em> such numbers.</p><h3 id="lets-dive-into-how-to-add-such-numbers"><strong>Let's dive into how to Add such numbers</strong></h3><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/S5yqNNTQlEZfqQ7InC/giphy.gif">https://media.giphy.com/media/S5yqNNTQlEZfqQ7InC/giphy.gif</a></div><blockquote><p>before going ahead I recommend you to take a pen and a paper and add 12345 + 6789, like we use to do in our childhood, step by step</p></blockquote><h4 id="algorithm">Algorithm:</h4><ul><li>take input and store both numbers into two different vectors / arrays.</li><li>reverse the vector (because we add from right to left).</li><li>initiate a variable to store carry.</li><li>store the sizes of vectors as <em>min_len</em>, <em>max_len</em>.</li><li>first loop till <em>min_len</em>, do addition of respective digits and store them into another vector (answer).</li><li>loop from <em>min_len</em> to <em>max_len</em> and store the sum of rest digits + carry into answer vector.</li><li>after addition of all digits if still carry remain then store it into answer vector.</li><li>again reverse the answer vector (because we previously reversed our two vectors)</li><li>return the answer vector.</li></ul><p><em>here's the code with comments for better understanding of the logic</em></p><pre><code><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span><span class="hljs-meta-string"><bits/stdc++.h></span></span><span class="hljs-keyword">using</span> <span class="hljs-keyword">namespace</span> <span class="hljs-built_in">std</span>;<span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> add(<span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> x, <span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> y) <span class="hljs-comment">/* function to add two vectors */</span>{ reverse(x.begin(),x.end()); <span class="hljs-comment">/* reversing since we add from right to left */</span> reverse(y.begin(), y.end()); <span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> ans; <span class="hljs-comment">/* vector to store the answer */</span> <span class="hljs-keyword">int</span> min_len = min(x.size(), y.size()); <span class="hljs-comment">/* calculating min_len and max_len from sizes of both vectors */</span> <span class="hljs-keyword">int</span> max_len = max(x.size(), y.size()); <span class="hljs-keyword">int</span> carry = <span class="hljs-number">0</span>; <span class="hljs-comment">/* initially carry will be 0 */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>;i<min_len;i++) <span class="hljs-comment">/* looping till minimum digits of both the numbers and doing addition */</span> { <span class="hljs-keyword">int</span> digit_sum = x[i] + y[i] + carry; carry = digit_sum / <span class="hljs-number">10</span>; <span class="hljs-comment">/* tens digit of digit_sum will be carry */</span> ans.push_back(digit_sum % <span class="hljs-number">10</span>); <span class="hljs-comment">/* we store the once digit as answer */</span> } <span class="hljs-comment">/* handling remaining digits */</span> <span class="hljs-keyword">if</span>(x.size() > y.size()) { <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=min_len; i < x.size(); i++) { <span class="hljs-keyword">int</span> digit_sum = x[i] + carry; carry = digit_sum / <span class="hljs-number">10</span>; ans.push_back(digit_sum % <span class="hljs-number">10</span>); } } <span class="hljs-keyword">if</span>(x.size() < y.size()) { <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=min_len; i < y.size(); i++) { <span class="hljs-keyword">int</span> digit_sum = y[i] + carry; carry = digit_sum / <span class="hljs-number">10</span>; ans.push_back(digit_sum % <span class="hljs-number">10</span>); } } <span class="hljs-comment">/* handling remaining carry */</span> <span class="hljs-keyword">while</span>(carry) { ans.push_back(carry % <span class="hljs-number">10</span>); carry = carry / <span class="hljs-number">10</span>; } reverse(ans.begin(), ans.end()); <span class="hljs-comment">/* reversing the answer vector to get correct answer vector */</span> <span class="hljs-keyword">return</span> ans;}<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span>{ <span class="hljs-built_in">string</span> a, b; <span class="hljs-built_in">cin</span> >> a >> b; <span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> a2,b2; <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>;i<a.size();i++) <span class="hljs-comment">/* storing characters as integers into vector */</span> a2.push_back(a[i] - <span class="hljs-string">'0'</span>); <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>; i<b.size(); i++) b2.push_back(b[i] - <span class="hljs-string">'0'</span>); <span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> result = add(a2, b2); <span class="hljs-comment">/* calling and passing both vectors to add */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">auto</span> i: result) <span class="hljs-comment">/* printing output */</span> <span class="hljs-built_in">cout</span><<i;}</code></pre><p>And we are done.</p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/11sBLVxNs7v6WA/giphy.gif">https://media.giphy.com/media/11sBLVxNs7v6WA/giphy.gif</a></div><p>Yeah exactly I also never thought addition can be so complex, but if take a closer look all we did is just fundamental addition that we did in our childhood, but never considered so many cases.</p><p>So let's head forwards and understand another important concept, I bet when you started learning C++ you must made a program to calculate factorial of a number, the flaw of that program is that you only calculate factorials upto number 20, on inputting 21 and onwards you'll face <em>overflow</em>.</p><p><em>Now we are going to remake that same program but this time it'll be able to calculate factorials of much bigger numbers.</em></p><div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://media.giphy.com/media/CjmvTCZf2U3p09Cn0h/giphy.gif">https://media.giphy.com/media/CjmvTCZf2U3p09Cn0h/giphy.gif</a></div><h3 id="computing-factorials-of-large-integers"><strong> Computing factorials of large integers</strong></h3><blockquote><p>Idea is same as before, we take a number and store its factorial into vector, but this time instead of calculating addition we'll go for multiplication.</p></blockquote><h4 id="algorithm">Algorithm:</h4><ul><li>input the number.</li><li>initiate answer vector and put 1 in it. (because multiplying any number with leads to same number )</li><li>loop from <em>i</em>=2 to number and each time multiplies the answer vector with <em>i</em> .</li><li>reverse the answer vector for correct answer.</li><li>print output.</li></ul><p><em>here's the code for above problem</em></p><pre><code><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span><span class="hljs-meta-string"><bits/stdc++.h></span></span><span class="hljs-keyword">using</span> <span class="hljs-keyword">namespace</span> <span class="hljs-built_in">std</span>;<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">factorial</span><span class="hljs-params">(<span class="hljs-built_in">vector</span><<span class="hljs-keyword">int</span>> &v, <span class="hljs-keyword">int</span> x)</span> <span class="hljs-comment">/* passing with refernce so that changes also occur in original vector */</span></span>{ <span class="hljs-keyword">int</span> carry = <span class="hljs-number">0</span>; <span class="hljs-comment">/* initially carry will be 0 */</span> <span class="hljs-comment">/* doing fundamental way of multiplication */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i < v.size(); i++) { <span class="hljs-keyword">int</span> digit_product = v[i]*x + carry; carry = digit_product / <span class="hljs-number">10</span>; v[i] = (digit_product % <span class="hljs-number">10</span>); } <span class="hljs-comment">/* handling if carry remians */</span> <span class="hljs-keyword">while</span>(carry) { v.push_back(carry % <span class="hljs-number">10</span>); <span class="hljs-comment">/* push_back() adds number at the end of vector but in multiplication we put carry in front so we need to reverse the vector to get the correct answer */</span> carry /= <span class="hljs-number">10</span>; }}<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span>{ <span class="hljs-keyword">int</span> n; <span class="hljs-built_in">cin</span> >> n; <span class="hljs-built_in">vector</span> <<span class="hljs-keyword">int</span>> ans; <span class="hljs-comment">/* vector to store answer */</span> ans.push_back(<span class="hljs-number">1</span>); <span class="hljs-keyword">for</span>(<span class="hljs-keyword">int</span> i = <span class="hljs-number">2</span>; i <= n; i++) <span class="hljs-comment">/* looping from 2 to n */</span> { factorial(ans, i); <span class="hljs-comment">/* passing vector and i to multiply the number in vector with i */</span> } reverse(ans.begin(), ans.end()); <span class="hljs-comment">/* reversing the vector to get the correct answer */</span> <span class="hljs-keyword">for</span>(<span class="hljs-keyword">auto</span> i:ans) <span class="hljs-built_in">cout</span><<i;}</code></pre><p>try this code on your own and grasp the concept behind it, I make it as clear as I can with proper comments but you need to visualize the flow using pen and paper, doing the fundamental multiplication of a digits inside vector by a number. </p><h2 id="whats-next">What's next !</h2><p>If you are ready now and feeling confident with this, I have a homework for you.</p><p><em>Try subtraction of two big integers on your own</em>.</p><p>And yeah that was all for this now! </p><p><strong>If you are still reading, make sure to follow me for more as I have some exciting stuff coming up every weekend. See ya next time! </strong></p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1618721834862/K-mRQeBYi.jpeg