Machine Learning : Regression Analysis #1
Introduction
If you are aspiring to become a data scientist, regression will be the first algorithm you will learn. It is one of the most well known and easiest algorithm in machine learning. Learning about regression analysis would not only help you to clear job interviews but also to solve real-world problems.
This article would mainly focus on Linear Regression, which is a form of regression.
What is Regression?
Regression is a statistical technique which is used to predict continuous dependent variable given a set of independent variables. Regression Analysis takes certain criteria(we will discuss in another post) to work properly. However, if the criteria are met, the algorithm gives splendid results.
Mathematically, linear regression is a function in the form,
If you are aspiring to become a data scientist, regression will be the first algorithm you will learn. It is one of the most well known and easiest algorithm in machine learning. Learning about regression analysis would not only help you to clear job interviews but also to solve real-world problems.
This article would mainly focus on Linear Regression, which is a form of regression.
What is Regression?
Regression is a statistical technique which is used to predict continuous dependent variable given a set of independent variables. Regression Analysis takes certain criteria(we will discuss in another post) to work properly. However, if the criteria are met, the algorithm gives splendid results.
Mathematically, linear regression is a function in the form,
where y is the dependent variable,
x1,x2,x3...xN are the independent variable or features with which we will predict y,
a0,a1,a2,a3....aN are the parameters or coefficients or weights.
A simple linear regression model with one feature would look like,
y = a0 + a1x1
which is a simple linear equation with a0 as intercept and a1 as the slope or gradient.
A simple linear regression may look like this:
The blue line is the regression line. It is used to predict the values of the dependent variable y.
Uses of Regression
Suppose you want to find the price of a house against the number of bedrooms or the land area.Since the price of the houses is continuous, you can very well use regression to predict the price of the house.There are many other examples where you can use regression.
How to find the regression line?
Regression Analysis is a supervised learning algorithm, i.e. we will train the learning algorithm using data. Once the model is trained, we can predict the value of unknown new examples.
Ordinary Least Square Method
The most common way to find the regression line is by Ordinary Least Square (OLS) method. We find the sum of square errors of our regression model and try to minimize the error. The error is defined as,
where h(x) is the hypothesis function or regression function,
h(xi)is the predicted value of xi,
y leaning rate alpha which is analogous to stride lengths the actual value of xi,
m is the total number of training examples.
Our objective is to find the hypothesis or regression function h(x) for which the cost function is the least.This approach treats the data as a matrix and uses linear algebra operations to estimate the optimal values for the coefficients. It means that all of the data must be available and you must have enough memory to fit the data and perform matrix operations.
Gradient Descent
Another way to find the regression line is through Gradient Descent. It involves knowing the cost function as well as the derivative of the cost function. You can think of gradient descent as going downhill from a random point until you find a minimum. The algorithm involves a learning rate alpha which is analogous to stride length while going downhill.
At each step, we do the following,
ai = ai - alpha*derivative of cost function where ai are the coefficients.
The above step is repeated until the error becomes less than a threshold value.
These are the two most commonly used regression algorithm.In the next post, we will go deeper into the discussed algorithms.
Comments
Post a Comment