top of page
  • priyadarshanipande

The Significance of Cost Function Selection in Model Training


In my last blog, I explained Linear Regression as follows:-


Y = h(X)


And we mentioned that in case of Linear Regression both h(X) and Y are known and we train an algotithm so that we can predict Y accurately for a new sample of X which is from out test set.


"Cost Function" also called the "Error Function" is Linear Regression is the measure of the error incurred between the value we predicted of the object via the training algorithm and the actual value of the sample. Hence the term cost comes into place. In simple words, it is described as the difference.


(Predicted Value of test Sample - Actual Value of Test Sample)


To explain this concept further, we consider a set of training data represented by following notatations:-


m

Number of Training Examples

X's

Input Variable

Y's

Output Variable

(x,y)

one instance of m (1 training example)

ith set of training example

Problem:- We want to predict the sales of a product based on the previous sales pattern. Some of the parameters we can consider for this problem would be:-


  1. Past Sales.

  2. Economic Trends.

  3. Inflation.

  4. Competitor rates.

  5. Sales pattern based on ad campaigns and other modes of publicity.


All the above will form the different variables of our hypothesis.


Formally the cost function will be standard deviation of the predicted values on the entire set from actual values.



The above equation represents what is known as the "Cost Function" or the "Squared Error Cost Function".


For simpicity, let's consider this hypothesis to be a single variable one. This would be called "Univariate Linear Regression". Hence our hypothesis will look like:-


Hypothesis cost function

A simplified version of the above equation will be if we consider theta-0=0 then the function will be straight line passing through the origin as in the figure below. This forms the "simplified version of the cost function (Intuition-1)".



Univariate Linear Regression Hypothesis


We have to choose the values of theta's in a way that the difference below is the minimum.


Cost function minimization

The second case is when we keep both the parameters theta-0 and theta-1 as non-zero. This forms the "simplified version of the cost function (Intuition-2)".


Below is a graphical representation of the way density estimation takes place around different sets of values for theta-0 and theta-1. This leads to a displacement density graph similar to the one below. It is an approximation for representation.


Univariate Linear Regression

The Cost Function is the part of one of most significant algorithms in Linear Regression which is called the Gradient Descent Algorithm. The goal of Gradient Descent algorithm is to keep changing the values of theta-0 and theta-1 for minimization of cost function. Understanding of the core concept of cost function forms the basis of our very first Machine Learning algorithms that will be learn about in the next blog.


Till then keep visiting and let me know your thoughts on what I can improve and explain better so it helps you. You can connect with me on (2) Priyadarshani Pandey | LinkedIn OR email on on priyadarshani.pandey@gmail.com


9 views0 comments

Recent Posts

See All
bottom of page