Batch Gradient Ascent and Logistic Regression

Photo by Giuseppe Famiani on Unsplash

For this blog, I will explain logistic regression and how the beta values can be calculated through gradient ascent of the maximum log-likelihood.

Logistic regression is widely used for binary classifications where the dependant variables are limited to be either 0 or 1. Therefore, our predicted values should also range between 0 and 1 with a 50% threshold. This is to say, if the model returns a value greater than or equal to 50% then it will be classified as 1, and if less than 50, classified as 0.

Logistic regression is very similar to linear regression as they are both additive models. However, the values output values should range from [0, 1]. A common way to return this value is to utilize a sigmoid function.

The probability of y to be 1 given x variables parameterized by the theta(feature) values will then equal h_{\theta}(x). And because y can only be 0 or 1 the probability of y to be 0 given x variables parameterized by theta will equal 1 — h_{\theta}(x).

The likelihood of the theta(all parameters, variables) is the product of the probability of the data.

The objective is to now choose theta values that will maximize the log-likelihood. Very similarly to gradient descent, I will update each beta values simultaneously by taking the partial derivative. However, since I am now trying to climb to the top I will add the partial derivative of the log-likelihood instead of subtracting.

I hope this was helpful in understanding gradient ascent and its application to logistic regression.

Hello! My name is Albert Um.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store