### Scikit-Learn *Logistic Regression* Implementation

*Scikit-Learn* has a *Logistic Regression* implementation that fits a model to a set of training data and can classify new or test data points into their respective classes. All important parameters can be specified, as the norm used in penalizations and the solver used in optimization.

*Logistic Regression* sigmoid function

*Logistic Regression* models use the *sigmoid function* to link the *log-odds* of a data point to the range [0,1], providing a probability for the classification decision. The *sigmoid function* is widely used in machine learning classification problems because its output can be interpreted as a probability and its derivative is easy to calculate.

*Classification Threshold* definition

A *Classification Threshold* determines the cutoff where the probabilistic output of a machine learning algorithm classifies data samples as belonging to the positive or negative class. A *Classification Threshold* of 0.5 is well suited to most problems, but particular classification problem could need a fine-tuned threshold in order to improve overall accuracy.

*Logistic Regression* interpretability

*Logistic Regression* models have high interpretability compared to most classification algorithms due to optimized feature coefficients. Feature coefficients can be thought as a measure of sensitivity in feature values.

*Log-Odds* calculation

The product of the feature coefficients and feature values in a *Logistic Regression* model is the *Log-Odds* of a data sample belonging to the positive class. Log odds can take any real value and it’s an indirect way to express probabilities.

### Logistic Regression Classifier

*Logistic Regression* is supervised binary classification algorithm used to predict binary response variables that may indicate the presence or absence of some state. It is possible to extend *Logistic Regression* to multi-class classification problems by creating several one-vs-all binary classifiers. In a one-vs-all scheme, n - 1 classes are grouped as one and a classifier learns to discriminate the remaining class from the ensembled group.

*Logistic Regression* prediction

*Logistic Regression* models predict the probability of an n-dimensional data point belonging to a specific class by constructing a linear decision boundary. This decision boundary splits the n-dimensional plane in two. In a prediction stage, the point is classified according to which semiplane has the highest probability.

*Logistic Regression* cost function

The cost function measuring the inaccuracy of a *Logistic Regression* model across all samples is *Log Loss*. The lower this value, the greater the overall classification accuracy. *Log Loss* is also known as *Cross Entropy* loss.

### Statistical Dependence

In statistics, two events are *dependent* if the occurrence of one of the events causes the probability of the other event occurring to change in a predictable way.

### Bayes Theorem

*Bayes Theorem* calculates the probability of `A`

given `B`

as the probability of `B`

given `A`

multiplied by the probability of `A`

divided by the probability of `B`

:

P(A|B)= {P(B|A)*P(A)}/{P(B)}

This theory describes the probability of an event (`A`

), based on prior knowledge of conditions (`P(B|A)`

) that might be related to the event.

### Statistical Independence

In statistics, two events are *independent* if the probability of one event occurring does not affect the probability of the second event occurring.