Feature Scaling
We have 2 important parts in feature scaling
- Standardization
- Normalization
What is Standardization and Normalization ?
Why are they Used ?
The terms normalization and standardization are sometimes used interchangeably,
but they usually refer to different things.
Standardization rescales data to have a mean (𝜇) of 0 and standard deviation (𝜎) of 1 (unit variance).
Formulae for Standardization is
$X_{changed} = \frac{X - \mu}{\sigma} $
For most applications standardization is recommended.
Normalization usually means to scale a variable to have a values between 0 and 1,
while standardization transforms data to have a mean of zero and a standard deviation of 1.
Normalization rescales the values into a range of [0,1].
This might be useful in some cases where all parameters need to have the same positive scale.
However, the outliers from the data set are lost.
Formulae for normalization is
$ X_{changed} = \frac{X - X_{min}}{X_{max}-X_{min}} $
We must standardize the data only after the split and the scaler should be only be fitted to the x_train set because if we do that we get the mean and standard of the values in the x_test which should be hidden to us. So we will only fit the scaler to the test set and then we will transform the scaler to x_test
One Very Important Question is
Do we have to apply/standardization to the dummy variables to the matrix of features ?
Answer is no
Simply as the goal of Standardization is to transform your data and get them in the range generally (-3 , +3) But here we have mostly the data in 0s and 1s after we have converted them using ColumnTransformer, OneHotEncoder and LabelEncoder.
And there is nothing extra to be done here.
Moreever here standardization will convert the values to -3 and +3 which will worsen out understanding of the data as we will not be able to understand the nonsense numerical values.
Feature Scaling on the dataset makes the model better but when we do the same on the dummy variabe it makes the data not redable and we can not relate the country to the salary etc.
So feature Scaling should be applyed to the model but not to the dummy variable as they are already encoded before using ColumnTransformer, OneHotEncoder and LabelEncoder.
Therefore we should only apply feature scaling to the non dummy values ie the values that are numbers
Learn About Data Preprocessing : Click Here