Decision Tree Regression
A decision tree is a flowchart-like method. Each internal node in a decision tree model adresses or represents a "result|outcome" of an attribute (e.g. whether a dice roll comes up with a 1 or a 6), each branch represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes)
Here is how a decision tree looks like:
For this model we will be using decision tree to predict a salary of an employee with position level of 6.5
In [1]:
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
In [2]:
df = pd.DataFrame({'Position': ['Business Analyst', 'Junior Consultant', 'Senior Consultant', 'Manager', 'Country Manager', 'Region Manager', 'Partner', 'Senior Partner', 'c-level', 'CEO'], 'Level':[1,2,3,4,5,6,7,8,9,10], 'Salary': [45000, 50000, 60000, 80000, 110000, 150000, 200000, 300000, 500000, 1000000]})
x = df.iloc[: , 1:2].values
y = df.iloc[:, 2:3].values
print(df)
In [4]:
print(x.T, '\n',y.T)
In [7]:
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state = 0)
regressor.fit(x, y)
# Predicting a new result
y_pred = regressor.predict([[7.5]])
In [9]:
X_bar = np.arange(min(x), max(x), 0.01)
X_bar = X_bar.reshape((len(X_bar), 1))
plt.scatter(x, y, color = 'red')
plt.plot(X_bar, regressor.predict(X_bar), color = 'blue')
plt.title('Truth or Bluff (Decision Tree Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
Learn About Data Preprocessing : Click Here