Skip to main content

🌲🌳 Decision Tree 🌲🌳

                             ðŸŒ²ðŸŒ³ Decision Tree 🌲🌳

Decision Tree Algorithm has come under supervised Learning, it is used for both 

Regression and Classification.

Important Terminology related to Decision tree:

  1. Root Node: It represents the entire population or sample and this further gets divided into two or more homogeneous sets.
  2. Splitting: It is a process of dividing a node into two or more sub-nodes.
  3. Decision Node: When a sub-node splits into further sub-nodes, then it is called the decision node.
  4. Leaf / Terminal Node: Nodes do not split is called Leaf or Terminal node.
  5. Pruning: When we remove sub-nodes of a decision node, this process is called pruning. You can say the opposite process of splitting.
  6. Branch / Sub-Tree: A subsection of the entire tree is called a branch or sub-tree.
  7. Parent and Child Node: A node, which is divided into sub-nodes is called a parent node of sub-nodes whereas sub-nodes are the child of a parent node.

Before starting the decision tree  let's understand these three topics:
a) Entropy
b) Information Gain
c) Gini impurity

I will explain all these things in simple words don't worry!

Entropy:- Entropy helps us to calculate the purity of the sub split.
>Entropy controls how a Decision Tree decides to split the data.
>Suppose we have 3 input features f1,f2,f3 out of these three features which feature, we have to select first to start the tree.
>By selecting the best feature will save time, memory, and model performance, we will get leaf node early, etc
> Entropy value ranges from "0-1".


Information Gain:-The information gain is the amount of information gained about a random variable or signal from observing another random variable.
  • An attribute with highest Information gain will tested/split first.


Gini impurity:-
Gini impurity is the same as Entropy with a small difference.
> Both are used for selecting the best feature for the best split (now the question is which one we have to select)
> Both the working is same the main difference is the range of entropy is 0-1 and Gini impurity is 0-0.5
> In entropy the computation time is more because the range starts from 0 and ends at 1 
> In Gini impurity, the computation time is less compared to entropy because the range starts from 0 and ends at 0.5. 

So better to choose Gini impurity.
It will save computation time!!

For the code part please check my github.






Comments

Popular posts from this blog

Loss Functions | MSE | MAE | RMSE

            Performance Metrics The various metrics used to evaluate the results of the prediction are : Mean Squared Error(MSE) Mean Absolute error(MAE) Root-Mean-Squared-Error(RMSE) Adjusted R² Mean Squared Error: Mean Squared error is one of the most used metrics for regression tasks. MSE is simply the average of the squared difference between the target value and value predicted by the regression model.  As it squares the differences and  penalizes (punish)even a small error which leads to over-estimation of how bad the model is. It is preferred more than other metrics because it is differentiable and hence can be optimized better. in the above formulae, y=actual value and ( yhat) means predicted value by the model. RMSE(Root Mean Squared Error: This is the same as MSE (Mean Squared Error) but the root of the value is considered while determining the accuracy of the model. It is preferred more in some cases because the errors are first...

SUPPORT VECTOR MACHINE

                 SUPPORT VECTOR MACHINE:- Support vector machine:-it is a type of supervised learning algorithm it is used to solve both classification and regression problem. Note :- It is mostly used for classification problems. what we are going to learn in SVM: a) Support vectors b) Hyperplane c) Marginal Distance d) Linear Separable e) Non-linear separable f) support kernels NOw we will discuss everything in detail. Hyper plane:- in the above diagram, we have drawn three lines(A, B, C) separating two data points (stars and reds) The lines (A, B, C) are called Hyperplanes. Note:- “Select the hyper-plane which segregates the two classes better” i.e  above there are three hyperplanes how to select the best hyperplane? b)Marginal Distance:- When we draw a hyperplane the plane creates two new(------) dotted lines one line above the hyperplane and one line below the hyperplane line. see the below image you will get an ...

Multi Linear Regression

                                 MULTI LINEAR REGRESSION Before going into MULTI LINEAR REGRESSION first look into Linear Regression. LINEAR REGRESSION:-It is all about getting the best line for the given data that supports linearity. for Linear regression please check my previous post. In Linear regression, we have only one independent variable and one dependent variable. In Multilinear Regression, we have more than one independent variable and one dependent variable. This is the main difference between Multilinear regression and Linear regression. Formulae for Linear regression and Multilinear Regression is listed below: Evaluation metrics for Multi-linear Regression problems are: a)Mean Absolute error b)Mean Squared error c)Root Mean Squared Error d)..... For Evaluation metrics I had posted another post please check it. For the code part please check my Github In ...