Articles, Blog

WEKA Tutorial #1.3 – How to Build a Data Mining Model from Scratch

November 7, 2019


In this third and final part of episode
2 we’re going to continue where we left off. In the two previous episodes where I
have shown you from installation of the WEKA software to pre-processing the
data set and model construction using the decision tree algorithm C4.5. In this video, I will show you how to interpret the decision rules
obtained from the decision tree model so without further ado let’s get started. So
let’s have a look at the tree, what does it actually look like? You can
right-click on this label here and then find visualize tree and then this is the
tree this is the decision tree created by the J48 or the C4.5 algorithm. The first one represents the root node and
the rectangle represents the leaf node. And so these represents the subsequent
branching out of the variables. So let’s start from the root node here so the
first variable is petal width and if the petal width has a value of less than -0.784457 then we can classify it as being Iris setosa and in parentheses, 50 of these are using this rule. So if
the petal width has a value greater than 0.656917 then we can say that it is an Iris virginica and 46 of these have been
correctly classified and 1 have been misclassified. And so we can do the same with the branching out of node as well. So this means that in order to be
classified as Iris versicolor here, the petal width needs to be in the range of
-0.78 and 0.65. This is the first variable and the second variable needs
to have petal length value of less than 0.64 to be an Iris versicolor. And so if we move on to the subsequent branch
here, the petal length has a value greater than 0.64 and the petal
width has a value less than 0.39, then we can say that it is an Iris
virginica. However, if the petal width has a value of greater than 0.39, then we can
see that it is an Iris versicolor. So this visual tree will allow us to come up with the
visualization of the if and then rules of the decision tree that have been created.
And we can see that 96% accuracy was afforded by the tree. So, very useful and
that’s about it. So, congratulations you have just built your first prediction
model. And in the future videos, we’re going to cover some more algorithms and
other interesting data mining software as well. So until next time, I’m Chanin Nantasenamat on the Data Professor channel. And if you haven’t subscribed
yet, please consider subscribing and clicking on the notification bell so
that you will be notified on the next video. So I’ll see you in the next one!

1 Comment

  • Reply jefferson Jones September 7, 2019 at 7:29 pm

    Cool stuff πŸ‘πŸ»πŸ‘πŸ»

  • Leave a Reply