What’s shown below is abstracted from Andrew Moore’s lab tutorials, thanks to him!
- Give one attribute(e.g. wealth), try to predict the value of new people’s wealths by means of some of the other available attributes.
- Applies to categorical outputs.
- Categorical attributes: an attribute which takes on two or more discrete values, also known as a symbolic attribute.
- Real attribute: a column of real numbers.
- A decision tree is a tree-structured plan of a set of attributes test in order to predict the output.
- To decide which attribute should be tested first, simply find the one with the highest information gain.
- Don’t split a node if all matching records have the same output value.
- Don’t split a node if none of the attributes can create multiple non-empty children.
- If all records in current data subset have the same output then don’t recurse.
- If all records have exactly the same set of input attributes then don’t recurse.