Mini course: Introduction to Supervised Learning

Prof. Marcelo de Souza Lauretto

Slides:

1. Bibliography

2. Introduction

3. Basic Methods

4. Performance Evaluation

5. Basics on R

Datasets

1. Weather data (nominal)

2. Credit Scoring

Exercices:

1. Implement a function called Zero_Rule(X, y), which receives a training data represented by X (data frame containing the attribute values) and y (vector of factors representing the classes) and returns the majority class.

2. Implement a function called One_Rule(X,y) which receives a training data represented by X (data frame containing the attribute values) and y (vector of factors representing the classes) and provides the optimum one-rule decision. The output shall be a data frame containing the columns:

a. Value: value of the category (if nominal attributes) or maximum value of the interval (for numeric attributes)

b. Class

3. Extend the function One_Rule in order to accept, besides the resubstitution error, also the Gini Index, Information Gain and Gain Ratio. The function must receive an additional argument, type, which determines which score function shall be used:
type=”ER”: resubstitution error
type=”GI”: Gini index
type=”IG”: Information gain
type=”GR”: Gain Ratio