Sunday, October 18, 2015

Machine learning definitions and terminology

Example: 
An object or instance in data used.

Features:
The set of attributes, often represented as a vector, associated to an example, e.g., height and weight for gender prediction.

Labels:
  • In classification, category associated to an object, e.g., positive or negative in binary classification. 
  • In regression, real-valued numbers.
Training data: 
Data used for training algorithm.

Test data: 
Data exclusively used for testing algorithm.

Some standard learning scenarios:
  • Supervised learning: labeled training data.
  • Unsupervised learning: no labeled data. 
  • Semi-supervised learning: labeled training data + unlabeled data.
  • Transductive learning: labeled training data + unlabeled test data.