Ralf Herbrich - Learning and Generalization: Theoretical Bounds (2001)

History / Edit / PDF / EPUB / BIB /
Created: June 30, 2017 / Updated: November 2, 2024 / Status: in progress / 2 min read (~250 words)
Machine learning

  • The fundamental difference between a system that learns and one that merely memorizes is that the learning system generalizes to unseen examples

  • An unknown distribution PXY over an input space X and an outcome space Y
  • We are only given a sample z(X×Y)m=Zm which is assumed to be drawn iid from PXY=PZ
  • In an attempt to discover the unknown relation PY|X=x between inputs and outputs, a learning algorithm A chooses a deterministic hypothesis h:XY solely based on a given training sample zZm
  • The performance of the learning algorithm is judged according to a loss function l:Y×YR+ which measures the cost of the prediction ˆy if y is the correct output
  • The learning problem is to find an hypothesis h:XY such the expected risk, R[h]=EXY[l(h(X),Y)], is minimized
  • If we knew PZ, the solution of the learning problem would be straightforward

    hopt(x)=argminy  YEY|X=x[l(y,Y)]

  • Herbrich, Ralf, and Robert C. Williamson. "Learning and generalization: Theoretical bounds." Handbook of Brain Theory and Neural Networks (2002): 3140-3150.