On Autistic Interpretations of Occam's Razor
José Hernández-Orallo and Ismael García-Varea
Abstract
Recently, an overhype about the MDL principle is surrounding some fields of
Artificial Intelligence, especially machine learning, neural networks, ILP
and many others, supported by recent proofs that the shorter the more
likely. In this paper we discuss in a critical and sometimes informal way
these justifications because they are proved for very ideal, infinite and,
in our view, artificial situations. Using the same information-theoretic
approach, we study the case for finite and short data and we arrive to a
slightly different result: MDL is a good principle but not the best one for
finite strings and perfect hypotheses. The argument is based upon recently introduced
variants and definitions around the idea of Intensional Complexity, which
penalise or 'simply' do not allow exceptions, seen these as extensional
descriptions. Intensional considerations change the statement that "optimal
compression (Minimal Description Length (MDL)) gives you the best hypothesis
provided the data are random with respect to the hypothesis, the data are
not completely perfect and the data grow to infinity" into the following one
"the intensionality criterion gives you a more feasible hypothesis when the
data are perfect ensuring and not supposing that the data are random to the
hypothesis." Also it does not require that the data grow to infinity, so it
can be used to "understand" finite real problems. More importantly, our
definitions are free from the "MDL's principle" paradox, since the
shortest hypothesis is never random to the data. In one of the two formalisations we
present, the connection with learning and Levin's "Universal Search
Problems" is made explicitly.
In the end, and very far from the
classical notion of 'identification', we propose a different notion of
learning: the more a system learns the more intensional the description is
respect to the data. Consequently the blurry notions of underfitting and
overfitting may be better understood.
Go back to my home page .
© 1996-1997 José Hernández Orallo.