TR98-19

An entropic estimator for structure discovery


    •  Matthew Brand, "An entropic estimator for structure discovery", Tech. Rep. TR98-19, Mitsubishi Electric Research Laboratories, Cambridge, MA, September 1998.
      BibTeX Download PDF
      • @techreport{MERL_TR98-19,
      • author = {Matthew Brand},
      • title = {An entropic estimator for structure discovery},
      • institution = {MERL - Mitsubishi Electric Research Laboratories},
      • address = {Cambridge, MA 02139},
      • number = {TR98-19},
      • month = sep,
      • year = 1998,
      • url = {http://www.merl.com/publications/TR98-19/}
      • }
  • MERL Contact:
  • Research Area:

    Algorithms


We introduce a novel framework for simultaneous structure and parameter learning in hidden-variable conditional probability models, based on an entropic prior and a solution for its maximum a posteriori (MAP) estimator. The MAP estimate minimizes uncertainty in all respects: cross-entropy between model and data; entropy of the model; entropy of the data's descriptive statistics. Iterative estimation extinguishes weakly supported parameters, compressing and sparsifying the model. Trimming operators accelerate this process by removing excess parameters and, unlike most pruning schemes, guarantee an increase in posterior probability. Entropic estimation takes a overcomplete random model and simplifies it, inducing the structure of relations between hidden and observed variables. Applied to hidden Markov models (HMMs), it finds a concise finite-state machine representing the hidden structure of a signal. We entropically model music, handwriting, and video time-series, and show that the resulting models are highly concise, structured, predictive, and interpretable: Surviving states tend to be highly correlated with meaningful partitions of the data, while surviving transitions provide a low-perplexity model of the signal dynamics.