Research topics
The following is a list of our current/past research domains. Recently,
our focus is mainly on symbolicstatistical modeling and structural
learning (inductive logic programming and grammar induction).
For a full list of international publications,
please visit here.
 SymbolicStatistical Modeling / Statistical Abduction
 Knowledge Acquisition / Structural Learning
 Inductive Logic Programming
 Quantified decision trees
 Knowledge Acquisition by Evolutionary Approach (genetic algorithm/programming)
 Grammar Induction
 Learning Automata
 Logic Programming
 FirstOrder Compiler
 Program Transformation
 Reinforcement Learning
 MultiAgent Reinforcement Learning
 Relational Reinforcement Learning
* The documentation of Gemini is available only in Japanese.
Programming Language for SymbolicStatistical Modeling
PRISM is a logicbased programming language for
symbolicstatistical modeling. It is designed to be a generalpurpose
device for statistical abduction, which integrates abductive reasoning
and statistical inference seamlessly. Thanks to PRISM's firstorder
expressive power, we can easily mix symbolic knowledge (structural
knowledge) and statistical knowledge, and it is also easy to describe
the various extensions of, and go beyond, hidden markov models,
probabilistic contextfree grammars, and Bayesian networks.
On the other hand, to support various statistical inference
tasks (e.g. probability computation or parameter estimation via the
EM algorithm), the programming system provides efficient builtin
routines. Currently, we are concentrating on statistical language
modeling and user modeling.
Related publications

Sato, T.:
Insideoutside probability computation for belief propagation.
Proceedings of the 20th International Joint Conference on
Artificial Intelligence
(IJCAI2007),
pp.26052610, 2007.
PDF

Sato, T.:
A Generic Approach to EM Learning for SymbolicStatistical Models.
Proeedings of the 4th Learning Language in Logic Workshop
(LLL05), 2005.
PDF

Sato, T., Kameya, Y. and Zhou, N.F.:
Generative modeling with failure in PRISM.
Proceedings of the 19th International Joint Conference on
Artificial Intelligence
(IJCAI2005), pp.847852, 2005.
PDF

Sato, T. and Kameya, Y.:
A dynamic programming approach to parameter learning of
generative models with failure.
Proceedings of ICML Workshop on Statistical Relational Learning
and its Connection to the Other Fields
(SRL2004), 2004.
PS,
PDF

Sato, T. and Kameya, Y.:
Parameter Learning of Logic Programs for SymbolicStatistical Modeling.
Journal of Artificial Intelligence Research
(JAIR),
Vol.15,
pp.391454, 2001.
PS,
PS + compress(.Z),
PDF

Sato, T. and Kameya, Y.:
PRISM: A symbolicstatistical modeling language.
Proceedings of the 15th International Joint Conference on Artificial
Intelligence (IJCAI97),
pp.13301335, 1997.
PS,
PS + gz,
PDF

Sato, T.:
A statistical learning method for logic programs with distribution
semantics.
Proceedings of the 12th International Conference on Logic Programming
(ICLP95), Tokyo,
pp.715729, 1995.
extended version:
PS,
PS + gz,
PDF
Statistical Language Modeling
In recent years, statistical language models are widely utilized for
analyzing a sequence of symbols in statistical natural language
processing, and bioinformatics. In particular, hidden Markov models
(HMMs), probabilistic contextfree grammars (PCFGs) and their
extensions (historybased models) are wellknown. In our laboratory,
the following studies have been done.
 We have found that, for parameter estimation of moderately
ambiguous PCFGs, a concatenation of a CFG parser and the
graphical EM algorithm (proposed during the research on PRISM) runs faster than the
InsideOutside algorithm by an order (orders) of magnitude. The
implementation based on the Earley parser, called Gemini, is
downloadable from here^{*}.
 We have proposed a simple EM learning method for hierarchical
hidden Markov models.
 We have applied the variational Bayesian learning to PCFGs and
derived an efficient learning algorithm. We have also found
empirically that the algorithm achieves better prediction than
the InsideOutside algorithm.
* The documentation of Gemini is available only in Japanese.
Related publications

Kurihara, K. and Sato, T.:
An Application of the Variational Bayesian Approach to Probabilistic
ContextFree Grammars.
IJCNLP04 Workshop Beyond shallow analyses, 2004.
PDF

Sato, T. and Zhou, N.F.:
A New Perspective of PRISM Relational Modeling.
Proceedings of IJCAI03 workshop on Learning Statistical Models
from Relational Data
(SRL2003),
pp.133139,
2003.
PS,
PS + gz,
PDF

Ueda, N., Sato, T.:
Simplified training algorithms for hierarchical hidden Markov models.
Proceedings of the 4th International Conference on Discovery Science
(DS2001),
LNCS Vol.2226, pp.401415, Springer, 2001.

Sato, T., Abe, S., Kameya, Y., and Shirai, K.:
A separateandlearn approach to EM learning of PCFGs.
Proceedings of the 6th Natural Language Processing Pacific Rim
Symposium
(NLPRS2001),
pp. 255262, 2001.
Revised version:
PS,
PS + gz,
PDF

Kameya, Y., Mori, T. and Sato, T.:
Efficient EM learning of probabilistic CFGs and their extensions
by using WFSTs.
Journal of Natural Language Processing,
Vol.8, No.1, pp.4984, 2001 (in Japanese).
PS,
PS + gz,
PDF

Ueda, N., Kameya, Y., and Sato, T.:
A parameter updating of stochastic contextfree grammars in
linear time on the number of productions.
Proceedings of the 1st IMC workshop, 1999.
Inductive Logic Programming
Decision trees or association rules are quite powerful tools
for data mining, but there is a limitation that they are only
able to represent propositional knowledge. Contrastingly,
with inductive logic programming, we can extract firstorder
rules from data. As a natural firstorder extension of decision
trees and their divideandconquer induction strategy, we proposed
a learning method for quantified decision trees (QDTs), in which
existential and universal quantifiers are treated mathematically.
Related publications

Sato, T.:
Program extraction from quantified decision trees,
Proc. of Machine Intelligence 17,
Bury St Edmunds, pp.7880, 2000.
PS,
PS + gz,
PDF
Last Update: Jan. 23, 2007