Geometry and Statistics in
NNs
Japanese Version
We are very grad to inform that
we have a special session,
"Geometry and Statistics in
Neural Network Learning Theory" ,
in the
International Conference KES'2001 , which will be held
in Oska and Nara in Japan, 6th - 8th, September, 2001.
Osaka Office
In our session, we study the statistical problem caused by
non-identifiability of layered learning machines.
Information :
* Date: September, 8th (Saturday), 2001, 14:40-16:45.
* Place: Nara New Public Hall, Nara City, Japan.
* Schedule: The time for each presentation is 25 minutes.
(Remark)
* Before this session, Professor Amari gives an invited talk, 13:40-14:40.
* You can see
all special sessions in the conference.
The authors and papers:
You can read the papers which will appear in
the sepecial session.
When you refer these papers, please use "to appear in Proceedings of
5th International Conference on Knowledge-based information Engineering
Systems and Allied Technologies," 2001, September, Osaka and Nara.
(1) Shun-Ichi Amari ,
T.Ozeki, and H.Park (RIKEN Brain Science Institute,Japan)
"Singularities in Learning Models:
Gaussian Random Field Approach."
(2)
Kenji Fukumizu (Insitute of Statistical Mathematics,Japan)
"Asymptotic Theory of Locally Conic Models and its Application to Multilayer Neural Networks."
A full version of many parts of this paper is
"Likelihood Ratio of Unidentifiable Models and Multilayer
Neural Networks"
(3)
Katsuyuki Hagiwara (Mie University,Japan)
"On the training error and generalization error of neural network regression without identifiablity."
(4)
Taichi Hayasaka, M.Kitahara, K.Hagiwara, N.Toda, and S.Usui (Toyohashi University of Technology, Japan)
"On the Asymptotic Distribution of the Least Squares Estimators for Non-identifiable Models."
(5)
Sumio Watanabe (Tokyo Institute of Technology,Japan)
"Bayes and Gibbs Estimations, Empirical Processes, and Resolution of Singularities."
A Short Introduction:
[ Non-identifiability ]
A parametric model in statistics is called identifiable if the mappning from
the parameter to the probability distribution is one-to-one.
A lot of learning machines used in information processing, such as
artificial neural networks, normal mixtures,
and Boltzmann machines, are not identifiable.
We do not yet have mathematical and statistical foundation on which
we can research such models.
[ Singularities and Asymptotics ]
If a non-identifiable model is redundant compared with
the true distribution, then the set of true paramters is an analytic
set with complex singularities, and the rank of the
Fisher information matrix depends on the parameter.
The behaviors of the training and generalization errors of
layered learning machines
are quite different from those of regular statistical models.
It should be emphasized that we can not apply the standard asymptotic methods
constructed by Fisher, Cramer, and Rao to these models.
Either we can not use AIC, MDL, or BIC in statistical model selection
for design of artificial neural networks.
[ Geometry and Statistics ]
The purpose of this special session is to
study and discuss the geometrical and statistical methodology by which
non-identifiable learning machines can be analyzed. Remark that
conic singularities are given by blowing-downs, and
normal crossing singularities are found by blowing-ups.
These algebraic geometrical methods take us to the statistical concepts,
the order statistic and
the empirical process . We find that
a new perspective in geometry and statistics is opened.
[ Results which will be reported ]
(1) Professor Amari, et. al. clarify the generaliztion and traning errors of
learning models of conic singularities
in both the maximum likelihood method and the Bayesian method
using the gaussian random field approach.
(2) Dr. Fukumizu proves that a three layered neural network can be
understood as a locally conic model, and that the asymptotic likelihood ratio
is in proportion to (log n), where n is the number of training samples.
(3) Dr. Hagiwara shows that the training and generalization errors of
a radial basis function with gaussian units are in proportion to
(log n) based on the assumption that the inputs are fixed.
(4) Dr. Hayasaka, et.al. claim that the asymptotic normality of
estimators does not hold in case of simple non-identifiable models, and
the asymptotic distribution of them is closely related to distributional
results of order statistics.
(5) Lastly, Dr.Watanabe studies the Bayes and Gibbs estimations for the case
of statistical models with normal crossing singularities, and
shows all general cases result in this case by resolution theorem.
We expect that mathematicians, statisticians, information scientists,
and theoretical physists will be interested in this topic.
Thank you very much for your interest in this special session.
For questions or comments, please send an e-mail to
Dr. Sumio Watanabe,
P&I Lab., Tokyo Institute of Technology.
E-mail: swatanab@pi.titech.ac.jp
http://watanabe-www.pi.titech.ac.jp/~swatanab/index.html
[Postal Mail] 4259 Nagatsuta, Midori-ku, Yokohama, 226-8503 Japan.