foto (c) M.Iwanowski

Call for papers
Social Programme



ICANNGA'07 tutorial on

Learning from Data with Generalization as an Inverse Problem

Vĕra KýrkovŠ

Institute of Computer Science
Academy of Sciences of the Czech Republic

Prague, Czech Republic

Generalization capability in learning from data can be investigated in terms of regularization, which has been used in many branches of applied mathematics to obtain stable solutions of inverse problems, i.e., problems of finding unknown causes (such as shapes of functions) of known consequences (such as measured data). It will be shown that supervised learning modeled as minimization of error functionals, the expected and the empirical one, can be reformulated in terms of inverse problems with solutions in spaces of functions defined by kernels. Mathematical results from theory of inverse problems applied to construction of optimal solutions of learning tasks can be used to design learning algorithms based on solutions of systems of linear equations.


  • Learning from data: minimization of the empirical error functional defined by a sample of data, minimization of the expected error functional defined by a probability distribution, optimization of error functionals as best approximation, tools from approximation theory.

  • Generalization: philosophical concept of generalization, generalization in learning as a stability of solutions with respect to small changes of data, penalization of solutions with high-frequency oscillation.

  • Inverse problems: well and ill-posed problems, well and ill-conditioned problems, Moore-Penrose pseudosolution, measures of stability, regularization as improvement of stability, properties of optimal and regularized solutions.

  • Representation of learning as an inverse problem: typical operators defining inverse problems, tomography and Radon transform, operators defining inverse problems modeling learning, characterization of optimal and regularized solutions (the Representer Theorem).

  • Three reasons for using kernels in machine learning: kernels define a class of hypothesis spaces, where theory of inverse problems can be applied, kernels define stabilizers penalizing various types of high-frequency oscillations, kernels define transformations of input space geometries allowing more types of data to be separated linearly

  • Learning algorithms based on the Representer Theorem: neural network learning as a solution of a system of linear equations, approximate optimization for complexity reduction, comparison with algorithms operating on networks with smaller number of units than the size of the sample of data.


Prof. Vĕra KýrkovŠ, scientist, Institute of Computer Science, Academy of Sciences of the Czech Republic, Praguesince 2002 Head of the Department of Theoretical Computer Scienceresearch in mathematical theory of neurocomputing and learning, nonlinear approximation and optimization published many research papers, several chapters in books and articles in encyclopedias, coedited a book and a proceeding published by Springer-Verlag, member of the Editorial Board of Neural Processing Letters (Kluwer)


Selected recent publications:

  • V. Kurkova: Supervised learning with generalization as an inverse problem. Logic Journal of IGPL 13: 551--559, 2005.

  • P. C. Kainen, V. Kurkova, M. Sanguineti: Rates of approximate minimization of error functionals over Boolean variable-basis functions. Journal of Mathematical Modelling and Algorithms 4: 355 -- 368, 2005.

  • V. Kurkova, M. Sanguineti: Learning with generalization capability by kernel methods of bounded complexity. Journal of Complexity 21: 350--367, 2005.

  • V. Kurkova, M. Sanguineti: Error estimates for approximate optimization by the extended Ritz method. SIAM Journal on Optimization 15: 461--487, 2005.

  • V. Kurkova: High-dimensional approximation and optimization by neural networks. Chapter 4 in Advances in Learning Theory: Methods, Models and Applications. (Eds. J. Suykens et al.) (pp. 69-88). Amsterdam: IOS Press, 2003.

  • P. C. Kainen, V. Kurkova, A. Vogt: Best approximation by linear combinations of characteristic functions of half-spaces. Journal of Approx. Theory 122: 151--159, 2003.

  • P. C. Kainen, V. Kurkova, M. Sanguineti: Minimization of error functionals over neural networks. SIAM Journal on Optimization 14: 732--742, 2003.
    V. Kurkova, M. Sanguineti: Comparison of worst-case errors in linear and neural network approximation. IEEE Transactions on Information Theory: 48, 264--275, 2002.

  • P. C. Kainen, V. Kurkova, A. Vogt: Continuity of approximation by neural networks in -spaces. Annals of Operational Research 101: 143--147, 2001.
    V. Kurkova, M. Sanguineti: Bounds on rates of variable-basis and neural network approximation. IEEE Transactions on Information Theory 47: 2659--2665, 2001.



web-site:            e-mail:
(c) 2005,06,07  Institute of Control and Industrial Electronics, Warsaw University of Technology, last modification: 16.02.2007