Research in Machine Learning

I. Robustness of deep neural networks. In colaboration with E. Pauwels and V. Magron we want to analyze the robustness of deep neural networks with respect to noise in the input. Recent works have considered semidefinite relaxations to address this robustness issue for RELU neural networks. In fact this relaxation is the first level of the Moment-SOS hierarchy that we have developed and we claim that we can improve such analysis by solving higher-level semidefinite relaxations of the hierarchy. Indeed the scalability issue of the Moment-SOS hierarchy can be overcome for certain types of neural nets (e.g. RELU, but not only) by implementing an adequate ``sparse" version of the hierarchy. The same methodology can also be used to provide certified upper bounds on the Lipschitz constant of Deep Neural Nets. See e.g.

Tong Chen, Lasserre J.B., Pauwels E., Magron V. Semi-algebraic optimization for Lipschitz constants of ReLU networks. Conference on Neural Information Processing Systems, Vancouver, December 2020

II. The Christoffel function for ML. This research is done mainly in collaboration with Edouard Pauwels (IMT, Toulouse) and more recently also with Mihai Putinar (University of California, Santa Barbara). The ultimate goal is to promote the Christoffel function as a new and promising tool for some important applications in Machine Learning (ML) and Data Analysis (e.g. data representation and encoding, outlier detection, density estimation). Our foundational approach is summarized in the three papers:

Lasserre J.B., Pauwels E. The empirical Christoffel function with applications in data analysis, Adv. Comp. Math. 45 (2019), pp. 1439--1468
Pauwels E., Putinar M., Lasserre J.B. Data analysis from empirical moments and the Christoffel function, Found. Comp. Math. 21 (2021), pp. 243--273
Lasserre J.B., Pauwels E. Sorting out typicality with the inverse moment matrix SOS polynomial, Proceedings of NIPS 2016, Barcelona 2016, arXiv:1606.03858

We started from a simple and striking observation. First we draw a cloud of 2-D points and built up the empirical moment matrix M_d(with moments up to order "2d") associated witht the points of the cloud. Then we built up the SOS polynomial x -> c_d(x) of degree "2d" whose associated Gram matrix is the inverse of M_d. Then one readily observes that the level sets of c_dcapture the shape of the cloud quite well even for small "d"! When µ is a measure with compact support K and with a density f w.r.t. the Lebesgue measure on K, then the reciprocal of c_dis the Christoffel function, well-known in approximation theory. For instance when K has a simple geometry (a box, ellipsoid, simplex) then it is well-known that 1/c_d(x) converges pointwise to the density f "times" an equilibrium density, intrinsic to K, for x in K, and converges to zero for x outside K.

Therefore, somehow the Christoffel function identifies the support of K. What is striking is how well this happens (even for small "d") for the Christoffel function associated with an empirical measure on a cloud of points drawn from an unknown distribution µ. Also it is worth noticing that the empirical moment matrix M_d is quite easy to construct and to invert (modulo the dimension), and with no optimization process involved! This should make c_d(x) an appealing and easy to use tool in some important applications of Machine Learning (ML) and statistics. For instance we have already shown that it has a remarkable efficiency in e.g. outlier detection and estimation of density. In addition, if the cloud of points is on a manifold then the kernel of M_dcontains a lot of information that can be used, e.g., to learn the dimension of the manifod when it has an algebraic boundary, with relatively little effort compared to more sophisticated methods of the literature.

All these potential applications in Machine Learning and Data Analysis are described in our forthcoming book

``The Christoffel_Darboux Kernel for Data Analysis" by J.B. Lasserre, E. Pauwels, M. Putinar, Cambridge University Press, 2021

III. New applications and connections of the Christoffel function.

It turns out that a non-standard applications of the Christoffel functions allow to recover accurately a function from a sample of its values and in several cases even without the Gibbs phenomenon if the function is discontinuous. To illustrate the basic idea consider a function f: [0,1] -> [0,1]. We (i) build up the moment matrix of the empirical measure µ on [0,1]×[0,1] supported on the graph of the function at the points where it is evaluated, (ii) and the degree-n Christoffel function Cn(x,y) associated with this measure. Then for each fixed x in 0,1], we approximate f(x) by f_n(x):=argmin_y Cn(x,y), which can bedone efficiently as y -> C_n(x,y) is a univariate SOS polynomial. This strategy is described in

Marx S., Pauwels E., Weisser T., Henrion D., Lasserre J.B. (2021) Semi-Algebraic approximation using Christoffel-Darboux kernel, Constr. Approx. 54, pp. 391--429.

We have also shown the potential of the Christoffel function as a simple tool for supervised classification in data analysis (of moderate dimension) in

Lasserre J.B. (2022) On the Christoffel function and classification in data analysis. Comptes Rendus Mathématique 360 (8), pp. 919--928

We have also shown that the Christoffel function of a measure µ on a cartesian product X×Y disintegrates as the product of the Christoffel function of the marginal of µ on X and the Christoffel function of a measure with the flavor of the conditional probability on Y given x in X, exactly as for Borel measures on X×Y. See

Lasserre J.B. (2022) A disintegration of the Christoffel function. Comptes Rendus Mathématique 360 (9), pp. 1071--1079

In addition, we have also revealed some (in the author's opinion surprising) connections with other disciplines involving the polynomial Pell's equation, certificates of positivity, equilibrium measure of compact sets, and a duality result of Nesterov for some cones of positive polynomials. See e.g.

Lasserre J.B. (2023) Pell's equation, sum-of-squares and equilibrium measures of compact sets. Comptes Rendus Mathématique 361 (5), pp. 935--952
Lasserre J.B. , Yuan Xu (2024) A generalized Pell's equation for a class of multivariate orthogonal polynomials, Trans. Amer. Math. Soc. (2024)
Lasserre J.B. (2024) The Christoffel function: Applications, connections and extensions, Num. Alg. Control Optim.

We have also provided a variant of the CF with same computational complexity and same asymptotic dichotomy behavior, and with the advantage that the (usually unknown) equilibrium measure of the support disappears in the limit.

Lasserre J.B. (2023) A modified Christoffel function and its asymptotic properties, J. Approximation Theory 295.

Vous êtes ici

Research in Machine Learning