ISNI: 0000000118823396
Graphs are ubiquitous tools to represent networks, may they be networks modeling data from neurosciences, sociology, molecular biology, chemistry, etc. A cornerstone of the analysis of graphs is the Laplacian matrix L that encodes their structure. From a linear algebra point-of-view, the analysis of L offers fundamental insights on key properties of the graph: the diffusion speed of an information or a disease on a network, the vulnerability of a network to targeted or random attacks, the redundancy or certain parts of the network, the network's structure in more or less independent modules, are all examples of characteristics of a network one may extract from the Laplacian matrix. Unfortunately, from a computational stand-point, the cost of this analysis is often too expensive in practice. For instance, computing the pseudo-inverse or the SVD of L requires a worst-case cubic time in the size of the graph n and becomes rapidly burdensome as n increases. Among the strategies developed in the state-of-the-art to overcome this challenge, the class of stochastic methods called RandNLA (for Randomized Numerical Linear Algebra) has aroused a great interest in the last decade. The gist of these approaches is to compute an approximation of the solution by following three steps: randomly sample and/or project the matrix L, solve the algebra problem in this reduced dimension, and extrapolate the solution back to the original dimension. On top of its simplicity, a reason for this approach's great success is that, provided a few conditions are met, the approximation error is theoretically very well controlled. In practice, this means that by allowing a small error on the result, RandNLA methods enable to reduce computation costs by several orders of magnitude in some cases. A paradigm on which the first step of RandNLA methods relies is iid (independent and identically distributed) sampling, that enables to take advantage of powerful concentration theorems to have a precise theoretical control over the approximation error. However, iid sampling is not optimal in the sense that it may provide very redundant samples: an ideal sampling scheme should indeed incorporate some form of repulsion in order to obtain samples as representative as possible of the matrix at hand. The class of determinantal point processes (DPP) are such a class of repulsive process, that have the added benefit of being tractable (for instance, the marginal probabilities at any order are known), which makes them natural candidates to improve the current performance of RandNLA algorithms. Sampling from these DPPs, however, is necessarily more expensive than iid sampling. In fact, it requires in general a cubic cost in the size of the set to sample. Thankfully, some specific DPPs are much more efficient to sample. It is the case for instance of DPPs defined over graphs such as random spanning forests that sample edges and nodes of a graph in a time quasi-linear in the number of edges. This type of acceleration is regularly observed when it comes to linear algebra problems involving graphs: a classical example is the inversion problem Mx=b, whose worst-case complexity is cubic in the size of M, and for which a good approximation may be obtained in a time quasi-linear in the number of non-zero elements of M provided that M is the Laplacian matrix of a graph. In a nutshell, this project aims at developing non-iid RandNLA algorithms specifically designed for Laplacian matrices, by taking advantage of the tractable repulsion enabled by DPPs and the efficiency of graph-based algorithms. The originality of our proposal is to leverage the representative power of DPPs for RandNLA in the specific yet promising context of Laplacian matrices, in order to implement these DPPs with very efficient graph-based sampling algorithms.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::338a734d8877f82789813cdb414ca6ec&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::338a734d8877f82789813cdb414ca6ec&type=result"></script>');
-->
</script>
The goal of project GrandMa is to explore the use of Random Graphs (RG) theory in modern Machine Learning (ML) methods on graphs. Large graphs indeed appear in many fields, and given their complexity, their statistical modelization is of primary importance. RG models thus have a long history in statistics and graph theory. Such modelling is however surprisingly absent from modern ML methods on graphs, such as graph kernels or the recently very popular Graph Neural Networks (GNN), and with it, classical ML tools such as generalization bounds or convergence rates of optimization methods. Such tools are nevertheless crucial to characterize the limitations of the algorithms and improve them. Project GrandMa therefore aims at advancing the fundamental understanding of ML algorithms on large graphs, in particular GNNs, understand their limitations, and leverage the theory to design efficient and powerful ML algorithms with guarantees on large random graphs.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::bbe7774ada250dc0567e948d795aefaa&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::bbe7774ada250dc0567e948d795aefaa&type=result"></script>');
-->
</script>
Gaussian Processes are one of the most popular tools in Machine Learning and Statistics. They underly a wide range of techniques for smoothing data, fitting nonlinear classifiers or finding latent variable representations. The role of GPs in Bayesian statistics and Machine Learning is to provide a way of expressing priors over functions. In the most common case, the data are noisy measurement of an underlying function f(x)at a set of sampling points. Instead of choosing a parametric form for f, we remain non-parametric, but we still need to make some assumptions (otherwise the most likely f is any function that interpolates our data exactly). In a Bayesian setting our assumptions enter into a prior p(f) , which lets us express for example that f has a certain degree of smoothness. GPs make it very easy to express such assumptions. Once we have set a GP prior we can perform inference through the posterior distribution p(f|y) , which again is a distribution over functions. The theory of GPs is very elegant, but to make them work in practice one needs to take numerical shortcuts. The so-called “big-n” problem arises, which says that inference for GPs will scale poorly with the number of datapoints. If the data are in the Euclidean plane, then many approximation techniques are available. The most effective is arguably the one developed by Lindgren et al. (2011), who use a representation of GPs as solutions of stochastic partial differential equations (SPDE). The PDE can be solved using the Finite Element Method, which enables large computational savings via sparse matrix operations. However, many datasets are not made up of points in the plane, but concern more complicated objects like character strings, trees, networks, or matrices. In neuroscience for example, functional networks (coactivation between areas) are used to diagnose some diseases, and it would be of interest to develop appropriate GP classifiers (working, in this case, in a space of networks). At the moment, GPs are most useful for (a) small datasets and (b) for data that are on the real line or in the plane. Current computational methods for handling non-spatial GPs are based on low-rank approximations, they are either ad-hoc or greedy, come with no performance guarantees, and can indeed perform badly in practice (Chalupka et al., 2013). We aim to extend the usability of GPs by generalising and extending an approach that has proven very successful for data in the plane or in embedded manifolds. The project is certainly timely since the recent development of deep GP models (Damianou et al., 2013) is bound to increase the interest in GP methods, which could become competitive with deep neural network models in hard classification and prediction tasks. The goal of the project is to provide fast numerical methods for GPs in structured spaces. We will do so by extending the SPDE representation of Lindgren et al. (2011) to a more flexible ARMA-like representation. This representation uses polynomials in the Laplace operator, which enables a generalisation to structured spaces via neighbour graphs and discrete Laplace operators. We will use tools from numerical linear algebra, signal and image processing, and convex optimisation. We will apply our methods to real datasets and research questions from neuroscience.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::a657a27c1c7a2b9793c5ec435f784063&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::a657a27c1c7a2b9793c5ec435f784063&type=result"></script>');
-->
</script>
Classical methods of automatic control are insufficient to deal with dynamics on large networks because they scale poorly with size. This project aims at overcoming this issue by studying scalable continuous methods. In such methods, the (discrete) network and the dynamics therein (an ODE system) are replaced by a dynamics over a continuous space, such as a PDE or an integral equation defined by a graphon. Among the multiple potential applications of these tools, which include epidemics and transportation networks, this project will focus on problems of mode detection in multi-modal mobility.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::2355171888e5f8cafeccaa82c61aa5c2&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::2355171888e5f8cafeccaa82c61aa5c2&type=result"></script>');
-->
</script>
We often admire high-level athletes for their extraordinary performances and for the precise coordination of their gestures. We forget that we are all vocal athletes too! Speech, that we produce everyday in an automatic way, actually requires a very precise coordination of breathing, laryngeal and articulatory gestures, which takes some time for children to master, and which can dysfunction in the case of voice or speech disorders. The production of stop consonants (/p/, /b/, ...) is of particular interest for a better understanding of speech control, as it requires a coordination of speech gestures not only in their amplitude and force, but also in their timing. Children with articulation disorders very often have troubles with this category of sounds, while dysphonic patients demonstrate excessive tensions and efforts. The project StopNCo aims at improving our understanding of speech control by addressing 4 questions: 1. Which acoustic features are crucial for the intelligibility of stop consonants? Instead of the traditional laboratory approach that examines the perceptual consequence of varying features in synthetic stimuli, we will here characterize how speakers enhance their speech in interactive situations requiring intelligibility, but where acoustic cues are altered or missing (speech produced in a noisy or reverberant environment, whispered speech, …). 2. Which coordination of breathing, laryngeal and articulatory gestures enables the variation of these acoustic features, with what extent of physical constraints vs. speaker-specific control? We will collect simultaneously acoustic, aerodynamic (intra-oral pressure, airflow), laryngeal (electroglottography, endoscopy) and articulatory data (movement, force sensors, surface EMG). We will explore how the coordination of breathing, laryngeal and articulatory gestures varies with the speaking mode (murmured to shouted, whispered, fast, clear speech) and using speech perturbation paradigms (filtered auditory feedback, perturbed articulation, oral anesthetic, …). 3. How does this control develop normally in children and dysfunction in some of them? We will characterize how the acoustic cues to stop consonants are refined with child age, or remain deviant in children with functional articulation disorders. Using non invasive methodologies, we will identify some aspects of speech coordination that differ from adult speech, and that some children with articulation disorders have troubles to develop. 4. How the coordination of speech gestures can vary in efficiency? This will first require the development of methodologies to measure or estimate laryngeal and articulatory efforts. Production efforts will be compared between non pathological adult speakers and dysphonic patients, hypothesized to coordinate less efficiently their breathing, laryngeal and articulatory gestures. A functional functional model of speech coordination in the production of stop consonants will be built, taking into account the different physiological levels and perceptual outcomes, the intra and inter-speaker. The financial help requested mainly aims at recruiting a Ph.D student and a one-year post-doctoral researcher, so as to help me build a team around the questions of speech production efforts and stop consonants production that are not very developed in the GIPSA-lab yet. The consortium will be composed of a limited number of people, mainly from GIPSA-lab but also from the university of Lyon 1 and from the medical field, most of them beeing young researchers. Each collaborator will bring a specific expertise – in speech physiology and pathology, in biomechanics and in speech development in children – complementary to mine in vocal effort, speech adaptation and face to face interaction. The project will bring fundamental knowledge on speech efforts and coordination, as well as new measurement methodologies and indices for the diagnosis and the rehabilitation of speech disorders.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::d193e089a8af0a2b11ad87a738ade74e&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=anr_________::d193e089a8af0a2b11ad87a738ade74e&type=result"></script>');
-->
</script>