Advanced search in
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
10,038 Research products, page 1 of 1,004

  • Publications
  • Conference object
  • European Commission
  • EU
  • English
  • Mémoires en Sciences de l'Information et de la Communication
  • Hal-Diderot
  • ProdInra

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • English
    Authors: 
    Turan, Mehmet Ali Tugtekin; Klakow, Dietrich; Vincent, Emmanuel; Jouvet, Denis;
    Publisher: HAL CCSD
    Country: France
    Project: EC | COMPRISE (825081)

    Submitted to INTERSPEECH 2021; International audience; In recent years, voice-controlled personal assistants have revolutionized the interaction with smart devices and mobile applications. These dialogue tools are then used by system providers to improve and retrain the language models (LMs). Each spoken message reveals personal information, hence, it is necessary to remove the private data from the input utterances. However, this may harm the LM training because privacy-transformed data is unlikely to match the test distribution. This paper aims to fill the gap by focusing on the adaptation of LM initially trained on privacy-transformed utterances. Our data sanitization process relies on named-entity recognition. We propose an LM adaptation strategy over the private data with minimum losses. Class-based modeling is an effective approach to overcome data sparsity in the context of n-gram model training. On the other hand, neural LMs can handle longer contexts which can yield better predictions. Our methodology combines the predictive power of class-based models and the generalization capability of neural models together. With privacy transformation, we have a relative 11% word error rate (WER) increase compared to an LM trained on the clean data. Despite the privacy-preserving, we can still achieve comparable accuracy. Empirical evaluations attain a relative WER improvement of 8% over the initial model.

  • English
    Authors: 
    Sheikh, Imran,; Vincent, Emmanuel; Illina, Irina;
    Publisher: HAL CCSD
    Country: France
    Project: EC | COMPRISE (825081)

    International audience; In several ASR use cases, training and adaptation of domain-specific LMs can only rely on a small amount of manuallyverified text transcriptions and sometimes a limited amount of in-domain speech. Training of LSTM LMs in such limited data scenarios can benefit from alternate uncertain ASR hypotheses, as observed in our recent work. In this paper, we propose a method to train Transformer LMs on ASR confusion networks. We evaluate whether these self-attention based LMs are better at exploiting alternate ASR hypotheses as compared to LSTM LMs. Evaluation results show that Transformer LMs achieve 3–6% relative reduction in perplexity on the AMI scenario meetings but perform similar to LSTM LMs on the smaller Verbmobil conversational corpus. Evaluation on ASR N-best rescoring shows that LSTM and Transformer LMs trained on ASR confusion networks do not bring significant WER reductions. However, a qualitative analysis reveals that they are better at predicting less frequent words.

  • Publication . Conference object . Preprint . Article . 2022
    Open Access English
    Authors: 
    Deutch, Daniel; Frost, Nave; Kimelfeld, Benny; Monet, Mikaël;
    Publisher: HAL CCSD
    Country: France
    Project: EC | ProDIS (804302)

    International audience; The Shapley value is a game-theoretic notion for wealth distribution that is nowadays extensively used to explain complex data-intensive computation, for instance, in network analysis or machine learning. Recent theoretical works show that query evaluation over relational databases fits well in this explanation paradigm. Yet, these works fall short of providing practical solutions to the computational challenge inherent to the Shapley computation. We present in this paper two practically effective solutions for computing Shapley values in query answering. We start by establishing a tight theoretical connection to the extensively studied problem of query evaluation over probabilistic databases, which allows us to obtain a polynomial-time algorithm for the class of queries for which probability computation is tractable. We then propose a first practical solution for computing Shapley values that adopts tools from probabilistic query evaluation. In particular, we capture the dependence of query answers on input database facts using Boolean expressions (data provenance), and then transform it, via Knowledge Compilation, into a particular circuit form for which we devise an algorithm for computing the Shapley values. Our second practical solution is a faster yet inexact approach that transforms the provenance to a Conjunctive Normal Form and uses a heuristic to compute the Shapley values. Our experiments on TPC-H and IMDB demonstrate the practical effectiveness of our solutions.

  • Closed Access English
    Authors: 
    Horel, Jean-Baptiste; Laugier, Christian; Marsso, Lina; Mateescu, Radu; Muller, Lucie; Paigwar, Anshul; Renzaglia, Alessandro; Serwe, Wendelin;
    Publisher: HAL CCSD
    Country: France
    Project: EC | ArchitectECA2030 (877539)

    International audience; Simulation, a common practice to evaluate autonomous vehicles, requires to specify realistic scenarios, in particular critical ones, which correspond to corner-case situations occurring rarely and potentially dangerous to reproduce in real environments. Such simulation scenarios may be either generated randomly, or specified manually. Randomly generated scenarios can be easily generated, but their relevance might be difficult to assess, for instance when many slightly different scenarios target one feature. Manually specified scenarios can focus on a given feature, but their design might be difficult and time-consuming, especially to achieve satisfactory coverage. In this work, we propose an automatic approach to generate a large number of relevant critical scenarios for autonomous driving simulators. The approach is based on the generation of behavioural conformance tests from a formal model (specifying the ground truth configuration with the range of vehicle behaviours) and a test purpose (specifying the critical feature to focus on). The obtained abstract test cases cover, by construction, all possible executions exercising a given feature, and can be automatically translated into the inputs of autonomous driving simulators. We illustrate our approach by generating hundreds of behaviour trees for the CARLA simulator for several realistic configurations.

  • Open Access English
    Authors: 
    Julien Baste; Michael R. Fellows; Lars Jaffke; Tomáš Masařík; Mateus de Oliveira Oliveira; Geevarghese Philip; Frances A. Rosamond;
    Publisher: HAL CCSD
    Country: France
    Project: EC | CUTACOMBS (714704)

    When modeling an application of practical relevance as an instance of a combinatorial problem X, we are often interested not merely in finding one optimal solution for that instance, but in finding a sufficiently diverse collection of good solutions. In this work we initiate a systematic study of diversity from the point of view of fixed-parameter tractability theory. First, we consider an intuitive notion of diversity of a collection of solutions which suits a large variety of combinatorial problems of practical interest. We then present an algorithmic framework which --automatically-- converts a tree-decomposition-based dynamic programming algorithm for a given combinatorial problem X into a dynamic programming algorithm for the diverse version of X. Surprisingly, our algorithm has a polynomial dependence on the diversity parameter. Comment: Accepted to Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI} 2020, 16 pages

  • Publication . Preprint . Conference object . Article . 2022
    Open Access English
    Authors: 
    Benedikt Ahrens; Ralph Matthes; Anders Mörtberg;
    Publisher: Association for Computing Machinery (ACM)
    Countries: Netherlands, France, France
    Project: EC | CoqHoTT (637339), EC | CoqHoTT (637339)

    Accepted to CPP 2022; International audience; In previous work (“From signatures to monads in UniMath”), we described a category-theoretic construction of abstract syntax from a signature, mechanized in the UniMath library based on the Coq proof assistant. In the present work, we describe what was necessary to generalize that work to account for simply-typed languages. First, some definitions had to be generalized to account for the natural appearance of non-endofunctors in the simply typed case. As it turns out, in many cases our mechanized results carried over to the generalized definitions without any code change. Second, an existing mechanized library on ω-cocontinuous functors had to be extended by constructions and theorems necessary for constructing multi-sorted syntax. Third, the theoretical framework for the semantical signatures had to be generalized from a monoidal to a bi-categorical setting, again to account for non-endofunctors arising in the typed case. This uses actions of endofunctors on functors with given source, and the corresponding notion of strong functors between actions, all formalized in UniMath using a recently developed library of bicategory theory. We explain what needed to be done to plug all of these ingredients together, modularly. The main result of our work is a general construction that, when fed with a signature for a simply-typed language, returns an implementation of that language together with suitable boilerplate code, in particular, a certified monadic substitution operation.

  • Open Access English
    Authors: 
    Azalea Raad; Luc Maranget; Viktor Vafeiadis;
    Publisher: Association for Computing Machinery (ACM)
    Countries: France, United Kingdom, France, France, France, France, France
    Project: EC | PERSIST (101003349), EC | PERSIST (101003349)

    Existing semantic formalisations of the Intel-x86 architecture cover only a small fragment of its available features that are relevant for the consistency semantics of multi-threaded programs as well as the persistency semantics of programs interfacing with non-volatile memory. We extend these formalisations to cover: (1) non-temporal writes, which provide higher performance and are used to ensure that updates are flushed to memory; (2) reads and writes to other Intel-x86 memory types, namely uncacheable, write-combined, and write-through; as well as (3) the interaction between these features. We develop our formal model in both operational and declarative styles, and prove that the two characterisations are equivalent. We have empirically validated our formalisation of the consistency semantics of these additional features and their subtle interactions by extensive testing on different Intel-x86 implementations.

  • Publication . Preprint . Article . Conference object . 2022
    Open Access English
    Authors: 
    Betea, Dan; Bouttier, Jérémie; Walsh, Harriet;
    Publisher: HAL CCSD
    Country: France
    Project: EC | CombiTop (716083), ANR | DIMERS (ANR-18-CE40-0033)

    We study two families of probability measures on integer partitions, which are Schur measures with parameters tuned in such a way that the edge fluctuations are characterized by a critical exponent different from the generic $1/3$. We find that the first part asymptotically follows a "higher-order analogue" of the Tracy-Widom GUE distribution, previously encountered by Le Doussal, Majumdar and Schehr in quantum statistical physics. We also compute limit shapes, and discuss an exact mapping between one of our families and the multicritical unitary matrix models introduced by Periwal and Shevitz. Comment: 12 pages, 3 figures (v2: minor modifications and clarifications). Extended abstract for FPSAC 2021

  • Open Access English
    Authors: 
    Da Silva, F.; Ricardo, E.; Ferreira, J.; Santos, J.; Heuraux, S.; Silva, A.; Ribeiro, T.; De Masi, G.; Tudisco, O.; Cavazzana, R.; +1 more
    Publisher: Institute of Physics Publishing, Bristol , Regno Unito
    Countries: France, Italy
    Project: EC | EUROfusion (633053)

    O-mode reflectometry, a technique to diagnose fusion plasmas, is foreseen as a source of real-time (RT) plasma position and shape measurements for control purposes in the coming generation of machines such as DEMO. It is, thus, of paramount importance to predict the behavior and capabilities of these new reflectometry systems using synthetic diagnostics. The use of finite-difference time-domain (FDTD) time-dependent codes permits a comprehensive description of reflectometry, including aspects such as propagation in the plasma, the system location within the vacuum vessel, its access to the plasma or the signal processing techniques. FDTD is a computationally demanding technique, especially when it comes to three-dimensional (3D) simulations, which requires access to HPC facilities. This fact makes the use of two-dimensional (2D) codes much more common. It is important to have a good evaluation of the compromises made when using a 2D model in order to decide whether it is applicable to the problem under study, or if the problem rather requires a 3D approach. This work attempts to answer this question by comparing simulations of a potential plasma position reflectometer (PPR) at the Lower Field-Side (LFS) on IDTT carried out using two full-wave FDTD codes, REFMULF (2D) and REFMUL3 (3D). In particular, the simulations consider one of IDTT's foreseen plasma scenarios, namely, a Single Null (SN) configuration, at the Start Of Flat (SOF), the start of the flat top of the plasma current.

  • Publication . 2021
    English
    Authors: 
    Baelde, David; Delaune, Stéphanie; Koutsos, Adrien; Moreau, Solène;
    Publisher: HAL CCSD
    Country: France
    Project: EC | POPSTAR (714955)

    International audience; Bana and Comon have proposed a logical approach to proving protocols in the computational model, which they call the Computationally Complete Symbolic Attacker (CCSA). The proof assistant Squirrel implements a verification technique that elaborates on this approach, building on a meta-logic over the CCSA base logic. In this paper, we show that this meta-logic can naturally be extended to handle protocols with mutable states (key updates, counters, etc.) and we extend \Squirrel's proof system to be able to express the complex proof arguments that are sometimes required for these protocols. Our theoretical contributions have been implemented in Squirrel and validated on a number of case studies, including a proof of the YubiKey and YubiHSM protocols.

Send a message
How can we help?
We usually respond in a few hours.