
The focus of this proposal is on the detection and survival of wrong code compiler defects, which we argue present a cyber-security threat that has been largely ignored to date. First, incorrectly compiled code can introduce exploitable vulnerabilities that are not visible at the source code level, and thus cannot be detected by source-level static analysers. Second, incorrectly compiled code can undermine the reliability of the application, which can have dramatic repercussions in the context of safety-critical systems. Third, wrong code compiler defects can also be the target of some of the most insidious security attacks. A crafty attacker posing as an open source developer can introduce a compiler-bug-based backdoor into a security-critical application by adding a patch that looks perfectly innocent but which, when compiled with a certain compiler, yields binary code that allows the attacker to compromise the software. In this project, we aim to explore automated techniques that can detect and prevent such problems. In particular, we plan to investigate techniques for automatically finding compiler-induced vulnerabilities in real software, approaches for understanding the extent to which an attacker could maliciously modify an application to create a compiler-induced vulnerability, and methods for preventing against such vulnerabilities at runtime.
Modern computing systems are becoming increasingly diverse, but the common feature of all emerging computing platforms is the increased potential for performing many computations in parallel, by providing large numbers of processor cores. Computer systems consisting of various different platforms have great potential for performing tasks fast and efficiently. However, programming such systems is a great challenge. The era of performance increase through increased clock speeds has come to an end and we have entered a period where performance increases can only come from increased numbers of heterogeneous computational cores and their effective exploitation by software. Because of the immense effort required to adapt existing parallel software to novel hardware architectures with present technology, there is a very real danger that future advances in hardware performance will have little impact on practical large-scale computing using legacy software. The specific challenge that we want to address in this proposal is how to exploit the parallelism of a given computing platform, e.g. a multicore CPU, a graphics processor (GPU) or a Field-Programmable Gate Array (FPGA), in the best possible way, without having to change the original program. These different platforms have very different properties in terms of the available parallelism, depending on the nature and organisation of the processing cores and the memory. In particular FPGAs have great potential for parallelism but they are radically different in architecture from mainstream processors. This makes them very difficult to program. The key problem here is how to transform a program so that it will best use the potential for parallelism provided by the computing platform, and crucially, how to do this so that the resulting program is guaranteed to have the same behaviour as the original program. Our proposed approach is to use an advanced type system called Multi-Party Session Types to describe the communication between the tasks that make up a computation. To use a rough analogy, the computation could for instance be viewed as a car assembly line, where every unit performs a particular task such as painting, inserting doors, wheels, motor etc. Depending on the organisation and composition of the factory, the order in which these operations is performed will determine the speed with which a car can be assembled. However, when reordering the operations, one must of course ensure that changing the order does not lead to incorrect assembly. To return to the computational problem, by using the Multi-Party Session Types to describe the communication, we have a formal way of reasoning about the transformations. By developing a formal language for the transformations we can prove their correctness. This is the main novelty of the proposal: the formal system for type transformations. The actual transformations can be viewed as "programs" in this formal language. They will be informed by the properties of the computing platform. To provide this link between the transformation and the platform, we will also develop a formal description of parallel computing platforms. By building these formal systems into a compiler we will be able to transform programs to run in the most efficient way on hybrid manycore platforms. The main benefit from the proposed research is that the programmer will not need to have in-depth knowledge of the highly complex architecture of a hybrid manycore platform. This will be of great benefit to in particular scientific computing, because it also means that programs will not need to be rewritten to run with best performance on novel systems. To demonstrate the effectiveness of our approach we aim to develop a proof-of-concept compiler which will transform programs so that they can run on FPGAs, because this type of computing platform is the most different from other platforms and hence the most challenging.
Heterogeneous multi-core processors are key to complex computational problems such as real-time medical imaging, financial analysis and high-definition video, because of the increased processing power they make available. The reliability and probity of such applications is of critical importance, yet heterogeneous multi-core processors are notoriously difficult to program correctly. There is a need for analysis and verification techniques to help detect and fix errors early in the design of multi-core software. The development of such techniques is the aim of the proposed Fellowship research.The research will involve extending the capabilities of existing theoretical computer science techniques based on typechecking and model checking. Typechecking is a commonly used lightweight method for eliminating errors in computer programs: a basic typechecker will reject an invalid expression such as Hello + 1. More complex typechecking, based on session types , can allow a protocol between two parties in a system to be automatically checked. One part of the Fellowship research will involve extending the notion of session types to be applicable to heterogeneous multi-core processors. New typechecking methods will also be developed to help programmers deal with complex issues arising from the management of separate memory spaces in multi-core systems.Model checking is a technique for verifying hardware and software systems which attempts to find system bugs by checking an abstract model of the system. Model checking is less widely used than typechecking, but model checking techniques have recently been incorporated in software products from major vendors such as Microsoft. A major part of the fellowship research will involve developing advanced model checking techniques to help find errors associated with the dynamic behaviour of software for heterogeneous multi-core processors.Part of the research will involve developing a set of open-source tools based on the novel formal analysis techniques. Experience has shown that developers interested in multi-core programming are reluctant to adopt new languages and formalisms, and will only consider new techniques if they are easy and intuitive to use, and can be integrated into an existing development tool-chain. To increase the potential for eventual adoption by industry, the new techniques developed during the Fellowship research will involve regular input and advice from Codeplay Software Ltd., a UK based company specialising in development tools for multi-core processors.
Users want mobile devices that appear fast and responsive, but at the same time have long lasting batteries and do not overheat. Achieving both of these at once is difficult. The workloads employed to evaluate mobile optimisations are rarely representative of real mobile applications and are oblivious to user perception, focussing only on performance. As a result hardware and software designers' decisions do not respect the user's Quality of Experience (QoE). The device either runs faster than necessary for optimal QoE, wasting energy, or the device runs too slowly, spoiling QoE. SUMMER will develop the first framework to record, replay, and analyse mobile workloads that represent and measure real user experience. Our work will expose for the first time the real Pareto trade-off between the user's QoE and energy consumption. The results of this project will permit others, from computer architects up to library developers, to make their design decisions with QoE as their optimisation target. To show the power of this new approach, we will design the first energy efficient operating system scheduler for heterogeneous mobile processors which takes QoE into account. With heterogeneous mobile processors just now entering the market, a scheduler able to use them optimally is urgently needed. We expect our scheduler to be at least 50% more energy efficient on average than the standard Linux scheduler on an ARM BIG.LITTLE system.
Computing pervades our lives, impacting our health, work, entertainment and social interaction. Over recent years, the technology inside the devices providing these services has undergone a radical change: where once, processing was undertaken by relatively homogeneous "sequential" devices, in which essentially one thing happened at a time, the new systems compose a range of specialized devices, some targeting specific problem sub domains, and almost all exhibiting considerable "parallelism", where many things can happen at the same time. This is true on all scales, from the internals of a mobile phone, to the massive data centres which serve web applications such as Google. This poses a substantial challenge for the software industry: writing correct and efficient programs for heterogeneous, highly parallel systems is much harder than for current technologies and most developers lack the skills and training to write safe and efficient code. Faced with this difficulty, software developers will often avoid writing parallel code completely, or else will use inappropriate, non-scalable and error-prone approaches based on explicit threads of program execution. Given the hardware trend towards increasingly complex, increasing parallel (manycore) systems, this is an inherently short-term strategy that is doomed to failure, Our project addresses this issue. Our key insight is that humans in general, are very good at using patterns to understand, predict and act in the real world. This insight translates into the world of software engineering in general, and parallel heterogeneous programming in particular. Our work will help programmers to recognize patterns in pre-existing and new applications, and to transform these pattern occurrences into forms which allow them to be exploited, adapted and run effectively on the new hardware platforms. The systems we develop will work in partnership with software developers, reducing the complexity of the task, automating and semi-automating the development task. The result will help the industry to develop new applications, and to update existing applications, with less effort, fewer errors and better resilience as the underlying technology continues to evolve.