
doi: 10.7302/8143
handle: 2027.42/177686
Data center applications consume the majority of today's compute cycles. As current computer systems---computer architecture, compilers, and operating systems---are inefficient for data center applications, this dissertation focuses on redesigning the computer system to enable efficient data center processing. The challenges of efficient data center processing are twofold. First, data center applications operate on a large volume of data with complex software functionality to meet the demand of billions of users. Second, processors can no longer provide steady performance scaling to support this rapid growth. This dissertation addresses these challenges by proposing a feedback loop in computer systems design. The feedback loop proposed in this dissertation consists of characterization methodologies to find reasons behind the inefficiency and optimization techniques to overcome the inefficiency. This dissertation leverages this feedback loop with profile-guided optimizations that collect data center applications' profiles using characterization methodologies and insert hints utilizing optimization techniques. While designing this feedback loop, I make two key contributions: (1) I propose systems interfaces using which software can reason about hardware inefficiencies; and (2) I design architectural abstractions using which software can suggest how to avoid hardware inefficiencies. Empowering software to understand and avoid inefficiencies across all major micro-architectural structures, I make the key contribution of moving the burden of latency-hiding optimizations from hardware to software. I help software diagnose hardware problems by designing systems interfaces to characterize hardware inefficiencies faced by data center applications. Drawing insights from diagnosis, my techniques guide software optimizations to avoid hardware inefficiencies. As Moore's Law dwindles, the demand for performance remains ever-present. To satisfy this trending need, data-driven optimizations of existing systems are essential. Systems observability is thus more valuable than ever, but more practically, it is more accessible than ever. Techniques I propose are definitive examples of how systems can proactively use observability to facilitate better communication between hardware and software. Embodying this vision, my systems techniques made proprietary workloads 2x faster. Consequently, I helped companies like ARM adopt my systems interfaces to diagnose hardware inefficiencies for their data center processors (e.g., ARM Neoverse N1 SDP) that power Amazon Web Service machines, along with Alibaba, and Microsoft data centers. Hardware optimizations are no longer sufficient for data center applications that process large volumes of data with rapidly growing complex software. Consequently, I design architectural abstractions that move optimizations from hardware to software. Empowering software to avoid inefficiencies across all major micro-architectural structures including instruction cache, data cache, and branch predictor, I redefine the way we design processors. I evaluate all of my techniques for widely-deployed data center applications (e.g., Facebook HHVM, Twitter Finagle, Apache Cassandra, PostgreSQL, MySQL, etc.), and show that they provide significant speedups (more than 2x) for these applications. As a result, Intel's data center processors have adopted a couple of my techniques. Looking forward, I will build open-source systems and benchmarking methodologies to make hardware/software co-design available to a wider audience. I will also use insights from leading these efforts to solve a wide range of efficiency problems across the systems stack.
Engineering, Profile-Guided Optimizations, Computer Science, Hardware/Software Co-Design
Engineering, Profile-Guided Optimizations, Computer Science, Hardware/Software Co-Design
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
