descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2009Publisher:IEEEJournal:2009 First International Communication Systems and Networks and Workshops

Authors: Cecchet, E; Natu, M; Sadaphal, V; Shenoy, P; Vin, H;

doi: 10.1109/comsnets.2009.4808877

Performance debugging in data centers: Doing more with less

- Summary
- Subjects
- Metrics

Abstract

With the increasing scale and complexity of data centers, detecting and localizing performance faults in real-time has become both a pressing need and a challenge. While several approaches for performance debugging in data centers have been proposed, these techniques do not assume any constraints on the availability of operational data needed to detect and localize faults. We argue that collecting such operational data often requires significant instrumentation or intrusiveness, which is difficult to realize in production data centers. Such constraints complicate the deployment of existing techniques or limit their effectiveness in practice. In this paper, we argue that for performance debugging to become practical and effective in realworld systems, one needs to develop techniques that are “more effective” with “less instrumentation and intrusiveness”. We raise several issues and challenges in realizing this vision and present some initial ideas on addressing these challenges.

Related Organizations

Tata Research Development and Design Centre
India
University of Massachusetts System
United States
University of Massachusetts Amherst
United States

Keywords

data centers, operating and distributed systems, fault detection and localization, performance debugging

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering