CAREER: Building Resilient Internet Services with Learning and Control (NSF CNS-0844983, 09/2009-08/2014)

Project description and goals

Due to the dynamic nature and unprecedented scale of the Internet, Internet services pose challenges including scalability, reliability, and availability to underlying networked systems. This CAREER project, directed by Prof. Xiaobo Zhou, concentrates on building Internet services that are resilient to those challenges with machine learning and control techniques. Internet services build upon cluster-based computer systems that keep growing in scale and complexity. Such systems become so complicated that it is even a big challenge to get a good understanding of the entire system dynamic behaviors. The investigators take an analytical and organized approach to design an autonomous software infrastructure on networked systems for building resilient Internet services. The project builds empirical models using statistical learning to help overcome the challenges of scale and complexity in networked systems. It designs coordinated admission control and capacity planning algorithms with end-to-end quality-of-service on multi-tier clusters. Model-independent control techniques are used with empirical models to allocate resources and to dynamically reconfigure the system for performance optimization needs. It develops performance differentiation, isolation, and self-adaptive reconfiguration capabilities for enhancing system reliability and availability. It broadens the research impact by developing a testbed in a data center lab to demonstrate the orchestration of designed techniques for automated arrangement, coordination, and management of complex computer systems, middleware, and services.

The research project is exectued in a cutting-edge lab located in the new science and engineering building. The server room is furnished with cutting-edge HP data center blade facility that has three racks of HP ProLiant BL460C G6 blade server modules and a 40 TB HP EVA storage area network with 10 Gbps Ethernet and 8 Gbps Fibre/iSCSI dual channels. It has three APC InRow RP Air-Cooled and UPS equipments for maximum 40 kWs in the n+1 redundancy design.

This five-year project was kicked off in September 2009.

You can view this page in Romanian.

You can view this page in Portuguese for


Project-sponsored Publications (Download Bibtex)


This material is based upon work supported by the National Science Foundation under Grant CNS-0844983. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

Ytext/htmlUUTF-8 (7N`v2(22 28