CSR: Moving MapReduce into the Cloud: Flexibility, Efficiency, and Elasticity (NSF CNS-1422119, 10/2014-09/2018) 

 Project description and goals

MapReduce, a parallel and distributed programming model on clusters of commodity hardware, has emerged as the de facto standard for processing large data sets. Although MapReduce provides a simple and generic interface for parallel programming, it incurs several problems including low cluster resource utilization, suboptimal scalability and poor multi-tenancy support. This project explores and designs new techniques that let MapReduce fully exploit the benefits of flexible and elastic resource allocations in the cloud while addressing the overhead and issues caused?by server virtualization. It broadens impact by allowing a flexible and cost-effective way to perform big data analytics. This project also involves industry collaboration, curriculum development, and provides more avenues to bring women, minority, and underrepresented students into research and graduate programs. 

The research project is exectued in a cutting-edge lab located in the new science and engineering building. The server room is furnished with cutting-edge HP data center blade facility that has three racks of HP ProLiant BL460C G6 blade server modules and a 40 TB HP EVA storage area network with 10 Gbps Ethernet and 8 Gbps Fibre/iSCSI dual channels. It has three APC InRow RP Air-Cooled and UPS equipments for maximum 40 kWs in the n+1 redundancy design. 


Participants


Project-sponsored Publications

Acknowledgements

This material is based upon work supported by the National Science Foundation under Grant CNS-1422119. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).