"The important thing in life is to have a great aim, and the determination to attain it" - Gothe.
DISCO: Distributed, Sustainable, and Cloud Computing Systems Lab
The DISCO Lab aims to explore in-depth understanding of Distributed, Sustainable and BigData Cloud computing and augmented services,
and develop open-source technologies to enhance the system performance, dependability, scalability and sustainability. The research was supported in part by National Science Foundation.
The DISCO Lab is located in the new science and engineering building. The server room is furnished
with datacenter blade facility that has three racks of HP ProLiant BL460C G6 blade server modules
and a 40 TB HP EVA storage area network with 10 Gbps Ethernet and 8 Gbps Fibre/iSCSI
dual channels. It has three APC InRow RP Air-Cooled and UPS equipments for maximum 40 kWs in the n+1 redundancy design.
- Wei's paper "OS-Augmented Oversubscription of Opportunistic Memory with a User-Assisted OOM Killer" was accepted by ACM Middleware 2019 (acceptance rate 24.5%).
- Wei's paper "Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications" was accepted by ACM SoCC 2019 (acceptance rate 24.7%).
- Eddie's paper "Semantic-aware Workflow Construction and Analysis for Distributed Data Analytic Systems" was accepted by ACM HPDC 2019 (acceptance rate 21%).
- Shaoqi's paper "Scalable Distributed DL Training: Batching Communcation and Computation" was accepted by AAAI 2019 (acceptance rate 16.2%).
- Shaoqi's paper "Aggressive Synchronization with Partial Processing for Iterative ML Jobs on Clusters" was accepted by ACM Middleware 2018 (acceptance rate 23%).
- Eddie's paper "Profiling Distrbuted Systems in Light-weight Virtualized Environments with Logs and Resource Metrics" was accepted by ACM HPDC 2018 (acceptance rate 19.5%).
- Tiago's paper "Reference-distance Eviction and Prefetching for Cache Management in Spark" was accepted by IEEE ICPP 2018 (acceptance rate 28%).
- Wei's paper "Characterizing Scheduling Delay for Low-latency Data Anayltic Workloads" was accepted by IEEE IPDPS 2018 (acceptance rate 24.5%).
- A joint paper with Dr. Palden Lama "Performance Isolation of Data-intensive Scale-out Applications in Multi-tenant Clouds" was accepted by IEEE IPDPS 2018 (acceptance rate 24.5%).
- Wei's paper "Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization" was accepted by USENIX ATC 2017 (acceptance rate 21%).
- Wei's paper "Addressing Memory Pressure in Data-Intensive Parallel Programs via Container based Virtulization" was accepted by IEEE ICAC 2017.
- Wei's paper "Addressing Performance Heterogeneity in MapReduce Clusters with Elastic Tasks" was accepted by IEEE IPDPS 2017(acceptance rate 23%).
- Shaoqi's paper "Network-Adaptive Scheduling of Data-Intensive Parallel Jobs in Clusters" was accepted by IEEE ICAC 2017.
- Dazhao's paper "Adaptive Scheduling of Parallel Jobs in Spark Streaming" was accepted by IEEE INFOCOM 2017 (acceptance rate 21%).
- A joint paper with Dr. Bo Wu "FLEP: Enabling Flexible and Efficient Preemption on GPUs" was accepted by ACM ASPLOS 2017(acceptance rate 17%).
- Ben Albernathy, PhD student, 2012 - present
- Tiago Perez, PhD student, 2014 - present
- Shaoqi Wang, PhD student, 2015 - present
- AiDi Pi, PhD student, 2016 - present
- Oluwatobi Akanbi, 2016 - present
- Tina Rose, 2017 - present
- Kathir Palaiappan, 2018 - present
- Amy Oh, 2018 - present
Graduated PhD Students
- Palden Lama, PhD in May 2013 (Assistant Professor, UT San Antonio)
- Sireesha Muppala, PhD in May 2013 (Postdoc, UCCS)
- Dennis Ippoliti, PhD in Dec 2013 (Project Manager, Microsoft)
- Yanfei Guo, PhD in May 2015 (Postdoc, Argonne National Lab)
- Dazhao Cheng, PhD in May 2016 (Assistant Professor, UNC Charlotte)
- Jason Upchurch, PhD in May 2016 (Research Fellow, Intel)
- Beaulah Navaman, PhD in Dec 2018 (FedEx)
- Wei Chen, PhD in May 2019 (Research Staff, Nvidia Lab)
Data-centers are evolving to host heterogeneous workloads on shared clusters to reduce the operational cost and achieve high resource utilization.
However, it is challenging to schedule heterogeneous workloads with diverse resource requirements and performance constraints on heterogeneous hardware.
Data parallel processing often suffers from interference and significant memory pressure, resulting in excessive garbage collection and out-of-memory errors that harm application performance and reliability.
Cluster memory management and scheduling is still inefficient, leading to low utilization and poor multi-service support.
Existing approaches either focus on application awareness or operating system awareness, thus are not well positioned to address the semantic gap between application run-times and the operating system.
This project aims to improve application performance and cluster efficiency via lightweight virtualization-enabled elastic memory management and cluster scheduling.
It combines system experimentation with rigorous design and analyses to improve performance and efficiency, and tackle memory pressure of data-parallel processing.
Developed system software will be open-sourced, providing opportunities to foster a large ecosystem that spans system software providers and customers.
MapReduce, a parallel and distributed programming model on clusters of commodity hardware, has emerged as the de facto standard for processing
large data sets. Although MapReduce provides a simple and generic interface for parallel programming, it incurs several problems including
low cluster resource utilization, suboptimal scalability and poor multi-tenancy support. This project explores and designs new techniques
that let MapReduce fully exploit the benefits of flexible and elastic resource allocations in the cloud while addressing the overhead and issues
caused?by server virtualization. It broadens impact by allowing a flexible and cost-effective way to perform big data analytics.
This project also involves industry collaboration, curriculum development, and provides more avenues to bring women, minority,
and underrepresented students into research and graduate programs.