Mobile-aware Cloud Resource Management

Modern mobile applications are increasingly relying on cloud data centers to provide both compute and storage capabilities. To guarantee performance for cloud customers, cloud platforms usually provide dynamic provisioning approaches to adjusting resources, in order to meet the demand of fluctuating workload. Modern mobile workload, however, exhibits three key distinct characteristics, i.e., new type of spatial fluctuation, shorter time scale of fluctuation, and more frequent fluctuation, that make current provisioning approaches less effective. The MOBILESCALE project proposes new research on resource management for mobile workload that differs significantly from traditional cloud workload.

Efficient Mobile Deep Inference

An ever-increasing number of mobile applications are leveraging deep learning models to provide novel and useful features, such as real-time language translation and object recognition. However, current mobile inference paradigm requires application developers to statically trade-off between inference accuracy and inference speed during development time. As a result, mobile user experience is negatively impact given dynamic inference scenarios and heterogeneous device capacity. The MODI project proposes new research in designing and implementing a mobile-aware deep inference platform that combines innovations in both algorithm and system optimizations.

Cloud Cost Reduction

One key benefit of cloud platforms is the ability to acquire resources on demand to handle peak workload. For enterprises with existing IT infrastructures, it is not clear how to make transitions to cloud platforms cost-effectively. I built a cloud bursting system that answers questions, such as when to and how much workload to move from private data center to public clouds, and automates such processes. This system serves as a building block for investigating problems such as pooling cloud resources.

Of course, as cloud customers(people that host cloud services of some sorts), we are constantly facing with cost and performance trade-offs. Can we get away with 10 servers, 100 servers or even 1000 servers? But What about our monthly bill? Yes, budgets is a real-world problem. Well, to help out, I did some works to exploit a type of very cheap but volatile cloud resources, e.g., spot servers. The high level idea is to provide system/application-level mechanisms that allow customers run interactive, batch and batch-interactive applications on spot servers as long as possible. Such mechanisms are guided by our risk-aware cost-effective policies.