Data in the Cloud: New challenges or more of the same?

Wednesday the 25th October at 15.00, a new meeting is held in Daisy Innovation’s Business Intelligence Technology (BIT) Network. This time Professor Divyakant Agrawal from University of California at Santa Barbara gives the talk “Data in the Cloud: New challenges or more of the same?”.

Over the past two decades, database and systems researchers have made significant advances in the development of algorithms and techniques to provide data management solutions that carefully balance the three major requirements when dealing with critical data: high availability, scalability, and data consistency. However, over the past few years the data requirements, in terms of availability and scalability, from Internet scale enterprises that provide services and cater to millions of users has been unprecedented. Current proposed solutions to scalable data management, driven primarily by prevalent application requirements, significantly downplay the data consistency requirements and instead focus on very high availability and almost unlimited scalability to support data-rich applications for millions to tens of millions of users. In particular, the “newer” data management systems limit consistent access only at the granularity of
single objects, rows, or keys, thereby significantly trading-off consistency in order to achieve very high scalability and availability. But the growing popularity of “cloud computing”, the
resulting shift of a large number of Internet applications to the cloud, and the quest towards providing data management services in the cloud, has opened up the challenge for designing data management systems that provide consistency guarantees at a granularity which goes beyond single rows and keys. In this talk, we analyze the design choices that allowed modern scalable data management systems to achieve orders of magnitude higher levels of scalability compared to traditional databases. With this understanding, we highlight some design principles for data management systems providing scalable and consistent data management as a service in the cloud. We conclude the talk by presenting results from two prototype systems which strike a middle ground between the two radically different data management architectures: traditional database management systems where the data is treated as a “whole” versus modern key-value stores where data is treated as a collection of independent “granules”.

Speaker Biography:
Dr. Divy Agrawal serves on the faculty of Computer Science at the University of California at Santa Barbara. His research interests are in the areas of distributed systems, databases, and large-scale information systems such as data warehouses, digital libraries, and other data/information rich environments.

The talk takes place in room 0.2.13.