Tim Kraska

Image

Tim Kraska

Assistant Professor of Computer Science

Even everyday living — smart phones, EZ Passes, credit card purchases — now generates a gush of data. Machines for storing it and software for making sense of it may not be keeping up with the petabytes. Tim Kraska is rethinking how and why we use Big Data.

Tim Kraska has seen the future, and it looks an awful lot like Big Data.

“Big Data is for sure the hot area, but not only in computer science. It is hot everywhere,” he said. “That is one of the big differences. Big Data is the next frontier of innovation for everyone.”

That may no longer be news. Scientists at CERN analyzed data by the hundreds of petabytes in their search for the Higgs boson. Even Hollywood producers have entered the petabyte sphere — one billion megabytes — for the rendering process of a movie. The explosion of data has rendered many traditional techniques obsolete.

Sheer size is an obvious problem. “Working on megabytes is really easy, but even working on gigabytes is still hard,” Kraska said. “Petabytes are a whole different story, so that’s the scaling problem. But data is not necessarily so structured anymore. It may consist of images, plain text, videos, audio signals. How to query that data, how to make sense out of it, is another significant problem. We really need to rethink how we use Big Data.”

That has been an organizing principle throughout Kraska’s career. As a graduate student in his native Germany (Westfälische Wilhelms–Universität Münster, Master of Information Science, 2006), he worked on a proposal for continuous XQuery processing that was accepted by the World Wide Web Consortium for the XQuery 1.1 standard. XQuery processes structured text, virtually anything that is accessible in XML. His Ph.D. work (ETH Zurich, 2010) got him into building large database applications for the cloud. Since March 2010, he has been at the University of California–Berkeley’s AMPLab as a postdoctoral scholar.

Crowd-sourcing is another of his Big Data interests — giving computers access to human computation, effectively turning the human-computer relationship on its head. “There are certain tasks a computer is really, really good at and other tasks that people are really good at,” Kraska said. “For example, it takes an awful lot of work to train a computer to identify a person in an image. Humans, on the other hand, are extremely good and fast at doing that. With crowd-sourcing, a computer can ask certain questions of humans and get the answer. It’s a super-powerful way of including humans in the system for tasks at which humans are particularly good.”

Kraska will begin his work at Brown in January 2013. He was attracted to Brown because of its size and supportive environment and by the quality of students and faculty.

“I have met some of the faculty at conferences, but I hadn’t collaborated with anyone,” he said. “It was funny, though. When I came out to interview, I started a collaboration with one of my interviewers — whether they accepted me or not.”

Kraska’s curriculum vitae includes a note that he is a certified ski instructor. Did anyone tell him that the highest point in Rhode Island is all of 812 feet? Is there a chance he could be disappointed with the Ocean State?

“No, no. They told me all about that; I am prepared,” Kraska said. “I stopped teaching sometime ago, but maybe there will be a chance to organize a ski seminar to the Rocky Mountains someday.”