Tuesday, June 8, 2010

Range Searching Over Compressed Kinetic Data

The ESA 2010 accepted papers are up. My advisor, David Mount, and I have a paper accepted there - "Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data." There is a (not yet revised and camera ready) version on my website. I'll also be presenting the paper at MASSIVE, a workshop held in conjunction with SoCG, next week.

The motivation of our paper (and generally, of much of my thesis) is to provide an observation-based framework for kinetic data. An earlier post talks more about why we'd want to do that - essentially the goal is to model scientifically collected data from moving objects. In the previous paper, we presented the framework and an accompanying compression scheme. In this paper, we're solving the problem that occurs when you realize that you've compressed all your data but still want to be able to query it (ideally, without decompressing it). We consider spatio-temporal queries, essentially the intersection of a temporal range and a spatial one - answering questions like "how many cars were observed on my street yesterday during rush hour?"

The abstract:

As sensor networks increase in size and number, efficient techniques are required to process the very large data sets that they generate. Frequently, sensor networks monitor objects in motion within their vicinity; the data associated with the movement of these objects is known as kinetic data. In an earlier paper we introduced an algorithm which, given a set of sensor observations, losslessly compresses this data to a size that is within a constant factor of the asymptotically optimal joint entropy bound. In this paper we present an e cient algorithm for answering spatio-temporal range queries. Our algorithm operates on a compressed representation of the data, without the need to decompress it. We analyze the efficiency of our algorithm in terms of a natural measure of information content, the joint entropy of the sensor outputs. We show that with space roughly equal to entropy, queries can be answered in time that is roughly logarithmic in entropy. In addition, we show experimentally that on real-world data our range searching structures use less space and have faster query times than the naive versions. These results represent the first solutions to range searching problems over compressed kinetic sensor data.

No comments: