While the main result of the paper is a compression algorithm, the point of the paper is broader than that. When reasoning about kinetic data (the data generated by a set of moving points), the current accepted model is the kinetic data structures model that assumes that points are given flight plans in the form of algebraic expressions. This works well when reasoning theoretically (assume the points follow predetermined algebraic curves...), when modeling physical properties, or possibly when dealing with actual planes that really do need to give us their flight plans in advance. But what about if we want to do statistical analyses on collected data to determine what motion trends the objects follow? The cell phone users, migratory birds, enemy tanks, whatever you like to think about, aren't going to tell us how and when and where they're going to move. Many of these observations are collected using a sensor network, so it makes sense to design algorithms that work over these networks using the possible minimal amount of information available - the number of objects each sensor sees at each time instant. The framework we introduce assumes nothing about the motion of the objects or our advance knowledge of their plans. Instead, we analyze our results with respect to the entropy of this information. The ultimate goal, then, is that algorithms designed over this framework would perform better on "neat" motion, say all the birds following a straight line from Canada to Mexico, while algorithms on "messy" data would take longer. This paper takes the first step in that direction, giving a compression algorithm that compresses the sensor observations to within a constant factor of the optimal joint entropy bound.
For those interested in more precision, the abstract is included below:
We introduce a framework for storing and processing kinetic data observed by sensor networks. These sensor networks generate vast quantities of data, which motivates a significant need for data compression. We are given a set of sensors, each of which continuously monitors some region of space. We are interested in the kinetic data generated by a finite set of objects moving through space, as observed by these sensors. Our model relies purely on sensor observations; it allows points to move freely and requires no advance notification of motion plans. Sensor outputs are represented as random processes, where nearby sensors may be statistically dependent. We model the local nature of sensor networks by assuming that two sensor outputs are statistically dependent only if the two sensors are among the k nearest neighbors of each other. We present an algorithm for the lossless compression of the data produced by the network. We show that, under the statistical dependence and locality assumptions of our framework, asymptotically this compression algorithm encodes the data to within a constant factor of the information-theoretic lower bound optimum dictated by the joint entropy of the system.
Also, a reminder - the SODA abstract deadline is today at 4:59pm EDT.