Project: General Compression Techniques for Small, Frequent Messages

Groupware communicates by sending messages across the network, and groupware programmers use a variety of formats for these messages, such as XML, plain text, or serialized objects. Although these formats have many advantages, they are often so verbose that they overload the system's network resources. Groupware programmers could improve efficiency by using more compact formats, but this efficiency comes at the cost of increased complexity, reduced convenience, and reduced readability. In this paper we propose an alternate approach for improving efficiency--an automatic compression system that transparently minimizes verbose formats. Our general message compressor--GMC--automatically finds and removes redundancy in message streams, without any knowledge of the contents or structure of the message, and without any need for the programmer to change the way they work.

Traditional compression algorithms, such as ZLIB, focus on intramessage redundancy. While that general approach is successful in large messages, small messages have too little intra-message redundancy to be compressed in such a manner. However, these messages do tend to have a high degree of inter-message redundancy. Our techniques focus on exploiting this intra-message redundancy to drastically, and quickly, reduce message size.

Participants

Carl Gutwin
University of Saskatchewan
Mark Watson
Institute Without Boundaries
Jeff Dyck
University of Saskatchewan

Publications

Improving Network Efficiency in Real-Time Groupware with General Message Compression
Gutwin, C., Fedak, C., Watson, M., Dyck, J., Bell, T. (2006), Proceedings of the ACM Conference on Computer-Supported Cooperative Work, 119-128.