高级检索

    一种基于文档模式的GML压缩方法

    A Schema-Based Approach to GML Compression

    • 摘要: GML已成为地理空间数据编码的事实标准.GML文档一般体积庞大,存储和传输时占用巨额资源.提出了一种基于文档模式的有效GML压缩方法,通过用文档推导出的模式验证文档本身,对树自动机的状态转换路径进行比特编码,对坐标数据增量编码,实现GML文档压缩.对真实GML文档的压缩实验表明,所提出方法的压缩率优于通用文本压缩器(gzip和PPMD)、主要高性能XML压缩器(XMill,XMLPPM和XWRT)以及现有GML压缩器GPress.

       

      Abstract: GML, an XML-based geographic modeling language, has become a de facto encoding standard for geospatial data. Usually, GML documents are extremely verbose because of highly frequent repeating structures like tags and attribute names, which contributes to the self-describing advantage of GML data. Besides, GML documents are rich of data, having many space-consuming textual data items, including attribute values and element contents. What is worse, there often exists a great amount of high-precision spatial coordinate data in text format that occupies more storage space than in binary format. Hence it is very costly to store and transfer GML documents. An effective schema-based approach to GML compression is proposed, which compresses a GML document by first inferring a schema from the document, validating the document against the schema inferred from the document itself, and then encoding the state transition paths of the tree automaton by bits, compressing the coordinate data via the delta encoding scheme, and forwarding the inferred schema and all encodings to the general text compressors finally. Experiments on real GML documents show that the proposed compressor outperforms both typical general text compressors (gzip and PPMD), and the state-of-the-art XML compressors (including XMill, XMLPPM, XWRT), as well as the GML compressor GPress in compression ratio.

       

    /

    返回文章
    返回