XML-Comprifex - Fast XML Compression Program
XML-comprifex is a lightweight but fast compression program written in C. This program uses only one important sytem call to find out the filesize (for allocating the needed memory), so there shouldn't be a problem write it for an other OS.
Important: This program is ideal for xml-files without attributes, but it also compresses with attributes.
Download
xml_comprifex_en.c (Source code of the encoder)
xml_comprifex_en_32 (32 bit linux executable encoder)
xml_comprifex_en_64 (64 bit linux executable encoder)
xml_comprifex_de.c (Source code of the decoder)
xml_comprifex_de_32 (32 bit linux executable decoder)
xml_comprifex_de_64 (64 bit linux executable decoder)
Compression-Ideas
The xml-file is split into three parts. A binary-part, that contains information about opening and closing tags. A tag-part that includes all tag-names and a text part that contains the texts between tags.
The binary part is compressed with runlength encoding followed by a variation of the huffman-encoding.
Tags are stored with adresses if they occure more than once a time. The tag-names itself are translated with a static ascii-table, so that the length of each character can be reduced. After that it's compressed with this type of huffman.
Text is twice encoded with the variation of huffman. If the file is too small so that this encoding has no advantage, the text is translated with a static ascii-table, like the tag-part.
Every huffman-encoding is optional, it calculates the resulting size before compressing. If this result is too big, huffman does nothing.
License GPLv3
XML-comprifex is a small and fast programm for compressing XML-files.
Copyright (C) 2009 scosu (http://allfex.org)
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.