# 数据压缩算法的介绍-Java实现-对比 **Repository Path**: BTLimt/compress ## Basic Information - **Project Name**: 数据压缩算法的介绍-Java实现-对比 - **Description**: 在RPC通信数据的传输场景下,当通信报文数据传输较大时,会对数据包进行压缩传输,根据不同传输场景,常用的压缩算法有Zlib、Gzip、Bzip2、Deflate、Lz4、Lzo、Snappy算法等。以下将介绍算法的概念、优缺点、Java实现代码以及各算法间的模拟性能对比。 - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 7 - **Created**: 2022-10-31 - **Last Updated**: 2022-10-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## **数据压缩算法的介绍-Java实现-对比** ### 0 资源项目合集 [资源项目合集目录](https://gitee.com/javanoteany/public_code.git) ### 1 前言 在RPC通信数据的传输场景下,当通信报文数据传输较大时,会对数据包进行压缩传输,根据不同传输场景,常用的压缩算法有Zlib、Gzip、Bzip2、Deflater、Lz4、Lzo、Snappy算法等。以下将包括算法的介绍、Java实现代码以及各算法间的模拟性能对比。 ### 2 压缩方案 + **Zlib** bzip2是Julian Seward开发并按照自由软件/开源软件协议发布的数据压缩算法及程序。对于压缩和解压缩,没有数据长度的限制,bzip2比传统的gzip的压缩效率更高,但是它的压缩速度较慢。 **核心代码**: ```java @Override public byte[] compress(byte[] data) throws IOException { byte[] output; Deflater compresser = new Deflater(); compresser.reset(); compresser.setInput(data); compresser.finish(); ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length); try { byte[] buf = new byte[1024]; while (!compresser.finished()) { int i = compresser.deflate(buf); bos.write(buf, 0, i); } output = bos.toByteArray(); } catch (Exception e) { output = data; e.printStackTrace(); } finally { try { bos.close(); } catch (IOException e) { e.printStackTrace(); } } compresser.end(); return output; } @Override public byte[] uncompress(byte[] data) throws IOException { byte[] output; Inflater decompresser = new Inflater(); decompresser.reset(); decompresser.setInput(data); ByteArrayOutputStream o = new ByteArrayOutputStream(data.length); try { byte[] buf = new byte[1024]; while (!decompresser.finished()) { int i = decompresser.inflate(buf); o.write(buf, 0, i); } output = o.toByteArray(); } catch (Exception e) { output = data; e.printStackTrace(); } finally { try { o.close(); } catch (IOException e) { e.printStackTrace(); } } decompresser.end(); return output; } ``` **测试结果**: ![zlib](images/zlib.png) + Gzip gzip的实现算法还是deflate,只是在deflate格式上增加了文件头和文件尾,同样jdk也对gzip提供了支持,分别是GZIPOutputStream和GZIPInputStream类,同样可以发现GZIPOutputStream是继承于DeflaterOutputStream的,GZIPInputStream继承于InflaterInputStream,并且可以在源码中发现writeHeader和writeTrailer方法。 **核心代码**: ```java @Override public byte[] compress(byte[] data) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); GZIPOutputStream gzip; try { gzip = new GZIPOutputStream(out); gzip.write(data); gzip.close(); } catch (IOException e) { e.printStackTrace(); } return out.toByteArray(); } @Override public byte[] uncompress(byte[] data) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); ByteArrayInputStream in = new ByteArrayInputStream(data); try { GZIPInputStream ungzip = new GZIPInputStream(in); byte[] buffer = new byte[2048]; int n; while ((n = ungzip.read(buffer)) >= 0) { out.write(buffer, 0, n); } } catch (IOException e) { e.printStackTrace(); } return out.toByteArray(); } ``` **测试结果**: ![gzip](images/gzip.png) + **Bzip2** bzip2是Julian Seward开发并按照自由软件/开源软件协议发布的数据压缩算法及程序。Seward在1996年7月第一次公开发布了bzip2 0.15版,在随后几年中这个压缩工具稳定性得到改善并且日渐流行,Seward在2000年晚些时候发布了1.0版。bzip2比传统的gzip的压缩效率更高,但是它的压缩速度较慢。 **核心代码**: ``` @Override public byte[] compress(byte[] data) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); BZip2CompressorOutputStream bcos = new BZip2CompressorOutputStream(out); bcos.write(data); bcos.close(); return out.toByteArray(); } @Override public byte[] uncompress(byte[] data) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); ByteArrayInputStream in = new ByteArrayInputStream(data); try { @SuppressWarnings("resource") BZip2CompressorInputStream ungzip = new BZip2CompressorInputStream(in); byte[] buffer = new byte[2048]; int n; while ((n = ungzip.read(buffer)) >= 0) { out.write(buffer, 0, n); } } catch (IOException e) { e.printStackTrace(); } return out.toByteArray(); } ``` **测试结果**: ![bzip2](images/bzip2.png) + **Deflater** DEFLATE是同时使用了LZ77算法与哈夫曼编码(Huffman Coding)的一个无损数据压缩算法,DEFLATE压缩与解压的源代码可以在自由、通用的压缩库zlib上找到,zlib官网:http://www.zlib.net/ jdk中对zlib压缩库提供了支持,压缩类Deflater和解压类Inflater,Deflater和Inflater都提供了native方法。 **核心代码**: ```java @Override public byte[] compress(byte[] data) throws IOException { ByteArrayOutputStream bos = new ByteArrayOutputStream(); Deflater compressor = new Deflater(1); try { compressor.setInput(data); compressor.finish(); final byte[] buf = new byte[2048]; while (!compressor.finished()) { int count = compressor.deflate(buf); bos.write(buf, 0, count); } } finally { compressor.end(); } return bos.toByteArray(); } @Override public byte[] uncompress(byte[] data) throws IOException { ByteArrayOutputStream bos = new ByteArrayOutputStream(); Inflater decompressor = new Inflater(); try { decompressor.setInput(data); final byte[] buf = new byte[2048]; while (!decompressor.finished()) { int count = decompressor.inflate(buf); bos.write(buf, 0, count); } } catch (DataFormatException e) { e.printStackTrace(); } finally { decompressor.end(); } return bos.toByteArray(); } ``` **测试结果**: ![deflater](images/deflater.png) + **Lz4** LZ4是一种无损数据压缩算法,着重于压缩和解压缩速度。 **核心代码**: ```java @Override public byte[] compress(byte[] data) throws IOException { LZ4Factory factory = LZ4Factory.fastestInstance(); ByteArrayOutputStream byteOutput = new ByteArrayOutputStream(); LZ4Compressor compressor = factory.fastCompressor(); LZ4BlockOutputStream compressedOutput = new LZ4BlockOutputStream(byteOutput, 2048, compressor); compressedOutput.write(data); compressedOutput.close(); return byteOutput.toByteArray(); } @Override public byte[] uncompress(byte[] data) throws IOException { LZ4Factory factory = LZ4Factory.fastestInstance(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); LZ4FastDecompressor decompresser = factory.fastDecompressor(); LZ4BlockInputStream lzis = new LZ4BlockInputStream(new ByteArrayInputStream(data), decompresser); int count; byte[] buffer = new byte[2048]; while ((count = lzis.read(buffer)) != -1) { baos.write(buffer, 0, count); } lzis.close(); return baos.toByteArray(); } ``` **测试结果**: ![lz4](images/lz4.png) + **Lzo** LZO是致力于解压速度的一种数据压缩算法,LZO是Lempel-Ziv-Oberhumer的缩写,这个算法是无损算法。 **核心代码**: ```java @Override public byte[] compress(byte[] data) throws IOException { LzoCompressor compressor = LzoLibrary.getInstance().newCompressor(LzoAlgorithm.LZO1X, null); ByteArrayOutputStream os = new ByteArrayOutputStream(); LzoOutputStream cs = new LzoOutputStream(os, compressor); cs.write(data); cs.close(); return os.toByteArray(); } @Override public byte[] uncompress(byte[] data) throws IOException { LzoDecompressor decompressor = LzoLibrary.getInstance().newDecompressor(LzoAlgorithm.LZO1X, null); ByteArrayOutputStream baos = new ByteArrayOutputStream(); ByteArrayInputStream is = new ByteArrayInputStream(data); @SuppressWarnings("resource") LzoInputStream us = new LzoInputStream(is, decompressor); int count; byte[] buffer = new byte[2048]; while ((count = us.read(buffer)) != -1) { baos.write(buffer, 0, count); } return baos.toByteArray(); } ``` **测试结果**: ![lzo](images/lzo.png) + **Snappy** Snappy(以前称Zippy)是Google基于LZ77的思路用C++语言编写的快速数据压缩与解压程序库,并在2011年开源。它的目标并非最大压缩率或与其他压缩程序库的兼容性,而是非常高的速度和合理的压缩率。 **核心代码**: ```java @Override public byte[] compress(byte[] data) throws IOException { return Snappy.compress(data); } @Override public byte[] uncompress(byte[] data) throws IOException { return Snappy.uncompress(data); } ``` **测试结果**: ![snappy](images/snappy.png) ### 3 性能对比 ENV:JDK:11/CPU:4C/
FormatSize Before(byte)Size After(byte)Compress Time(ms)UnCompress Time(ms)Compress Rate(%)
Zlib102400770289275.22
Gzip1024007703711475.23
Bzip2102400769471607375.14
Deflate102400785038276.66
Lz410240010347114592101.04
Lzo1024001028261388100.42
Snappy10240010241117071100.01
不同大小文件压缩效率及质量有差异,性能对比仅供参考; Compress Rate(%) = Size Before(byte) / Size After(byte) * 100% ### 4 About me 一个爱学习、爱分享、爱交流的程序员; 欢迎关注个人微信公众号【Java烂笔头】,微信小程序【Java烂笔头】,一起交流、共同进步; ![微信公众号](images/%E6%89%AB%E7%A0%81_%E6%90%9C%E7%B4%A2%E8%81%94%E5%90%88%E4%BC%A0%E6%92%AD%E6%A0%B7%E5%BC%8F-%E7%99%BD%E8%89%B2%E7%89%88.png) ![微信小程序](images/%E6%89%AB%E7%A0%81_%E6%90%9C%E7%B4%A2%E8%81%94%E5%90%88%E4%BC%A0%E6%92%AD%E6%A0%B7%E5%BC%8F-%E5%BE%AE%E4%BF%A1%E6%A0%87%E5%87%86%E7%BB%BF%E7%89%88_%E5%89%AF%E6%9C%AC.jpg) 点赞、评论+关注是最大的支持!