# message **Repository Path**: lijinting01/message ## Basic Information - **Project Name**: message - **Description**: Benchmark and usage of message serialization/deserialization. - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 6 - **Forks**: 1 - **Created**: 2018-12-20 - **Last Updated**: 2025-06-17 ## Categories & Tags **Categories**: testing **Tags**: None ## README # Java Object 序列化的基准测试(V1) > SPEED/SPACE Benchmarks of Java Object Serializing(V1) ## 1.概要 > Summary Java 序列化体系的性能孰高孰低,网上已经有了许多比较文章。 但我认为有些比较存在问题: - 测试样本结构简单 - 测试程序进行泛化处理以公平衡量各序列化体系 - 涉及序列化体系较少 - 测试程序扩展性差难以加入其它序列化体系 因此撰写此文及程序,作为众人参考比较的选择。 > There has been so many discussions about which is the best Java serialization system. > Yet I think there were some problem in some of them. > - Sample data structure is too simple > - Testing program did not generalize serialization systems to evaluate each of them fairly > - Only a few serialization systems are involved > - Testing program is not extendable to involve more serialization systems > > That's why this testing program and article were written, providing another option to build your own Java Serialization Systems evaluation. ### 1.1.涉及的序列化体系 > Serialization systems involved - JDK bulit-in - Protobuf - Hessian2 - Kryo - Fastjson - Jackson - Gson ### 1.2.测试结果关注点 > Testing results to concern - 序列化速度 > Speed of serialization - 反序列化速度(pending) > Speed of deserialization - 序列化后所占用的字节空间 > Space cost after serialization ### 1.3.泛化处理 > Generalization Protobuf协议需要对消息定义执行静态编译,JDK built-in序列化协议需要被序列化对象实现java.io.Serializable接口。而其他框架都是运行时动态对任意Java Object进行序列化。为了能在同一个基准上进行比较,需要定义泛化约束如下。 > Static compilation is required for Protocol Buffers message definitions, and JDK built-in serialization protocol requires Objects to implement java.io.Serializable, while others serialize any Plain Java Object dynamically. Constraints were defined to generalize all serialization systems. #### 1.3.1.结构泛化 > Structure Generalization - 测试所用的领域对象必须与.proto文件预定义的消息结构相同,并提供转换器与.proto文件预定义的消息相互转化。 > - All domain objects should have the same structure defined by `.proto`, and provides converters to convert POJOs and protobuf messages back and forth. - 测试所用的领域对象必须实现java.io.Serializable接口 > - All domain objects should implement java.io.Serializable #### 1.3.2.输入泛化 > Input Generalization 运行同一轮基准测试时,所有序列化框架输入的数据相同,循环次数相同。 > Use exactly the same input for each Serialization System and loop exactly the same times for the same benchmark testing. #### 1.3.3.如何构建和运行 > How to build and run 构建测试程序 - 进入构建目录 ```cd master``` - 全量构建 ```mvn clean install``` > Build The Testing Program > - Enter building directory > ```cd master``` > - Startover building > ```mvn clean install``` 运行测试程序 - 进入benchmark目录 ```cd benchmark``` - 开始运行 ```java -jar target/benchmark-.jar``` - 在${user.home}/benchmark.log 查看输出、日志 > Run The Testing Program > - Enter benchmark directory > ```cd benchmark``` > - Start running by typing > ```java -jar target/benchmark-.jar``` > - Checkout logs in ${user.home}/benchmark.log ## 2.测试程序设计 > Testing Program Designing ### 2.1.测试样本对象 > Samples Testing Models 为了满足测试的多样性,较全面测试空间和时间性能,测试样本对象当满足如下条件。 > Testing objects are supposed to satisfy requirements mentioned below, so that space/speed performances are better evaluated. - **C1-01** 测试样本对象的内容采用随机生成 > * **C1-01** Testing objects and the properties of them are created randomly - **C1-02** 数据类型使用上至少包括整数、字符串、浮点数和枚举 > * **C1-02** Testing objects should have integer/string/float/enum properties > All of these types are mandatory. - **C1-03** 至少使用一个集合类型 > * **C1-03** Testing objects are supposed to hold at least 1 collection property - **C1-04** 测试样本对象应当有相互引用的结构 > * **C1-04** Testing objects are supposed to refer to each other ### 2.2.序列化对象 > Serialized Object 序列化对象是普通Java对象的包装,满足如下条件。 > Serialized Objects are wrappers of POJOs, are supposed to satisfy requirements mentioned below. - **C2-01** 接受一个普通Java对象作为初始化对象 > * **C2-01** Accepts a POJO for initialization - **C2-02** 提供返回值为```byte[]```类型的无参方法获取序列化后的字节流 > * **C2-02** Provides a ```byte[]``` method without args for accessing serialized byte array - **C2-03** 提供返回值为```int```类型的无参方法获取字节流长度 > * **C2-03** Provides a ```int``` method without args for accessing the length of byte array - **C2-04** 提供返回值为```String```的无参方法获取序列化后字节流的UTF-8字符串形态 > * **C2-04** Provides a ```String``` method without args for accessing the UTF-8 form of byte array - **C2-05** 提供返回值为```String```的无参方法获取序列化后字节流的Base64字符串形态 > * **C2-05** Provides a ```String``` method without args for accessing the Base64 form of byte array - **C2-06** 提供返回值与初始化对象相同无参方法对序列化后的字节流反序列化 > * **C2-06** Provides method without args returning the same type as accepted POJO, which is deserialized from the byte array - **C2-07** `C2-06`所提及的方法不能直接返回`C2-01`传入的对象 > * **C2-07** The method required by `C2-06` shall not return the POJO accepted by `C2-01` - **C2-08** 序列化对象应当是不可变对象 - 不提供任何`set*`,`add*`等会改变对象状态的方法 - `C2-02`所提供的方法应当进行保护性复制 > * **C2-08** Serialized Object is supposed to be **IMMUTABLE** > + Provide no mutators that changes the object status, like `set*`, `add*` > + Method defined by `C2-02` should return a defencive copy of the internal byte array ### 2.3.基准测试对象 > Benchmark Testing Objects #### 2.3.1.空间基准测试对象 > Space Benchmark Testing Objects 空间基准测试比较简单。只需要随机测试样本,逐个输出各序列化体系的空间占用即可。 > Space benchmark testing is the simpler one. Generate samples, and record space cost of each serialization systems. That's all we have to do. #### 2.3.2.速度基准测试对象 > Speed Benchmark Testing Objects 为了公平比较各序列化体系,定义速度基准测试对象约束如下 > To be fair, the subsequent constraints are defined - **C3-01** 提供接受1个Object类型参数和1个int类型参数的方法。其中Object类型参数为待序列化对象,int类型参数为循环次数 > - **C3-01** Provides a method which accepts 1 `Object` argument, which is to be serialized; and 1 `int` argument, which indicates times of looping. - **C3-02** 对`C3-01`定义的方法开始和结束时进行计时,计算总消耗时间和平均每次序列化的时间 > - **C3-02** Calculate elapsed time of the method defined by `C3-01`, and average elapsed time of each serialization. - **C3-03** 速度基准测试对象的执行次序应当可以在运行时随意调整 > - **C3-03** The execution order of each Speed Benchmark Objects are able to be adjusted at runtime, freely. ## 3.测试程序实现 > Implementing Testing Program ### 3.1.SerializedObject - ```SerializedObject``` 是所有序列化对象的基类,根据 [2.2](2.2)的要求实现. - ```SerializedObject``` 的子类告诉其父类如何把所包装对象序列化成字节流. - ```SerializedObject``` 的子类告诉其父类如何把字节流反序列化成对象. - ```SerializedObject``` 的子类可通过实现 ```beforeSerilize()```方法初始化序列化过程中需要用到的工具. - ```SerializedObject``` 在序列化过程中捕捉的受检异常都会被包装到```SerializationException```重新抛出. - ```SerializedObject``` 提供了工厂方法初始化其子类,其子类的构造函数都是package-private的。 > - ```SerializedObject``` is the base class of all serialized objects, which complies with [2.2](2.2). > - Sub-types of ```SerializedObject``` tells their super class how to serialize the wrapped object. > - Sub-types of ```SerializedObject``` tells their super class how to deserialize from the byte array. > - Sub-types of ```SerializedObject``` are allowed to implement ```beforeSerilize()``` to initiate the internal utilities. > - Checked exception of serialization procedure inside ```SerializedObject``` are wrapped and rethrown by ```SerializationException```. > - ```SerializedObject``` provides factory method to initialize it's known sub-types, since the constructor of which are package-private. ### 3.1.1.Hessian2 ```Hessian2SerializedObject```需要额外的配置,用以指定自定义的序列化和反序列化策略。相应的配置放在META-INF目录下面。 > ```Hessian2SerializedObject```requires extra configuration under META-INF, which specifies custom serializers. ### 3.2.Benchmark Interface 是速度基准测试接口 > which is a Speed Benchmark Interface - ```Benchmark``` 接口根据[2.3.2](2.3.2)定义了单次基准测试的执行方法 > - ```Benchmark``` defined method for benchmark testing, complies with [2.3.2](2.3.2). - ```Benchmark``` 的执行计时通过```ProfilingAspect```拦截实现 > - ```Benchmark``` executions are intercepted by ```ProfilingAspect```, for elapsed time calculation. - ```ProfilingAspect``` 的总耗时单位是毫秒,单次调用平均耗时单位为微秒。 > - ```ProfilingAspect``` records total elapsed time in Milliseconds, and average elapsed time of a single call in Microseconds. ### 3.3.SpeedBenchmarks - 组合所有 ```Benchmark``` 已知的接口的实现 - 对所有```Benchmark```实现分别执行1,000, 5,000, 20,000, 50,000, 200,000次 - 定义执行Benchmark Testing的线程池并管理之 > - Arranges known ```Benchmark``` implementations. > - Run each ```Benchmark``` implementation for 1,000, 5,000, 20,000, 50,000, 200,000 times. > - Define thread pool which executes Benchmark Testing and manage its lifecycle. ### 3.4.自动生成的代码 Protocol Buffers消息对象需要通过静态编译预生成. 同时为避免冗长的代码,测试程序使用了lombok。如果你导入代码到IDE时发现缺少了相应的类或者库,请先到master目录运行`mvn clean install`,然后再重新导入代码。 > Protocol Buffers messages requires static compilation. Moreover, the testing program introduced lombok. If you see any required classes or dependencies are missing after importing into IDE, checkout the `master`directory and run `mvn clean insall` first, and re-import the testing program after that. ### 3.5.Testing Models ```TestingModels```是样本测试数据生成器,可随机生成被测试的样本对象及枚举值。测试样本类型由lombok编译器生成。无论编译与否,原文件在`message/testing-models/src/main/lombok`目录下找到。 > ```TestingModels``` is the sample testing object provider, which generates samples testing objects and enums randomly. Sample testing model types are generated by lombok automatically. The original source can be found under `message/testing-models/src/main/lombok` even before compilation. ### 3.6.Package `io.demo.message.domain.proto` `io.demo.message.domain.proto`包含2种类型 - Protobuf编译器生成的消息类,编译后可在`message/testing-models/target/generated-sources/protobuf/java`找到。 - Protobuf消息类和测试样本类之间的转换类。 > 2 kinds of classes are under`io.demo.message.domain.proto` > - ```Message```classes generated by Protobuf compiler, which can be found under `message/testing-models/target/generated-sources/protobuf/java` after compilation. > - Converters transforms Testing Models and Protobuf messages back and forth ## 4.如何扩展测试程序 > How to extend the Testing Program