CN114726380B - Monitoring data lossless compression method, device, equipment and readable storage medium - Google Patents
Monitoring data lossless compression method, device, equipment and readable storage medium Download PDFInfo
- Publication number
- CN114726380B CN114726380B CN202210636103.1A CN202210636103A CN114726380B CN 114726380 B CN114726380 B CN 114726380B CN 202210636103 A CN202210636103 A CN 202210636103A CN 114726380 B CN114726380 B CN 114726380B
- Authority
- CN
- China
- Prior art keywords
- monitoring
- data
- value
- compressed
- difference
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 161
- 238000007906 compression Methods 0.000 title claims abstract description 86
- 230000006835 compression Effects 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000004364 calculation method Methods 0.000 claims abstract description 55
- 238000013178 mathematical model Methods 0.000 claims abstract description 35
- 238000006243 chemical reaction Methods 0.000 claims abstract description 24
- 238000007667 floating Methods 0.000 claims abstract description 6
- 238000004590 computer program Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 238000004422 calculation algorithm Methods 0.000 abstract description 4
- 238000013144 data compression Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 abstract description 2
- 238000004891 communication Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
技术领域technical field
本发明涉及数据计算技术领域,具体而言,涉及一种监测数据无损压缩方法、装置、设备及可读存储介质。The present invention relates to the technical field of data computing, and in particular, to a lossless compression method, device, device and readable storage medium for monitoring data.
背景技术Background technique
随着国家对交通构筑物安全的重视,交通构筑物上的各个角落布置的传感器密度会越来越高,相对应监测频率也会随着提高,采集每时每刻的监测数据,每年产生的数据达到PB级之多。目前在交通构筑物结构健康监测工作中,在交通构筑物的特定关键部位安装的传感器所采集到的交通构筑物结构响应数据均属于海量数据范畴,数据在长期的累积下,使得数据处理与分析及存储工作的工作量日益增加。当前的压缩技术应对交通构筑物监测系统中产生的监测数据时,存在压缩率低的缺点。As the country attaches great importance to the safety of traffic structures, the density of sensors arranged in various corners of traffic structures will become higher and higher, and the corresponding monitoring frequency will also increase. As much as PB level. At present, in the structural health monitoring of traffic structures, the structural response data of traffic structures collected by sensors installed in specific key parts of traffic structures belong to the category of massive data. workload is increasing. The current compression technology has the disadvantage of low compression rate when dealing with the monitoring data generated in the traffic structure monitoring system.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种监测数据无损压缩方法、装置、设备及可读存储介质,以改善上述问题。为了实现上述目的,本发明采取的技术方案如下:The purpose of the present invention is to provide a lossless compression method, device, device and readable storage medium for monitoring data, so as to improve the above problems. In order to achieve the above object, the technical scheme adopted by the present invention is as follows:
第一方面,本申请提供了一种监测数据无损压缩方法,包括:获取待压缩数据表,所述待压缩数据表包括传感器采集到的监测值和每个所述监测值对应的时间数据,每个所述监测值为所述传感器按照时间顺序采集得到;根据预设的时间差分数学模型将所有所述时间数据进行计算得到时间戳压缩结果,所述时间戳压缩结果包括一个起始时间戳和所有所述时间数据进行二次差值法处理后得到的时间戳二次差值;根据预设的数据进制转换数学模型将所有所述监测值进行计算得到待压缩数值;根据预设的哈夫曼编码数学模型将所述待压缩数值进行计算得到监测值压缩结果。In a first aspect, the present application provides a method for lossless compression of monitoring data, including: acquiring a data table to be compressed, where the data table to be compressed includes monitoring values collected by sensors and time data corresponding to each monitoring value, and each Each of the monitoring values is collected by the sensor in chronological order; all the time data are calculated according to a preset time difference mathematical model to obtain a timestamp compression result, and the timestamp compression result includes a start timestamp and The time stamp secondary difference value obtained after all the time data is processed by the secondary difference method; according to the preset data conversion mathematical model, all the monitoring values are calculated to obtain the value to be compressed; according to the preset hash value The Fuman coding mathematical model calculates the to-be-compressed value to obtain the monitoring value compression result.
第二方面,本申请还提供了一种监测数据无损压缩装置,包括:数据采集模块,用于获取待压缩数据表,所述待压缩数据表包括传感器采集到的监测值和每个所述监测值对应的时间数据;第一计算模块,用于根据预设的时间差分数学模型将所有所述时间数据进行计算得到时间戳压缩结果,所述时间戳压缩结果包括一个起始时间戳和所有所述时间数据进行二次差值法处理后得到的时间戳二次差值;第二计算模块,用于根据预设的数据进制转换数学模型将所有所述监测值进行计算得到待压缩数值;第三计算模块,用于根据预设的哈夫曼编码数学模型将所述待压缩数值进行计算得到监测值压缩结果。In a second aspect, the present application also provides a device for lossless compression of monitoring data, including: a data acquisition module for acquiring a data table to be compressed, where the data table to be compressed includes monitoring values collected by sensors and each of the monitoring values. The time data corresponding to the value; the first calculation module is used to calculate all the time data according to the preset time difference mathematical model to obtain a time stamp compression result, and the time stamp compression result includes a start time stamp and all all the time stamps. The time stamp secondary difference value obtained after the time data is processed by the secondary difference method; the second calculation module is used for calculating all the monitoring values according to a preset data-to-system conversion mathematical model to obtain the to-be-compressed value; The third calculation module is configured to calculate the to-be-compressed value according to a preset Huffman coding mathematical model to obtain a monitoring value compression result.
第三方面,本申请还提供了一种监测数据无损压缩设备,包括:In a third aspect, the present application also provides a lossless compression device for monitoring data, including:
存储器,用于存储计算机程序;memory for storing computer programs;
处理器,用于执行所述计算机程序时实现所述监测数据无损压缩方法的步骤。The processor is configured to implement the steps of the monitoring data lossless compression method when executing the computer program.
第四方面,本申请还提供了一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述基于监测数据无损压缩方法的步骤。In a fourth aspect, the present application further provides a readable storage medium, where a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the steps of the above monitoring data-based lossless compression method are implemented.
本发明的有益效果为:The beneficial effects of the present invention are:
本发明通过针对监测值的浮点数特性,采用包含差分思想的算法编码进行编码转换、进制转换编码和哈夫曼编码三种编码思想相结合,最终构建监测值无损压缩算法模型,完成对监测值的压缩编码,达到提高数据压缩率、降低数据库存储压力的效果。According to the floating-point number characteristic of the monitoring value, the present invention adopts the algorithm coding containing the difference idea to carry out coding conversion, binary conversion coding and Huffman coding, and finally constructs a lossless compression algorithm model of the monitoring value, and completes the monitoring process. The compression encoding of the value can improve the data compression rate and reduce the storage pressure of the database.
本发明的其他特征和优点将在随后的说明书阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明实施例了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the present invention will be set forth in the description which follows, and, in part, will be apparent from the description, or may be learned by practice of embodiments of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description, claims, and drawings.
附图说明Description of drawings
为了更清楚地说明本发明实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本发明的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the embodiments. It should be understood that the following drawings only show some embodiments of the present invention, and therefore do not It should be regarded as a limitation of the scope, and for those of ordinary skill in the art, other related drawings can also be obtained according to these drawings without any creative effort.
图1为本发明实施例中所述的监测数据无损压缩方法流程示意图;1 is a schematic flowchart of a method for lossless compression of monitoring data described in an embodiment of the present invention;
图2为本发明实施例中所述的监测数据无损压缩装置结构示意图;2 is a schematic structural diagram of a monitoring data lossless compression device described in an embodiment of the present invention;
图3为本发明实施例中所述的监测数据无损压缩设备结构示意图;3 is a schematic structural diagram of a monitoring data lossless compression device described in an embodiment of the present invention;
图4为本发明实施例中所述的ASCII码转换示意图。FIG. 4 is a schematic diagram of ASCII code conversion described in the embodiment of the present invention.
图中标记:1、数据采集模块;2、第一计算模块;21、第一计算单元;22、第二计算单元;3、第二计算模块;31、第三计算单元;311、第一判断单元;312、第二判断单元;313、第三判断单元;32、第四计算单元;321、第五计算单元;322、第六计算单元;4、第三计算模块;41、第七计算单元;42、第八计算单元;43、第九计算单元;800、监测数据无损压缩设备;801、处理器;802、存储器;803、多媒体组件;804、I/O接口;805、通信组件。Labels in the figure: 1, data acquisition module; 2, first calculation module; 21, first calculation unit; 22, second calculation unit; 3, second calculation module; 31, third calculation unit; 311, first judgment unit; 312, the second judgment unit; 313, the third judgment unit; 32, the fourth calculation unit; 321, the fifth calculation unit; 322, the sixth calculation unit; 4, the third calculation module; 41, the
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围,而是仅仅表示本发明的选定实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. The components of the embodiments of the invention generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Thus, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the invention as claimed, but is merely representative of selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。同时,在本发明的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.
实施例1:Example 1:
本实施例提供了一种监测数据无损压缩方法。This embodiment provides a lossless compression method for monitoring data.
参见图1,图中示出了本方法包括步骤S100、步骤S200、步骤S300和步骤S400。Referring to FIG. 1 , it is shown that the method includes step S100 , step S200 , step S300 and step S400 .
步骤S100、获取待压缩数据表,待压缩数据表包括传感器采集到的监测值和每个监测值对应的时间数据,每个监测值为传感器按照时间顺序采集得到。Step S100: Obtain a data table to be compressed, the data table to be compressed includes monitoring values collected by the sensor and time data corresponding to each monitoring value, and each monitoring value is collected by the sensor in a chronological order.
需要说明的是,在步骤S100中,采集由交通构造物内传感器设备产生的时序数据,将采集到的监测数据进行缺失值填补、数据标准化、异常值剔除等操作,将得到的数据整理为待压缩数据表,简化后续压缩过程的步骤,提升数据处理的效率。It should be noted that, in step S100, the time series data generated by the sensor equipment in the traffic structure is collected, and the collected monitoring data is subjected to operations such as missing value filling, data standardization, and outlier elimination, and the obtained data is sorted into pending data. Compress the data table, simplify the steps of the subsequent compression process, and improve the efficiency of data processing.
S200、根据预设的时间差分数学模型将所有时间数据进行计算得到时间戳压缩结果,时间戳压缩结果包括一个起始时间戳和所有时间数据进行二次差值法处理后得到的时间戳二次差值。S200. Calculate all time data according to a preset time difference mathematical model to obtain a time stamp compression result, where the time stamp compression result includes a starting time stamp and a second time stamp obtained by processing all time data by a quadratic difference method difference.
需要说明的是,在步骤S200中,本步骤是对时间数据进行无损压缩,通过时间数据转换为时间戳形式后,将时间戳数据进行二次差值后得到时间戳压缩结果。通过对时间数据和监测值分别压缩处理,降低压缩数据的计算量。It should be noted that, in step S200, this step is to perform lossless compression on the time data, and after the time data is converted into a timestamp format, the timestamp compression result is obtained by performing a second difference on the timestamp data. By compressing the time data and the monitoring value separately, the calculation amount of the compressed data is reduced.
S300、根据预设的数据进制转换数学模型将所有监测值进行计算得到待压缩数值。S300. Calculate all the monitored values according to a preset mathematical model of conversion of data to system to obtain the value to be compressed.
需要说明的是,在步骤S300中,数据进制转换数学模型通过对浮点数形式的监测值进行多次进制转换,最终转化为ASCII编码,有效降低了存储成本。It should be noted that, in step S300, the mathematical model of data binary conversion performs multiple binary conversion on the monitoring value in the form of a floating point number, and finally converts it into ASCII code, which effectively reduces the storage cost.
S400、根据预设的哈夫曼编码数学模型将待压缩数值进行计算得到监测值压缩结果。S400. Calculate the value to be compressed according to a preset Huffman coding mathematical model to obtain a compression result of the monitoring value.
需要说明的是,在步骤S400中,哈夫曼编码数学模型通过对待压缩数值中ASCII码符号出现频率进行假设后构造哈夫曼编码数,通过存储哈夫曼编码数大幅降低存储成本。It should be noted that, in step S400, the Huffman coding mathematical model constructs the Huffman coding number by assuming the occurrence frequency of the ASCII code symbol in the to-be-compressed value, and greatly reduces the storage cost by storing the Huffman coding number.
在本申请所公开的具体实施例中,步骤S200中包括步骤S210和步骤S220。In the specific embodiment disclosed in this application, step S200 includes step S210 and step S220.
S210、将时间数据转换为UNIX时间戳形式并按照时间顺序排列得到时间戳数据列。S210 , converting the time data into a UNIX timestamp format and arranging in a chronological order to obtain a timestamp data column.
S220、根据预设的时间戳二次差值法数学模型将时间戳数据列进行计算得到起始时间戳和时间戳二次差值。S220. Calculate the timestamp data column according to the preset timestamp quadratic difference method mathematical model to obtain the start timestamp and the timestamp quadratic difference.
需要说明的是,如表1所示,表1示出了时间戳二次差值法计算表,本申请将时间数据转换为UNIX时间戳数据,按照时间戳数据排列后提取出起始时间戳数据进行存储,其余的时间戳数据采用对差值法进行二次差值来保证二次差值90%以上的数为0可以用1bit位进行存储,出现的跳点、缺点等情况在基于之前缺失值填补也能保证二次差值在[-60,60]以内,这一类的二次差值使用控制位‘1’来标记并用7bit来存储DN-DN-1,在时间戳二次差值法计算表中6个时间戳只需要64bit+8*2+3*1=83bit进行存储,比直接存储时间戳存储成本为(长整型为8字节64bit*6)要大幅度降低。It should be noted that, as shown in Table 1, Table 1 shows the time stamp quadratic difference method calculation table, this application converts time data into UNIX time stamp data, and extracts the starting time stamp after arranging the time stamp data The data is stored, and the rest of the time stamp data uses the difference method to perform a second difference to ensure that the number of more than 90% of the second difference is 0 and can be stored with 1 bit. The jump points and shortcomings are based on the previous Missing value padding can also ensure that the quadratic difference is within [-60,60]. This type of quadratic difference is marked with the control bit '1' and used to store 7bit D N -D N-1 , in the timestamp The 6 timestamps in the quadratic difference method calculation table only need 64bit+8*2+3*1=83bit for storage, which is larger than the storage cost of directly storing the timestamp (long integer is 8 bytes 64bit*6) The magnitude is reduced.
表1 时间戳二次差值法计算表 Table 1 Time stamp quadratic difference method calculation table
在本申请所公开的具体实施例中,步骤S300中包括步骤S310、和步骤S320。In the specific embodiment disclosed in this application, step S300 includes step S310 and step S320.
S310、根据预设的差分编码数学模型将所有监测值进行计算得到监测值编码,监测值编码为十进制编码。S310. Calculate all monitoring values according to a preset differential coding mathematical model to obtain a monitoring value code, and the monitoring value code is a decimal code.
在本申请所公开的具体实施例中,步骤S310中包括步骤S311、步骤S312和步骤S313。In the specific embodiment disclosed in this application, step S310 includes step S311, step S312 and step S313.
S311、遍历所有监测值,根据时间顺序对相邻两个监测值进行差值计算,判断计算结果是否为0。S311 , traverse all monitoring values, perform difference calculation on two adjacent monitoring values according to the time sequence, and determine whether the calculation result is 0.
S312、若相邻监测值的差值为0,则将相邻两个监测值中时间顺序靠后一个监测值数据作为0存储至第一数组,第一数组用于存储差值为0的数组。S312. If the difference between the adjacent monitoring values is 0, store the data of the monitoring value at the back of the time sequence of the two adjacent monitoring values as 0 in the first array, and the first array is used to store the array with the difference value of 0 .
S313、若相邻两个监测值的差值不为0,则对相邻两个监测值的整数部分和浮点数部分分别进行差值处理,计算得到正整数数组。S313 , if the difference between the two adjacent monitoring values is not 0, perform difference processing on the integer part and the floating point part of the two adjacent monitoring values respectively to obtain a positive integer array by calculation.
需要说明的是,在上述步骤中,如表2所示,表2示出了监测值差分编码表,将采集到的监测值按照时间顺序进行排列,遍历所有监测值判断监测值与前一个点差值是否为0,若差值为0,则声明一个第一数组(delta数组)用于存储与前一个点差值为0的索引值。It should be noted that, in the above steps, as shown in Table 2, Table 2 shows the monitoring value differential coding table, the collected monitoring values are arranged in time order, and all the monitoring values are traversed to determine the difference between the monitoring value and the previous point. Whether the difference value is 0, if the difference value is 0, a first array (delta array) is declared to store the index value with a difference value of 0 from the previous point.
如果不为0,则继续执行如下步骤:If it is not 0, continue with the following steps:
取监测值整数部分Xi,计算Xi-Xi-1保存整数部分差值数组dx(默认dx[0]为第一条数据整数部分);Take the integer part X i of the monitoring value, calculate X i -X i-1 and save the integer part difference array dx (default dx[0] is the integer part of the first data);
取监测值小数部分(默认取四位)将其整数化保存为数组ex;Take the fractional part of the monitoring value (four digits by default) and save it as an array ex;
分别遍历整数差值数组dx与小数整数化数组ex进行拼接,拼接规则如下,其中i为第i条数据,为编码后结果:Traverse the integer difference array dx and the decimal integer array ex for splicing respectively. The splicing rules are as follows, where i is the ith data, For the encoded result:
当i=0为第一条数据时,When i=0 is the first data,
; ;
其中,rx为正整数数组,dx为整数部分差值数组,ex为小数部分差值数组,i为第i条数据。Among them, rx is a positive integer array, dx is an integer part difference array, ex is a fractional part difference array, and i is the ith data.
当delta[i]=0成立时:表明监测值与前一个监测值相同则不重复记录,When delta[i]=0 is established: it indicates that the monitoring value is the same as the previous monitoring value, and the record is not repeated.
; ;
其中,rx为正整数数组,i为第i条数据。Among them, rx is an array of positive integers, and i is the ith data.
当dx[i]=0成立时:表明监测值整数部分与前一个监测值整数部分相同则只记录小数部分(则rx[i]长度默认为4位),When dx[i]=0 is established: it indicates that the integer part of the monitoring value is the same as the integer part of the previous monitoring value, and only the fractional part is recorded (then the length of rx[i] is 4 digits by default),
; ;
其中,rx为正整数数组,dx为整数部分差值数组,ex为小数部分差值数组,i为第i条数据。Among them, rx is a positive integer array, dx is an integer part difference array, ex is a fractional part difference array, and i is the ith data.
当dx[i]>0成立时:表明监测值整数大于前一个部分监测值,监测值的拼接规则为 When dx[i]>0 is established: it indicates that the integer of the monitoring value is greater than the previous part of the monitoring value, and the splicing rule of the monitoring value is
其中,rx为正整数数组,dx为整数部分差值数组,ex为小数部分差值数组,i为第i条数据。Among them, rx is a positive integer array, dx is an integer part difference array, ex is a fractional part difference array, and i is the ith data.
该公式表示将整数差值部分+小数差值部分+‘0’拼接组成的字符串转换成整数,末尾的‘0’是>0的标志位(长度范围在5-9位之间);This formula means to convert the string composed of the integer difference part + the decimal difference part + '0' into an integer, and the '0' at the end is >0 flag ( length ranges between 5-9 digits);
当<0成立时:表明监测值整数大于前一个部分监测值,监测值的拼接规则为when When <0 is established: it indicates that the integer of the monitoring value is greater than the previous part of the monitoring value, and the splicing rule of the monitoring value is
其中,rx为正整数数组,dx为整数部分差值数组,ex为小数部分差值数组,i为第i条数据。Among them, rx is a positive integer array, dx is an integer part difference array, ex is a fractional part difference array, and i is the ith data.
该公式表示将整数差值部分放最前面+小数差值部分+‘1’拼接组成的字符串转换成整数,末尾的‘1’是<0的标志位(长度范围在5-9位之间)。This formula means to convert the string composed of the integer difference part at the front + the decimal difference part + '1' into an integer, and the '1' at the end is <0 flag bit ( The length ranges from 5-9 digits).
编码后将浮点数转换为4-9位之间的正整数存储12*4=48bytes存储由此可见通过该种方式编码将浮点数转换为正整数同时还能够降低数据的平均长度,提升数据的压缩率,提高系统的资源使用效率。After encoding, the floating-point number is converted into a positive integer between 4-9 bits and stored in 12*4=48bytes. It can be seen that converting the floating-point number into a positive integer by encoding in this way can also reduce the average length of the data and improve the data. Compression ratio, improve system resource usage efficiency.
表2 监测值差分编码表Table 2 Monitoring value differential coding table
S320、根据预设的编码进制转换模型将监测值编码进行计算得到待压缩数值,待压缩数值的进制高于监测值差分编码的进制。S320. Calculate the monitoring value encoding according to the preset encoding scale conversion model to obtain the value to be compressed, and the scale of the value to be compressed is higher than the scale of the differential encoding of the monitoring value.
在本申请所公开的具体实施例中,步骤S320中包括步骤S321和步骤S322。In the specific embodiment disclosed in this application, step S320 includes step S321 and step S322.
S321、对监测值差分编码进行进制转换得到二进制形式编码。S321. Perform binary conversion on the differential encoding of the monitoring value to obtain a binary code.
S322、将二进制形式编码按照每7个比特位进行ASCII编码得到待压缩数值,待压缩数值为ASCII码符号。S322: Perform ASCII encoding on every 7 bits of the binary code to obtain a value to be compressed, and the value to be compressed is an ASCII code symbol.
需要说明的是,在上述步骤中,如图4和表3所示,图4示出了ASCII码转换示意图,表3示出了监测值ASCII编码表,在utf-8 (Unicode Transformation Format)中,单字节字符对应ASCII码中128个字符,如果能将10进制字符的信息与ASCII码建立联系,就能将10进制转化成128进制,就能做到1个字节存储log2128=7bit的信息,实现空间利用率达到87.5%。本申请采用的方法是将10进制字符组成的字串对应的数字转化为2进制形式,然后每7个bit与一个ASCII码对应(ASCII码中128个字符可对应0-127),得出待压缩数值,由监测值ASCII编码表结果可知,压缩前存储12条浮点数记录每条需要8bytes来保存,12条记录共需要12*8=96bytes存储,编码后将浮点数转换为4-9位之间的正整数存储12*4=48bytes,然后基于进制压缩转换编码方式将整数经过位与运算后转换为ASCII码,其存储成本为3+2+2+2+1+1+2+3+2+2+2+2=24bytes相比于最开始存储浮点数的96bytes来说压缩比达到4,有效降低了存储成本。It should be noted that, in the above steps, as shown in Figure 4 and Table 3, Figure 4 shows a schematic diagram of ASCII code conversion, and Table 3 shows the monitoring value ASCII encoding table, in utf-8 (Unicode Transformation Format) , single-byte characters correspond to 128 characters in ASCII code, if you can connect the information of decimal characters with ASCII code, you can convert decimal to 128, and you can store log2128 in 1 byte =7bit information, the space utilization rate reaches 87.5%. The method adopted in this application is to convert the number corresponding to the string composed of decimal characters into binary form, and then every 7 bits corresponds to an ASCII code (128 characters in the ASCII code can correspond to 0-127), so as to obtain The value to be compressed is obtained. From the result of the monitoring value ASCII encoding table, it can be seen that each record of 12 floating-point numbers needs 8 bytes to be stored before compression, and the 12 records need to be stored in 12*8=96bytes in total. After encoding, the floating-point number is converted into 4- Positive integers between 9 digits are stored in 12*4=48bytes, and then the integers are converted into ASCII codes after bitwise AND operation based on the base compression conversion encoding method, and the storage cost is 3+2+2+2+1+1+ 2+3+2+2+2+2=24bytes has a compression ratio of 4 compared to the 96bytes that initially stores floating-point numbers, which effectively reduces storage costs.
表3 监测值ASCII编码表 Table 3 Monitoring value ASCII code table
在本申请所公开的具体实施例中,步骤S400中包括步骤S410、步骤S420和步骤S430。In the specific embodiment disclosed in this application, step S400 includes step S410, step S420 and step S430.
S410、根据待压缩数值计算得到字符频率表,字符频率表中包括了待压缩数值中的数值与数值对应的频率。S410. Calculate and obtain a character frequency table according to the numerical value to be compressed, where the character frequency table includes the numerical value in the numerical value to be compressed and the frequency corresponding to the numerical value.
S420、对字符频率表进行哈夫曼编码计算得到哈夫曼编码表。S420. Perform Huffman encoding calculation on the character frequency table to obtain a Huffman encoding table.
S430、根据待压缩数值和哈夫曼编码表计算得到监测值压缩结果。S430. Calculate and obtain the compression result of the monitoring value according to the value to be compressed and the Huffman coding table.
需要说明的是,在上述步骤中,本申请提出了在进制转换的基础上已经将十进制整数转换成128进制的ASCII码,这样做的目的就是为了更好地结合哈夫曼编码的特点,编码范围只包含128个ASCII码,本申请使用哈夫曼原理对上述8条监测数据中ASCII码符号出现频率进行假设后构造哈夫曼编码树其符号频率得到了表4所示的基于ASCII码频率对应哈夫曼编码表。根据表4中计算得到的ASCII码出现频率存储ASCII码编码总共需要10000bytes空间,大小为200+300+600+700+1000+3200+1900+2100=10000bytes,如果按照哈夫曼编码存储该ASCII码表存储总共需要3275bytes空间大小为200*5+300*5+600*4+700*4+1000*4+3200*2+1900*2+2100*2)/8bit=3274.5bytes+8bytes+4bytes(最后的8+4bytes为哈夫曼编码表的存储成本),其压缩比达到3.05表明压缩效果较好,完成了对监测值的压缩编码,降低数据库的存储压力,提高系统的资源使用率。It should be noted that, in the above steps, the present application proposes to convert decimal integers into ASCII codes of 128 bases on the basis of system conversion. The purpose of this is to better combine the characteristics of Huffman coding. , the coding range only includes 128 ASCII codes, the application uses the Huffman principle to make assumptions about the occurrence frequency of ASCII code symbols in the above-mentioned 8 monitoring data to construct a Huffman coding tree. The symbol frequency is obtained based on ASCII shown in Table 4. The code frequency corresponds to the Huffman coding table. According to the frequency of occurrence of ASCII codes calculated in Table 4, it takes a total of 10000 bytes of space to store ASCII codes, and the size is 200+300+600+700+1000+3200+1900+2100=10000 bytes. A total of 3275bytes is required for table storage. The size is 200*5+300*5+600*4+700*4+1000*4+3200*2+1900*2+2100*2)/8bit=3274.5bytes+8bytes+4bytes ( The last 8+4bytes is the storage cost of the Huffman coding table), and its compression ratio of 3.05 indicates that the compression effect is good, the compression coding of the monitoring value is completed, the storage pressure of the database is reduced, and the resource utilization rate of the system is improved.
表4 基于ASCII码频率对应哈夫曼编码表Table 4 Corresponding Huffman coding table based on ASCII code frequency
实施例2:Example 2:
如图2所示,本实施例提供了一种监测数据无损压缩装置,装置包括As shown in FIG. 2 , this embodiment provides an apparatus for lossless compression of monitoring data. The apparatus includes:
数据采集模块1,用于获取待压缩数据表,待压缩数据表包括传感器采集到的监测值和每个监测值对应的时间数据。The data acquisition module 1 is used for acquiring a data table to be compressed, and the data table to be compressed includes monitoring values collected by sensors and time data corresponding to each monitoring value.
第一计算模块2,用于根据预设的时间差分数学模型将所有时间数据进行计算得到时间戳压缩结果,时间戳压缩结果包括一个起始时间戳和所有时间数据进行二次差值法处理后得到的时间戳二次差值。The
第二计算模块3,用于根据预设的数据进制转换数学模型将所有监测值进行计算得到待压缩数值。The second calculation module 3 is configured to calculate all the monitored values according to a preset mathematical model of conversion of data to system to obtain the numerical value to be compressed.
第三计算模块4,用于根据预设的哈夫曼编码数学模型将待压缩数值进行计算得到监测值压缩结果。The
在一些具体的实施例中,第一计算模块2包括:In some specific embodiments, the
第一计算单元21,用于将时间数据转换为UNIX时间戳形式并按照时间顺序排列得到时间戳数据列。The
第二计算单元22,用于根据预设的时间戳二次差值法数学模型将时间戳数据列进行计算得到起始时间戳和时间戳二次差值。The
在一些具体的实施例中,第二计算模块3包括:In some specific embodiments, the second computing module 3 includes:
第三计算单元31,用于根据预设的差分编码数学模型将所有监测值进行计算得到监测值编码,监测值编码为十进制编码。The
第四计算单元32,用于根据预设的编码进制转换模型将监测值编码进行计算得到待压缩数值,待压缩数值的进制高于监测值差分编码的进制。The
在一些具体的实施例中,第三计算单元31包括:In some specific embodiments, the
第一判断单元311,用于遍历所有监测值,根据时间顺序对相邻两个监测值进行差值计算,判断计算结果是否为0。The
第二判断单元312,用于判断若相邻监测值的差值为0,则将相邻两个监测值中时间顺序靠后一个监测值数据作为0存储至第一数组,第一数组用于存储差值为0的数组。The
第三判断单元313,用于判断若相邻两个监测值的差值不为0,则对相邻两个监测值的整数部分和浮点数部分分别进行差值处理,计算得到正整数数组。The
在一些具体的实施例中,第四计算单元32包括:In some specific embodiments, the
第五计算单元321,用于对监测值差分编码进行进制转换得到二进制形式编码。The
第六计算单元322,用于将二进制形式编码按照每7个比特位进行ASCII编码得到待压缩数值,待压缩数值为ASCII码符号。The
在一些具体的实施例中,第三计算模块4包括:In some specific embodiments, the
第七计算单元41,用于根据待压缩数值计算得到字符频率表,字符频率表中包括了待压缩数值中的数值与数值对应的频率。The
第八计算单元42,用于对字符频率表进行哈夫曼编码计算得到哈夫曼编码表。The
第九计算单元43,用于根据待压缩数值和哈夫曼编码表计算得到监测值压缩结果。The
实施例3:Example 3:
相应于上面的方法实施例,本实施例中还提供了一种监测数据无损压缩设备,下文描述的一种监测数据无损压缩设备与上文描述的一种监测数据无损压缩方法可相互对应参照。Corresponding to the above method embodiments, this embodiment also provides a device for lossless compression of monitoring data, and a device for lossless compression of monitoring data described below and a method for lossless compression of monitoring data described above may refer to each other correspondingly.
图3是根据示例性实施例示出的一种监测数据无损压缩设备800的框图。如图3所示,该监测数据无损压缩设备800可以包括:处理器801,存储器802。该监测数据无损压缩设备800还可以包括多媒体组件803, I/O接口804,以及通信组件805中的一者或多者。FIG. 3 is a block diagram of a
其中,处理器801用于控制该监测数据无损压缩设备800的整体操作,以完成上述的监测数据无损压缩方法中的全部或部分步骤。存储器802用于存储各种类型的数据以支持在该监测数据无损压缩设备800的操作,这些数据例如可以包括用于在该监测数据无损压缩设备800上操作的任何应用程序或方法的指令,以及应用程序相关的数据,例如联系人数据、收发的消息、图片、音频、视频等等。该存储器802可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,例如静态随机存取存储器(Static Random AccessMemory,简称SRAM),电可擦除可编程只读存储器(Electrically Erasable ProgrammableRead-Only Memory,简称EEPROM),可擦除可编程只读存储器(Erasable ProgrammableRead-Only Memory,简称EPROM),可编程只读存储器(Programmable Read-Only Memory,简称PROM),只读存储器(Read-Only Memory,简称ROM),磁存储器,快闪存储器,磁盘或光盘。多媒体组件803可以包括屏幕和音频组件。其中屏幕例如可以是触摸屏,音频组件用于输出和/或输入音频信号。例如,音频组件可以包括一个麦克风,麦克风用于接收外部音频信号。所接收的音频信号可以被进一步存储在存储器802或通过通信组件805发送。音频组件还包括至少一个扬声器,用于输出音频信号。I/O接口804为处理器801和其他接口模块之间提供接口,上述其他接口模块可以是键盘,鼠标,按钮等。这些按钮可以是虚拟按钮或者实体按钮。通信组件805用于该监测数据无损压缩设备800与其他设备之间进行有线或无线通信。无线通信,例如Wi-Fi,蓝牙,近场通信(Near FieldCommunication,简称NFC),2G、3G或4G,或它们中的一种或几种的组合,因此相应的该通信组件805可以包括:Wi-Fi模块,蓝牙模块,NFC模块。The
在一示例性实施例中,监测数据无损压缩设备800可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,简称ASIC)、数字信号处理器(DigitalSignal Processor,简称DSP)、数字信号处理设备(Digital Signal ProcessingDevice,简称DSPD)、可编程逻辑器件(Programmable Logic Device,简称PLD)、现场可编程门阵列(Field Programmable Gate Array,简称FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述的监测数据无损压缩方法。In an exemplary embodiment, the monitoring data
在另一示例性实施例中,还提供了一种包括程序指令的计算机可读存储介质,该程序指令被处理器执行时实现上述的监测数据无损压缩方法的步骤。例如,该计算机可读存储介质可以为上述包括程序指令的存储器802,上述程序指令可由监测数据无损压缩设备800的处理器801执行以完成上述的监测数据无损压缩方法。In another exemplary embodiment, a computer-readable storage medium including program instructions is also provided, and when the program instructions are executed by a processor, the steps of the above-mentioned method for lossless compression of monitoring data are implemented. For example, the computer-readable storage medium can be the above-mentioned
实施例4:Example 4:
相应于上面的方法实施例,本实施例中还提供了一种可读存储介质,下文描述的一种可读存储介质与上文描述的一种监测数据无损压缩方法可相互对应参照。Corresponding to the above method embodiments, a readable storage medium is also provided in this embodiment, and a readable storage medium described below and a lossless compression method for monitoring data described above can be referred to each other correspondingly.
一种可读存储介质,可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例的监测数据无损压缩方法的步骤。A readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the monitoring data lossless compression method of the above method embodiment.
该可读存储介质具体可以为U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的可读存储介质。The readable storage medium may specifically be a USB flash drive, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, etc. that can store program codes. Readable storage medium.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210636103.1A CN114726380B (en) | 2022-06-07 | 2022-06-07 | Monitoring data lossless compression method, device, equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210636103.1A CN114726380B (en) | 2022-06-07 | 2022-06-07 | Monitoring data lossless compression method, device, equipment and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114726380A CN114726380A (en) | 2022-07-08 |
CN114726380B true CN114726380B (en) | 2022-09-02 |
Family
ID=82232796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210636103.1A Active CN114726380B (en) | 2022-06-07 | 2022-06-07 | Monitoring data lossless compression method, device, equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114726380B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115278777A (en) * | 2022-09-26 | 2022-11-01 | 江苏骏利精密制造科技有限公司 | Data transmission method for vehicle-mounted purifier |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004098066A1 (en) * | 2003-04-28 | 2004-11-11 | Nippon Telegraph And Telephone Corporation | Floating point type digital signal reversible encoding method, decoding method, devices for them, and programs for them |
JP4719667B2 (en) * | 2006-12-28 | 2011-07-06 | 日立オートモティブシステムズ株式会社 | Time series data compression method |
US8000541B2 (en) * | 2008-10-28 | 2011-08-16 | Gregory James Gosian | Quantization differencing compression algorithm |
CN102263560B (en) * | 2010-05-28 | 2013-11-20 | 富士通株式会社 | Differential encoding method and system |
CN103051341B (en) * | 2012-12-31 | 2016-01-27 | 华为技术有限公司 | Data coding device and method, data deciphering device and method |
CN110210054A (en) * | 2019-04-25 | 2019-09-06 | 中国电力科学研究院有限公司 | A kind of sampled data preprocess method |
CN111988618B (en) * | 2019-05-22 | 2022-05-20 | 杭州海康威视数字技术股份有限公司 | Decoding and encoding method, decoding end and encoding end |
CN111211787A (en) * | 2019-10-09 | 2020-05-29 | 华中科技大学 | An industrial data compression method, system, storage medium and terminal |
CN112887713B (en) * | 2019-11-30 | 2022-06-28 | 华为技术有限公司 | Picture compression and decompression method and device |
CN111931345B (en) * | 2020-07-09 | 2021-11-02 | 西南交通大学 | A monitoring data prediction method, apparatus, device and readable storage medium |
CN112968751A (en) * | 2021-01-27 | 2021-06-15 | 伊之密机器人自动化科技(苏州)有限公司 | Industrial time sequence data compression method and system |
CN113078908B (en) * | 2021-03-10 | 2022-03-25 | 杭州又拍云科技有限公司 | Simple encoding and decoding method suitable for time sequence database |
-
2022
- 2022-06-07 CN CN202210636103.1A patent/CN114726380B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN114726380A (en) | 2022-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107102220B (en) | Recording data processing method and device | |
CN100495318C (en) | Integer data compression method, device and decompression method, device | |
KR102535450B1 (en) | Data storage method and apparatus, and computer device and storage medium thereof | |
CN104380267A (en) | Data compression/decompression device | |
CN111091876A (en) | A DNA storage method, system and electronic device | |
CN111563109A (en) | Radix statistics method, apparatus, system, device and computer readable storage medium | |
CN109831409A (en) | Data transmission method, device, computer readable storage medium and electronic equipment | |
CN114726380B (en) | Monitoring data lossless compression method, device, equipment and readable storage medium | |
CN107590157B (en) | Data storage method, data query method and related equipment | |
CN115408350A (en) | Log compression method, log recovery method, log compression device, log recovery device, computer equipment and storage medium | |
WO2024138981A1 (en) | Data compression method and apparatus, data decompression method and apparatus, and electronic device and storage medium | |
KR20160113167A (en) | Optimized data condenser and method | |
WO2024149207A1 (en) | Data processing method and apparatus, and medium and computer device | |
CN114490546A (en) | Compression method, device, electronic device and storage medium for trajectory data | |
CN118612297A (en) | Mobile communication data optimization storage method, device, equipment and storage medium | |
CN111858391A (en) | A Method for Optimizing Compressed Storage Format During Data Processing | |
CN115145496B (en) | Simulation result data processing method, device and storage medium | |
CN110287147B (en) | Character string sorting method and device | |
CN116208168A (en) | Data compression method, data decompression method and device | |
CN104901703B (en) | A kind of integer sequence Fast Compression stores algorithm | |
CN115955250B (en) | College scientific research data acquisition management system | |
CN113139636B (en) | A method for generating information code, related device, equipment and storage medium | |
Nguyen | Reducing Storage Size of Large Quantitative Data Using a Combined Lossy Compression Method: An Application to an IoT Dataset | |
CN111263155B (en) | Compression method and system for equal-resolution CR image | |
CN118860998A (en) | A log processing method, log restoration method, device and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |