WO2021145105A1

WO2021145105A1 - Data compression device and data compression method

Info

Publication number: WO2021145105A1
Application number: PCT/JP2020/045668
Authority: WO
Inventors: 朋紀佐藤; 悠二西牧; 義己田中
Original assignee: ソニーグループ株式会社
Priority date: 2020-01-15
Filing date: 2020-12-08
Publication date: 2021-07-22

Abstract

This invention realizes a configuration and a process which improve compression efficiency in reversible compression processing for generating feature amount compression data and remainder data compression data by applying a learning model. This invention includes: a feature amount compression unit that applies a learning model to generate feature amount compression data of input data; a feature amount restoration unit that generates feature amount restoration data by restoration of the feature amount compression data; a difference calculation unit that calculates remainder data, which is the difference between the input data and the feature amount restoration data; a remainder data block division unit that generates a plurality of remainder division blocks from the remainder data; a remainder division block unit encoder that generates a plurality of remainder division block compression data pieces by compression processing of the remainder division blocks; and an output compression data generation unit that generates output compression data by synthesizing the feature amount compression data and the remainder division block compression data.

Description

Data compression device and data compression method

This disclosure relates to a data compression device and a data compression method. More specifically, the present invention relates to a data compression device that executes data compression processing, and a data compression method.

In recent years, the use of mobile devices such as smartphones and sensors such as cameras has increased rapidly. These devices (devices) generate various data such as communication data and image data, transfer the generated data to other devices via a network, and record media such as flash memory, optical disk, and hard disk. Performs processing such as recording in.

When transferring or recording a large amount of data, the data compression process improves the data transfer speed and reduces the data recording capacity, and an efficient compression processing technique with an increased compression rate is required.
A compression method using a machine learning model has been proposed as one aspect of a technique for compressing data generated from a sensor or a mobile device.

As an example of the compression method using the machine learning model, there is a method of extracting important feature data from the data to be compressed and performing compression processing of the extracted feature data to realize a high compression rate.
However, in this method, data other than the feature data selected from the data to be compressed is not included in the compression result data, and information loss occurs.

That is, it is impossible to restore data other than the feature data that is not included in the compressed data, and the restored data (decompressed data) will be different from the original data before compression. That is, there is a problem that the compression process to which machine learning is simply applied as described above becomes a lossy compression process.

Patent Document 1 (Japanese Unexamined Patent Publication No. 2019-14680) discloses a configuration that reduces missing data. This Patent Document 1 proposes a compression system with less information loss and high compression efficiency.
Specifically, an autoencoder, which is one of the machine learning methods, is used to extract features of a plurality of scales to restore a high-definition image from a small amount of information.

However, even if the configuration described in Patent Document 1 is applied, the missing data cannot be completely eliminated. That is, the lossless compression process has not been realized yet.

There are situations where information loss is not allowed depending on the purpose of data use, such as medical images. Lossless compression is essential for such data compression and restoration.

There is a compression system proposed by Shen et al. As a conventional technique that discloses a configuration that realizes lossless compression in a compression method using a machine learning model.
This arrangement is described in Non-Patent Document 1 ( "Lossless Compression of Curated Erythrocyte Images Using Deep Autoencoders for Malaria Infection Diagnosis" Published in 2016 Picture Coding Symposium 4-7 December 2016.Hongda Shen et al.).

In this non-patent document 1, non-extracted data other than the feature data not included in the compressed data to which the machine learning model is applied is extracted as residual data, and the residual data is further compressed and remains. Generate difference data compressed data.
That is, the final compressed data is a combination of compressed data to which the machine learning model is applied and residual data compressed data. When transmitting or storing this set data in the storage unit and restoring (decompressing) it, the compressed data to which the machine learning model, which is the constituent data of the set data, is applied and the residual data compressed data are individually extracted and individually. After restoring to, these restored data are combined to restore the data before compression processing.

However, in the configuration described in Non-Patent Document 1, the amount of the compressed data of the residual data becomes large, and the compressed data to which the machine learning model, which is the finally generated compressed data, is applied, and the residual data compression. There is also a problem that the amount of data is large and the compression efficiency is lowered as a result.

JP-A-2019-140680

This disclosure is made in view of the above problems, for example, and describes a data compression device and a data compression method that perform compression processing applying a machine learning model and realize lossless compression with an improved compression rate. The purpose is to provide.

The first aspect of the disclosure is
A feature compression unit that applies a learning model to generate feature compression data for input data,
A feature restoration unit that generates feature restoration data by executing restoration processing on the feature compression data, and a feature restoration unit.
A difference calculation unit that calculates residual data, which is the difference data between the input data and the feature amount restoration data,
A residual data block division unit that divides the residual data into blocks to generate a plurality of residual division blocks, and a residual data block division unit.
A residual division block unit encoder that executes compression processing for each of the plurality of residual division blocks and generates multiple residual division block compression data.
An output compression data generation unit that generates output compression data by synthesizing the feature amount compression data generated by the feature amount compression unit and a plurality of residual division block compression data generated by the residual division block unit encoder. It is in the data compression device that has.

Further, the second aspect of the present disclosure is
It is a data compression method executed in a data compression device.
The feature compression process that the feature compression unit applies the learning model to generate the feature compression data of the input data,
The feature amount restoration process in which the feature amount restoration unit executes the restoration process for the feature amount compressed data to generate the feature amount restoration data, and
The difference calculation unit calculates the residual data, which is the difference data between the input data and the feature amount restoration data, and the difference calculation process.
Residual data block division processing in which the residual data block division unit divides the residual data into blocks to generate a plurality of residual division blocks, and
The residual division block unit encoder executes the compression processing of each of the plurality of residual division blocks to generate a plurality of residual division block compression data, and the residual division block unit encoding processing.
The output compression data generation unit generates the output compression data by synthesizing the feature amount compression data generated by the feature amount compression unit and a plurality of residual division block compression data generated by the residual division block unit encoder. There is a data compression method that executes the output compression data generation process.

Still other objectives, features and advantages of the present disclosure will be clarified by more detailed description based on the examples of the present disclosure and the accompanying drawings described below. In the present specification, the system is a logical set configuration of a plurality of devices, and the devices having each configuration are not limited to those in the same housing.

According to the configuration of one embodiment of the present disclosure, a configuration and processing with improved compression efficiency are realized in a lossless compression process for generating feature amount compressed data and residual data compressed data to which a learning model is applied.
Specifically, for example, a feature quantity compression unit that applies a learning model to generate feature quantity compressed data of input data, a feature quantity restoration unit that generates feature quantity restoration data by restoring feature quantity compressed data, and an input. A difference calculation unit that calculates the residual data that is the difference between the data and the feature amount restoration data, a residual data block division unit that generates a plurality of residual division blocks from the residual data, and a compression process of the residual division block. Residual division block unit encoder that generates multiple residual division block compression data, and output compression data generator that synthesizes feature amount compression data and residual division block compression data to generate output compression data. Have.
With this configuration, it is possible to realize a configuration and processing in which the compression efficiency is improved in the lossless compression processing for generating the feature amount compressed data and the residual data compressed data to which the learning model is applied.
The effects described in the present specification are merely exemplary and not limited, and may have additional effects.

It is a figure explaining the configuration example of the data compression apparatus which executes the data compression processing using a learning model as a lossless compression processing. It is a figure explaining the configuration example of the data restoration apparatus which executes the data restoration processing using a learning model. It is a figure explaining the configuration example of the data compression apparatus which executes the data compression processing using a learning model as a lossless compression processing. It is a figure which shows the flowchart explaining the sequence of the data compression processing which executes the data compression processing using a learning model as a lossless compression processing. It is a figure explaining the configuration example of the data restoration apparatus which executes the data restoration processing using a learning model. It is a figure which shows the flowchart explaining the sequence of the data restoration processing executed by the data restoration apparatus which executes the data restoration processing using a learning model. It is a figure which shows the flowchart explaining the sequence of the generation and update process of a learning model. It is a figure explaining the structural example of the data compression apparatus of this disclosure. It is a figure explaining the specific example of the process executed by the residual data block division part of the data compression apparatus of this disclosure. It is a figure explaining the specific example of the process executed by the residual division block unit encoder of the data compression apparatus of this disclosure. It is a figure which shows the flowchart explaining the sequence of the data compression processing executed by the data compression apparatus of this disclosure. It is a figure explaining the error function used for learning of machine learning. It is a figure which shows the flowchart explaining the sequence of the generation | update process of the learning model of this disclosure. It is a figure which shows the flowchart explaining the determination processing sequence of the division bit number of the residual data executed by the residual data block division part of a data compression apparatus. It is a figure explaining the hardware configuration example of the data compression apparatus of this disclosure.

Hereinafter, the details of the data compression device and the data compression method of the present disclosure will be described with reference to the drawings. The explanation will be given according to the following items.
1. 1. Basic configuration for realizing lossless compression in data compression processing using a machine learning model 2. 2. About the configuration and processing sequence of the feature amount compression processing using the learning model, the data compression using the residual data compression processing, and the data restoration processing. About the generation sequence of the learning model 4. Regarding the configuration of the data compression device and the data compression processing sequence that combine the feature amount compression processing using the learning model of the present disclosure and the residual data compression processing 4-1. (Example 1) Example of a data compression device that divides each element of residual data into blocks of several bits (bit) in the compression processing of residual data and performs compression processing in block units 4-2. (Example 2) Example of generating residual data that enhances compression efficiency, that is, a learning model that generates residual data for enabling generation of residual compressed data with a smaller amount of data 4-3. (Example 3) Example of determining the optimum block division mode of residual data that enhances compression efficiency, that is, determining the block division mode that enables generation of residual compressed data with a smaller amount of data. About the hardware configuration example of the data compression device 6. Summary of the structure of this disclosure

[1. Basic configuration for realizing lossless compression in data compression processing using machine learning models]
First, before explaining the configuration and processing of the present disclosure, a basic configuration for realizing lossless compression in data compression processing using a machine learning model will be described.

Compression system that Shen et al described above has been proposed, that is, non-patent literature 1 ( "Lossless Compression of Curated Erythrocyte Images Using Deep Autoencoders for Malaria Infection Diagnosis" Published in 2016 Picture Coding Symposium 4-7 December 2016.Hongda Shen et al The explanation will be given based on the configuration disclosed in.).

The outline of the data compression configuration disclosed in Non-Patent Document 1 is as follows.
First, feature quantities are extracted from the data using an autoencoder that executes compression processing using a machine learning model, and feature quantity compression data is generated.
Further, the residual data which is the difference between the restored data based on the feature amount compressed data and the input data is calculated, and the residual compressed data which is the compressed data of the residual data is generated.
Finally, the data including the feature amount compressed data and the residual compressed data is generated as the final compressed data.

This final compressed data is data obtained by adding residual compressed data that is not included in this feature amount compressed data to the feature amount compressed data generated by the compression process using the machine learning model. Therefore, by the final restoration process of the compressed data, it is possible to generate the same restoration data as the original input data before the compression process. That is, lossless compression is realized. This has made it possible to apply it to medical images for which reversibility is strongly required.

With reference to FIG. 1, the configuration and processing of the data compression device proposed by Shen et al. Disclosed in Non-Patent Document 1 will be described.

The data compression device 20 shown in FIG. 1 is a configuration example of a data compression device that executes a data compression process using a machine learning model as a lossless compression process. The data compression device 20 shown in FIG. 1 performs compression processing on the compression target data 10 which is input data.

The compression target data 10 shown in FIG. 1 is image data. The image data is an example of the compression target data 10, and the compression target data 10 is not limited to the image data but can be various data such as numerical string data.
Here, it is assumed that the compression target data 10 is image data of 50 × 50 = 2500 pixels as shown in FIG.

The compression target data 10 is first input to the feature amount encoder 21. The feature amount encoder 21 generates the feature amount compression data acquired from the image data which is the compression target data 10, that is, the feature amount compression data 11 shown in FIG.
The feature amount acquired from the image data is various information such as edge information, color information, and shading information of the image.
The feature amount encoder 21 generates the feature amount compressed data 11 by using a machine learning model (hereinafter, described as a “learning model”) generated in advance.

The learning model is a learning model generated in advance based on various sample data, and is a learning model for generating feature-compressed data. This learning model includes, for example, parameter information for applying to compression processing.
The feature amount encoder 21 uses this learning model to generate the feature amount compressed data 11.

In the example shown in FIG. 1, an example in which image data of 2500 pixels is sequentially compressed in the order of 1500 pixels, 1000 pixels, and 30 pixels is shown.
The feature amount encoder 21 finally generates the feature amount compressed data 11 having a data amount corresponding to an image of 30 pixels.

The feature amount compressed data 11 generated by the feature amount encoder 21 is input to the feature amount decoder 22. Similar to the feature encoder 21, the feature decoder 22 executes a restoration process (decompression process) of the feature compressed data 11 by using a learning model generated in advance.

The learning model is a learning model for generating feature restoration (elongation) data generated based on various sample data, and is a learning model having parameter information to be applied to the restoration process.
The feature amount decoder 22 performs a data restoration process applying this learning model, and generates a feature amount restoration (extension) data 12, that is, a feature amount restoration image of 50 × 50 = 2500 pixels.

The feature amount restoration (elongation) data 12 is an image similar to the compression target data 10 before compression, but is not exactly the same image.
The reason is that the feature amount encoder 21 generates the feature amount compressed data 11 using only the feature amount selected from the compression target data 10, and the feature amount decoder 22 restores the feature amount compressed data 11 to restore the feature amount. This is because the (extended) data 12 is generated. That is, the feature amount restoration (extension) data 12 does not include information that has not been selected as the feature amount.

As described above, the feature amount restoration (decompression) data 12 is different from the compression target data 10 before compression, and the lossless compression processing cannot be realized by the data compression processing using the learning model.
The data compression device 20 shown in FIG. 1 further executes the following processing in order to realize the lossless compression processing.

First, the difference calculation unit 23 calculates the difference between the feature amount restoration (elongation) data 12 and the compression target data 10, that is, the residual data 13.
Next, the residual encoder 24 executes the compression process of the residual data 13.
The data compression process executed by the residual encoder 24 is not a compression process to which a learning model is applied, but a compression process whose reversibility is guaranteed. Specifically, for example, a dictionary-type compression process or a compression process to which an entropy coding process such as a Golomb coding process is applied is executed.
The residual encoder 24 generates the residual compressed data 14 by the compression process of the residual data 13.

Next, the output compressed data generation unit (bitstream generation unit) 25
(A) Feature amount compressed data 11 generated by the feature amount encoder 21.
(B) Residual compression data 14 generated by the residual encoder 24,
These two compressed data are input, and the output compressed data 15 is generated by combining them.

The output compressed data 15 is a set data of each of the compressed data (a) and (b), that is, feature amount compressed data + residual compressed data.
The output compressed data 15 is transmitted to an external device via, for example, the communication unit 26. Alternatively, it is stored in the storage unit 27.

The output compressed data 15 includes the feature amount compressed data 11 compressed by applying the learning model and the residual compressed data 14 which is the compressed data of the residual data 13 not included in the feature amount compressed data 11. It is data, and by the restoration process (decompression process) using the output compressed data 15, it is possible to reproduce the same data as the compressed target data 10 (input image) before compression.
That is, lossless compression processing is realized.

Next, the configuration and processing of the data restoration device that inputs the output compressed data 15 and executes the restoration processing will be described with reference to FIG.

The data restoration device 50 shown in FIG. 2 receives the output compressed data 15 generated by the data compression device 20 shown in FIG. 1 via a network, for example, and executes a restoration process (decompression process).

The data restoration device 50 shown in FIG. 2 inputs the output compressed data (feature amount compressed data + residual compressed data) 61 shown in FIG. 2 via the communication unit 51.
The output compressed data 61 corresponds to the output compressed data 15 generated by the data compression device 20 shown in FIG.

The output compressed data 61 is input to the data separation unit 52 and is separated into the feature amount compressed data 62 and the residual compressed data 63.
The feature amount compressed data 62 is input to the feature amount decoder 53.
Similar to the feature amount decoder 22 of the data compression device 20 described above with reference to FIG. 1, the feature amount decoder 53 uses a learning model generated in advance to restore the feature amount compressed data 62 (decompression processing). To execute.

The learning model is a learning model for generating feature restoration (elongation) data generated based on various sample data, and is a learning model having parameter information to be applied to the restoration process.
The feature amount decoder 53 performs a data restoration process applying this learning model, and generates a feature amount restoration (elongation) data 64, that is, a feature amount restoration image of 50 × 50 = 2500 pixels.

On the other hand, the residual compressed data 63 separated by the data separation unit 52 is input to the residual decoder 54.
The residual decoder 54 executes a restoration process (decompression process) on the residual compressed data 63. The restoration process executed by the residual decoder 54 is a data restoration process corresponding to the data compression process executed by the residual encoder 24 of the data compression device 20 described above with reference to FIG. 1, and has reversibility. It is executed as a restoration process of compressed data.
The residual decoder 54 generates the residual data 65 by the restoration process (decompression process) of the residual compressed data 63.
The residual data 65 is the same data as the residual data 13 shown in FIG.

The feature amount restoration (extension) data 64 generated by the feature amount decoder 53 and the residual data 65 generated by the residual decoder 54 are input to the compositing unit 55.
The compositing unit 55 executes a compositing process of the feature amount restoration (elongation) data 64 and the residual data 65 to generate the restoration (elongation) data 66.
The restored (decompressed) data 66 is the same data as the compression target data 10 before the compression process input by the data compression device 20 shown in FIG.

That is, the compressed data (compressed data 50 for output) generated by the data compression device 20 shown in FIG. 1 is restored by the restoration process executed by the data restoration device 50 shown in FIG. 2, and the data before compression (data 10 to be compressed). ), It is possible to generate the same restored (decompressed) data 66, and reversible compression is realized.

However, the data compression / restoration processing method described with reference to FIGS. 1 and 2, that is, the lossless compression method using the machine learning model has a problem that the compression rate is lowered.

As described above, the output compressed data 15 which is the final output generated by the data compression device 20 shown in FIG. 1 is
(A) Feature amount compressed data 11 generated by the feature amount encoder 21.
(B) Residual compression data 14 generated by the residual encoder 24,
It is compressed data that combines these two compressed data.

(B) Since the residual encoder 24 generates the residual compressed data 14 by the lossless compression process, the amount of the residual compressed data 14 tends to be large.
As a result, even if the data in (a) above, that is, the data amount of the feature amount compressed data 11 generated by applying the learning model by the feature amount encoder 21 is reduced, it is included in the final output compressed data 15. Due to the influence of the data of the above (b), that is, the residual compressed data 14 generated by the residual encoder 24, the amount of data of the final output compressed data 15 becomes large, and the compression efficiency is lowered. There is.

As described above, the residual encoder 24 compresses the residual data by using, for example, an entropy code that utilizes the frequency of occurrence of numerical values.
The amount of data of the residual compression data 14 generated by the residual encoder 24 of the data compression device 20 shown in FIG. 1 will be considered.
The residual encoder 24 performs a compression process of the residual data 13. The residual data 13 corresponds to the difference pixel value of each of the corresponding pixels of the compression target data (input image) 10 and the feature amount restoration data (feature amount restoration image) 12 calculated by the difference calculation unit 23.

For example, as an example, it is assumed that each pixel value of the compression target data (input image) 10 and the feature amount restoration data (feature amount restoration image) 12 is 32-bit int type (integer type) data.
In this case, the value of the residual data of each pixel is
-2,147,483,648 (-2 ³² ) to 2,147,483,647 (2 ³² -1)
It may be set to such a wide range of values.

If the residual data of each pixel may spread over an extremely wide range of values in this way, the frequency of appearance of numerical values is unlikely to be biased, and the compression rate of the entropy code does not improve.
That is, the amount of residual compressed data 14 generated by the residual encoder 24 increases.

Therefore, even if the compression process applying the learning model executed in the feature amount encoder 21 of the data compression device 20 shown in FIG. 1 realizes a high compression rate, the final output is as long as the residual data is difficult to compress. It becomes difficult to significantly reduce the amount of data of the output compressed data 15.

Further, the method of Shen et al. Described in Non-Patent Document 1 also has a problem in a learning model generation method, that is, a learning method, which is applied to the feature amount compression process in the feature amount encoder 21.

The output compressed data 15, which is the final output of the data compression device 20 shown in FIG. 1, is dominated by residual compressed data, and how to reduce the residual compressed data is the key to improving the compression rate. Therefore, the learning model should be optimized in consideration of increasing the compression efficiency of the residual data.

However, the learning model described in Non-Patent Document 1 is a model that minimizes the error between the compression target data 10 and the feature amount restoration (elongation) data 12, that is, learning that improves the quality of the extracted features. It is a model and does not consider the compression efficiency of residual data. As a result, the amount of residual compressed data in the output compressed data 15 which is the final output has increased.

The data size of the feature amount compressed data 11 generated by the feature amount encoder 21 is constant regardless of the quality of the features of the compression target data 10, and from the viewpoint of improving the compression rate, reducing the data amount of the residual compressed data is the key. Therefore, it is considered that there is room for improvement.

[2. About feature compression processing using learning model, data compression using residual data compression processing, and data restoration processing configuration and processing sequence]
Next, the configuration and processing sequence of the feature amount compression processing using the learning model, the data compression using the residual data compression processing, and the data restoration processing will be described.

The configuration described in Non-Patent Document 1 has been described with reference to FIGS. 1 and 2. Next, with reference to FIGS. 3 and 3, feature amount compression processing using a learning model and residual data compression The configuration and processing sequence of data compression and data restoration using processing will be described.

FIG. 3 is a diagram showing a configuration example of a data compression device 100 that performs data compression processing using both feature amount compression processing using a learning model and residual data compression processing.

The data compression device 100 shown in FIG. 3 includes a learning model application feature amount compression unit 101, a learning model application feature amount restoration unit 102, a difference calculation unit 103, a residual encoder 104, and an output compression data generation unit (bitstream generation unit). It has 105 and a learning model storage unit 110.

The data compression device 100 shown in FIG. 3 inputs the input data 121, which is the data to be compressed, into the learning model application feature amount compression unit 101.
The learning model application feature amount compression unit 101 applies the learning model stored in the learning model storage unit 110 to generate the feature amount compression data acquired from the input data 121, that is, the feature amount compression data 122.

The learning model stored in the learning model storage unit 110 is a learning model for generating feature amount compressed data generated based on various sample data, and is a learning model having parameter information and the like for applying to compression processing. It is a model.
The learning model application feature amount compression unit 101 uses this learning model to generate feature amount compression data 122.

The feature amount compression data 122 generated by the learning model application feature amount compression unit 101 is input to the learning model application feature amount restoration unit 102. The learning model application feature amount restoration unit 102 also applies the learning model stored in the learning model storage unit 110 to execute the restoration process (decompression process) of the feature amount compression data 122, and the feature amount restoration (elongation) data 123. To generate.

The learning model used by the learning model application feature restoration unit 102 is a learning model for generating feature restoration (elongation) data generated based on various sample data, and is a parameter for being applied to the restoration process. It is a learning model with information.
The learning model application feature amount restoration unit 102 performs a data restoration process to which the learning model is applied, and generates feature amount restoration (elongation) data 123.

The feature amount restoration (extension) data 123 generated by the learning model application feature amount restoration unit 102 is input to the difference calculation unit 1103.
The difference calculation unit 103 calculates the difference between the feature amount restoration (elongation) data 123 generated by the learning model application feature amount restoration unit 102 and the input data 121 before the compression process, that is, the residual data 124.

The residual data 124 calculated by the difference calculation unit 103 is input to the residual encoder 104.
The residual encoder 104 executes the compression process of the residual data 124 to generate the residual compressed data 125.
The data compression process executed by the residual encoder 104 is not a compression process to which a learning model is applied, but a compression process whose reversibility is guaranteed. Specifically, for example, a dictionary-type compression process or a compression process to which an entropy coding process such as a Golomb coding process is applied is executed.

However, as described above with reference to FIG. 1, there is a problem that the data compression process executed by the residual encoder 104 does not realize a sufficient reduction in the amount of data.
That is, for example, assuming that each element value of the compression target data (input data) 121 is 32-bit int type (integer type) data, the residual data with the feature amount restoration data 123 is -2,147,483. may take a wide range of values of 648 2,147,483,647 from ^{(-2 32)} ⁽² 32 -1). If each element of the residual data may spread over an extremely wide range of values in this way, the frequency of occurrence of numerical values is unlikely to be biased, and the compression rate of the entropy code does not improve.
That is, the amount of residual compressed data 125 generated by the residual encoder 104 becomes large.

The residual compressed data 125 generated by the residual encoder 104 is input to the output compressed data generation unit (bitstream generation unit) 105.

The output compressed data generation unit (bitstream generation unit) 105 is
(A) Feature compression data 122 generated by the learning model application feature compression unit 101,
(B) Residual compression data 125 generated by the residual encoder 104,
These two compressed data are input to generate output compressed data 126 in which they are combined.

The output compressed data 126 is a set data of each of the compressed data (a) and (b), that is, feature amount compressed data + residual compressed data.
The output compressed data 126 is transmitted to an external device via, for example, a communication unit. Alternatively, it is stored in the storage unit.

The output compressed data 126 includes the feature amount compressed data 122 that has been compressed by applying the training model and the residual compressed data 125 that is the compressed data of the residual data 124 that is not included in the feature amount compressed data 122. It is data, and by the restoration process (decompression process) using the output compressed data 126, it is possible to reproduce the same data as the input data 121 before compression. That is, lossless compression processing is realized.

Next, a sequence of data compression processing executed by the data compression device 100 shown in FIG. 3 will be described with reference to the flowchart shown in FIG.
The processing of each step of the flow shown in FIG. 4 will be sequentially described.

(Step S101)
First, the data compression device 100 inputs the data to be compressed in step S101.

(Step S102)
Next, in step S102, the data compression device 100 acquires a learning model (feature amount compression learning model, feature amount restoration learning model).

(Step S103)
Next, in step S103, the data compression device 100 executes a compression process applying the feature amount compression learning model to the input data (compression target data) to generate the feature amount compression data.

This process is a process executed by the learning model application feature amount compression unit 101 of the data compression device 100 shown in FIG.

(Step S104)
Next, in step S104, the data compression device 100 executes a restoration process in which the feature amount restoration learning model is applied to the feature amount compression data generated in step S103 to generate the feature amount restoration data.

This process is a process executed by the learning model application feature amount restoration unit 102 of the data compression device 100 shown in FIG.

(Step S105)
Next, the data compression device 100 calculates the difference (residual) between the input data (compression target data) and the feature amount restoration data generated in step S104 in step S105.

This process is a process executed by the difference calculation unit 103 of the data compression device 100 shown in FIG.

(Step S106)
Next, in step S106, the data compression device 100 executes a compression process for the difference (residual) calculated in step S105 to generate residual compression data.

This process is a process executed by the residual encoder 104 of the data compression device 100 shown in FIG.
However, as described above, there is a problem that the data compression process executed by the residual encoder 104 does not realize a sufficient reduction in the amount of data, that is, the compression efficiency is poor.

(Step S107)
Finally, in step S107, the data compression device 100 combines the feature amount compression data generated by the learning model application feature amount compression unit 101 in step S103 with the residual compression data generated by the residual encoder 104 in step S106. , Generate compressed data for output.

This process is a process executed by the output compressed data generation unit 105 of the data compression device 100 shown in FIG.

The output compression data generated by the output compression data generation unit 105 is the feature amount compression data generated by the learning model application feature amount compression unit 101 in step S103 and the residual compression data generated by the residual encoder 104 in step S106. It is the data that combines.
Since the output compressed data includes residual compressed data with poor compression efficiency generated by the residual encoder 104, the amount of data becomes large.

Next, with reference to FIG. 5, the configuration of the data restoration device 150 that inputs the output compressed data 105 generated by the data compression device 100 shown in FIG. 3 and executes the restoration process will be described.

As shown in FIG. 5, the data restoration device 150 includes a data separation unit 151, a learning model application feature amount restoration unit 152, a residual decoder 153, a synthesis unit 154, and a learning model storage unit 160.

The data restoration device 150 shown in FIG. 5 receives the output compression data 171 (= output compression data 126 shown in FIG. 3) generated by the data compression device 100 shown in FIG. 3 via a network, for example, and performs restoration processing. (Extension processing) is executed.

The output compressed data 171 which is the data to be restored is input to the data separation unit 151 of the data restoration device 150 shown in FIG.
The data separation unit 151 separates the two types of compressed data included in the output compressed data 171, that is, the feature amount compressed data 172 and the residual compressed data 173.
The feature amount compressed data 172 is input to the learning model application feature amount restoration unit 152.

Similar to the learning model applied feature amount restoring unit 102 of the data compression device 100 described above with reference to FIG. 3, the learning model applied feature amount restoring unit 152 uses the learning model generated in advance to generate the feature amount compressed data. The restoration process (decompression process) of 172 is executed.

The learning model is a learning model for generating feature restoration (elongation) data generated based on various sample data, and is a learning model having parameter information to be applied to the restoration process.
The learning model application feature amount restoration unit 152 performs a data restoration process to which this learning model is applied, and generates feature amount restoration (extension) data 174.

On the other hand, the residual compressed data 173 separated by the data separation unit 151 is input to the residual decoder 153.
The residual decoder 153 executes a restoration process (decompression process) on the residual compressed data 173. The restoration process executed by the residual decoder 153 is a data restoration process corresponding to the data compression process executed by the residual encoder 104 of the data compression device 100 described above with reference to FIG. 3, and has reversibility. It is executed as a restoration process of compressed data.

The residual decoder 153 generates the residual data 175 by the restoration process (decompression process) for the residual compressed data 173.
The residual data 175 is the same data as the residual data 124 shown in FIG.

The feature amount restoration (extension) data 174 generated by the learning model application feature amount restoration unit 152 and the residual data 175 generated by the residual decoder 153 are input to the synthesis unit 154.
The synthesizing unit 154 executes a synthesizing process of the feature amount restoration (elongation) data 174 and the residual data 175 to generate the restoration (elongation) data 176.
The restored (decompressed) data 176 is the same data as the input data 121, which is the data to be compressed before the compression process, which is input by the data compression device 100 shown in FIG.

In this way, the compressed data (output compressed data 126) generated by the data compression device 100 shown in FIG. 3 is restored to the input data 121 before compression by the restoration process executed by the data restoration device 150 shown in FIG. The same restored (decompressed) data 175 can be generated, and reversible compression is realized.

Next, a sequence of data restoration processing executed by the data restoration apparatus 150 shown in FIG. 5 will be described with reference to the flowchart shown in FIG.
The processing of each step of the flow shown in FIG. 6 will be sequentially described.

(Step S121)
First, the data restoration device 150 inputs the restoration target data in step S121.
The data to be restored to be input is data composed of feature amount compressed data and residual compressed data.

(Step S122)
Next, in step S122, the data restoration device 150 separates the restoration target data input in step S121 into feature amount compressed data and residual compressed data.

This process is a process executed by the data separation unit 151 of the data restoration device 150 shown in FIG.

(Step S123)
Next, the data restoration device 150 acquires a learning model (a learning model for feature amount restoration) in step S123.

(Step S124)
Next, in step S124, the data restoration device 150 executes a restoration process in which the feature amount restoration learning model is applied to the feature amount compressed data separated from the input data in the data separation process of step S122. Generate feature restoration data.

This process is a process executed by the learning model application feature amount restoration unit 152 of the data restoration device 150 shown in FIG.

(Step S125)
Next, in step S125, the data restoration device 150 executes a restoration process (decompression process) on the residual compressed data separated from the input data in the data separation process of step S122 to generate residual restoration data.

This process is a process executed by the residual decoder 153 of the data restoration device 150 shown in FIG.

(Step S126)
Next, in step S126, the data restoration device 150 executes a synthesis process of the feature amount restoration data generated in step S124 and the residual restoration data generated in step S125 to generate output restoration data.

This restored (decompressed) data is the same data as the input data 121, which is the data to be compressed before the compression process, which is input by the data compression device 100 shown in FIG.
In this way, the compressed data (output compressed data 126) generated by the data compression device 100 shown in FIG. 3 is restored by the restoration process executed by the data restoration device 150 shown in FIG. 5, and the data before compression (input data). It is possible to generate the same restored (decompressed) data 175 as in 121), and reversible compression is realized.

However, as described above, in the data compression / restoration processing method described with reference to FIGS. 3 to 6, that is, the reversible compression method using the machine learning model, the compression generated by the data compression device 100 shown in FIG. 3 There is a problem that the amount of residual compressed data included in the data (compressed data for output 126) increases and the compression efficiency decreases.

[3. About the learning model generation sequence]
Next, referring to the flowchart shown in FIG. 7, the learning model applied feature amount compression unit 101, the learning model applied feature amount restoring unit 102, and the data restoring device 150 shown in FIG. 5 of the data compression device 100 shown in FIG. The generation sequence of the learning model used by the learning model application feature amount restoration unit 151 of the above will be described.

The processing of each step of the flowchart shown in FIG. 7 will be sequentially described.
The process according to the flow shown in FIG. 7 can be executed by the data compression device 100 shown in FIG. 3, the data restoration device 150 shown in FIG. 5, or other device, and the process according to the flow shown in FIG. The learning model generated and updated by the data is stored in the learning model storage unit 110 of the data compression device 100 shown in FIG. 3 and the learning model storage unit 160 of the data restoration device 150 shown in FIG.

(Step S151)
First, in step S151, a learning model as a template is input.
The learning model that serves as a template includes a learning model for compression processing for generating feature amount compressed data and a learning model for restoration processing for generating restoration data from feature amount compression data.

The learning model for compression processing includes various parameters applied to the compression processing, and the learning model for restoration processing includes various parameters applied to the restoration processing.
The parameters of the initial template learning model can be set to any value.

(Step S152)
Next, in step S152, the loss function of the learning model input in step S151 is defined.
The loss function is a function for quantitatively evaluating the performance of the learning model.

Specifically, for example, an index value calculation function for measuring the distance between an input and an output to a data processing unit that executes data processing to which a learning model is applied, for example, a square error calculation function, a cross entropy error calculation function, or the like is used. ..

(Step S153)
Next, in step S153, the parameters of the current learning model input in step S151 are evaluated using the loss function defined in step S152.

Specifically, a feature amount is extracted from the input data, and the degree of deviation between the data restored from the extracted feature amount and the input data is evaluated by a loss function.
For example, a loss function is defined in which a value with a smaller loss (higher evaluation value) is calculated as the deviation between the restored data and the input data is smaller, and the parameters of the learning model are evaluated using this loss function.

(Steps S154 to S155)
Next, in step S154, the parameter update amount is calculated based on the evaluation value calculated in step S153, and in step S155, the parameter is updated based on the calculation result.
The evaluation and update processing of these parameters is repeatedly executed a predetermined number of times.

(Step S156)
If it is determined in step S156 that the parameter evaluation and update processing of steps S154 to S155 has reached a predetermined number of times, the process proceeds to step S157.

(Step S157)
Finally, in step S157, a learning model in which the finally updated parameters are set is generated by iterative processing of the specified number of sentences of the parameter evaluation and update processing of steps S154 to S155, and the learning model storage unit (FIG. 3). It is stored in the learning model storage unit 110 and the learning model storage unit 160) in FIG.

The learning model generated in this way is the learning of the learning model application feature amount compression unit 101 of the data compression device 100 shown in FIG. 3, the learning model application feature amount restoration unit 102, and the data restoration device 150 shown in FIG. It is used in the model application feature amount restoration unit 151, and the feature amount is compressed and restored.

However, as described above, this training model is a model that minimizes the error between the data to be compressed and the feature restoration (decompression) data, that is, a training model that improves the quality of the extracted features. , The ease of compression of residual data is not considered.
That is, since the amount of residual compressed data included in the output compressed data 126, which is the final output of the data compression device 100 shown in FIG. 3, is not taken into consideration, it does not become a learning model that contributes to the improvement of compression efficiency. There is a problem.

[4. About the configuration of the data compression device and the data compression processing sequence that combine the feature amount compression processing using the learning model of the present disclosure and the residual data compression processing]
Next, the configuration of the data compression device and the data compression processing sequence in which the feature amount compression processing using the learning model of the present disclosure and the residual data compression processing are used together will be described.

As described with reference to FIGS. 1 to 7, the lossless compression method using the existing learning model has the following problems.
(1) Poor compression efficiency of residual data (2) The learning method of the learning model is not optimal There are these two problems.

The factors that cause the above problems can be analyzed as follows.
(1) Poor compression efficiency of residual data Specifically, this problem is solved.
(1) Since each element of the residual data can take a wide range of values, it is difficult to compress the existing lexicographic order or entropy code.
It is caused by such factors.

In addition, (2) the learning method of the learning model is not optimal. Specifically, this problem is solved.
(2) The model is trained to generate good quality features, and the model is not trained for the original purpose of improving the overall compression ratio.
It is caused by such factors.

The data compression apparatus of the present disclosure described below solves these problems.
Similar to the data compression device described above with reference to FIGS. 1 and 3, the data compression device of the present disclosure also performs data compression processing using both feature amount compression processing using a learning model and residual data compression processing. Execute.

However, the data compression device of the present disclosure has improved the mode of compressing the residual data, and the amount of residual compressed data is greatly reduced as compared with the data compression device shown in FIGS. 1 and 3. .. As a result, a lossless compression process that reduces the data size of the output compression data (feature amount compression data + residual compression data) that is the final output is realized.

As an example of the data compression device of the present disclosure, the following plurality of examples will be sequentially described.
(Example 1) An embodiment of a data compression device that divides each element of residual data into blocks of several bits (bit) and performs compression processing in block units when compressing residual data.
(Example 2) An example of generating residual data that enhances compression efficiency, that is, a learning model that generates residual data for enabling generation of residual compressed data with a smaller amount of data.
(Example 3) Example of determining the optimum block division mode of residual data that enhances compression efficiency, that is, determining the block division mode that enables generation of residual compressed data with a smaller amount of data.

[4-1. (Example 1) Example of a data compression device that divides each element of residual data into blocks of several bits (bit) and performs compression processing in block units when compressing residual data]
First, as (Example 1), an embodiment of a data compression device that divides each element of residual data into blocks of several bits (bits) and performs compression processing in block units when compressing residual data. explain.

In this (Example 1), each element of the residual data is divided into blocks of several bits (bit), and compression processing is performed for each divided block, thereby narrowing the range in which the numerical value of each block can be taken. , The configuration is such that the amount of residual compressed data is reduced by improving the compression efficiency using the appearance rule and frequency of numerical values.

FIG. 8 shows a configuration example of the data compression device 200 of the first embodiment of the present disclosure.
The data compression device 200 of the present disclosure shown in FIG. 8 is data compression using both feature amount compression processing using a learning model and residual data compression processing, similar to the data compression devices of FIGS. 1 and 3 described above. Perform processing.

The data compression device 200 shown in FIG. 8 includes a learning model application feature amount compression unit 201, a learning model application feature amount restoration unit 202, a difference calculation unit 203, a residual data block division unit 204, a residual division block unit encoder 205, and an output. It has a compressed data generation unit (bitstream generation unit) 206 and a learning model storage unit 210.

The data compression device 200 shown in FIG. 8 inputs the input data 221 which is the data to be compressed into the learning model application feature amount compression unit 201.
The learning model application feature amount compression unit 201 applies the learning model stored in the learning model storage unit 210 to generate the feature amount compression data acquired from the input data 221, that is, the feature amount compression data 222.

The learning model stored in the learning model storage unit 210 is a learning model for generating feature amount compressed data generated based on various sample data, and is a learning model having parameter information for applying to compression processing. Is.
The learning model application feature amount compression unit 201 uses this learning model to generate feature amount compression data 222.
The compression process generated by the learning model application feature amount compression unit 201 is a lossy compression process.

The feature amount compression data 222 generated by the learning model application feature amount compression unit 201 is input to the learning model application feature amount restoration unit 202. The learning model application feature amount restoration unit 202 also applies the learning model stored in the learning model storage unit 210 to execute the restoration process (decompression process) of the feature amount compression data 222, and the feature amount restoration (elongation) data 223. To generate.

The learning model used by the learning model application feature restoration unit 202 is a learning model for generating feature restoration (elongation) data generated based on various sample data, and is a parameter for being applied to the restoration process. It is a learning model with information.
The learning model application feature amount restoration unit 202 performs a data restoration process to which the learning model is applied, and generates feature amount restoration (elongation) data 223.

The feature amount restoration (extension) data 223 generated by the learning model application feature amount restoration unit 202 is input to the difference calculation unit 203.
The difference calculation unit 203 calculates the difference between the feature amount restoration (elongation) data 223 generated by the learning model application feature amount restoration unit 202 and the input data 221 before the compression process, that is, the residual data 224.

The residual data 224 calculated by the difference calculation unit 203 is input to the residual data block division unit 204.
The residual data block division unit 204 divides the residual data 224 calculated by the difference calculation unit 203 into a plurality of blocks, and a plurality of (n) residual division blocks-n, 225-1 to n shown in the figure. To generate.

A specific example of the processing executed by the residual data block partitioning unit 204 will be described with reference to FIG.
The input data 221 shown in FIG. 8 is assumed to be 4-pixel image data. In this assumption, it is assumed that the residual data 224 input to the residual data block dividing unit 204 is composed of the residual data of each of the four pixels a to d shown in FIG.

The residual data 224 shown in FIG. 9 is composed of the residual data of each of the pixels a to d. These correspond to the difference pixel values of each of the four corresponding pixels a to d of the input data 221 and the feature amount restoration data 223 calculated by the difference calculation unit 203.

As shown in FIG. 9, the residual data of each of the four pixels a to d is as follows.
Residual of pixel a (difference pixel value) = 271
Residual of pixel b (difference pixel value) = 15
Residual of pixel c (difference pixel value) = 29
Residual of pixel d (difference pixel value) = 97
FIG. 9 shows a binary bit string (32 bits) corresponding to each of these numerical values.

In the example shown in FIG. 9, it is assumed that each pixel value of the input data 221 and the feature amount restoration data 223 is 32-bit int type (integer type) data, and the residual data of each of the pixels a to d is also 2. It is shown as a forward bit string (32 bits).

The residual data block partitioning unit 204 divides these 32-bit bit strings into a plurality of blocks.
The residual data block division unit 204 divides the bit strings at the same division position for all the elements (all pixels) of the residual data, and the configuration data of all the elements in the same division position is used as the configuration data of one residual division block. , Generate multiple residual partition blocks.

Specifically, for example, as shown in FIG. 9, all the elements (all pixels) of the residual data 224 are divided into four blocks in 8-bit units from the most significant bit to the least significant bit. In the example shown in FIG. 9, the number of bits per block, that is, the number of divided bits = 8 bits, is an example, and various divided modes are possible.

For example, it may be set to divide into two blocks in units of 16 bits, or it may be set to divide by a different number of bits such as 15 bits from the upper bit, the next 10 bits, and the last 7 bits.
The optimization process for the number of divided bits will be described in Example 3 described later.

In the example shown in FIG. 9, the residual data block dividing unit 204 transmits the 32-bit data of the residual data (271, 15, 29, 32) of each pixel a to d in 8-bit units from the upper bit to the lower bit. It is divided into four blocks, and four residual division blocks 1,225-1 to residual division blocks 4,225-4 are generated.

The residual division blocks 1,225-1 are residual division blocks composed of 8 bits from the most significant bit of the residual data of the pixels a to d.
The residual division blocks 2, 225-2 are residual division blocks composed of 8-bit data of 9 bits to 16 bits from the upper end of each residual data of pixels a to d.

The residual division blocks 3, 225-3 are residual division blocks composed of 8-bit data of 17 bits to 24 bits from the upper side of the residual data of the pixels a to d.
The residual division blocks 4, 225-4 are residual division blocks composed of 8-bit data (= lower 8-bit data) of 25 bits to 32 bits from the upper side of the residual data of the pixels a to d.

As described above with reference to FIG. 3, the residual encoder 104 of the conventional data compression device 100 performs entropy such as dictionary-type compression processing and Golomb coding processing on these 32-bit numerical strings. Although the code is executed, there is a problem that the data compression process executed by the residual encoder 104 does not realize a sufficient reduction in the amount of data.
In other words, residual data of each pixel may take a wide variety of values -2,147,483,648 2,147,483,647 from ^{(-2 32)} ⁽² 32 -1), such If the residual data of each pixel spreads over an extremely wide range of values, the frequency of appearance of the numerical values is unlikely to be biased, and the compression rate of the entropy code does not improve.

In order to solve this problem, the data compression device 200 of the present disclosure shown in FIG. 8 is configured to perform the following processing.
The residual data block dividing unit 204 converts a plurality of 32-bit data (271, 15, 29, 32) of the residual data (271, 15, 29, 32) of each pixel a to d from the upper bit to the lower bit (four in the example shown in FIG. 9). ) Is divided into blocks.
In the example shown in FIG. 9, four residual division blocks 1,225-1 to residual division blocks 4,225-4 are generated.
Further, the next residual division block unit encoder 205 executes the compression process in these plurality of residual division block units.

As shown in FIG. 8, the plurality of residual division blocks 1,225-1 to the residual division blocks n, 225-n generated by the residual data block division unit 204 are input to the residual division block unit encoder 205. NS.

The residual division block unit encoder 205 individually compresses these plurality of residual division blocks 1,225-1 to the residual division blocks n and 225-n, and the residual division block compression data according to the number of division blocks. -1,226-1 to Residual division block compression data -n, 226-n is generated.

An example of compression processing for each residual division block executed by the residual division block unit encoder 205 will be described with reference to FIG.

In FIG. 10, four residual division blocks generated by the residual data block division unit 204 described with reference to FIG. 9, that is, residual data of 4 pixels a to d (271, 15, 29, 32). The three residual division blocks 1,225-1 to the residual division blocks 4,225-4 are shown in which the 32-bit data of No. 1 is divided from the upper bit to the lower bit in 8-bit units.

The residual division block unit encoder 205 individually executes compression processing (encoding processing) on these four residual division blocks 1,225-1 to residual division blocks 4,225-4. The residual division block unit encoder 205 performs lexicographic compression or entropy code compression for each residual division block.
The compression process executed by the residual division block unit encoder 205 is a lossless compression process in the residual division block unit.
As a result of this compression processing, residual division block compression data-1,226-1 to residual division block compression data-4,226-4 according to the number of division blocks are generated.

The number of bits of each element (each pixel) included in each block is 8 bits, and the range in which the numerical value of this 8-bit data can be taken is narrower than that of the conventional method described above with reference to FIG. 3 and the like. Become.
As described above with reference to FIG. 3, the residual encoder 104 of the conventional data compression device 100 performs entropy such as dictionary-type compression processing and Golomb coding processing on these 32-bit numerical strings. Execute the code. However, residual data of each pixel may take a wide variety of values -2,147,483,648 2,147,483,647 from ^{(-2 32)} ⁽² 32 -1), such If the residual data of each pixel spreads over an extremely wide range of values, the frequency of appearance of the numerical values is unlikely to be biased, and the compression rate of the entropy code does not improve.

On the other hand, in the data compression device 200 of the present disclosure shown in FIG. 8, the 32-bit data string of each pixel is divided into 8-bit units for the data to be compressed. 8 Possible numerical range of bit data ^is significantly reduced and -128 (-2 8) and 127 ^{(2 8).} As a result, it is possible to greatly increase the probability that one block contains many same values (specifically, bit values = 0). As a result, the compression efficiency due to entropy encoding is greatly improved. That is, the data compression efficiency is increased, and the amount of compressed data can be reduced.

Among the four blocks of the 32-bit data string of each pixel a to d shown in FIG. 10, the block of the high-order bit has a value of almost all 0. In this way, the block in which all the bits are 0 can be reduced to the minimum number of bits as compressed data. That is, when the value of the residual data of each pixel is small, the amount of compressed data of the block of the high-order bit can be significantly reduced, and the compression rate can be improved.

The data to be compressed is the residual data 224 calculated by the difference calculation unit 203 shown in FIG. This is the difference data between the feature amount restoration (extension) data 223 generated by the learning model application feature amount restoration unit 202 and the input data 221 before the compression process.

This difference data is unlikely to take a large value and is likely to take a small value. The reason for this is that the data compression and restoration processing executed by the learning model application feature amount compression unit 201 and the learning model application feature amount restoration unit 202 is processing using the learning model, and this learning model inputs and restores data. This is because it is a learning model set to be close to each other.
Therefore, as shown in FIG. 10, most of the high-order bits of each pixel are likely to be set to 0, and the compression efficiency can be significantly improved.

Returning to FIG. 8, the configuration and processing of the data compression device 200 of the present disclosure will be described.
As described with reference to FIG. 10, the residual division block unit encoder 205 includes a plurality of residual division blocks 1,225-1 to residual division blocks n, 225- input from the residual data block division unit 204. n is individually compressed to generate residual divided block compressed data-1,226-1 to residual divided block compressed data -n, 226-n according to the number of divided blocks.

The plurality of residual division block compression data-1,226-1 to the residual division block compression data -n, 226-n generated by the residual division block unit encoder 205 are output compression data generation units (bitstream generation units). ) 206 is input.

The output compressed data generation unit (bitstream generation unit) 206 is
(A) Feature compression data 222 generated by the learning model application feature compression unit 201,
(B) Multiple residual division block compression data-1,226-1 to residual division block compression data-n, 226-n generated by the residual division block unit encoder 205,
These compressed data are input to generate output compressed data 227 in which they are combined.

The output compressed data 227 is a set data of each of the compressed data (a) and (b), that is, feature amount compressed data + residual division block compressed data.
The output compressed data 227 is transmitted to an external device via, for example, a communication unit. Alternatively, it is stored in the storage unit.

The output compression data 227 includes the feature amount compression data 222 that has been compressed by applying the learning model and the residual division block compression data 226 that is the compression data of the residual data 224 that is not included in the feature amount compression data 222. It is compressed data including, and by the restoration process (decompression process) using the output compressed data 227, it is possible to reproduce the same data as the input data 221 before compression. That is, lossless compression processing is realized.

Next, a sequence of data compression processing executed by the data compression device 200 shown in FIG. 8 will be described with reference to the flowchart shown in FIG.
The processing of each step of the flow shown in FIG. 11 will be sequentially described.

(Step S201)
First, the data compression device 200 inputs the data to be compressed in step S201.

(Step S202)
Next, in step S202, the data compression device 200 acquires a learning model (feature amount compression learning model, feature amount restoration learning model).

(Step S203)
Next, in step S203, the data compression device 200 executes a compression process applying the feature amount compression learning model to the input data (compression target data) to generate the feature amount compression data.

This process is a process executed by the learning model application feature amount compression unit 201 of the data compression device 200 shown in FIG.

(Step S204)
Next, in step S204, the data compression device 200 executes a restoration process applying the feature amount restoration learning model to the feature amount compression data generated in step S203 to generate the feature amount restoration data.

This process is a process executed by the learning model application feature amount restoration unit 202 of the data compression device 200 shown in FIG.

(Step S205)
Next, the data compression device 200 calculates the difference (residual) between the input data (compression target data) and the feature amount restoration data generated in step S204 in step S205.

This process is a process executed by the difference calculation unit 203 of the data compression device 200 shown in FIG.

(Step S206)
Next, in step S206, the data compression device 200 determines the number of division bits of the residual data calculated in step S205.

This process is a process executed by the residual data block partitioning unit 204 of the data compression device 200 shown in FIG.

The residual data block division unit 204 determines the number of division bits of the residual data calculated in step S205. Specifically, as described above with reference to FIG. 9, a predetermined number of divided bits such as a block of 8 bits from the high-order bit is used.
Alternatively, the optimum number of division bits for minimizing the compressed data of the residual data may be calculated and determined. This division bit number optimization process will be described in the latter part (Example 3).

(Step S207)
Next, in step S207, the data compression device 200 divides the residual data according to the number of division bits of the residual data determined in step S206, and generates a plurality of residual division blocks.

This process is a process executed by the residual data block partitioning unit 204 of the data compression device 200 shown in FIG.
Specifically, for example, as described above with reference to FIG. 9, a plurality of residual division blocks of 8 bits each are generated from the high-order bits.

(Step S208)
Next, in step S208, the data compression device 200 executes compression processing in units of the plurality of residual division block generated in step S207 to generate a plurality of residual division block compressed data.

This process is a process executed by the residual division block unit encoder 205 of the data compression device 200 shown in FIG.

The residual division block unit encoder 205 executes compression processing in a plurality of residual division block units generated by the residual data block division unit 204, as described above with reference to FIG. 10, and a plurality of residual division block units encoder 205. Generate residual split block compressed data.

(Step S209)
Finally, in step S209, the data compression device 200 includes the feature amount compression data generated by the learning model application feature amount compression unit 201 in step S203 and a plurality of residuals generated by the residual division block unit encoder 205 in step S208. Combine the divided block compressed data to generate output compressed data.

This process is a process executed by the output compressed data generation unit 206 of the data compression device 200 shown in FIG.

As described above, the output compression data 227 generated by the output compression data generation unit 206 of the data compression device 200 shown in FIG. 8 includes the feature amount compression data 222 that has been compressed by applying the learning model. It is the compressed data including the residual division block compressed data 226 which is the compressed data of the residual data 224 not included in the feature amount compressed data 222.
By the restoration process (decompression process) using the output compressed data 227, it is possible to reproduce the same data as the input data 221 before compression. That is, lossless compression processing is realized.

[4-2. (Example 2) About an example of generating residual data that enhances compression efficiency, that is, a learning model that generates residual data for enabling generation of residual compressed data with a smaller amount of data]
Next, as (Example 2), regarding an example of generating residual data that enhances compression efficiency, that is, a learning model that generates residual data for enabling generation of residual compressed data with a smaller amount of data. explain.

The learning model generated according to the second embodiment can be used in the data compression device 200 shown in FIG. 8 described in the first embodiment. It can also be used in data compression processing according to the processing flow described with reference to FIG.
That is, the learning model generated according to the second embodiment described below is the learning model used in the learning model applied feature amount compression unit 201 and the learning model applied feature amount restoring unit 202 of the data compression device 200 shown in FIG. Is.

The learning model application feature amount compression unit 201 of the data compression device 200 shown in FIG. 8 applies the learning model stored in the learning model storage unit 210, and compresses the feature amount acquired from the input data 221, that is, the feature. Amount compressed data 222 is generated.

The learning model stored in the learning model storage unit 210 is a learning model for generating feature amount compressed data generated based on various sample data, and is a learning model having parameter information for applying to compression processing. Is.
The learning model application feature amount compression unit 201 uses this learning model to generate feature amount compression data 222.

Further, the learning model application feature amount restoration unit 202 also applies the learning model stored in the learning model storage unit 210 to execute the restoration process (decompression processing) of the feature amount compression data 222, and the feature amount restoration (elongation). Generate data 223.

The second embodiment is an example relating to the generation processing of the learning model used by the learning model application feature amount compression unit 201 and the learning model application feature amount restoration unit 202.
The compression efficiency is improved by devising the learning model used by the learning model applied feature amount compression unit 201 and the learning model applied feature amount restoring unit 202.

Before explaining the second embodiment, the learning method for generating and updating the learning model and the compression efficiency of the residual data 224 generated in the data compression process using the data compression device 200 shown in FIG. 8 The relationship between

In the data compression process using the data compression device 200 shown in FIG. 8, the residual data 224 generated by the difference calculation unit 203 includes the feature amount restoration (elongation) data 223 generated by the learning model application feature amount restoration unit 202. This is the difference data from the input data 221 before the compression process.

The parameters of the learning model used by the learning model applied feature amount compression unit 201 and the learning model applied feature amount restoration unit 202 are the feature amount restoration (expansion) data 223 generated by the learning model applied feature amount restoration unit 202 and the compression processing. It is determined by how to evaluate the degree of deviation from the previous input data 221. That is, it depends on how the loss function corresponding to the learning model is defined.

For example, in the generation / update process of the existing learning model described above with reference to FIG. 7, the loss function is defined in step S152, and the parameters of the learning model are evaluated using the loss function defined in step S153. Is going.
In the parameter evaluation process in this existing system, as described above with reference to FIG. 7, the degree of deviation between the data restored from the feature quantity and the input data is evaluated by the loss function.
For example, a loss function is defined in which a value with a smaller loss (higher evaluation value) is calculated as the deviation between the restored data and the input data is smaller, and the parameters of the learning model are evaluated using this loss function. ..

As a typical example of the loss function for evaluating the parameters of the learning model used when generating and updating a general learning model, it is an error calculation that calculates the degree of deviation between the restored data from the feature quantity and the input data. A square error calculation function is known.

In the second embodiment, epsilon-insensitive loss is used as a loss function for evaluating the parameters of the learning model used by the learning model application feature amount compression unit 201 and the learning model application feature amount restoration unit 202. ) Use an error function called a calculation function.

With reference to FIG. 12, a square error calculation function, which is an error function used for general machine learning learning, and an epsilon-insensitive loss, which is an error function used in the second embodiment, are used. The calculation function will be described.

FIG. 12 shows the following two error functions.
(1) Square error calculation function (2) Epsilon-insensitive loss calculation function In each graph, the vertical axis (y) is the loss and the horizontal axis (x) is the difference (deviation degree) of the input / output data. ) Is shown.

The difference (degree of deviation) of the input / output data on the horizontal axis of each graph corresponds to the difference between the input data 221 of the data compression device 200 shown in FIG. 8 and the feature amount restoration data 223, that is, the residual data 224. ..

The relational expression between the loss (y) in the square error calculation function and the epsilon-insensitive loss calculation function and the difference (deviation degree) (x) of the input / output data is as follows (Equation 1). , (Equation 2).

In the above (Equation 1) and (Equation 2), i is an identifier of a component of the residual data. For example, when the data to be compressed is an image, it is a pixel identifier of the image constituent pixels.

When the squared error calculation function shown in FIG. 12 (1) is used as the loss function for evaluating the parameters of the training model, the loss increases significantly as the input / output difference, that is, the value of each element of the residual data increases (evaluation). The value drops significantly).
That is, when the squared error calculation function is used as the loss function, the learning model is generated and updated so as to make each element of the residual as small as possible.

On the other hand, when the epsilon-insensitive loss calculation function shown in FIG. 12 (2) is used as the loss function for evaluating the parameters of the training model, the absolute value of the input / output difference, that is, the residual It can be seen that the loss increases (the evaluation value decreases) when the value of each element of the data becomes larger than the specified value (ε: epsilon).

That is, when the epsilon-insensitive loss calculation function shown in FIG. 12 (2) is used as the loss function for evaluating the parameters of the training model, the value of each element of the residual data is the specified value. The training model is generated and updated so that it fits within (ε: epsilon).

As an example, the value of the specified value (ε: epsilon),
ε = 11111111
= 255
Let us consider the case.

In this case, the learning model is generated and updated so that the absolute value of the input / output difference, that is, the value of each element of the residual data is 255 (= 11111111) or less.

For example, as described above with reference to FIG. 9, when the input data is an image and the constituent pixels of the image are 32-bit data, the value of the residual data of each pixel, which is each element, is 255 (=). 11111111) The training model is generated and updated so as to be as follows.
That is, the learning model is generated and updated so that the value of the residual data of each pixel is 0 to 255 (= 00000000-111111111) or less, that is, the value fits in the lower 8 bits.

If the value of the residual data of each element is 0 to 255 (= 00000000-111111111) or less, that is, if it is within the lower 8 bits, the size of the value does not matter. , There are less restrictions on the size of the value.

In the data compression device 200 shown in FIG. 8 described in the first embodiment described above, as described above with reference to FIGS. 9 and 10, many blocks in which all the bits of the residual division block are 0 are created. This improves the compression efficiency.

As described with reference to FIGS. 9 and 10, when each element (each pixel) is 32-bit data, the residual data of each element (each pixel) is in the range of 0 to 255 (= 00000000 to 11111111). That is, if it can be converged to a value that can be expressed by the lower 8 bits, the constituent bits of the 3 blocks on the upper bit side of the residual division blocks 225-1 to 4 shown in FIGS. 9 and 10 are all 0.
With such a setting, it is possible to remarkably increase the compression efficiency of the residual division block.

That is, the epsilon-insensitive loss calculation function shown in FIG. 12 (2) is used as the loss function for evaluating the parameters of the learning model, and the specified value (ε: epsilon) is further divided into the residual division block. The learning model is generated and updated by setting it as a value that can be expressed by the number of bits included in the block on the least significant bit side of.

By setting the learning model generated and updated in this way as the learning model to be used in the learning model applied feature amount compression unit 201 of the data compression device 200 shown in FIG. 8 and the learning model applied feature amount restoring unit 202, the remaining It is possible to set the value of each element of the difference data to fit in the number of bits included in the block on the least significant bit side.

As a result, it is possible to remarkably increase the compression efficiency of the residual division block, and it is possible to reduce the amount of residual division block compressed data generated by the residual division block encoder 205 shown in FIG. 8, and finally. The amount of compressed data for output 227 can be reduced.

Next, the generation and update sequence of the learning model in the second embodiment will be described with reference to the flowchart shown in FIG.
The learning model generated and updated according to the flowchart shown in FIG. 13 is stored in the learning model storage unit 210 of the data compression device 200 described above with reference to FIG. 8 (Example 1), and the learning model application feature amount. It is used for feature amount data compression and feature amount data restoration processing in the compression unit 201 and the learning model application feature amount restoration unit 202.

The process according to the flow shown in FIG. 13 can be executed by the data compression device 200 shown in FIG. 8 or another device.
The processing of each step of the flowchart shown in FIG. 13 will be sequentially described.

(Step S251)
First, in step S251, a learning model as a template is input.
The learning model that serves as a template includes a learning model for compression processing for generating feature amount compressed data and a learning model for restoration processing for generating restoration data from feature amount compression data.

The learning model for compression processing includes various parameters applied to the compression processing, and the learning model for restoration processing includes various parameters applied to the restoration processing.
The parameters of the learning model of the initial template can be set to any value.

(Step S252)
Next, in step S252, the epsilon-insensitive loss calculation function used as the loss function of the learning model input in step S251, that is, the epsilon-insensitivity booros calculation shown in FIG. 12 (2). Determine the function parameters (ε: epsilon).
For example, the value described above, that is,
ε = 255 = 11111111
Such parameters (ε: epsilon) are determined.
The value of this parameter (ε: epsilon) can be determined to be an arbitrary value.

(Step S253)
Next, in step S253, the present input in step S251 using the loss function in which the parameter (ε: epsilon) determined in step S252 is set, that is, the epsilon-insensitive loss calculation function is used. Evaluate the parameters of the learning model of.

Specifically, the feature amount is extracted from the input data, and the loss according to the degree of deviation between the data restored from the extracted feature amount and the input data is determined by the function represented by the function (Equation 2) described above, that is, , Calculated and evaluated using the epsilon-insensitivity boulos calculation function shown in FIG. 12 (2).

As explained earlier, in the evaluation using the epsilon-insensitivity booloss calculation function, if the deviation between the restored data and the input data is less than the specified value (ε), the loss is evaluated as 0 (evaluation value = MAX). Will be done. The parameters of the learning model are evaluated using such a loss function.

(Steps S254 to S255)
Next, in step S254, the parameter update amount is calculated based on the evaluation value calculated in step S253, and in step S255, the parameter is updated based on the calculation result.
The evaluation and update processing of these parameters is repeatedly executed a predetermined number of times.

(Step S256)
When it is determined in step S256 that the parameter evaluation and update processing of steps S254 to S255 has reached a predetermined number of times, the process proceeds to step S257.

(Step S257)
Finally, in step S257, a learning model in which the finally updated parameters are set is generated by iterative processing of the specified number of sentences of the parameter evaluation and update processing of steps S254 to S255, and the learning model storage unit (FIG. 8). It is stored in the learning model storage unit 210).

The learning model generated in this way is used in the learning model applied feature amount compression unit 201 and the learning model applied feature amount restoration unit 202 of the data compression device 200 shown in FIG. 8, and the feature amount is compressed and restored. It is said.

This learning model is a learning model in which the difference between each element (for example, a pixel) of the compression target data and the feature amount restoration (elongation) data is set to a specified value (ε) or less as much as possible.
The difference calculated by the difference calculation 203 shown in FIG. 8 is calculated by performing the processing applying this learning model in the learning model applied feature amount compression unit 201 and the learning model applied feature amount restoring unit 202 of the data compression device 200 shown in FIG. The difference (residual) of each element of the data 224 is substantially equal to or less than the specified value (ε).

That is, it is possible to set the value of each element of the residual data to be substantially within the number of bits included in the block on the least significant bit side.
As a result, it is possible to remarkably increase the compression efficiency of the residual division block, and it is possible to reduce the amount of residual division block compressed data generated by the residual division block encoder 205 shown in FIG. 8, and finally. The amount of compressed data for output 227 can be reduced.

[4-3. (Example 3) Regarding the process of determining the optimum block division mode of residual data that enhances compression efficiency, that is, the embodiment of determining the block division mode that enables generation of residual compressed data with a smaller amount of data]
Next, as (Example 3), an embodiment of determining the optimum block division mode of residual data that enhances compression efficiency, that is, determining a block division mode that enables generation of residual compressed data with a smaller amount of data. Will be described.

In addition, this Example 3 corresponds to an example of the process of generating the residual data block in the residual data block division part 204 of the data compression apparatus 200 shown in FIG. 8 described in Example 1.
This is an example of the processing of steps S206 to S207 in the processing flow described above with reference to FIG.

In the third embodiment, in order to improve the compression efficiency of the data compression device 200 shown in FIG. 8 described in the first embodiment, the residual data block division 204 of the data compression device 200 shown in FIG. 8 has a residual error. In this embodiment, the semi-optimal number of bits that separate each element is determined.

As a processing example of the residual data block dividing unit 204, in the processing example described above with reference to FIG. 9, each element (pixel) of the input data 221 and the feature amount restoration data 223 is an int type (integer type) of 32 bits. An example of dividing a 32-bit bit string into 8-bit 4-blocks has been described, assuming that the data is the same as the above.

However, the bit division processing example shown in FIG. 9 is an example, and various division modes are possible.
For example, it is possible to set to divide into two blocks in units of 16 bits, or to divide by a different number of bits such as 15 bits from the upper bit, the next 10 bits, and the last 7 bits.

The number of bits that divide each element of the residual data determines the range of values that each block can take and the number of blocks in which all the bits in the block are 0. Therefore, the number of divided bits is an important parameter that affects the compression efficiency.

The third embodiment is an example of a data compression device that determines a quasi-optimal number of bits for dividing each element of the residual in the residual data block partitioning unit 204 of the data compression device 200 shown in FIG.
The residual data block division unit 204 determines the semi-optimal number of division bits based on a value called "redundancy" which is a guideline for the compression ratio.

Note that the processing according to the third embodiment is not an essential configuration of the data compression device 200 shown in FIG. 8 according to the first embodiment described above, and is an option when the calculation resources of the data compression device 200 are sufficient. It can be executed as.

The flowchart shown in FIG. 14 is a processing sequence for determining the number of divided bits of the residual data 224 executed by the residual data block dividing unit 204 of the data compression device 200 shown in FIG. 8 according to the third embodiment.
Hereinafter, the processing of each step of the flowchart shown in FIG. 14 will be described.

(Step S301)
First, in step S301, the residual data block division unit 204 of the data compression device 200 shown in FIG. 8 determines a plurality of “residual data delimiter bit number candidates” for dividing the residual data into a plurality of blocks.

(Step S302)
Next, in step S302, the residual data block division unit 204 selects one “residual data delimiter bit number candidate” from the plurality of “residual data delimiter bit number candidates” determined in step S301.

(Step S303)
Next, the residual data block division unit 204 applies one “residual data delimiter bit number candidate” selected in step S302 to the residual data 224 input from the difference calculation unit 203 in step S303. To generate a residual division block.

(Step S304)
Next, in step S304, the residual data block division unit 204 calculates the redundancy of each of the residual division blocks generated in step S303, and calculates the average redundancy which is the average value of all the calculated redundancy blocks. do.

The redundancy is defined by the following (Equation 3).
Redundancy: r = 1- (H / H_max) = 1- (entropy / maximum entropy)
... (Equation 3)

The entropy shown in the above (Equation 3) is defined by the following (Equation 4).
Entropy: H = -Σ _k p _i log ₂ p _i ... (Equation 4)

In the above (Equation 4)
p _i represents the probability of occurrence of a certain numerical value contained in the residual division blocks.
For example, suppose that the residual division block contains four numerical values 1, 2, 2, and 4.
At this time, the appearance probabilities of the numerical values 1, 2, and 4 are 1/4, 1/2, and 1/4, respectively.

The maximum entropy is assumed to be 1/3 in the example when the appearance probabilities of all the numerical values are equal, that is, when four numerical values of 1, 2, 2, and 4 are included in the residual division block. It is the value of entropy when.

In step S304, the residual data block division unit 204 calculates the redundancy of the residual division block generated in step S303 using the above (Equation 3).

In step S304, the residual data block division unit 204 calculates the redundancy of the residual division block for all of the plurality of residual division blocks, and also calculates the average value (average redundancy) thereof.
The calculated average redundancy can be regarded as an evaluation value for estimating the ease of compressing the residual data, that is, how much the compression rate can be increased.

(Step S305)
Next, in step S305, the residual data block division unit 204 determines whether or not the redundancy calculation process for all the candidates of the plurality of “residual data delimiter bit number candidates” determined in step S301 has been completed. do.

If there is an unprocessed "residual data delimiter bit number candidate", the process returns to step S302, and the processes of steps S302 to S304 are executed for the unprocessed "residual data delimiter bit number candidate".

If it is determined in step S305 that the redundancy calculation process for all the candidates of the plurality of "residual data delimiter bit number candidates" determined in step S301 is completed, the process proceeds to step S306.

(Step S306)
Next, in step S306, the residual data block division unit 204 applies the “candidate for the number of residual data delimiter bits” having the highest average calculated average redundancy to the final “residual” for generating the residual division block. It is determined as "the number of bits separated by difference data".

The number of bits when the average value of the redundancy of each block is the largest can be regarded as the number of block division bits of the semi-optimal residual data. The reason why the "quasi" optimum is used is that the lexicographic compression method and the entropy code cannot always compress the data according to the redundancy. However, in many cases, the data compression device 200 of the first embodiment described above with reference to FIG. 8 by separating each element of the residual data by the number of bits determined according to the flow shown in FIG. It can be expected to improve the compression efficiency.

[5. About the hardware configuration example of the data compression device]
Next, a hardware configuration example of the data compression device will be described with reference to FIG.
The hardware described with reference to FIG. 15 is one specific hardware configuration example of the data compression device 200 described with reference to FIG. 8 above.

The CPU (Central Processing Unit) 301 functions as a control unit or a data processing unit that executes various processes according to a program stored in the ROM (Read Only Memory) 302 or the storage unit 308. For example, the process according to the sequence described in the above-described embodiment is executed. The RAM (Random Access Memory) 303 stores programs and data executed by the CPU 301. These CPU 301, ROM 302, and RAM 303 are connected to each other by a bus 304.

The CPU 301 is connected to the input / output interface 305 via the bus 304, and the input / output interface 305 is connected to an input unit 306 consisting of various switches, a keyboard, a mouse, a microphone, a sensor, etc., and an output unit 307 consisting of a display, a speaker, and the like. Has been done. The CPU 301 executes various processes in response to a command input from the input unit 306, and outputs the process results to, for example, the output unit 307.

The storage unit 308 connected to the input / output interface 305 is composed of, for example, a hard disk or the like, and stores programs executed by the CPU 301 and various data. The communication unit 309 functions as a transmission / reception unit for Wi-Fi communication, Bluetooth (registered trademark) (BT) communication, and other data communication via a network such as the Internet or a local area network, and communicates with an external device.

The drive 310 connected to the input / output interface 305 drives a removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card, and records or reads data.

[6. Summary of the structure of this disclosure]
As described above, the examples of the present disclosure have been described in detail with reference to the specific examples. However, it is self-evident that one of ordinary skill in the art can modify or substitute the examples without departing from the gist of the present disclosure. That is, the present invention has been disclosed in the form of an example, and should not be construed in a limited manner. In order to judge the gist of this disclosure, the column of claims should be taken into consideration.

The technology disclosed in the present specification can have the following configuration.
(1) A feature amount compression unit that applies a learning model to generate feature amount compression data of input data, and
A feature restoration unit that generates feature restoration data by executing restoration processing on the feature compression data, and a feature restoration unit.
A difference calculation unit that calculates residual data, which is the difference data between the input data and the feature amount restoration data,
A residual data block division unit that divides the residual data into blocks to generate a plurality of residual division blocks, and a residual data block division unit.
A residual division block unit encoder that executes compression processing for each of the plurality of residual division blocks and generates multiple residual division block compression data.
An output compression data generation unit that generates output compression data by synthesizing the feature amount compression data generated by the feature amount compression unit and a plurality of residual division block compression data generated by the residual division block unit encoder. Data compression device to have.

(2) The feature amount compression unit is
The data compression device according to (1), wherein a feature amount included in the input data is selected and a lossy compression process for generating the feature amount compression data is executed.

(3) The residual division block unit encoder is
The data compression device according to (1) or (2), which executes a lossless compression process.

(4) The feature amount restoration unit is
The data compression device according to any one of (1) to (3), wherein a learning model is applied to execute a restoration process on the feature amount compressed data.

(5) The data compression device is
The data compression device according to any one of (1) to (4), which has a learning model storage unit that stores the learning model.

(6) The feature amount compression unit is
The data compression device according to any one of (1) to (5), which is configured to generate the feature amount compression data by applying a learning model including parameters applied to the feature amount compression process of input data.

(7) The parameters included in the learning model are
The data compression device according to (6), which is a parameter generated based on the evaluation result using the epsilon-insensitive loss calculation function, which is a loss function for evaluating the performance of the training model. ..

(8) The residual data block partitioning portion is
The data compression device according to any one of (1) to (7), wherein the residual data is divided into blocks for each element to generate a plurality of residual division blocks.

(9) The residual data block partitioning portion is
The data compression according to any one of (1) to (8), wherein the bit string constituting each element of the residual data is divided into a plurality of blocks from the most significant bit to the least significant bit to generate a plurality of residual division blocks. Device.

(10) The residual data block partitioning portion is
For all the elements of the residual data, the bit strings are separated at the same delimiter position.
The data compression device according to any one of (1) to (9), wherein a plurality of residual division blocks are generated by using the configuration data of all the elements in the same division position as the configuration data of one residual division block.

(11) The residual data block partitioning portion is
The data compression device according to any one of (1) to (10), which executes a process of determining the number of delimiter bits for generating the residual division block.

(12) The residual data block partitioning portion is
The redundancy of the residual division block generated by applying a plurality of different delimiter bits to the same residual data is calculated.
The data compression device according to (11), wherein the optimum number of delimiter bits is determined based on the calculated redundancy.

(13) The redundancy is a redundancy that shows a larger value as the compression efficiency is higher.
The residual data block partitioning unit
The data compression device according to (12), wherein the number of delimiter bits having the maximum calculated redundancy is determined as the optimum number of delimiter bits.

(14) The residual division block unit encoder is
Any one of (1) to (13), in which lossless compression processing is individually executed for each of the plurality of residual division blocks generated by the residual data block division unit to generate a plurality of residual division block compressed data. The data compression device described in.

(15) A data compression method executed in a data compression device.
The feature compression process that the feature compression unit applies the learning model to generate the feature compression data of the input data,
The feature amount restoration process in which the feature amount restoration unit executes the restoration process for the feature amount compressed data to generate the feature amount restoration data, and
The difference calculation unit calculates the residual data, which is the difference data between the input data and the feature amount restoration data, and the difference calculation process.
Residual data block division processing in which the residual data block division unit divides the residual data into blocks to generate a plurality of residual division blocks, and
The residual division block unit encoder executes the compression processing of each of the plurality of residual division blocks to generate a plurality of residual division block compression data, and the residual division block unit encoding processing.
The output compression data generation unit generates the output compression data by synthesizing the feature amount compression data generated by the feature amount compression unit and a plurality of residual division block compression data generated by the residual division block unit encoder. Output compression A data compression method that executes data generation processing.

Further, the series of processes described in the specification can be executed by hardware, software, or a composite configuration of both. When executing processing by software, install the program that records the processing sequence in the memory in the computer built in the dedicated hardware and execute it, or execute the program on a general-purpose computer that can execute various processing. It can be installed and run. For example, the program can be pre-recorded on a recording medium. In addition to installing on a computer from a recording medium, it is possible to receive a program via a network such as LAN (Local Area Network) or the Internet and install it on a recording medium such as a built-in hard disk.

The various processes described in the specification are not only executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. Further, in the present specification, the system is a logical set configuration of a plurality of devices, and the devices having each configuration are not limited to those in the same housing.

As described above, according to the configuration of one embodiment of the present disclosure, the configuration and processing in which the compression efficiency is improved in the lossless compression processing for generating the feature amount compressed data and the residual data compressed data to which the learning model is applied. Is realized.
Specifically, for example, a feature quantity compression unit that applies a learning model to generate feature quantity compressed data of input data, a feature quantity restoration unit that generates feature quantity restoration data by restoring feature quantity compressed data, and an input. A difference calculation unit that calculates the residual data that is the difference between the data and the feature amount restoration data, a residual data block division unit that generates a plurality of residual division blocks from the residual data, and a compression process of the residual division block. Residual division block unit encoder that generates multiple residual division block compression data, and output compression data generator that synthesizes feature amount compression data and residual division block compression data to generate output compression data. Have.
With this configuration, it is possible to realize a configuration and processing in which the compression efficiency is improved in the lossless compression processing for generating the feature amount compressed data and the residual data compressed data to which the learning model is applied.

20 Data compression device 21 Feature encoder 22 Feature decoder 23 Difference calculation unit 24 Residual encoder 25 Output compression data generation unit (bitstream generation unit)
26 Communication unit 27 Storage unit 50 Data restoration device 51 Communication unit 52 Data separation unit 53 Feature amount decoder 54 Residual decoder 55 Synthesis unit 100 Data compression device 101 Learning model application feature amount compression unit 102 Learning model application feature amount restoration unit 103 Difference Calculation unit 104 Residual encoder 105 Output compressed data generation unit (bit stream generation unit)
110 Learning model storage unit 150 Data restoration device 151 Data separation unit 152 Learning model application feature amount restoration unit 153 Residual decoder 154 Synthesis unit 160 Learning model storage unit 200 Data compression device 201 Learning model application feature amount compression unit 202 Learning model application feature Amount restoration unit 203 Difference calculation unit 204 Residual data block division unit 205 Residual division block unit encoder 206 Compressed data generation unit for output (bit stream generation unit)
210 Learning model storage 301 CPU
302 ROM
303 RAM
304 Bus 305 Input / output interface 306 Input unit 307 Output unit 308 Storage unit 309 Communication unit 310 Drive 311 Removable media

Claims

A feature compression unit that applies a learning model to generate feature compression data for input data,
A feature restoration unit that generates feature restoration data by executing restoration processing on the feature compression data, and a feature restoration unit.
A difference calculation unit that calculates residual data, which is the difference data between the input data and the feature amount restoration data,
A residual data block division unit that divides the residual data into blocks to generate a plurality of residual division blocks, and a residual data block division unit.
A residual division block unit encoder that executes compression processing for each of the plurality of residual division blocks and generates multiple residual division block compression data.
An output compression data generation unit that generates output compression data by synthesizing the feature amount compression data generated by the feature amount compression unit and a plurality of residual division block compression data generated by the residual division block unit encoder. Data compression device to have.
The feature amount compression unit is
The data compression device according to claim 1, wherein a feature amount included in the input data is selected and a lossy compression process for generating the feature amount compressed data is executed.
The residual division block unit encoder is
The data compression device according to claim 1, wherein the lossless compression process is executed.
The feature amount restoration unit is
The data compression device according to claim 1, wherein a learning model is applied to execute a restoration process on the feature amount compressed data.
The data compression device is
The data compression device according to claim 1, further comprising a learning model storage unit that stores the learning model.
The feature amount compression unit is
The data compression device according to claim 1, wherein a learning model including a parameter applied to a feature amount compression process of input data is applied to generate the feature amount compression data.
The parameters included in the learning model are
The data compression device according to claim 6, which is a parameter generated based on an evaluation result using an epsilon-insensitive loss calculation function, which is a loss function for evaluating the performance of a training model. ..
The residual data block partitioning unit
The data compression device according to claim 1, wherein the residual data is divided into blocks for each element to generate a plurality of residual division blocks.
The residual data block partitioning unit
The data compression device according to claim 1, wherein the bit string constituting each element of the residual data is divided into a plurality of blocks from the most significant bit to the least significant bit to generate a plurality of residual division blocks.
The residual data block partitioning unit
For all the elements of the residual data, the bit strings are separated at the same delimiter position.
The data compression device according to claim 1, wherein a plurality of residual division blocks are generated by using the configuration data of all the elements in the same division position as the configuration data of one residual division block.
The residual data block partitioning unit
The data compression device according to claim 1, wherein the process of determining the number of delimiter bits for generating the residual division block is executed.
The residual data block partitioning unit
The redundancy of the residual division block generated by applying a plurality of different delimiter bits to the same residual data is calculated.
The data compression device according to claim 11, wherein the optimum number of delimiter bits is determined based on the calculated redundancy.
The redundancy is a redundancy that shows a larger value as the compression efficiency is higher.
The residual data block partitioning unit
The data compression device according to claim 12, wherein the number of delimiter bits having the maximum calculated redundancy is determined as the optimum number of delimiter bits.
The residual division block unit encoder is
The data compression device according to claim 1, wherein a lossless compression process is individually executed for each of the plurality of residual division blocks generated by the residual data block division unit to generate a plurality of residual division block compressed data. ..
It is a data compression method executed in a data compression device.
The feature compression process that the feature compression unit applies the learning model to generate the feature compression data of the input data,
The feature amount restoration process in which the feature amount restoration unit executes the restoration process for the feature amount compressed data to generate the feature amount restoration data, and
The difference calculation unit calculates the residual data, which is the difference data between the input data and the feature amount restoration data, and the difference calculation process.
Residual data block division processing in which the residual data block division unit divides the residual data into blocks to generate a plurality of residual division blocks, and
The residual division block unit encoder executes the compression processing of each of the plurality of residual division blocks to generate a plurality of residual division block compression data, and the residual division block unit encoding processing.
The output compression data generation unit generates the output compression data by synthesizing the feature amount compression data generated by the feature amount compression unit and a plurality of residual division block compression data generated by the residual division block unit encoder. Output compression A data compression method that executes data generation processing.