WO2023095204A1 - Learning device, autoencoding device, learning method, autoencoding method, and program - Google Patents

Learning device, autoencoding device, learning method, autoencoding method, and program Download PDF

Info

Publication number
WO2023095204A1
WO2023095204A1 PCT/JP2021/042980 JP2021042980W WO2023095204A1 WO 2023095204 A1 WO2023095204 A1 WO 2023095204A1 JP 2021042980 W JP2021042980 W JP 2021042980W WO 2023095204 A1 WO2023095204 A1 WO 2023095204A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoding
feature amount
self
main data
learning
Prior art date
Application number
PCT/JP2021/042980
Other languages
French (fr)
Japanese (ja)
Inventor
忍 工藤
幸浩 坂東
正樹 北原
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to JP2023563381A priority Critical patent/JPWO2023095204A1/ja
Priority to PCT/JP2021/042980 priority patent/WO2023095204A1/en
Publication of WO2023095204A1 publication Critical patent/WO2023095204A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to a learning device, a self-encoding device, a learning method, a self-encoding method and a program.
  • Non-Patent Document 1 a technology called VQVAE.
  • end-to-end learning is not performed, and the distribution of feature vectors is assumed to be a uniform distribution, and optimization of encoding and decoding and optimization of representative vectors are separated. to implement.
  • the technique of Non-Patent Document 2 is a technique called Soft to Hard.
  • the technique of Non-Patent Document 2 performs end-to-end learning by approximating the quantization process using a softmax function and approximating the probability of occurrence of representative vectors with a histogram.
  • Non-Patent Document 1 and the technique of Non-Patent Document 2 have problems related to obtaining self-encoding processing using vector quantization.
  • the problem can be, for example, a stability problem, or a processing performance problem, for example, when the dimension of the vector is large, or when encoding at a high rate, the amount of computation or memory usage increases exponentially. It was a problem that would end up.
  • the burden required to obtain self-encoding processing using vector quantization was sometimes large.
  • the load required for self-encoding using vector quantization is large, such as the fact that self-encoding using vector quantization cannot be realized due to lack of computer resources.
  • an object of the present invention is to provide a technology that reduces the burden required for self-encoding using vector quantization.
  • One aspect of the present invention is a process of self-encoding using vector quantization.
  • the learning device estimates the occurrence probability using a result of integration of a probability density function parameterized with a given area as an integration area.
  • One aspect of the present invention is a self-encoding target acquisition unit that acquires a self-encoding target, and a self-encoding process using vector quantization. and an auxiliary feature amount that is a feature amount of the main data feature amount, entropy coding of the result of vector quantization of the main data feature amount and scalar quantization of the auxiliary feature amount.
  • a learning unit that updates the encoding and decoding processes in self-encoding processing that performs entropy encoding on the converted result by learning, and the learning unit indicates the main data feature amount in the learning.
  • main-data-side probability estimation processing for estimating the probability of occurrence of each element of the tensor, wherein the main-data-side probability estimation processing divides a vector space in which representative vectors are arranged in a lattice, and divides the vector space into Estimate the occurrence probability using a learning device that estimates the occurrence probability using the result of integrating the probability density function parameterized as an integration region that is a region that includes one lattice point of and has a hypercube shape a self-encoding execution unit that performs self-encoding by vector quantization on the target acquired by the self-encoding target acquisition unit, using the learned encoding process and the learned decoding process, is an autoencoding device comprising:
  • One aspect of the present invention is a process of self-encoding using vector quantization.
  • This learning method estimates the occurrence probability using a result of integration of a probability density function parameterized with a given area as an integration area.
  • One aspect of the present invention is a self-encoding target acquisition step for acquiring a self-encoding target, and a self-encoding process using vector quantization, wherein the main data feature is a feature quantity of the self-encoding target. and an auxiliary feature amount that is a feature amount of the main data feature amount, entropy coding of the result of vector quantization of the main data feature amount and scalar quantization of the auxiliary feature amount.
  • a learning unit that updates the encoding and decoding processes in self-encoding processing that performs entropy encoding on the converted result by learning, and the learning unit indicates the main data feature amount in the learning.
  • main-data-side probability estimation processing for estimating the probability of occurrence of each element of the tensor, wherein the main-data-side probability estimation processing divides a vector space in which representative vectors are arranged in a lattice, and divides the vector space into Estimate the occurrence probability using a learning device that estimates the occurrence probability using the result of integrating the probability density function parameterized as an integration region that is a region that includes one lattice point of and has a hypercube shape
  • a self-encoding execution step of performing self-encoding by vector quantization on the target obtained by the self-encoding target obtaining step, using the learned encoding process and the learned decoding process, is a self-encoding method with .
  • One aspect of the present invention is a program for causing a computer to function as the above learning device.
  • One aspect of the present invention is a program for causing a computer to function as the above self-encoding device.
  • the present invention makes it possible to reduce the burden required for self-encoding using vector quantization.
  • FIG. 4 is an explanatory diagram for explaining LatticeVQ in the embodiment
  • FIG. 4 is a first explanatory diagram for explaining an example of adding noise in the embodiment
  • FIG. 7 is a second explanatory diagram for explaining an example of adding noise in the embodiment
  • the figure which shows an example of the flow of the process which the learning part in embodiment performs. 1 is a first explanatory diagram for explaining an outline of an autoencoding device according to an embodiment
  • FIG. 2 is a second explanatory diagram for explaining the outline of the self-encoding device in the embodiment; 4 is a flowchart showing an example of the flow of processing executed by an encoder according to the embodiment; 4 is a flowchart showing an example of the flow of processing executed by a decoder according to the embodiment;
  • the figure which shows an example of a structure of the control part with which the learning apparatus in embodiment is provided.
  • the figure which shows an example of the hardware constitutions of the self-encoding apparatus in embodiment The figure which shows an example of a structure of the control part with which the self-encoding apparatus in embodiment is provided.
  • FIG. 1 is an explanatory diagram for explaining an overview of the learning device 1 of the embodiment.
  • the learning device 1 includes a learning section 10 .
  • the learning unit 10 performs learning so as to improve the performance of self-encoding using vector quantization of data represented by a tensor.
  • the performance of self-encoding is evaluated by the smallness of the RD cost (D+ ⁇ R), which is the weighted sum of the error D between the original data and the restored data and the code amount R of the data with the Lagrangian constant ⁇ .
  • D+ ⁇ R the RD cost
  • a smaller RD cost indicates better RD performance.
  • learning means machine learning. Learning is, for example, deep learning.
  • data self-encoding means compression of data.
  • the learning unit 10 includes a learning network 100 and an optimization unit 113.
  • Learning network 100 is a neural network.
  • the learning network 100 includes a main data acquisition unit 101, a main data side encoding unit 102, an auxiliary data side encoding unit 103, a main data side noise addition unit 104, an auxiliary data side noise addition unit 105, and an auxiliary data side A probability estimation unit 106, an auxiliary data side decoding unit 107, an auxiliary entropy acquisition unit 108, a main data side probability estimation unit 109, a main entropy acquisition unit 110, a main data side decoding unit 111, and a reconstruction error calculation unit. 112 and.
  • the optimization unit 113 updates the learning network 100 based on the output of the learning network 100 .
  • the main data acquisition unit 101 acquires data represented by a tensor as main data.
  • Data represented by a tensor is, for example, image data.
  • the tensor-expressed data may be, for example, time-series data of one or more channels of audio.
  • the data acquired by the main data acquisition unit 101 is hereinafter referred to as main data.
  • the main data side encoding unit 102 executes main data feature quantity acquisition processing.
  • the main data feature amount acquisition process is a process of encoding the main data.
  • Coding is a process of obtaining information indicating the characteristics of an object to be encoded, so encoding is a process of obtaining information indicating the amount of characteristics.
  • the encoding of the main data by the main data side encoding unit 102 is a process of acquiring the feature amount of the main data. Therefore, the main data feature amount acquisition process is a process of acquiring the main data feature amount.
  • the main data feature quantity is encoded main data.
  • the content of the main data feature quantity acquisition process is updated by learning. That is, the contents of the processing executed by main data side encoding section 102 are updated by learning.
  • the auxiliary data side encoding unit 103 further encodes the main data feature quantity.
  • encoding is a process of acquiring information indicating a feature amount
  • encoding of the main data feature amount by the auxiliary data side encoding unit 103 is a process of acquiring the feature amount of the main data feature amount. be. Therefore, hereinafter, information obtained by further encoding the encoded main data will be referred to as an auxiliary feature amount. That is, the auxiliary feature amount is information indicating the feature amount of the main data feature amount. Since the auxiliary feature amount is information obtained by encoding the main data feature amount, the entropy of the auxiliary feature amount is information smaller than the entropy of the main data feature amount.
  • the process of encoding the main data feature quantity will be referred to as the auxiliary feature quantity acquisition process.
  • the content of the auxiliary feature quantity acquisition process is updated by learning. That is, the contents of the processing executed by the auxiliary data side encoding unit 103 are updated by learning.
  • the content of the auxiliary feature acquisition process is updated through learning so that the amount of information in the main data statistics information is included in the auxiliary feature.
  • the main data statistic information is information indicating the statistic of each probability distribution followed by the value of each element of the tensor representing the main data.
  • a statistic of the probability distribution is, for example, the degree of dispersion.
  • the statistic of the probability distribution may be not only the degree of dispersion but also a set of the degree of dispersion and a representative value.
  • data generally appears according to a probability distribution, such as the probability distribution of the appearance of each Roman character that appears in English.
  • the value of each element of the tensor representing the main data also follows the probability distribution.
  • the main data side noise addition unit 104 executes noise-added main data feature quantity acquisition processing.
  • the noisy main data feature amount acquisition process is a process of applying vector noise to the main data feature amount.
  • the vector noise addition process is a process for processing a K-dimensional vector (hereinafter referred to as a "noise addition target vector") having a predetermined number K (K is an integer of 2 or more) of elements, This is the process of adding noise to the processing target. Therefore, the noise-added main data feature amount acquisition process is a process for acquiring information in which noise is added to the main data feature amount (hereinafter referred to as "noise-added main data feature amount").
  • the noise addition target vector is a K-dimensional vector included in the tensor that expresses the main data feature amount.
  • a K-dimensional vector contained in a tensor means that the k-th element of the vector (k is an integer of 1 or more and K or less) is the k-th element of consecutive K elements among all the elements of the tensor. There is, which means vector.
  • the number of elements K of the noise addition target vector is the number of elements mapped to one code by vector quantization by a device that performs vector quantization using the learning result of the learning unit 10 .
  • the result of learning by the learning unit 10 is hereinafter referred to as a network learning result. Therefore, when K elements are collectively mapped to one code by vector quantization using the network learning result, the number of elements of the noise addition target vector is K. Note that when K elements are collectively mapped to one code by quantization, the obtained code is an index indicating a K-dimensional vector.
  • the number of elements K of the noise addition target vector is a predetermined number. Since vector quantization of data is data encoding, vector quantization using network learning results means data encoding using network learning results.
  • Lattice is a set of lattice points in vector space.
  • LatticeVQ is vector quantization when representative vectors are arranged in a lattice in a vector space. That is, LatticeVQ is vector quantization that satisfies the condition that representative vectors are arranged in a lattice in a vector space.
  • LatticeVQ is known to have better RD performance than scalar quantization, except for certain conditions.
  • FIG. 2 is an explanatory diagram explaining LatticeVQ in the embodiment. More specifically, FIG. 2 is an explanatory diagram for explaining LatticeVQ when the number of elements of the noise addition target vector is two.
  • FIG. 2 is an example of an A2 lattice.
  • one type of lattice is the A2 lattice.
  • one type of lattice is the E8 lattice.
  • one type of lattice is the Reach lattice, each of which is a LatticeVQ lattice that maximizes the RD performance for uniform distributions.
  • FIG. 2 is also a diagram showing an example of a lattice space.
  • the lattice space is two-dimensional, but if the representative vectors are K-dimensional, the lattice space is also K-dimensional.
  • FIG. 3 is a first explanatory diagram illustrating an example of adding noise in the embodiment.
  • FIG. 4 is a second explanatory diagram illustrating an example of adding noise in the embodiment.
  • a plurality of random K-dimensional vectors uniformly distributed within a (K ⁇ 1)-dimensional sphere circumscribing the grid unit region of the origin grid (hereinafter referred to as the “target grid unit region”) are generated, Arrange in grid space.
  • the internal points in FIG. 4 are generated until they reach at least one or more.
  • a lattice unit area is each area resulting from dividing a lattice space into a plurality of areas so that a division condition is satisfied.
  • a lattice unit area is an area that divides a vector space in which representative vectors are arranged in a lattice and includes one lattice point of the vector space.
  • a region B1 in FIG. 3 is an example of the target lattice unit region.
  • the lattice unit regions are Voronoi regions.
  • a Voronoi region is each region obtained by dividing a metric space such as a lattice space into a plurality of regions by Voronoi division.
  • the noise points in FIG. 3 are an example of samples arranged within the circumscribed circle of the target grid unit area.
  • the process of arranging the samples in the lattice space is specifically the process of acquiring the coordinates in the lattice space. Since the lattice unit area in the lattice space is an area within the lattice space, the process of arranging the samples within the circumscribed circle of any one lattice unit area is the process of obtaining the coordinates within the circumscribed circle. be.
  • the process of distinguishing between internal points and excluded points is a process of determining, for each noise point, whether the coordinates are inside the target grid unit area or outside the target grid unit area based on the coordinates of the noise point.
  • the determination process is a process in which points that match the origin lattice as a result of vector quantization of the noise points by Equation (11) are determined to be coordinates within the region.
  • the addition process specifically means a process of adding a noise-addition target vector and a position vector indicating a selected noise point.
  • the position vector indicating the selected noise point is a type of noise because it is a quantity determined by random numbers. Since the position vector indicating the selected noise point is represented by a vector, the position vector indicating the selected noise point is noise represented by a vector. Therefore, the position vector indicating the selected noise point is hereinafter referred to as a noise vector. In this way, noise is added to the noise addition target vector.
  • the auxiliary data-side noise addition unit 105 executes auxiliary feature quantity acquisition processing with noise.
  • the noise-attached auxiliary feature acquisition process is a process of adding noise to the auxiliary feature. That is, the noise-attached auxiliary feature acquisition process is a process of acquiring information obtained by adding scalar noise to the auxiliary feature (hereinafter referred to as "noise-attached auxiliary feature"). More specifically, the auxiliary data noise adding unit 105 adds scalar noise to each element of the tensor representing the auxiliary feature amount.
  • the scalar noise is, for example, uniform noise between -1/2 and 1/2.
  • the auxiliary data side probability estimation unit 106 executes auxiliary data side probability estimation processing.
  • the auxiliary data side probability estimation process is a process of estimating the auxiliary data side probability based on the auxiliary feature amount with noise.
  • the auxiliary data side probability is information indicating the occurrence probability of each element of the tensor indicating the auxiliary feature amount.
  • Information on the probability distribution of each element of the tensor indicating the auxiliary feature is used for estimating the occurrence probability of each element of the tensor indicating the auxiliary feature.
  • the auxiliary data side probability estimation unit 106 stores the given probability distribution in a predetermined storage unit 14 or the like, which will be described later. obtained by reading from the storage device. Then, the auxiliary data side probability estimation unit 106 estimates the occurrence probability of each element of the tensor indicating the auxiliary feature value based on the acquired probability distribution.
  • the probability distribution given in advance is, for example, a probability distribution expressed using a cumulative distribution function.
  • the auxiliary data side probability estimation unit 106 calculates the auxiliary Estimate the occurrence probability of each element of the tensor representing the feature quantity.
  • the parameterized auxiliary feature cumulative distribution function is a parameterized function indicating the probability distribution of each element of the tensor indicating the auxiliary feature.
  • the parameter of the parameterized auxiliary feature cumulative distribution function is specifically a parameter that changes according to the statistic representing the probability distribution of each element of the tensor representing the auxiliary feature.
  • the parameter values of the parameterized auxiliary feature cumulative distribution function are updated by learning. That is, the content of the processing executed by the auxiliary data side probability estimation unit 106 is updated by learning.
  • a parametrized cumulative distribution function is, for example, a parametrized sigmoid function or soft plus function. Since the parameter values of the parameterized auxiliary feature cumulative distribution function are updated by learning as described above, the parameter values of the parameterized cumulative distribution function are updated by learning.
  • FIG. 5 is an explanatory diagram illustrating the relationship between cumulative distribution functions and occurrence probabilities in the embodiment.
  • An image G1 is an image showing an example of a probability density function that indicates the probability density of the random variable q.
  • indicates the step size of quantization.
  • the image G1 shows that the value obtained by integrating the probability density function over the range ⁇ within the domain centered on the value q is the occurrence probability p(q) of the value q.
  • the step size ⁇ is, for example, the magnitude of the closed interval [ ⁇ 1/2, 1/2] (ie 1).
  • Image G2 is the cumulative distribution function cdf(q) obtained as a result of integration of the probability density function of image G1.
  • the result of integration of the probability density function is represented by a monotonically increasing function such as a sigmoid function.
  • Image G2 shows that the probability of occurrence p(q) is equal to cdf(q+ ⁇ /2) ⁇ cdf(q ⁇ /2).
  • the auxiliary data side decoding unit 107 executes auxiliary feature amount decoding processing.
  • the auxiliary feature amount decoding process is a process for processing information obtained based on the auxiliary feature amount, and is a process for decoding the processing target.
  • a processing target of the auxiliary feature amount decoding process executed by the auxiliary data side decoding unit 107 is the auxiliary feature amount with noise.
  • the information obtained by decoding the auxiliary feature with noise will be referred to as auxiliary data. Therefore, the auxiliary data decoding unit 107 acquires auxiliary data by decoding the auxiliary feature with noise.
  • the content of the auxiliary feature decoding process is updated through learning so that the RD cost is reduced. That is, the contents of the processing executed by the auxiliary data side decoding unit 107 are updated by learning.
  • the auxiliary entropy acquisition unit 108 acquires auxiliary entropy based on the auxiliary data side probability, which is the estimation result of the auxiliary data side probability estimation unit 106 .
  • Auxiliary entropy is the entropy of auxiliary features.
  • the main data side probability estimation unit 109 executes main data side probability estimation processing.
  • the main data side probability estimation process is a process of estimating the main data side probability based on the noise-added main data feature amount and the auxiliary data.
  • the main data side probability is information indicating the occurrence probability of each element of the tensor indicating the main data feature amount.
  • Information on the probability distribution of each element of the tensor representing the main data feature is used to estimate the occurrence probability of each element of the tensor representing the main data feature.
  • the main data side probability estimation unit 109 calculates the parameterized main data feature amount cumulative distribution function Based on, the occurrence probability of each element of the tensor representing the main data feature is estimated.
  • the parameterized main data feature value cumulative distribution function is a cumulative distribution function that indicates the probability distribution of each element of the tensor that indicates the main data feature value and is a parameterized cumulative distribution function.
  • a probability distribution is, for example, a Gaussian distribution.
  • the parameter of the parameterized main data feature quantity cumulative distribution function is specifically a statistic representing the probability distribution of each element of the tensor representing the main data feature quantity. Therefore, in learning, the value indicated by the auxiliary data is used as the value of the parameter of the parameterized auxiliary feature cumulative distribution function.
  • the auxiliary data obtained in the learned learning network 100 indicates the representative value and scatter of the Gaussian distribution.
  • main data side probability estimation processing An example of main data side probability estimation processing will be described. More specifically, an example of main data side probability estimation processing when using the LatticeVQ described above will be described. Each vector of the main data is represented by one of the representative vectors when vector quantization is performed. Therefore, the explanation of the main data side probability estimation processing when using LatticeVQ is, more specifically, the explanation of the processing of estimating the occurrence probability of the representative vector when using LatticeVQ.
  • the process of estimating the occurrence probability of representative vectors will be referred to as representative vector occurrence probability estimation process.
  • the representative vector occurrence probability estimation process is an example of the main data side probability estimation process.
  • the main data side probability estimation process uses the parameterized main data feature amount cumulative distribution function.
  • the parameterized main data feature quantity cumulative distribution function is specifically a parameterized cumulative distribution function.
  • the cumulative distribution function is the result of integrating the probability density function. Therefore, representative vector occurrence probability estimation processing is processing using a cumulative distribution function. The use of the cumulative distribution function in the representative vector occurrence probability estimation process means that the result of integrating the parameterized probability density function in the grid unit area is used.
  • Voronoi region is one of the lattice unit regions as described above.
  • Voronoi tessellation is a well-known region segmentation method, and therefore is often used to obtain lattice unit regions.
  • the shape of the Voronoi region is hexagonal as described above.
  • a parametrized cumulative distribution function obtained using the results of hyperrectangular partitioning is used instead of Voronoi partitioning.
  • the hyperrectangular parallelepiped division is a process of dividing a lattice space into lattice unit regions each having a hyperrectangular parallelepiped shape.
  • a two-dimensional hyperrectangular parallelepiped is a rectangle, and a three-dimensional hyperrectangular parallelepiped is a rectangular parallelepiped.
  • FIG. 6 is an explanatory diagram explaining the hyperrectangular parallelepiped division in the embodiment. More specifically, FIG. 6 is a diagram showing an example of the result of hyperrectangular partitioning for a two-dimensional lattice space together with an example of the result of Voronoi partitioning. Both the “true region” and the “approximate region” in FIG. 6 are examples of grid unit regions.
  • the “true domain” is the domain resulting from the Voronoi division. That is, the “true domain” is the Voronoi domain.
  • a “true area” is a hexagonal lattice unit area.
  • “Approximate region” is the grid unit region resulting from the hypercube division.
  • the shape of the "approximation area” is a rectangular parallelepiped.
  • one example of the result of the hyperrectangular parallelepiped division executed by the main data side probability estimation unit 109 is the result of the division by the "approximate region”.
  • S1 in FIG. 6 is the length of the side of the rectangular parallelepiped which is the shape of the "approximation area", and indicates the length of the side of the first dimension in the two-dimensional lattice space
  • S2 is the length of the side of the second dimension. Indicates length.
  • the length of the first-dimensional side means the length of the rectangular parallelepiped in one oblique space when the rectangular parallelepiped in the grid space is projected into two orthogonal oblique spaces.
  • the length of the second-dimensional side means the length of the rectangular parallelepiped in the other oblique space.
  • the length of the n-th side of the hypercube in the N-dimensional lattice space means Means the length of the hyperrectangular parallelepiped.
  • a hypercube in oblique space is a straight line.
  • integration with a two- or more-dimensional area as the integration domain can be obtained by iterative integration.
  • iterative integration is performed to perform N times of integration on the function.
  • iterative integration can be performed with the integration region of each N times of integration as the sides of the hyperrectangular parallelepiped.
  • each integration is not affected by the results of other integrations.
  • each integral in the iterative integral is affected by the results of other integrals. Therefore, when the shape of the lattice unit region is a hyperrectangular parallelepiped, integration is easier than when the shape of the lattice unit region is a hyperpolyhedron that is not a hyperrectangular parallelepiped.
  • the "true region” has a hexagonal shape and the “approximate region” has a rectangular parallelepiped shape. It is easier than integration over regions.
  • the probability density function is a one-dimensional probability density function that is not affected by other dimensions in each integration of iterative integration. is performed.
  • the probability density function to be iteratively integrated is a function that indicates a predetermined type of distribution and is a parameterized function.
  • a predetermined type of distribution is a Gaussian distribution.
  • the values of the parameters of the probability density function are values according to the auxiliary data.
  • a parameterized cumulative distribution function is obtained by performing iterative integration on the probability density function using a hyperrectangular parallelepiped obtained by hyperrectangular parallelepiped as an integration region.
  • the values of the parameters of the probability density function are the values of the auxiliary data, so the parameterized cumulative distribution function thus obtained is a function according to the auxiliary data.
  • auxiliary data values are substituted for the parameters of the cumulative distribution function obtained in advance in this manner.
  • the cumulative distribution function thus obtained is used to estimate the occurrence probability of the representative vector.
  • the occurrence probability of representative vectors is estimated by executing the processes represented by the following equations (1) and (2).
  • the cumulative distribution function cdf in Equation (2) represents each cumulative distribution function obtained by each integration of iterative integration.
  • the result of integrating the m-dimensional (m is a natural number) cumulative distribution function for one dimension is the (m ⁇ 1)-dimensional cumulative distribution function.
  • Equation (1) is an example of the occurrence probability obtained using the parameterized main data feature quantity cumulative distribution function.
  • Equation (1) is a function parametrized via the cumulative distribution function cfd of Equation (2).
  • Equation (2) is a cumulative distribution function obtained as a result of integrating the probability density function in the one-dimensional direction of the grid unit area.
  • Equation (1) is the product of the occurrence probabilities of each dimension obtained by Equation (2). Therefore, equation (1) is the occurrence probability expressed using the result of integrating the probability density function over the entire lattice unit area.
  • Equation (1) indicates the occurrence probability of representative vectors. Since each grid point represents a representative vector, the occurrence probability p i at each grid point means the occurrence probability p i of each representative vector.
  • ⁇ j means the length of the dimension represented by the lattice unit region identifier j. Since the shape and size of the lattice unit area are predetermined, ⁇ j is a predetermined length.
  • y given a hat symbol represented by the following equation (3) means the feature amount of the main data with noise.
  • the symbol A with a hat is hereinafter referred to as A ⁇ .
  • the hated symbol y in equation (3) below is y ⁇ .
  • Equation (1) indicates that the occurrence probability p i is expressed by the product of the probabilities p i [j] obtained for each dimension. If the shape of the lattice unit region is not hyperrectangular, the occurrence probability p i cannot be expressed by a simple product of the probabilities p i [j] obtained for each dimension. Therefore, since the shape of the grid unit area is a hyperrectangular parallelepiped, the main data side probability estimation unit 109 can easily acquire the occurrence probability pi as shown by the equations (1) and (2).
  • equations (1) and (2) are obtained by assuming that the covariance between dimensions is zero.
  • the shape of the lattice unit area is a hyperrectangular parallelepiped
  • the covariance between dimensions can be reduced to 0 by rotating the coordinate axes of the lattice space so as to be parallel to each side of the hyperrectangular parallelepiped. Therefore, equations (1) and (2) are equations that are established by rotation of the coordinate axes even if the covariance between dimensions is not zero. It is a well-known fact in linear algebra that the rotation of the coordinate axes is a unitary transformation, so it is not a transformation that changes the contents of equations (1) and (2). Note that the process of setting the covariance to 0 is the process of diagonalizing the matrix representing the variance between dimensions.
  • the main data-side probability estimation process uses the result of integrating the parameterized probability density function with the grid unit region having a hyperrectangular parallelepiped shape as the integration region, and uses each of the tensors representing the main data feature quantity. Estimate the occurrence probability of an element.
  • hyperrectangular parallelepiped determination processing processing for determining the shape and size of a lattice unit region that at least satisfies the condition of being a hyperrectangular parallelepiped
  • hyperrectangular solid determination process grids adjacent to the origin grid are first calculated.
  • the origin grid is a grid point located at the origin of the grid space. Neighboring grids are grid points next to the origin.
  • hyperrectangular solid determination processing hyperrectangular solids satisfying hyperrectangular solid determination conditions are calculated.
  • the hypercube determination conditions include the condition that the lattice unit area of the origin lattice does not overlap with the lattice unit area of each adjacent lattice and the condition that the volume of the hypercube is equal to the volume of the Voronoi region.
  • the superscript in the hypercube [s 1 a , s 2 b ... s 3 c ] has a length of s 1 for each side from the 1st dimension to the ath dimension, and from the (a+1)th dimension to ( It indicates that the length of each side up to the a+b) dimension is s2 .
  • a superscript in a notation that uses a superscript is a character to be attached with the superscript, and for a series of dimensions equal to the number of superscripts, the number of sides of the hypercube Denote that the lengths are of the same length s.
  • s a indicates that for a sequence of a dimensions, the side length of the hyperrectangular parallelepiped is s. It should be noted that information on the order of dimensions is required for determining whether or not the dimensions are continuous, but the order of dimensions is a predetermined order.
  • [ ⁇ 1 2 , 0 6 ] indicates that there are cases of +1 and ⁇ 1. Therefore, [ ⁇ 1 2 , 0 6 ] is specifically [1, 1, 0 6 ], [ ⁇ 1, 1, 0 6 ], [1, ⁇ 1, 0 6 ], and [ ⁇ 1, ⁇ 1, 0 6 ].
  • the hyperrectangular parallelepiped of each adjacent lattice does not overlap with the hyperrectangular parallelepiped of the origin lattice.
  • One condition is that 7 or more of the sides (elements) of the hyperrectangular parallelepiped are 1 or less.
  • Another condition is that one or more of the sides (elements) of the hyperrectangular parallelepiped is 1/2 or less.
  • s [1/2, 1 6 , 2], for example, is obtained in the case of an eight-dimensional E8 lattice space.
  • the range of the lattice unit area is the range enclosed by the hyperrectangular parallelepiped represented by Equation (5) below.
  • the range of the lattice unit area is the range enclosed by the hyperrectangular parallelepiped represented by the following equation (6).
  • the range of the lattice unit area is the range enclosed by the hyperrectangular parallelepiped represented by the following equation (7).
  • the range of the hyperrectangular parallelepiped (the shape and size of the lattice unit area) is determined.
  • the main entropy acquisition unit 110 acquires the entropy of the main data feature amount based on the estimation result of the main data side probability estimation unit 109 .
  • the entropy of the main data feature quantity will be referred to as the main entropy.
  • the main data side decoding unit 111 executes main data feature amount decoding processing.
  • the main data feature amount decoding process is a process for processing information obtained based on the main data feature amount, and is a process for decoding the processing target.
  • the processing target of the main data feature amount decoding process executed by the main data side decoding unit 111 is the main data feature amount with noise.
  • the contents of the decoding process of the main data side decoding unit 111 are updated by learning.
  • the reconstruction error calculation unit 112 calculates the difference between the decoding result of the main data side decoding unit 111 and the main data acquired by the main data acquisition unit 101 .
  • the difference between the decoding result of the main data side decoding unit 111 and the main data acquired by the main data acquisition unit 101 is called a reconstruction error.
  • the difference between the decoding result of the main data side decoding unit 111 and the main data acquired by the main data acquisition unit 101 may be represented by, for example, the sum of mean square errors or binary cross entropy.
  • the optimization unit 113 updates the learning network 100 based on the auxiliary entropy, the primary entropy and the reconstruction error.
  • auxiliary entropy, primary entropy, and reconstruction error are all examples of outputs of learning network 100 .
  • the optimization unit 113 updates the learning network 100 so as to reduce the optimization error, primary entropy, and auxiliary entropy.
  • the symbol L represents an objective function.
  • Symbol D represents the reconstruction error.
  • the symbol lambda is a predetermined constant.
  • the symbol R y represents the auxiliary entropy.
  • the symbol Rz represents the principal entropy.
  • a small entropy means a short code length, so the optimization unit 113 updates the learning network 100 to reduce the entropy. Also, since the smaller the optimization error, the higher the accuracy of self-encoding, the optimization unit 113 updates the learning network 100 so as to reduce the optimization error.
  • the learning network 100 is updated by solving the minimization problem of the objective function L using the gradient method. That is, the learning network 100 is updated by updating the values of the parameters of the learning network 100 by, for example, the error backpropagation method.
  • the updating of the learning network 100 includes updating the main data side encoding unit 102, the auxiliary data side encoding unit 103, the auxiliary data side probability estimation unit 106, the auxiliary data side decoding unit 107, and the main data side decoding unit 111. It means to update the contents of the process.
  • the learning network 100 includes a main data side noise adding section 104 .
  • the main data side noise addition unit 104 itself is not a quantization process.
  • the performance of self-encoding using vector quantization is improved by performing learning precisely because the processing of the main data side noise addition unit 104 is included. I will explain why.
  • the main data noise addition unit 104 adds noise to the noise addition target vector. Therefore, the gradient does not become 0 by performing learning including the noise addition unit 104 on the main data side.
  • Peripheral processing is the processing that produces the information used in vector quantization. Specifically, the peripheral processing is executed by each of main data side encoding section 102, auxiliary data side encoding section 103, auxiliary data side probability estimation section 106, auxiliary data side decoding section 107, and main data side decoding section 111. processing.
  • FIG. 7 is a diagram showing an example of the flow of processing executed by the learning unit 10 in the embodiment.
  • the main data x is an N-dimensional vector having N elements from x1 to xN .
  • Each element from x 1 to x N is a tensor. Therefore, each element from x1 to xN may be a scalar or a vector.
  • the function f enc (x) is a function (hereinafter referred to as "main data encoding function") that expresses the encoding process of the main data x.
  • the main data feature quantity y is a tensor composed of k K-dimensional vectors.
  • the auxiliary data side encoding unit 103 executes auxiliary feature amount acquisition processing (step S103).
  • the function g enc (y) is a function that expresses the processing of encoding the main data feature quantity y (hereinafter referred to as "main data feature quantity encoding function").
  • the auxiliary feature z is a tensor such as a vector.
  • the main-data-side noise addition unit 104 executes noise-added main data feature amount acquisition processing (step S104).
  • yi ⁇ yi + uy .
  • yi represents the i-th vector element of the main data feature quantity y.
  • uy represents noise.
  • the auxiliary data side noise addition unit 105 executes auxiliary feature quantity acquisition processing with noise (step S105).
  • w is an integer of 1 or more.
  • z i ⁇ z i + u z . zi represents the i-th element of the auxiliary feature z.
  • u z represents noise.
  • the auxiliary data side probability estimation unit 106 executes auxiliary data side probability estimation processing (step S106).
  • the auxiliary data-side probability estimation unit 106 estimates auxiliary data-side probabilities by executing auxiliary data-side probability estimation processing.
  • the auxiliary data side probability is specifically represented by the following equation (8).
  • Equation (8) represents the auxiliary data side probability.
  • the symbol h is a parameterized auxiliary feature cumulative distribution function.
  • auxiliary data side decoding unit 107 executes auxiliary feature quantity decoding processing on the auxiliary feature quantity with noise (step S107).
  • the function g dec ( ⁇ ) is a function (hereinafter referred to as “auxiliary feature quantity decoding function”) that expresses the process of decoding the auxiliary feature quantity with noise ⁇ .
  • the auxiliary data ⁇ is a tensor such as a vector.
  • the auxiliary entropy acquisition unit 108 acquires auxiliary entropy based on the auxiliary data side probability (step S108). Specifically, the auxiliary entropy acquisition unit 108 acquires the auxiliary entropy by executing the process represented by the following formula (9).
  • Equation (9) The symbol on the left side of Equation (9) represents auxiliary entropy.
  • the main data side probability estimation unit 109 executes main data side probability estimation processing (step S109).
  • the main-data-side probability estimation unit 109 estimates the main-data-side probability based on the noise-added main-data feature amount and the auxiliary data by executing the main-data-side probability estimation process.
  • the main data side probability is specifically represented by the above-described formula (1).
  • the primary entropy acquisition unit 110 acquires primary entropy based on the primary data side probability (step S110). Specifically, the primary entropy acquisition unit 110 acquires the primary entropy by executing the process represented by the following formula (10).
  • Equation (10) The symbol on the left side of Equation (10) represents the principal entropy.
  • the main data side decoding unit 111 executes main data feature amount decoding processing on the main data feature amount with noise (step S111).
  • the main data feature amount with noise is decoded.
  • the function f dec ( ⁇ ) is a function (hereinafter referred to as “main data feature quantity decoding function”) that expresses the process of decoding the main data feature quantity ⁇ with noise.
  • the decoded main data x ⁇ is a tensor such as a vector.
  • the reconstruction error calculation unit 112 acquires the difference between the decoded main data and the main data acquired by the main data acquisition unit 101 (step S112).
  • the difference between the decoded main data and the main data acquired by the main data acquisition unit 101 is the reconstruction error.
  • the optimization unit 113 updates the learning network 100 based on the auxiliary data entropy, main data entropy, and reconstruction error (step S113).
  • the optimization unit 113 determines whether or not a predetermined termination condition (hereinafter referred to as "learning termination condition") regarding learning is satisfied (step S114).
  • the learning end condition is, for example, a condition that the learning network 100 has been updated a predetermined number of times.
  • step S114 YES
  • step S114: NO the process ends.
  • step S101 The peripheral processing at the time when the learning end condition is satisfied is used for vector quantization as the learned peripheral processing.
  • step S101 to step S114 may be executed in any order as long as it does not violate the law of causality.
  • the auxiliary data side probability estimation unit 106 may acquire a previously given probability distribution by reading from a predetermined storage device or the like. In such a case, the contents of the auxiliary data side probability estimation process are not updated by learning. Therefore, the learned auxiliary data side probability estimation process is the same as the auxiliary data side probability estimation process before learning.
  • updating the contents of the auxiliary data side probability estimation process is, more specifically, updating the parameterized auxiliary feature quantity cumulative distribution function h. Therefore, when the auxiliary data side probability estimating unit 106 acquires a previously given probability distribution by reading from a predetermined storage device or the like, the trained parameterized auxiliary feature quantity cumulative distribution function h is the parameterized auxiliary feature before learning. It is the same as the quantity cumulative distribution function h.
  • the learning unit 10 updates the encoding and decoding processes in the self-encoding process using vector quantization through learning.
  • the self-encoding process using vector quantization uses a main data feature quantity, which is a feature quantity to be self-encoded, and an auxiliary feature quantity, which is a feature quantity of the main data feature quantity. Furthermore, in the self-encoding process using vector quantization, entropy encoding is performed on the result of vector quantization of the main data feature quantity and entropy encoding is performed on the result of scalar quantization of the auxiliary feature quantity.
  • the encoding process in the self-encoding process using such vector quantization includes the main data feature amount acquisition process executed by the main data side encoding unit 102, the auxiliary data side encoding unit 103 executes auxiliary feature amount acquisition processing. Further, the decoding process in the self-encoding process using such vector quantization specifically includes the auxiliary feature amount decoding process executed by the auxiliary data side decoding unit 107 and the main data side decoding unit 111 and main data feature amount decoding processing to be executed.
  • FIG. "Learning completed" means the time when the learning end condition is satisfied. More specifically, as an example of a device that performs self-encoding using vector quantization using learned peripheral processing, the auto-encoding device 2 that performs encoding and decoding will be described.
  • the self-encoding device 2 is a type of self-encoder (autoencoder).
  • FIG. 8 is a first explanatory diagram for explaining the outline of the self-encoding device 2 in the embodiment.
  • FIG. 9 is a second explanatory diagram for explaining the outline of the self-encoding device 2 in the embodiment. More specifically, FIG. 8 is an explanatory diagram for explaining the encoding process executed by the self-encoding device 2, and FIG. 9 is an explanatory diagram for explaining the decoding process executed by the self-encoding device 2. is.
  • the self-encoding device 2 is a kind of self-encoder, so it has an encoder and a decoder.
  • the autoencoding device 2 comprises an encoder 200 and a decoder 212 .
  • the encoder 200 includes a self-encoding target acquisition unit 201, a learned main data side encoding unit 202, a learned auxiliary data side encoding unit 203, a vector quantization unit 204, a scalar quantization unit 205, and a learned auxiliary data side probability.
  • It comprises an estimating section 206 , a learned auxiliary data side decoding section 207 , an auxiliary entropy coding section 208 , a main data side probability estimating section 209 , a main entropy coding section 210 and a data multiplexing section 211 .
  • the decoder 212 includes an encoded data acquisition unit 213, a data separation unit 214, an auxiliary entropy decoding unit 215, a trained auxiliary data side decoding unit 216, a main entropy decoding unit 217, and a trained main data side decoding unit 218.
  • the self-encoding target acquisition unit 201 acquires data to be self-encoded as main data.
  • An object of self-encoding is hereinafter referred to as a self-encoding object.
  • the learned main data side encoding unit 202 executes a learned main data feature amount acquisition process for the self-encoding target. By executing the learned main data feature amount acquisition process, the learned main data side encoding unit 202 acquires the main data feature amount to be self-encoded.
  • the learned auxiliary data side encoding unit 203 executes learned auxiliary feature amount acquisition processing on the main data feature amount to be self-encoded. By executing the learned auxiliary feature amount acquisition process, the learned auxiliary data side encoding unit 203 acquires the auxiliary feature amount to be self-encoded.
  • the vector quantization unit 204 executes vector quantization processing on the main data feature quantity to be self-encoded. By executing the vector quantization process, the vector quantization unit 204 acquires the vector-quantized main data feature amount (hereinafter referred to as "vector quantized feature amount”) to be self-encoded.
  • vector quantized feature amount the vector-quantized main data feature amount
  • the scalar quantization unit 205 performs scalar quantization processing on the auxiliary feature quantity to be auto-encoded. By executing the scalar quantization process, the scalar quantization unit 205 acquires a scalar-quantized auxiliary feature amount to be self-encoded (hereinafter referred to as “scalar quantized feature amount”).
  • the learned auxiliary data side probability estimation unit 206 executes the learned auxiliary data side probability estimation process.
  • the learned auxiliary data-side probability estimating unit 206 estimates the auxiliary data-side probability of the self-encoding target based on the scalar quantized feature value by executing the learned auxiliary data-side probability estimation process.
  • the learned auxiliary data side decoding unit 207 executes learned auxiliary feature quantity decoding processing on the scalar quantized feature quantity. That is, the learned auxiliary data side decoding unit 207 decodes the scalar quantized feature quantity.
  • the information obtained by decoding the scalar quantized feature quantity will be referred to as quantized auxiliary data. Therefore, the learned auxiliary data side decoding unit 207 is a process of obtaining quantized auxiliary data by decoding the scalar quantized feature amount.
  • the auxiliary entropy encoding unit 208 entropy-encodes the scalar quantized feature amount based on the scalar quantized feature amount and the probability of the auxiliary data to be self-encoded.
  • Entropy coding is, for example, arithmetic coding.
  • the main-data-side probability estimation unit 209 estimates the main-data-side probability of the self-encoding target based on the vector quantized feature amount and the quantized auxiliary data.
  • the primary entropy encoding unit 210 performs entropy encoding of the vector quantized feature quantity based on the vector quantized feature quantity and the probability of the main data to be self-encoded.
  • Entropy coding is, for example, arithmetic coding.
  • the data multiplexing unit 211 outputs the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount to the decoder 212 . In this manner, encoder 200 encodes the self-encoding object.
  • the encoded data acquisition unit 213 acquires the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount.
  • the data separation unit 214 acquires the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount acquired by the encoded data acquisition unit 213 .
  • the data separation unit 214 outputs the entropy-encoded scalar quantized feature amount to the auxiliary entropy decoding unit 215 and outputs the entropy-encoded vector quantized feature amount to the primary entropy decoding unit 217 .
  • the trained auxiliary entropy decoding unit 215 performs entropy decoding on the entropy-encoded scalar quantized feature using the trained parameterized auxiliary feature cumulative distribution function.
  • the learned auxiliary data side decoding unit 216 executes the learned auxiliary feature quantity decoding process on the result of entropy decoding by the trained auxiliary entropy decoding unit 215 .
  • the primary entropy decoding unit 217 entropy-encoded vector quantization based on the entropy-encoded vector quantized feature amount and the result of the learned auxiliary feature amount decoding process by the learned auxiliary data side decoding unit 216. entropy coding of the modified features. More specifically, the primary entropy decoding unit 217 performs entropy decoding on the entropy-encoded vector quantized feature using the decoded cumulative distribution function.
  • the decoded cumulative distribution function is a parameterized main data feature quantity cumulative distribution function whose parameter values are values indicated by the result of learned auxiliary feature quantity decoding processing by the learned auxiliary data side decoding unit 216 .
  • the learned main data side decoding unit 218 executes the learned main data feature value decoding process on the result of decoding by the main entropy decoding unit 217 .
  • the decoder 212 decodes the self-encoded object encoded by the encoder 200. Also, in this manner, the self-encoding device 2 self-encodes the self-encoding target.
  • FIG. 10 is a flow chart showing an example of the flow of processing executed by the encoder 200 in the embodiment.
  • the autoencoding object X is an N-dimensional vector with N elements from X1 to XN . Each element from X 1 to X N is a tensor. Therefore, each element of X 1 through X N may be a scalar or a vector.
  • the learned main data side encoding unit 202 executes a learned main data feature amount acquisition process (step S202). That is, the learned main data side encoding unit 202 encodes the self-encoding target X.
  • the function F enc (X) is the learned primary data encoding function.
  • the main data feature quantity Y is a tensor such as a vector.
  • the learned auxiliary data side encoding unit 203 executes a learned auxiliary feature amount acquisition process (step S203).
  • the function G enc (Y) is a learned main data feature quantity encoding function.
  • the auxiliary feature Z is a tensor such as a vector.
  • the vector quantization unit 204 performs vector quantization on the main data feature quantity Y to be self-encoded (step S204).
  • the vector quantized feature Y ⁇ [Y 1 ⁇ , Y 2 ⁇ , . Q(Y k )] is obtained.
  • Y i represents the i-th element of the main data feature quantity Y to be auto-encoded.
  • ⁇ i the i-th element of the vector quantized feature ⁇ .
  • Q is a function represented by Equation (11) below.
  • means the set of all lattice points.
  • the scalar quantization unit 205 performs scalar quantization on the auxiliary feature Z to be self-encoded (step S205).
  • the learned auxiliary data side probability estimation unit 206 executes the learned auxiliary data side probability estimation process (step S206).
  • the learned auxiliary data-side probability estimation unit 206 estimates the auxiliary data-side probability of the self-encoding target based on the scalar quantized feature value Z ⁇ by executing the learned auxiliary data-side probability estimation process.
  • the auxiliary data side probability to be self-encoded is represented by the following equation (12).
  • Equation (12) represents the auxiliary data side probability of the self-encoding target.
  • the symbol H is a trained parameterized auxiliary feature cumulative distribution function.
  • the function G dec ( ⁇ ) is a learned auxiliary feature decoding function.
  • the quantization auxiliary data ⁇ is a tensor such as a vector.
  • the auxiliary entropy encoding unit 208 entropy-encodes the scalar quantized feature Z ⁇ based on the scalar quantized feature Z ⁇ and the probability of the auxiliary data to be self-encoded (step S208).
  • the main-data-side probability estimating unit 209 estimates the main-data-side probability of the self-encoding target based on the vector quantized feature Y ⁇ and the quantized auxiliary data ⁇ (step S209).
  • the main entropy encoding unit 210 entropy-encodes the vector quantized feature Y ⁇ based on the vector quantized feature Y ⁇ and the probability of the main data to be self-encoded (step S210).
  • the data multiplexing unit 211 outputs the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount to the decoder 212 (step S211).
  • step S201 to step S211 is an example of encoding processing by the encoder 200. Note that each process from step S201 to step S211 may be executed in any order as long as it does not violate causality.
  • FIG. 11 is a flowchart showing an example of the flow of processing executed by the decoder 212 in the embodiment.
  • the encoded data acquisition unit 213 acquires the result of encoding by the encoder 200 (step S301). Specifically, the results of encoding by the encoder 200 are the entropy-encoded vector quantized feature amount output in step S211 and the entropy-encoded scalar quantized feature amount.
  • the data separation unit 214 separates the entropy-encoded scalar quantized feature quantity acquired in step S301 from the entropy-encoded vector quantized feature quantity acquired in step S302 (step S302). .
  • the separation means that the entropy-encoded scalar quantized feature obtained in step S301 is output to the auxiliary entropy decoding unit 215, and the entropy-encoded vector quantized feature obtained in step S302 is output to the auxiliary entropy decoding unit 215. It means outputting the amount to the principal entropy decoding unit 217 .
  • the trained auxiliary entropy decoding unit 215 performs entropy decoding on the entropy-encoded scalar quantized feature using the trained parameterized auxiliary feature cumulative distribution function (step S303).
  • the learned auxiliary data side decoding unit 216 executes the learned auxiliary feature quantity decoding process on the result of entropy decoding by the trained auxiliary entropy decoding unit 215 (step S304).
  • the primary entropy decoding unit 217 performs entropy decoding on the entropy-encoded vector quantized feature using the decoded cumulative distribution function (step S305).
  • the learned main data side decoding unit 218 executes the learned main data feature quantity decoding process on the result of decoding by the main entropy decoding unit 217 (step S306.
  • a series of processing from step S301 to step S306 is an example of decoding processing by the decoder 212. It should be noted that each process from step S301 to step S306 is executed after the encoding process by the encoder 200 such as step S211 is executed, and may be executed in any order as long as it does not violate causality.
  • FIG. 12 is a diagram showing an example of the hardware configuration of the learning device 1 according to the embodiment.
  • the learning device 1 includes a control unit 11 including a processor 91 such as a CPU (Central Processing Unit) connected via a bus and a memory 92, and executes a program.
  • the learning device 1 functions as a device including a control unit 11, an input unit 12, a communication unit 13, a storage unit 14, and an output unit 15 by executing a program.
  • a control unit 11 including a processor 91 such as a CPU (Central Processing Unit) connected via a bus and a memory 92, and executes a program.
  • the learning device 1 functions as a device including a control unit 11, an input unit 12, a communication unit 13, a storage unit 14, and an output unit 15 by executing a program.
  • a control unit 11 including a processor 91 such as a CPU (Central Processing Unit) connected via a bus and a memory 92, and executes a program.
  • the learning device 1 functions as a device including
  • the processor 91 reads the program stored in the storage unit 14 and stores the read program in the memory 92 .
  • the processor 91 executes the program stored in the memory 92 , whereby the learning device 1 functions as a device comprising the control section 11 , the input section 12 , the communication section 13 , the storage section 14 and the output section 15 .
  • the control unit 11 controls the operations of various functional units included in the learning device 1.
  • the control unit 11 controls the operation of the output unit 15, for example.
  • the control unit 11 records, for example, various information generated by learning in the storage unit 14 .
  • the input unit 12 includes input devices such as a mouse, keyboard, and touch panel.
  • the input unit 12 may be configured as an interface that connects these input devices to the learning device 1 .
  • the input unit 12 receives input of various information to the learning device 1 .
  • the communication unit 13 includes a communication interface for connecting the learning device 1 to an external device.
  • the communication unit 13 communicates with an external device via wire or wireless.
  • the external device is, for example, a device that transmits main data used for learning.
  • the communication unit 13 acquires main data used for learning through communication with the device that is the transmission source of the main data.
  • the external device is for example the autoencoding device 2 .
  • the communication unit 13 transmits the network learning result to the self-encoding device 2 through communication with the self-encoding device 2 .
  • the main data does not necessarily have to be input via the communication unit 13 and may be input to the input unit 12 .
  • the storage unit 14 is configured using a computer-readable storage medium device such as a magnetic hard disk device or a semiconductor storage device.
  • the storage unit 14 stores various information regarding the learning device 1 .
  • the storage unit 14 stores information input via the input unit 12 or the communication unit 13, for example.
  • the storage unit 14 stores, for example, various information generated by execution of learning.
  • the storage unit 14 pre-stores, for example, probability distributions used to acquire the occurrence probability of each element of the tensor indicating the auxiliary feature amount.
  • the storage unit 14 stores, for example, a parameterized auxiliary feature cumulative distribution function in advance.
  • the storage unit 14 stores in advance, for example, the parameterized main data feature quantity cumulative distribution function.
  • the storage unit 14 stores, for example, representative vector information in advance.
  • the storage unit 14 stores, for example, the result of hyperrectangular parallelepiped division.
  • the storage unit 14 stores, for example, the initial values of the parameters of the learning network 100 in advance.
  • the initial value is, for example, a random value.
  • the storage unit 14 stores, for example, network learning results.
  • the output unit 15 outputs various information.
  • the output unit 15 includes a display device such as a CRT (Cathode Ray Tube) display, a liquid crystal display, an organic EL (Electro-Luminescence) display, or the like.
  • the output unit 15 may be configured as an interface that connects these display devices to the study device 1 .
  • the output unit 15 outputs information input to the input unit 12, for example.
  • the output unit 15 may display the result of learning, for example.
  • FIG. 13 is a diagram showing an example of the configuration of the control unit 11 included in the learning device 1 according to the embodiment.
  • the control unit 11 includes a learning unit 10 , a memory control unit 120 , a communication control unit 130 and an output control unit 140 .
  • the storage control unit 120 records various information in the storage unit 14 .
  • the communication control section 130 controls the operation of the communication section 13 .
  • the output control section 140 controls the operation of the output section 15 .
  • FIG. 14 is a diagram showing an example of the hardware configuration of the self-encoding device 2 in the embodiment.
  • the self-encoding device 2 includes a control section 21 including a processor 93 such as a CPU (Central Processing Unit) connected via a bus and a memory 94, and executes a program.
  • the self-encoding device 2 functions as a device comprising a control section 21, an input section 22, a communication section 23, a storage section 24 and an output section 25 by executing a program.
  • a control section 21 including a processor 93 such as a CPU (Central Processing Unit) connected via a bus and a memory 94, and executes a program.
  • the self-encoding device 2 functions as a device comprising a control section 21, an input section 22, a communication section 23, a storage section 24 and an output section 25 by executing a program.
  • the processor 93 reads the program stored in the storage unit 24 and stores the read program in the memory 94 .
  • the processor 93 executes the program stored in the memory 94 so that the self-encoding device 2 functions as a device comprising the control section 21 , the input section 22 , the communication section 23 , the storage section 24 and the output section 25 .
  • the control unit 21 controls operations of various functional units provided in the self-encoding device 2 .
  • the control unit 21 controls the operation of the output unit 25, for example.
  • the control unit 21 records various information generated by encoding by the encoder 200 and decoding by the decoder 212 in the storage unit 24, for example.
  • the input unit 22 includes input devices such as a mouse, keyboard, and touch panel.
  • the input unit 22 may be configured as an interface connecting these input devices to the autoencoding device 2 .
  • the input unit 22 receives input of various information to the self-encoding device 2 .
  • the communication unit 23 includes a communication interface for connecting the self-encoding device 2 to an external device.
  • the communication unit 23 communicates with an external device via wire or wireless.
  • the external device is, for example, the device that is the source of the self-encoding.
  • the communication unit 23 acquires the self-encoding target through communication with the device that is the transmission source of the self-encoding target.
  • the external device is the learning device 1, for example.
  • the communication unit 23 receives network learning results through communication with the learning device 1 . Note that the self-encoding target does not necessarily have to be input via the communication unit 23 and may be input to the input unit 22 .
  • the storage unit 24 is configured using a computer-readable storage medium device such as a magnetic hard disk device or a semiconductor storage device.
  • a storage unit 24 stores various information about the self-encoding device 2 .
  • the storage unit 24 stores information input via the input unit 22 or the communication unit 23, for example.
  • the storage unit 24 stores various kinds of information generated by executing encoding by the encoder 200 and decoding by the decoder 212, for example.
  • the storage unit 24 stores, for example, network learning results.
  • the storage unit 24 stores, for example, representative vector information in advance.
  • the storage unit 24 stores, for example, the result of hyperrectangular parallelepiped division.
  • the output unit 25 outputs various information.
  • the output unit 25 includes a display device such as a CRT (Cathode Ray Tube) display, a liquid crystal display, an organic EL (Electro-Luminescence) display, or the like.
  • the output unit 25 may be configured as an interface that connects these display devices to the study device 1 .
  • the output unit 25 outputs information input to the input unit 22, for example.
  • the output unit 25 may output, for example, the result of self-encoding of the object to be self-encoded.
  • FIG. 15 is a diagram showing an example of the configuration of the control unit 21 included in the self-encoding device 2 according to the embodiment.
  • the control unit 21 includes a self-encoding execution unit 20 , a memory control unit 220 , a communication control unit 230 and an output control unit 240 .
  • the self-encoding execution unit 20 performs self-encoding on a self-encoding target.
  • the autoencoding execution unit 20 comprises an encoder 200 and a decoder 212 .
  • the self-encoding execution unit 20 carries out encoding by the encoder 200 and decoding by the decoder 212 to self-encode the object to be self-encoded.
  • the storage control unit 220 records various information in the storage unit 24.
  • a communication control unit 230 controls the operation of the communication unit 23 .
  • the output control section 240 controls the operation of the output section 25 .
  • the learning device 1 configured in this way learns peripheral processing for vector quantization using representative vector information, which is information indicating the positions of representative vectors arranged in a lattice in a vector space such as LatticeVQ. . Then, the learning device 1 estimates the occurrence probability of the representative vector using the result of the hyperrectangular parallelepiped partitioning. As described above, when representative vectors are used, Voronoi division may be used to estimate the probability of occurrence of representative vectors, but integration is not easy.
  • the learning device 1 that estimates the probability of occurrence of a representative vector using a hyperrectangular parallelepiped obtained by hyperrectangular parallelepiped partitioning can reduce the load required to obtain self-encoding processing using vector quantization. . This is to reduce the burden from the learning stage required until the realization of self-encoding using vector quantization. Therefore, the learning device 1 can reduce the load required for self-encoding using vector quantization.
  • the learning device 1 configured in this way uses representative vectors to learn peripheral processing for vector quantization, it is possible to reduce the burden required for learning representative vectors.
  • the learning device 1 configured in this way learns the peripheral processing for vector quantization using the representative vector, there is no need to use a memory when learning the representative vector. Therefore, the learning device 1 can reduce the frequency of memory shortage problems and can process main data of a larger dimension.
  • the learning device 1 configured as described above, in learning, among the samples randomly generated in the (K ⁇ 1)-dimensional sphere circumscribing the Voronoi region in the K-dimensional vector space, the samples in the Voronoi region is used as noise. Therefore, it is possible to generate noise following a Gaussian distribution. Therefore, the learning device 1 can reduce the burden required to obtain self-encoding processing using vector quantization. Therefore, the learning device 1 can reduce the load required for self-encoding using vector quantization.
  • the self-encoding device 2 configured in this way performs self-encoding using vector quantization using the learning result of the learning device 1 . Therefore, the load required for self-encoding using vector quantization can be reduced.
  • the process of adding noise to the main data feature amount may be any one of the first noise adding process, the second noise adding process, or the third noise adding process.
  • the first noise adding process is a process of adding, as noise, samples in the Voronoi region among the samples randomly generated in the (K ⁇ 1)-dimensional sphere circumscribing the Voronoi region in the K-dimensional vector space. That is, the first noise adding process is the process described with reference to FIGS. 3 and 4.
  • the second noise addition process is a process of adding, as noise, samples uniformly generated within a (K-1)-dimensional sphere whose volume is approximated to the volume of a Voronoi region in a K-dimensional vector space.
  • noise is uniformly generated in a hyperrectangular parallelepiped region that is a region that divides a vector space in which representative vectors are arranged in a grid pattern, that is a region that includes one grid point of the vector space, and that has a hyperrectangular parallelepiped shape. This is a process to assign the samples obtained as noise.
  • the learning device 1 may be implemented using a plurality of information processing devices that are communicably connected via a network.
  • each functional unit included in the learning device 1 may be distributed and implemented in a plurality of information processing devices.
  • the self-encoding device 2 may be implemented using a plurality of information processing devices communicatively connected via a network.
  • each functional unit included in the self-encoding device 2 may be distributed and implemented in a plurality of information processing devices.
  • All or part of the functions of the learning device 1 and the self-encoding device 2 are hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). may be implemented using The program may be recorded on a computer-readable recording medium.
  • Computer-readable recording media include portable media such as flexible disks, magneto-optical disks, ROMs and CD-ROMs, and storage devices such as hard disks incorporated in computer systems.
  • the program may be transmitted over telecommunications lines.
  • Output control part 21 Control part 22... Input part 23... Communication part 24... Storage part 25... Output part 20... Self-encoding Execution unit 220 Storage control unit 230 Communication control unit 240 Output control unit 91 Processor 92 Memory 93 Processor 94 Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A learning device according to one aspect of the present invention comprises a learning unit that executes learning to update coding and decoding processes in an autoencoding process involving using vector quantization, using a primary data feature that is a feature to be subjected to autoencoding and an auxiliary feature that is a feature of the primary data feature, and entropy-coding the result of vector-quantizing the primary data feature and entropy-coding the result of scalar-quantizing the auxiliary feature. The learning unit executes, through learning, a primary data-side probability estimation process for estimating the probability of occurrence of each element of a tensor indicating the primary data feature, and the primary data-side probability estimation process involves estimating said probability of occurrence by using the result of integrating a parameterized probability density function over an integration region that is a region obtained by dividing a vector space in which representative vectors are arranged in a lattice pattern and is a hyper-rectangular region including one lattice point of the vector space.

Description

学習装置、自己符号化装置、学習方法、自己符号化方法及びプログラムLearning device, self-encoding device, learning method, self-encoding method and program
 本発明は、学習装置、自己符号化装置、学習方法、自己符号化方法及びプログラムに関する。 The present invention relates to a learning device, a self-encoding device, a learning method, a self-encoding method and a program.
 深層学習を用いたデータ圧縮において量子化処理にはスカラー量子化とベクトル量子化の2種類があり、ベクトル量子化の方が、圧縮性能が高い。ベクトル量子化の試みとして、以下の非特許文献1や非特許文献2に記載の試みがある。非特許文献1に記載の技術は、VQVAEと呼称される技術である。非特許文献1に記載の技術では、End-to-Endでの学習はせず、特徴ベクトルの分布を一様分布と仮定して、エンコード及びデコードの最適化と代表ベクトルの最適化とを分けて実施する。非特許文献2の技術は、Soft to Hardと呼称される技術である。非特許文献2の技術は、ソフトマックス関数を用いて量子化処理を近似計算し、代表ベクトルの生起確率をヒストグラムで近似してEnd-to-Endでの学習を行う。 There are two types of quantization processing in data compression using deep learning: scalar quantization and vector quantization. Vector quantization has higher compression performance. As attempts of vector quantization, there are attempts described in Non-Patent Document 1 and Non-Patent Document 2 below. The technology described in Non-Patent Document 1 is a technology called VQVAE. In the technique described in Non-Patent Document 1, end-to-end learning is not performed, and the distribution of feature vectors is assumed to be a uniform distribution, and optimization of encoding and decoding and optimization of representative vectors are separated. to implement. The technique of Non-Patent Document 2 is a technique called Soft to Hard. The technique of Non-Patent Document 2 performs end-to-end learning by approximating the quantization process using a softmax function and approximating the probability of occurrence of representative vectors with a histogram.
 しかしながら非特許文献1に記載の技術や非特許文献2の技術など従来の技術は、ベクトル量子化を用いた自己符号化の処理を得ることに関する問題があった。問題は、例えば安定性の問題であったり、例えば処理性能の問題であったり、例えばベクトルの次元が大きい場合や高レートでの符号化を行う場合に計算量又はメモリ使用量が爆発的に増加してしまう問題であったりであった。このように従来の技術では、ベクトル量子化を用いた自己符号化の処理を得ることに要する負担が大きい場合があった。その結果、コンピュータ資源の不足によりベクトル量子化を用いた自己符号化を実現できないなど、ベクトル量子化を用いた自己符号化に要する負担が大きい場合があった。 However, conventional techniques such as the technique described in Non-Patent Document 1 and the technique of Non-Patent Document 2 have problems related to obtaining self-encoding processing using vector quantization. The problem can be, for example, a stability problem, or a processing performance problem, for example, when the dimension of the vector is large, or when encoding at a high rate, the amount of computation or memory usage increases exponentially. It was a problem that would end up. As described above, in the prior art, the burden required to obtain self-encoding processing using vector quantization was sometimes large. As a result, in some cases, the load required for self-encoding using vector quantization is large, such as the fact that self-encoding using vector quantization cannot be realized due to lack of computer resources.
 上記事情に鑑み、本発明は、ベクトル量子化を用いた自己符号化に要する負担を軽減する技術を提供することを目的としている。 In view of the above circumstances, an object of the present invention is to provide a technology that reduces the burden required for self-encoding using vector quantization.
 本発明の一態様は、ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習部、を備え、前記学習部は前記学習において、前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、学習装置である。 One aspect of the present invention is a process of self-encoding using vector quantization. is a self-encoding process that performs entropy encoding on the result of vector quantization of the main data feature amount and entropy encoding on the result of scalar quantization of the auxiliary feature amount, a learning unit that updates the encoding and decoding processes in the learning by learning, wherein the learning unit estimates the probability of occurrence of each element of the tensor representing the main data feature amount in the main data side probability estimation process and the main data side probability estimation processing is performed by dividing a vector space in which the representative vectors are arranged in a grid pattern, an area including one grid point of the vector space, and having a hyperrectangular shape. The learning device estimates the occurrence probability using a result of integration of a probability density function parameterized with a given area as an integration area.
 本発明の一態様は、自己符号化の対象を取得する自己符号化対象取得部と、ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習部、を備え、前記学習部は前記学習において、前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、学習装置を用いて得られた、学習済みの符号化の処理と学習済みの復号の処理とを用いて、前記自己符号化対象取得部が取得した前記対象のベクトル量子化による自己符号化を行う自己符号化実行部と、を備える自己符号化装置である。 One aspect of the present invention is a self-encoding target acquisition unit that acquires a self-encoding target, and a self-encoding process using vector quantization. and an auxiliary feature amount that is a feature amount of the main data feature amount, entropy coding of the result of vector quantization of the main data feature amount and scalar quantization of the auxiliary feature amount. a learning unit that updates the encoding and decoding processes in self-encoding processing that performs entropy encoding on the converted result by learning, and the learning unit indicates the main data feature amount in the learning. main-data-side probability estimation processing for estimating the probability of occurrence of each element of the tensor, wherein the main-data-side probability estimation processing divides a vector space in which representative vectors are arranged in a lattice, and divides the vector space into Estimate the occurrence probability using a learning device that estimates the occurrence probability using the result of integrating the probability density function parameterized as an integration region that is a region that includes one lattice point of and has a hypercube shape a self-encoding execution unit that performs self-encoding by vector quantization on the target acquired by the self-encoding target acquisition unit, using the learned encoding process and the learned decoding process, is an autoencoding device comprising:
 本発明の一態様は、ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習ステップ、を有し、前記学習ステップは前記学習において前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、学習方法である。 One aspect of the present invention is a process of self-encoding using vector quantization. is a self-encoding process that performs entropy encoding on the result of vector quantization of the main data feature amount and entropy encoding on the result of scalar quantization of the auxiliary feature amount, a learning step of updating the encoding and decoding processes in the learning by learning, wherein the learning step estimates the probability of occurrence of each element of the tensor representing the main data feature amount in the learning, the main data side probability estimation process and the main data side probability estimation processing is performed by dividing a vector space in which the representative vectors are arranged in a grid pattern, an area including one grid point of the vector space, and having a hyperrectangular shape. This learning method estimates the occurrence probability using a result of integration of a probability density function parameterized with a given area as an integration area.
 本発明の一態様は、自己符号化の対象を取得する自己符号化対象取得ステップと、ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習部、を備え、前記学習部は前記学習において、前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、学習装置を用いて得られた、学習済みの符号化の処理と学習済みの復号の処理とを用いて、前記自己符号化対象取得ステップが取得した前記対象のベクトル量子化による自己符号化を行う自己符号化実行ステップと、を有する自己符号化方法である。 One aspect of the present invention is a self-encoding target acquisition step for acquiring a self-encoding target, and a self-encoding process using vector quantization, wherein the main data feature is a feature quantity of the self-encoding target. and an auxiliary feature amount that is a feature amount of the main data feature amount, entropy coding of the result of vector quantization of the main data feature amount and scalar quantization of the auxiliary feature amount. a learning unit that updates the encoding and decoding processes in self-encoding processing that performs entropy encoding on the converted result by learning, and the learning unit indicates the main data feature amount in the learning. main-data-side probability estimation processing for estimating the probability of occurrence of each element of the tensor, wherein the main-data-side probability estimation processing divides a vector space in which representative vectors are arranged in a lattice, and divides the vector space into Estimate the occurrence probability using a learning device that estimates the occurrence probability using the result of integrating the probability density function parameterized as an integration region that is a region that includes one lattice point of and has a hypercube shape a self-encoding execution step of performing self-encoding by vector quantization on the target obtained by the self-encoding target obtaining step, using the learned encoding process and the learned decoding process, is a self-encoding method with .
 本発明の一態様は、上記の学習装置としてコンピュータを機能させるためのプログラムである。 One aspect of the present invention is a program for causing a computer to function as the above learning device.
 本発明の一態様は、上記の自己符号化装置としてコンピュータを機能させるためのプログラムである。 One aspect of the present invention is a program for causing a computer to function as the above self-encoding device.
 本発明により、ベクトル量子化を用いた自己符号化に要する負担を軽減することが可能となる。 The present invention makes it possible to reduce the burden required for self-encoding using vector quantization.
実施形態の学習装置の概要を説明する説明図。Explanatory drawing explaining the outline|summary of the learning apparatus of embodiment. 実施形態におけるLatticeVQを説明する説明図。FIG. 4 is an explanatory diagram for explaining LatticeVQ in the embodiment; 実施形態におけるノイズの付与の一例を説明する第1の説明図。FIG. 4 is a first explanatory diagram for explaining an example of adding noise in the embodiment; 実施形態におけるノイズの付与の一例を説明する第2の説明図。FIG. 7 is a second explanatory diagram for explaining an example of adding noise in the embodiment; 実施形態における累積分布関数と生起確率との関係を説明する説明図。Explanatory drawing explaining the relationship between the cumulative distribution function and the occurrence probability in the embodiment. 実施形態における超直方体分割を説明する説明図。Explanatory drawing explaining the hypercube division|segmentation in embodiment. 実施形態における学習部が実行する処理の流れの一例を示す図。The figure which shows an example of the flow of the process which the learning part in embodiment performs. 実施形態における自己符号化装置の概要を説明する第1の説明図。1 is a first explanatory diagram for explaining an outline of an autoencoding device according to an embodiment; FIG. 実施形態における自己符号化装置の概要を説明する第2の説明図。FIG. 2 is a second explanatory diagram for explaining the outline of the self-encoding device in the embodiment; 実施形態におけるエンコーダが実行する処理の流れの一例を示すフローチャート。4 is a flowchart showing an example of the flow of processing executed by an encoder according to the embodiment; 実施形態におけるデコーダが実行する処理の流れの一例を示すフローチャート。4 is a flowchart showing an example of the flow of processing executed by a decoder according to the embodiment; 実施形態における学習装置のハードウェア構成の一例を示す図。The figure which shows an example of the hardware constitutions of the learning apparatus in embodiment. 実施形態における学習装置が備える制御部の構成の一例を示す図。The figure which shows an example of a structure of the control part with which the learning apparatus in embodiment is provided. 実施形態における自己符号化装置のハードウェア構成の一例を示す図。The figure which shows an example of the hardware constitutions of the self-encoding apparatus in embodiment. 実施形態における自己符号化装置が備える制御部の構成の一例を示す図。The figure which shows an example of a structure of the control part with which the self-encoding apparatus in embodiment is provided.
 (実施形態)
 図1は、実施形態の学習装置1の概要を説明する説明図である。学習装置1は、学習部10を備える。学習部10は、テンソルで表現されたデータのベクトル量子化を用いた自己符号化の性能が高まるように、学習を行う。なお、自己符号化の性能は元データと復元データの誤差Dとデータの符号量Rをラグランジュ定数λで重み付け和した量であるRDコスト(D+λR)の小ささで評価される。RD性能はRDコストが小さいほどよいことを表す。なお、学習は、機械学習を意味する。学習は、例えば深層学習である。なお、よく知られているようにデータの自己符号化とはデータの圧縮を意味する。
(embodiment)
FIG. 1 is an explanatory diagram for explaining an overview of the learning device 1 of the embodiment. The learning device 1 includes a learning section 10 . The learning unit 10 performs learning so as to improve the performance of self-encoding using vector quantization of data represented by a tensor. The performance of self-encoding is evaluated by the smallness of the RD cost (D+λR), which is the weighted sum of the error D between the original data and the restored data and the code amount R of the data with the Lagrangian constant λ. A smaller RD cost indicates better RD performance. Note that learning means machine learning. Learning is, for example, deep learning. As is well known, data self-encoding means compression of data.
 学習部10は、学習ネットワーク100と最適化部113とを備える。学習ネットワーク100は、ニューラルネットワークである。学習ネットワーク100は、主データ取得部101、主データ側符号化部102と、補助データ側符号化部103と、主データ側ノイズ付与部104と、補助データ側ノイズ付与部105と、補助データ側確率推定部106と、補助データ側復号部107と、補助エントロピー取得部108と、主データ側確率推定部109と、主エントロピー取得部110と、主データ側復号部111と、再構成誤差算出部112と、を備える。詳細は後述するが最適化部113は、学習ネットワーク100の出力に基づいて学習ネットワーク100を更新する。 The learning unit 10 includes a learning network 100 and an optimization unit 113. Learning network 100 is a neural network. The learning network 100 includes a main data acquisition unit 101, a main data side encoding unit 102, an auxiliary data side encoding unit 103, a main data side noise addition unit 104, an auxiliary data side noise addition unit 105, and an auxiliary data side A probability estimation unit 106, an auxiliary data side decoding unit 107, an auxiliary entropy acquisition unit 108, a main data side probability estimation unit 109, a main entropy acquisition unit 110, a main data side decoding unit 111, and a reconstruction error calculation unit. 112 and. Although details will be described later, the optimization unit 113 updates the learning network 100 based on the output of the learning network 100 .
 主データ取得部101は、テンソルで表現されたデータを主データとして取得する。テンソルで表現されたデータとは、例えば画像のデータである。テンソルで表現されたデータは、例えば1又は複数のチャネルの音声の時系列のデータであってもよい。以下、主データ取得部101が取得したデータを主データという。 The main data acquisition unit 101 acquires data represented by a tensor as main data. Data represented by a tensor is, for example, image data. The tensor-expressed data may be, for example, time-series data of one or more channels of audio. The data acquired by the main data acquisition unit 101 is hereinafter referred to as main data.
 主データ側符号化部102は、主データ特徴量取得処理を実行する。主データ特徴量取得処理は主データを符号化する処理である。符号化とは符号化の対象の特徴を示す情報を取得する処理であるので、符号化とは特徴量を示す情報を取得する処理である。 The main data side encoding unit 102 executes main data feature quantity acquisition processing. The main data feature amount acquisition process is a process of encoding the main data. Coding is a process of obtaining information indicating the characteristics of an object to be encoded, so encoding is a process of obtaining information indicating the amount of characteristics.
 したがって、主データ側符号化部102による主データの符号化とは主データの特徴量を取得する処理である。そのため主データ特徴量取得処理は、主データ特徴量を取得する処理である。主データ特徴量は符号化された主データである。主データ特徴量取得処理の内容は、学習により更新される。すなわち、主データ側符号化部102の実行する処理の内容は、学習により更新される。 Therefore, the encoding of the main data by the main data side encoding unit 102 is a process of acquiring the feature amount of the main data. Therefore, the main data feature amount acquisition process is a process of acquiring the main data feature amount. The main data feature quantity is encoded main data. The content of the main data feature quantity acquisition process is updated by learning. That is, the contents of the processing executed by main data side encoding section 102 are updated by learning.
 補助データ側符号化部103は、主データ特徴量をさらに符号化する。上述したように符号化とは特徴量を示す情報を取得する処理であるので、補助データ側符号化部103による主データ特徴量の符号化とは主データ特徴量の特徴量を取得する処理である。そこで、以下、符号化された主データがさらに符号化された情報を補助特徴量という。すなわち補助特徴量は、主データ特徴量の特徴量を示す情報である。補助特徴量は、主データ特徴量が符号化された情報であるので、補助特徴量のエントロピーは主データ特徴量よりもエントロピーの小さな情報である。 The auxiliary data side encoding unit 103 further encodes the main data feature quantity. As described above, encoding is a process of acquiring information indicating a feature amount, so encoding of the main data feature amount by the auxiliary data side encoding unit 103 is a process of acquiring the feature amount of the main data feature amount. be. Therefore, hereinafter, information obtained by further encoding the encoded main data will be referred to as an auxiliary feature amount. That is, the auxiliary feature amount is information indicating the feature amount of the main data feature amount. Since the auxiliary feature amount is information obtained by encoding the main data feature amount, the entropy of the auxiliary feature amount is information smaller than the entropy of the main data feature amount.
 以下、主データ特徴量を符号化する処理を、補助特徴量取得処理という。補助特徴量取得処理の内容は学習により更新される。すなわち、補助データ側符号化部103の実行する処理の内容は、学習により更新される。 Hereinafter, the process of encoding the main data feature quantity will be referred to as the auxiliary feature quantity acquisition process. The content of the auxiliary feature quantity acquisition process is updated by learning. That is, the contents of the processing executed by the auxiliary data side encoding unit 103 are updated by learning.
 補助特徴量取得処理の内容は、主データ統計量情報の情報量がより多く補助特徴量に含まれるように学習により更新される。主データ統計量情報は、主データを表現するテンソルの各要素の値が従う各確率分布の統計量を示す情報である。確率分布の統計量は、例えば散布度である。確率分布の統計量は、散布度だけでなく散布度と代表値との組であってもよい。 The content of the auxiliary feature acquisition process is updated through learning so that the amount of information in the main data statistics information is included in the auxiliary feature. The main data statistic information is information indicating the statistic of each probability distribution followed by the value of each element of the tensor representing the main data. A statistic of the probability distribution is, for example, the degree of dispersion. The statistic of the probability distribution may be not only the degree of dispersion but also a set of the degree of dispersion and a representative value.
 なお、情報処理や情報理論の分野において説明されるように、英文に出現する各ローマ字の出現の確率分布などのようにデータは一般に確率分布に従って出現する。主データを表現するテンソルの各要素の値もまた確率分布に従う。 In addition, as explained in the fields of information processing and information theory, data generally appears according to a probability distribution, such as the probability distribution of the appearance of each Roman character that appears in English. The value of each element of the tensor representing the main data also follows the probability distribution.
 主データ側ノイズ付与部104は、ノイズ付き主データ特徴量取得処理を実行する。ノイズ付き主データ特徴量取得処理は、主データ特徴量に対してベクトルノイズ付与処理を実行する処理である。ベクトルノイズ付与処理は、要素の数が予め定められた数K(Kは2以上の整数)であるK次元ベクトル(以下「ノイズ付与対象ベクトル」という。)を処理対象とする処理であって、処理対象に対してノイズを付与する処理である。したがって、ノイズ付き主データ特徴量取得処理は、主データ特徴量にノイズが付与された情報(以下「ノイズ付き主データ特徴量」という。)を取得する処理である。 The main data side noise addition unit 104 executes noise-added main data feature quantity acquisition processing. The noisy main data feature amount acquisition process is a process of applying vector noise to the main data feature amount. The vector noise addition process is a process for processing a K-dimensional vector (hereinafter referred to as a "noise addition target vector") having a predetermined number K (K is an integer of 2 or more) of elements, This is the process of adding noise to the processing target. Therefore, the noise-added main data feature amount acquisition process is a process for acquiring information in which noise is added to the main data feature amount (hereinafter referred to as "noise-added main data feature amount").
 ノイズ付与対象ベクトルは具体的には、主データ特徴量を表現するテンソルに含まれるK次元ベクトルである。テンソルに含まれるK次元ベクトルとは、そのベクトルのk番目の要素(kは1以上K以下の整数)が、テンソルの全要素のうちの連続するK個の要素のうちのk番目の要素である、ベクトルを意味する。 Specifically, the noise addition target vector is a K-dimensional vector included in the tensor that expresses the main data feature amount. A K-dimensional vector contained in a tensor means that the k-th element of the vector (k is an integer of 1 or more and K or less) is the k-th element of consecutive K elements among all the elements of the tensor. There is, which means vector.
 ノイズ付与対象ベクトルの要素数Kは、学習部10による学習の結果を用いたベクトル量子化を行う装置によるベクトル量子化によって、1つの符号に写像される要素の数である。以下、学習部10による学習の結果を、ネットワーク学習結果、という。したがって、ネットワーク学習結果を用いたベクトル量子化によってK個の要素がまとめて1つの符号に写像される場合、ノイズ付与対象ベクトルの要素数はKである。なお、量子化によってK個の要素がまとめて1つの符号に写像される場合、得られる符号はK次元のベクトルを示すインデックスである。 The number of elements K of the noise addition target vector is the number of elements mapped to one code by vector quantization by a device that performs vector quantization using the learning result of the learning unit 10 . The result of learning by the learning unit 10 is hereinafter referred to as a network learning result. Therefore, when K elements are collectively mapped to one code by vector quantization using the network learning result, the number of elements of the noise addition target vector is K. Note that when K elements are collectively mapped to one code by quantization, the obtained code is an index indicating a K-dimensional vector.
 したがって、ノイズ付与対象ベクトルの要素数Kは、予め決定された数である。なお、データのベクトル量子化とはデータの符号化であるので、ネットワーク学習結果を用いたベクトル量子化とはネットワーク学習結果を用いたデータの符号化を意味する。 Therefore, the number of elements K of the noise addition target vector is a predetermined number. Since vector quantization of data is data encoding, vector quantization using network learning results means data encoding using network learning results.
 ここでノイズを付与する方法について説明するが、ノイズの付与の説明に先立ち、LatticeVQについて説明する。 A method for adding noise will be described here, but before describing noise addition, LatticeVQ will be described.
<LatticeVQ>
 ベクトル量子化では代表ベクトルが必要である。Latticeは、ベクトル空間における格子点集合である。LatticeVQは、ベクトル空間において代表ベクトルを格子状に配置する場合のベクトル量子化である。すなわち、LatticeVQは、ベクトル空間において代表ベクトルが格子状に配置されるという条件を満たすベクトル量子化である。LatticeVQは特定の条件を除いて、スカラー量子化よりもRD性能が良いことが知られている。
<Lattice VQ>
Vector quantization requires a representative vector. Lattice is a set of lattice points in vector space. LatticeVQ is vector quantization when representative vectors are arranged in a lattice in a vector space. That is, LatticeVQ is vector quantization that satisfies the condition that representative vectors are arranged in a lattice in a vector space. LatticeVQ is known to have better RD performance than scalar quantization, except for certain conditions.
 図2は、実施形態におけるLatticeVQを説明する説明図である。より具体的には、図2は、ノイズ付与対象ベクトルの要素数が2である場合のLatticeVQを説明する説明図である。 FIG. 2 is an explanatory diagram explaining LatticeVQ in the embodiment. More specifically, FIG. 2 is an explanatory diagram for explaining LatticeVQ when the number of elements of the noise addition target vector is two.
 図2は、A格子の例である。2次元の場合、格子の種類の1つとしてA格子がある。なお8次元の場合、格子の種類の1つとして、E格子がある。なお24次元の場合、格子の種類の1つとして、リーチ格子があり、それぞれ一様分布に対してRD性能を最大にするLatticeVQの格子である。 FIG. 2 is an example of an A2 lattice. In the two-dimensional case, one type of lattice is the A2 lattice. Note that for eight dimensions, one type of lattice is the E8 lattice. Note that for the 24-dimensional case, one type of lattice is the Reach lattice, each of which is a LatticeVQ lattice that maximizes the RD performance for uniform distributions.
 以下、代表ベクトルが格子状に配置された空間を格子空間という。すなわち、代表ベクトルは、格子空間の各格子点に位置する。図2は格子空間の一例を示す図でもある。図2の例では格子空間は2次元であるが、代表ベクトルがK次元であれば格子空間もK次元である。 The space in which the representative vectors are arranged in a lattice is hereinafter referred to as the lattice space. That is, the representative vector is located at each lattice point in the lattice space. FIG. 2 is also a diagram showing an example of a lattice space. In the example of FIG. 2, the lattice space is two-dimensional, but if the representative vectors are K-dimensional, the lattice space is also K-dimensional.
<ノイズの付与>
 それでは図3及び図4を用いてノイズの付与の一例を、要素数が2の場合を例に説明する。図3は、実施形態におけるノイズの付与の一例を説明する第1の説明図である。図4は、実施形態におけるノイズの付与の一例を説明する第2の説明図である。
<Addition of noise>
Now, an example of adding noise will be described with reference to FIGS. 3 and 4, where the number of elements is two. FIG. 3 is a first explanatory diagram illustrating an example of adding noise in the embodiment. FIG. 4 is a second explanatory diagram illustrating an example of adding noise in the embodiment.
 ノイズの付与に際してはまず、原点格子の格子単位領域(以下「対象格子単位領域」という。)に外接する(K-1)次元球面内に一様分布するランダムなK次元ベクトルを複数生成し、格子空間に配置する。例えば図4の例であれば、ノイズ付与に際してはまず、図4の内部点数が少なくとも1つ以上になるまで生成される。格子単位領域は、格子空間を分割条件が満たされるように複数の領域に分割した結果生じた各領域である。 When applying noise, first, a plurality of random K-dimensional vectors uniformly distributed within a (K−1)-dimensional sphere circumscribing the grid unit region of the origin grid (hereinafter referred to as the “target grid unit region”) are generated, Arrange in grid space. For example, in the example of FIG. 4, when adding noise, first, the internal points in FIG. 4 are generated until they reach at least one or more. A lattice unit area is each area resulting from dividing a lattice space into a plurality of areas so that a division condition is satisfied.
 分割条件は、各領域の大きさ及び形状が同一であり、各領域はそれぞれ格子空間の1つの格子点を含むという条件である。したがって、格子単位領域は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であってベクトル空間の1つの格子点を含む領域である。 The division condition is that each region has the same size and shape, and each region includes one grid point in the grid space. Therefore, a lattice unit area is an area that divides a vector space in which representative vectors are arranged in a lattice and includes one lattice point of the vector space.
 なお、(K―1)次元球面は、K=2であれば円であり、K=3であれば球面であり、Kが4以上であれば、超球面である。図3の領域B1が対象格子単位領域の一例である。 It should be noted that the (K−1)-dimensional sphere is a circle if K=2, a sphere if K=3, and a hypersphere if K is 4 or more. A region B1 in FIG. 3 is an example of the target lattice unit region.
 図3及び図4の例において、格子単位領域は、ボロノイ領域である。ボロノイ領域は、格子空間等の距離空間をボロノイ分割によって複数の領域に分割した結果得られる各前記領域である。 In the examples of FIGS. 3 and 4, the lattice unit regions are Voronoi regions. A Voronoi region is each region obtained by dividing a metric space such as a lattice space into a plurality of regions by Voronoi division.
 図3のノイズ点は、対象格子単位領域の外接円内に配置されたサンプルの一例である。なお、格子空間内にサンプルが配置される処理とは、具体的には、格子空間内の座標が取得される処理である。そして、格子空間における格子単位領域は格子空間内の領域であるので、任意の1つの格子単位領域の外接円内にサンプルが配置される処理とは、その外接円内の座標を取得する処理である。 The noise points in FIG. 3 are an example of samples arranged within the circumscribed circle of the target grid unit area. Note that the process of arranging the samples in the lattice space is specifically the process of acquiring the coordinates in the lattice space. Since the lattice unit area in the lattice space is an area within the lattice space, the process of arranging the samples within the circumscribed circle of any one lattice unit area is the process of obtaining the coordinates within the circumscribed circle. be.
 次に、配置されたノイズ点のうち、対象格子単位領域内に位置するノイズ点(以下「内部点」という。)と、対象格子単位領域外に位置するノイズ点(以下「除外点」という。)と、が図4の例に示すように、区別される。内部点と除外点との区別の処理は、具体的には、ノイズ点ごとに、ノイズ点の座標に基づき対象格子単位領域内の座標か対象格子単位領域外の座標かを判定する処理である。判定処理は具体的にはノイズ点を式(11)によってベクトル量子化した結果、原点格子と一致した点が領域内の座標と判定される処理である。 Next, among the arranged noise points, noise points located within the target lattice unit region (hereinafter referred to as “internal points”) and noise points located outside the target grid unit region (hereinafter referred to as “exclusion points”). ) and are distinguished as shown in the example of FIG. Specifically, the process of distinguishing between internal points and excluded points is a process of determining, for each noise point, whether the coordinates are inside the target grid unit area or outside the target grid unit area based on the coordinates of the noise point. . Specifically, the determination process is a process in which points that match the origin lattice as a result of vector quantization of the noise points by Equation (11) are determined to be coordinates within the region.
 次に、対象格子単位領域内位置する複数のノイズ点のうちの1つがランダムに選択される。次に、選択されたノイズ点が1つのノイズ付与対象ベクトルに付与される。付与の処理は具体的には、ノイズ付与対象ベクトルと選択されたノイズ点を示す位置ベクトルとが足し算される処理を意味する。 Next, one of the plurality of noise points located within the grid unit area of interest is randomly selected. Next, the selected noise points are added to one noise addition target vector. The addition process specifically means a process of adding a noise-addition target vector and a position vector indicating a selected noise point.
 このように、選択されたノイズ点を示す位置ベクトルは乱数で決まる量であるので、ノイズの一種である。そして、選択されたノイズ点を示す位置ベクトルはベクトルで表現されるので、選択されたノイズ点を示す位置ベクトルはベクトルで表現されたノイズである。そこで、以下、選択されたノイズ点を示す位置ベクトルをノイズベクトルという。このようにしてノイズ付与対象ベクトルにはノイズが付与される。 In this way, the position vector indicating the selected noise point is a type of noise because it is a quantity determined by random numbers. Since the position vector indicating the selected noise point is represented by a vector, the position vector indicating the selected noise point is noise represented by a vector. Therefore, the position vector indicating the selected noise point is hereinafter referred to as a noise vector. In this way, noise is added to the noise addition target vector.
 図1の説明に戻る。補助データ側ノイズ付与部105は、ノイズ付き補助特徴量取得処理を実行する。ノイズ付き補助特徴量取得処理は、補助特徴量に対してノイズを付与する処理である。すなわち、ノイズ付き補助特徴量取得処理は、補助特徴量に対してスカラーのノイズが付与された情報(以下「ノイズ付き補助特徴量」という。)を取得する処理である。より具体的には、補助データ側ノイズ付与部105は、補助特徴量を示すテンソルの各要素にスカラーのノイズを付与する。スカラーのノイズは例えば、-1/2以上1/2以下の一様ノイズである。 Return to the description of Figure 1. The auxiliary data-side noise addition unit 105 executes auxiliary feature quantity acquisition processing with noise. The noise-attached auxiliary feature acquisition process is a process of adding noise to the auxiliary feature. That is, the noise-attached auxiliary feature acquisition process is a process of acquiring information obtained by adding scalar noise to the auxiliary feature (hereinafter referred to as "noise-attached auxiliary feature"). More specifically, the auxiliary data noise adding unit 105 adds scalar noise to each element of the tensor representing the auxiliary feature amount. The scalar noise is, for example, uniform noise between -1/2 and 1/2.
 補助データ側確率推定部106は、補助データ側確率推定処理を実行する。補助データ側確率推定処理は、ノイズ付き補助特徴量に基づき補助データ側確率を推定する処理である。補助データ側確率は、補助特徴量を示すテンソルの各要素の生起確率を示す情報である。補助特徴量を示すテンソルの各要素の生起確率の推定には補助特徴量を示すテンソルの各要素の確率分布の情報が用いられる。 The auxiliary data side probability estimation unit 106 executes auxiliary data side probability estimation processing. The auxiliary data side probability estimation process is a process of estimating the auxiliary data side probability based on the auxiliary feature amount with noise. The auxiliary data side probability is information indicating the occurrence probability of each element of the tensor indicating the auxiliary feature amount. Information on the probability distribution of each element of the tensor indicating the auxiliary feature is used for estimating the occurrence probability of each element of the tensor indicating the auxiliary feature.
 補助特徴量を示すテンソルの各要素の確率分布が予め与えられた確率分布である場合には、補助データ側確率推定部106は、その予め与えられた確率分布を後述する記憶部14等の所定の記憶装置からの読み出し等によって取得する。そして、補助データ側確率推定部106は、取得した確率分布に基づいて補助特徴量を示すテンソルの各要素の生起確率を推定する。なお、予め与えられた確率分布は、例えば累積分布関数を用いて表現された確率分布である。 When the probability distribution of each element of the tensor indicating the auxiliary feature amount is a given probability distribution, the auxiliary data side probability estimation unit 106 stores the given probability distribution in a predetermined storage unit 14 or the like, which will be described later. obtained by reading from the storage device. Then, the auxiliary data side probability estimation unit 106 estimates the occurrence probability of each element of the tensor indicating the auxiliary feature value based on the acquired probability distribution. The probability distribution given in advance is, for example, a probability distribution expressed using a cumulative distribution function.
 補助特徴量を示すテンソルの各要素の確率分布が、パラメトライズド補助特徴量累積分布関数で表現される場合、補助データ側確率推定部106は、パラメトライズド補助特徴量累積分布関数に基づいて、補助特徴量を示すテンソルの各要素の生起確率を推定する。パラメトライズド補助特徴量累積分布関数は、補助特徴量を示すテンソルの各要素の確率分布を示す関数であってパラメトライズされた関数である。パラメトライズド補助特徴量累積分布関数のパラメータは、具体的には、補助特徴量を示すテンソルの各要素の確率分布を表す統計量に応じて変化するパラメータである。 When the probability distribution of each element of the tensor representing the auxiliary feature is represented by the parameterized auxiliary feature cumulative distribution function, the auxiliary data side probability estimation unit 106 calculates the auxiliary Estimate the occurrence probability of each element of the tensor representing the feature quantity. The parameterized auxiliary feature cumulative distribution function is a parameterized function indicating the probability distribution of each element of the tensor indicating the auxiliary feature. The parameter of the parameterized auxiliary feature cumulative distribution function is specifically a parameter that changes according to the statistic representing the probability distribution of each element of the tensor representing the auxiliary feature.
 パラメトライズド補助特徴量累積分布関数のパラメータの値は学習によって更新される。すなわち、補助データ側確率推定部106の実行する処理の内容は、学習により更新される。 The parameter values of the parameterized auxiliary feature cumulative distribution function are updated by learning. That is, the content of the processing executed by the auxiliary data side probability estimation unit 106 is updated by learning.
 パラメトライズされた累積分布関数は、例えばパラメトライズされたシグモイド関数やソフトプラス関数である。上述したようにパラメトライズド補助特徴量累積分布関数のパラメータの値は学習により更新されるので、パラメトライズされた累積分布関数のパラメータの値は学習により更新される。 A parametrized cumulative distribution function is, for example, a parametrized sigmoid function or soft plus function. Since the parameter values of the parameterized auxiliary feature cumulative distribution function are updated by learning as described above, the parameter values of the parameterized cumulative distribution function are updated by learning.
 ここで、累積分布関数と生起確率との関係を確率変数が1次元の場合を例に説明する。図5は、実施形態における累積分布関数と生起確率との関係を説明する説明図である。画像G1は、確率変数qの確率密度を示す確率密度関数の一例を示す画像である。Δは量子化のステップサイズを示す。画像G1は、値qを中心にした定義域内の範囲Δで確率密度関数を積分した値が値qの生起確率p(q)であることを示す。図5のような1次元の場合、ステップサイズΔは例えば閉区間[-1/2、1/2]の大きさ(すなわち1)である。 Here, the relationship between the cumulative distribution function and the occurrence probability will be explained using a one-dimensional random variable as an example. FIG. 5 is an explanatory diagram illustrating the relationship between cumulative distribution functions and occurrence probabilities in the embodiment. An image G1 is an image showing an example of a probability density function that indicates the probability density of the random variable q. Δ indicates the step size of quantization. The image G1 shows that the value obtained by integrating the probability density function over the range Δ within the domain centered on the value q is the occurrence probability p(q) of the value q. In the one-dimensional case as in FIG. 5, the step size Δ is, for example, the magnitude of the closed interval [−1/2, 1/2] (ie 1).
 画像G2は、画像G1の確率密度関数の積分の結果として得られた累積分布関数cdf(q)である。このように、確率密度関数の積分の結果は、シグモイド関数等の単調増加関数で表される。なお、確率密度関数の積分の結果は、確率密度関数の形状によらずシグモイド関数等の単調増加関数で表される。より具体的には、累積分布関数は、cdf(-∞)=0という条件と、cdf(∞)=1という条件と、cdf(q)のqについての微分値が0以上という条件とを満たす関数である。 Image G2 is the cumulative distribution function cdf(q) obtained as a result of integration of the probability density function of image G1. Thus, the result of integration of the probability density function is represented by a monotonically increasing function such as a sigmoid function. The result of integration of the probability density function is represented by a monotonically increasing function such as a sigmoid function regardless of the shape of the probability density function. More specifically, the cumulative distribution function satisfies the conditions that cdf(−∞)=0, cdf(∞)=1, and that the differential value of cdf(q) with respect to q is 0 or more. is a function.
 画像G2は、生起確率p(q)がcdf(q+Δ/2)-cdf(q-Δ/2)に等しいことを示す。図1の説明に戻る。 Image G2 shows that the probability of occurrence p(q) is equal to cdf(q+Δ/2)−cdf(q−Δ/2). Returning to the description of FIG.
 補助データ側復号部107は、補助特徴量復号処理を実行する。補助特徴量復号処理は補助特徴量に基づいて得られた情報を処理対象とする処理であって、処理対象を復号する処理である。補助データ側復号部107が実行する補助特徴量復号処理の処理対象は、ノイズ付き補助特徴量である。以下、ノイズ付き補助特徴量が復号された情報を補助データという。したがって、補助データ側復号部107は、ノイズ付き補助特徴量を復号することで補助データを取得する。 The auxiliary data side decoding unit 107 executes auxiliary feature amount decoding processing. The auxiliary feature amount decoding process is a process for processing information obtained based on the auxiliary feature amount, and is a process for decoding the processing target. A processing target of the auxiliary feature amount decoding process executed by the auxiliary data side decoding unit 107 is the auxiliary feature amount with noise. Hereinafter, the information obtained by decoding the auxiliary feature with noise will be referred to as auxiliary data. Therefore, the auxiliary data decoding unit 107 acquires auxiliary data by decoding the auxiliary feature with noise.
 上述したように学習により補助特徴量取得処理の内容が更新されるほど、RDコストが小さくなる。したがって、補助特徴量取得処理の内容が更新されるほど、補助データが含み得る主データ統計量情報の情報量の最大量も増大する。 As described above, the more the content of the auxiliary feature acquisition process is updated through learning, the smaller the RD cost. Therefore, as the content of the auxiliary feature amount acquisition process is updated, the maximum amount of information of the main data statistic information that can be included in the auxiliary data also increases.
 補助特徴量復号処理の内容は、RDコストが小さくなるように、学習により更新される。すなわち、補助データ側復号部107の実行する処理の内容は、学習により更新される。 The content of the auxiliary feature decoding process is updated through learning so that the RD cost is reduced. That is, the contents of the processing executed by the auxiliary data side decoding unit 107 are updated by learning.
 補助エントロピー取得部108は、補助データ側確率推定部106の推定の結果である補助データ側確率に基づき、補助エントロピーを取得する。補助エントロピーは、補助特徴量のエントロピーである。 The auxiliary entropy acquisition unit 108 acquires auxiliary entropy based on the auxiliary data side probability, which is the estimation result of the auxiliary data side probability estimation unit 106 . Auxiliary entropy is the entropy of auxiliary features.
 主データ側確率推定部109は、主データ側確率推定処理を実行する。主データ側確率推定処理は、ノイズ付き主データ特徴量と補助データとに基づき主データ側確率を推定する処理である。主データ側確率は、主データ特徴量を示すテンソルの各要素の生起確率を示す情報である。主データ特徴量を示すテンソルの各要素の生起確率の推定には主データ特徴量を示すテンソルの各要素の確率分布の情報が用いられる。 The main data side probability estimation unit 109 executes main data side probability estimation processing. The main data side probability estimation process is a process of estimating the main data side probability based on the noise-added main data feature amount and the auxiliary data. The main data side probability is information indicating the occurrence probability of each element of the tensor indicating the main data feature amount. Information on the probability distribution of each element of the tensor representing the main data feature is used to estimate the occurrence probability of each element of the tensor representing the main data feature.
 主データ特徴量を示すテンソルの各要素の確率分布が、パラメトライズド主データ特徴量累積分布関数を用いて表現される場合、主データ側確率推定部109は、パラメトライズド主データ特徴量累積分布関数に基づいて、主データ特徴量を示すテンソルの各要素の生起確率を推定する。 When the probability distribution of each element of the tensor representing the main data feature amount is expressed using the parameterized main data feature amount cumulative distribution function, the main data side probability estimation unit 109 calculates the parameterized main data feature amount cumulative distribution function Based on, the occurrence probability of each element of the tensor representing the main data feature is estimated.
 パラメトライズド主データ特徴量累積分布関数は、主データ特徴量を示すテンソルの各要素の確率分布を示す累積分布関数であってパラメトライズされた累積分布関数である。確率分布は例えばガウス分布である。 The parameterized main data feature value cumulative distribution function is a cumulative distribution function that indicates the probability distribution of each element of the tensor that indicates the main data feature value and is a parameterized cumulative distribution function. A probability distribution is, for example, a Gaussian distribution.
 パラメトライズド主データ特徴量累積分布関数のパラメータは、具体的には、主データ特徴量を示すテンソルの各要素の確率分布を表す統計量である。したがって、学習においてパラメトライズド補助特徴量累積分布関数のパラメータの値としては、補助データの示す値が用いられる。 The parameter of the parameterized main data feature quantity cumulative distribution function is specifically a statistic representing the probability distribution of each element of the tensor representing the main data feature quantity. Therefore, in learning, the value indicated by the auxiliary data is used as the value of the parameter of the parameterized auxiliary feature cumulative distribution function.
 例えば確率分布がガウス分布である場合には、学習済みの学習ネットワーク100において得られる補助データはガウス分布の代表値と散布度とを示す。 For example, if the probability distribution is a Gaussian distribution, the auxiliary data obtained in the learned learning network 100 indicates the representative value and scatter of the Gaussian distribution.
<生起確率の推定>
 主データ側確率推定処理について一例を説明する。より具体的には、上述したLatticeVQを用いる場合の主データ側確率推定処理の一例を説明する。主データの各ベクトルはベクトル量子化が行われた際には、代表ベクトルの1つで表される。したがって、LatticeVQを用いる場合の主データ側確率推定処理の説明とは、さらに具体的には、LatticeVQを用いる場合の代表ベクトルの生起確率を推定する処理の説明である。以下、代表ベクトルの生起確率を推定する処理を、代表ベクトル生起確率推定処理という。代表ベクトル生起確率推定処理は主データ側確率推定処理の一例である。
<Estimation of Occurrence Probability>
An example of main data side probability estimation processing will be described. More specifically, an example of main data side probability estimation processing when using the LatticeVQ described above will be described. Each vector of the main data is represented by one of the representative vectors when vector quantization is performed. Therefore, the explanation of the main data side probability estimation processing when using LatticeVQ is, more specifically, the explanation of the processing of estimating the occurrence probability of the representative vector when using LatticeVQ. Hereinafter, the process of estimating the occurrence probability of representative vectors will be referred to as representative vector occurrence probability estimation process. The representative vector occurrence probability estimation process is an example of the main data side probability estimation process.
<代表ベクトル生起確率推定処理>
 上述したように、主データ側確率推定処理では、パラメトライズド主データ特徴量累積分布関数が用いられる。また、上述したようにパラメトライズド主データ特徴量累積分布関数は、具体的にはパラメトライズされた累積分布関数である。さらに、上述したように累積分布関数とは確率密度関数を積分した結果である。したがって、代表ベクトル生起確率推定処理は累積分布関数を用いる処理である。そして、代表ベクトル生起確率推定処理で累積分布関数が用いられるとは、パラメトライズされた確率密度関数を格子単位領域で積分した結果が用いられることを意味する。
<Representative Vector Occurrence Probability Estimation Processing>
As described above, the main data side probability estimation process uses the parameterized main data feature amount cumulative distribution function. Further, as described above, the parameterized main data feature quantity cumulative distribution function is specifically a parameterized cumulative distribution function. Furthermore, as described above, the cumulative distribution function is the result of integrating the probability density function. Therefore, representative vector occurrence probability estimation processing is processing using a cumulative distribution function. The use of the cumulative distribution function in the representative vector occurrence probability estimation process means that the result of integrating the parameterized probability density function in the grid unit area is used.
 ところでボロノイ領域は、上述したように格子単位領域の1つである。ボロノイ分割はよく知られた領域分割の方法であるため、格子単位領域を得る際にボロノイ分割が用いられる場合が多い。しかしながら、格子空間が2次元の場合ボロノイ領域の形は上述のように6角形である。 By the way, the Voronoi region is one of the lattice unit regions as described above. Voronoi tessellation is a well-known region segmentation method, and therefore is often used to obtain lattice unit regions. However, when the lattice space is two-dimensional, the shape of the Voronoi region is hexagonal as described above.
 しかしながら、多様体上の関数の積分において、積分領域を示す領域の形状が6角形である場合、積分の実行は必ずしも容易ではない。なお、積分の実行が容易とは、予め定められた精度以上の精度で積分の結果を得ることに要する演算量が少ないことを意味する。 However, in the integration of a function on a manifold, it is not always easy to perform the integration when the shape of the region indicating the integration region is hexagonal. It should be noted that the ease of execution of integration means that the amount of calculation required to obtain the result of integration with accuracy equal to or higher than a predetermined accuracy is small.
 さらに、多様体が2次元よりも高次元の場合には、ボロノイ分割の結果として得られる領域における関数の積分の実行は、2次元の場合よりも更に容易ではない。例えば4次元の場合、ボロノイ領域の形を2次元空間に描くことは難しい場合が多く、そのような多様体上での関数の積分の実行も容易では無い。このように、ボロノイ分割の結果として得られる領域を積分領域として関数の積分を実行することは必ずしも容易ではない。 Furthermore, when the manifold has more dimensions than two, performing the integral of the function over the domain resulting from the Voronoi division is even less straightforward than in the two-dimensional case. For example, in the case of four dimensions, it is often difficult to draw the shape of the Voronoi region in two-dimensional space, and it is not easy to perform the integral of a function on such a manifold. As described above, it is not always easy to integrate a function using the area obtained as a result of the Voronoi tessellation as the integration area.
 したがって、代表ベクトル生起確率推定処理で用いられるパラメトライズされた累積分布関数の取得は容易ではない場合がある。 Therefore, it may not be easy to obtain the parameterized cumulative distribution function used in the representative vector occurrence probability estimation process.
 そこで代表ベクトル生起確率推定処理では、ボロノイ分割に代えて超直方体分割の結果を用いて得られたパラメトライズされた累積分布関数を用いる。超直方体分割は、形状が超直方体の格子単位領域で格子空間を分割する処理である。なお、2次元の超直方体は矩形であり、3次元の超直方体は直方体である。 Therefore, in representative vector occurrence probability estimation processing, a parametrized cumulative distribution function obtained using the results of hyperrectangular partitioning is used instead of Voronoi partitioning. The hyperrectangular parallelepiped division is a process of dividing a lattice space into lattice unit regions each having a hyperrectangular parallelepiped shape. A two-dimensional hyperrectangular parallelepiped is a rectangle, and a three-dimensional hyperrectangular parallelepiped is a rectangular parallelepiped.
 図6は、実施形態における超直方体分割を説明する説明図である。より具体的には、図6は2次元の格子空間に対する超直方体分割の結果の一例を、ボロノイ分割の結果の一例とともに示す図である。図6における“真の領域”と“近似領域”とはいずれも格子単位領域の例である。 FIG. 6 is an explanatory diagram explaining the hyperrectangular parallelepiped division in the embodiment. More specifically, FIG. 6 is a diagram showing an example of the result of hyperrectangular partitioning for a two-dimensional lattice space together with an example of the result of Voronoi partitioning. Both the “true region” and the “approximate region” in FIG. 6 are examples of grid unit regions.
 “真の領域”はボロノイ分割の結果の領域である。すなわち、“真の領域”はボロノイ領域である。“真の領域”は、6角形の格子単位領域である。“近似領域”は超直方体分割の結果の格子単位領域である。 The "true domain" is the domain resulting from the Voronoi division. That is, the "true domain" is the Voronoi domain. A "true area" is a hexagonal lattice unit area. "Approximate region" is the grid unit region resulting from the hypercube division.
 “近似領域”の形状は直方体である。図6に示す2種類の分割の結果のうち、主データ側確率推定部109が実行する超直方体分割の結果の一例は、“近似領域”による分割の結果である。図6におけるSは“近似領域”の形状である直方体の辺の長さであって、2次元の格子空間の1次元目の辺の長さを示し、Sは2次元目の辺の長さを示す。 The shape of the "approximation area" is a rectangular parallelepiped. Among the two types of division results shown in FIG. 6, one example of the result of the hyperrectangular parallelepiped division executed by the main data side probability estimation unit 109 is the result of the division by the "approximate region". S1 in FIG. 6 is the length of the side of the rectangular parallelepiped which is the shape of the "approximation area", and indicates the length of the side of the first dimension in the two-dimensional lattice space, and S2 is the length of the side of the second dimension. Indicates length.
 1次元目の辺の長さとは、格子空間における直方体を直交する2つの斜影空間に射影した際の一方の斜影空間における直方体の長さを意味する。2次元目の辺の長さとは、他方の斜影空間における直方体の長さを意味する。なお、以下の説明において、N次元の格子空間における超直方体のn次元目の辺の長さとは、超直方体を直交するN個の1次元の斜影空間に射影した際のn番目の斜影空間における超直方体の長さを意味する。なお、斜影空間における超直方体は、直線である。 The length of the first-dimensional side means the length of the rectangular parallelepiped in one oblique space when the rectangular parallelepiped in the grid space is projected into two orthogonal oblique spaces. The length of the second-dimensional side means the length of the rectangular parallelepiped in the other oblique space. In the following description, the length of the n-th side of the hypercube in the N-dimensional lattice space means Means the length of the hyperrectangular parallelepiped. A hypercube in oblique space is a straight line.
 ところで、2次元以上の領域を積分領域とする積分は反復積分で得られる。N次元の超直方体を積分領域とする関数の積分では関数に対してN回の積分を行う反復積分が実行される。超直方体の場合、N回の各積分の積分領域を超直方体の辺とする反復積分の実行が可能である。  By the way, integration with a two- or more-dimensional area as the integration domain can be obtained by iterative integration. In the integration of a function whose integration region is an N-dimensional hyperrectangular parallelepiped, iterative integration is performed to perform N times of integration on the function. In the case of a hyperrectangular parallelepiped, iterative integration can be performed with the integration region of each N times of integration as the sides of the hyperrectangular parallelepiped.
 そして、N回の各積分の積分領域を超直方体の辺とする反復積分を実行する場合、各積分は他の積分の結果に影響されない。一方、超直方体ではない超多面体の場合、反復積分の各積分は他の積分の結果に影響される。そのため、格子単位領域の形が超直方体である場合、格子単位領域の形が超直方体ではない超多面体の場合よりも、積分の実行が容易である。 Then, when performing iterative integration with the integration area of each N times of integration as the sides of the hyperrectangular parallelepiped, each integration is not affected by the results of other integrations. On the other hand, in the case of a hyperpolyhedron that is not a hypercube, each integral in the iterative integral is affected by the results of other integrals. Therefore, when the shape of the lattice unit region is a hyperrectangular parallelepiped, integration is easier than when the shape of the lattice unit region is a hyperpolyhedron that is not a hyperrectangular parallelepiped.
 図6の例では“真の領域”は形状が6角形であり“近似領域”の形状は直方体であるため、“近似領域”を積分領域とする積分の実行は、“真の領域”を積分領域とする積分よりも容易である。 In the example of FIG. 6, the "true region" has a hexagonal shape and the "approximate region" has a rectangular parallelepiped shape. It is easier than integration over regions.
 超直方体分割の結果得られた超直方体を積分領域としてパラメトライズされた累積分布関数を得る場合、反復積分の各積分において1次元の確率密度関数であって他の次元の影響を受けない確率密度関数の積分が実行される。 When obtaining a parametrized cumulative distribution function using the hypercube obtained as a result of hypercube partitioning as an integration domain, the probability density function is a one-dimensional probability density function that is not affected by other dimensions in each integration of iterative integration. is performed.
 学習ネットワーク100においては、反復積分の対象である確率密度関数は、予め定められた種類の分布を示す関数であり、パラメトライズされた関数である。予め定められた種類の分布は、ガウス分布である。確率密度関数のパラメータの値は補助データに応じた値である。パラメトライズされた累積分布関数は、確率密度関数に対して超直方体分割で得られた超直方体を積分領域とする反復積分を実行することで得られる。学習ネットワーク100において、確率密度関数のパラメータの値は補助データの値であるので、このようにして得られたパラメトライズされた累積分布関数は補助データに応じた関数である。 In the learning network 100, the probability density function to be iteratively integrated is a function that indicates a predetermined type of distribution and is a parameterized function. A predetermined type of distribution is a Gaussian distribution. The values of the parameters of the probability density function are values according to the auxiliary data. A parameterized cumulative distribution function is obtained by performing iterative integration on the probability density function using a hyperrectangular parallelepiped obtained by hyperrectangular parallelepiped as an integration region. In the learning network 100, the values of the parameters of the probability density function are the values of the auxiliary data, so the parameterized cumulative distribution function thus obtained is a function according to the auxiliary data.
 代表ベクトル生起確率推定処理では、このようにして予め得られた累積分布関数のパラメータに補助データの値を代入することが実行される。代表ベクトル生起確率推定処理では、このようにして得られた累積分布関数を用いて代表ベクトルの生起確率を推定する。 In the representative vector occurrence probability estimation process, auxiliary data values are substituted for the parameters of the cumulative distribution function obtained in advance in this manner. In the representative vector occurrence probability estimation process, the cumulative distribution function thus obtained is used to estimate the occurrence probability of the representative vector.
 なお代表ベクトル生起確率推定処理では、例えば以下の式(1)及び式(2)の表す処理の実行により代表ベクトルの生起確率が推定される。なお式(2)における累積分布関数cdfは反復積分の各積分で得られた各累積分布関数を表す。m次元(mは自然数)の累積分布関数を1つの次元について積分した結果は、(m-1)次元の累積分布関数である。 In the representative vector occurrence probability estimation process, for example, the occurrence probability of representative vectors is estimated by executing the processes represented by the following equations (1) and (2). Note that the cumulative distribution function cdf in Equation (2) represents each cumulative distribution function obtained by each integration of iterative integration. The result of integrating the m-dimensional (m is a natural number) cumulative distribution function for one dimension is the (m−1)-dimensional cumulative distribution function.
 式(1)の右辺がパラメトライズド主データ特徴量累積分布関数を用いて得られる生起確率の一例である。式(1)は式(2)の累積分布関数cfdを介してパラメトライズされた関数である。式(2)は確率密度関数を格子単位領域の1次元方向に積分した結果得られた累積分布関数である。式(1)は式(2)で得られた各次元の生起確率の積である。したがって、式(1)は、格子単位領域全面で確率密度関数を積分した結果を用いて表された生起確率である。 The right side of Equation (1) is an example of the occurrence probability obtained using the parameterized main data feature quantity cumulative distribution function. Equation (1) is a function parametrized via the cumulative distribution function cfd of Equation (2). Equation (2) is a cumulative distribution function obtained as a result of integrating the probability density function in the one-dimensional direction of the grid unit area. Equation (1) is the product of the occurrence probabilities of each dimension obtained by Equation (2). Therefore, equation (1) is the occurrence probability expressed using the result of integrating the probability density function over the entire lattice unit area.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 iは、各格子点を識別する識別子である。jは、格子空間の各次元を意味する。式(1)の左辺は、代表ベクトルの生起確率を示す。各格子点はそれぞれ代表ベクトルを示すので、各格子点における生起確率pとは、各代表ベクトルの生起確率pを意味する。Δは、格子単位領域の識別子jで表される次元の長さを意味する。格子単位領域の形状及び大きさは予め定められているので、Δは予め定められた長さである。 i is an identifier that identifies each lattice point. j means each dimension of the lattice space. The left side of Equation (1) indicates the occurrence probability of representative vectors. Since each grid point represents a representative vector, the occurrence probability p i at each grid point means the occurrence probability p i of each representative vector. Δ j means the length of the dimension represented by the lattice unit region identifier j. Since the shape and size of the lattice unit area are predetermined, Δj is a predetermined length.
 なお、以下の式(3)で表されるハット記号が付与されたyは、ノイズ付き主データ特徴量を意味する。以下、ハット付きの記号AをA^と記載する。したがった、例えば以下の式(3)で表されたハット付きの記号yは、y^である。 It should be noted that y given a hat symbol represented by the following equation (3) means the feature amount of the main data with noise. The symbol A with a hat is hereinafter referred to as A^. Thus, for example, the hated symbol y in equation (3) below is y^.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 式(1)は、生起確率pは次元ごとに得られた確率p[j]の積で表現される、ことを示す。仮に格子単位領域の形状が超直方体ではない場合には、生起確率pは次元ごとに得られた確率p[j]の簡単な積で表現されない。したがって、格子単位領域の形状が超直方体であるので、主データ側確率推定部109は式(1)及び式(2)が示すように生起確率pの取得が容易である。 Equation (1) indicates that the occurrence probability p i is expressed by the product of the probabilities p i [j] obtained for each dimension. If the shape of the lattice unit region is not hyperrectangular, the occurrence probability p i cannot be expressed by a simple product of the probabilities p i [j] obtained for each dimension. Therefore, since the shape of the grid unit area is a hyperrectangular parallelepiped, the main data side probability estimation unit 109 can easily acquire the occurrence probability pi as shown by the equations (1) and (2).
 なお上記式(1)と(2)とは、次元間の共分散が0である場合を仮定して得られた式である。しかしながら、格子単位領域の形状が超直方体である場合、格子空間の座標軸を超直方体の各辺に平行なように回転させれば次元間の共分散を0にすることができる。したがって式(1)と式(2)とは次元間の共分散が0では無い場合であっても、座標軸の回転により成り立つ式である。なお、線形代数でよく知られた事実であるが、座標軸の回転はユニタリ変換であるので式(1)と式(2)の内容を変える変換ではない。なお、共分散を0にする処理とは、次元間の分散を表す行列を対角化する処理である。 The above formulas (1) and (2) are obtained by assuming that the covariance between dimensions is zero. However, if the shape of the lattice unit area is a hyperrectangular parallelepiped, the covariance between dimensions can be reduced to 0 by rotating the coordinate axes of the lattice space so as to be parallel to each side of the hyperrectangular parallelepiped. Therefore, equations (1) and (2) are equations that are established by rotation of the coordinate axes even if the covariance between dimensions is not zero. It is a well-known fact in linear algebra that the rotation of the coordinate axes is a unitary transformation, so it is not a transformation that changes the contents of equations (1) and (2). Note that the process of setting the covariance to 0 is the process of diagonalizing the matrix representing the variance between dimensions.
 このように、主データ側確率推定処理は、パラメトライズされた確率密度関数が、形状が超直方体である格子単位領域を積分領域として積分された結果を用いて、主データ特徴量を示すテンソルの各要素の生起確率を推定する。 In this way, the main data-side probability estimation process uses the result of integrating the parameterized probability density function with the grid unit region having a hyperrectangular parallelepiped shape as the integration region, and uses each of the tensors representing the main data feature quantity. Estimate the occurrence probability of an element.
 <超直方体の範囲の決定の処理>
 ここまで、格子単位領域として超直方体を用いることを説明してきたが、その超直方体の形状及び大きさをどのように決めるのか、その処理については触れてこなかった。ここで、超直方体であるという条件は少なくとも満たす格子単位領域について、その形状及び大きさを決める処理(以下「超直方体決定処理」という。)の一例を説明する。
<Processing for Determining Range of Hypercube>
So far, the use of hyperrectangular parallelepipeds as lattice unit regions has been explained, but no mention has been made of how to determine the shape and size of the hyperrectangular parallelepipeds. Here, an example of processing for determining the shape and size of a lattice unit region that at least satisfies the condition of being a hyperrectangular parallelepiped (hereinafter referred to as "hyperrectangular parallelepiped determination processing") will be described.
 超直方体決定処理では、まず原点格子の隣接格子が算出される。なお、原点格子とは、格子空間の原点に位置する格子点である。隣接格子は、原点の隣の格子点である。次に超直方体決定処理では、超直方体決定条件を満たす超直方体が算出される。超直方体決定条件は、原点格子の格子単位領域が各隣接格子の格子単位領域と重ならないという条件と、超直方体の体積がボロノイ領域の体積に等しい、という条件とを含む条件である。  In the hyperrectangular solid determination process, grids adjacent to the origin grid are first calculated. The origin grid is a grid point located at the origin of the grid space. Neighboring grids are grid points next to the origin. Next, in hyperrectangular solid determination processing, hyperrectangular solids satisfying hyperrectangular solid determination conditions are calculated. The hypercube determination conditions include the condition that the lattice unit area of the origin lattice does not overlap with the lattice unit area of each adjacent lattice and the condition that the volume of the hypercube is equal to the volume of the Voronoi region.
 超直方体決定処理の具体例として、8次元のE8格子空間における超直方体の形状及び大きさの決定に用いられる条件について説明する。原点格子の隣接格子は、以下の2種類が存在する。1つは、[±1、0]である。他の1つは、[±(1/2)]である。ここで、上付き文字は、次元の数を示す。例えば[1、0]は、[1、1、0、0、0、0、0、0]を意味する。なお、上記の大カッコ[]は、ベクトルを表す。 As a specific example of the hypercube determination process, conditions used to determine the shape and size of a hypercube in an eight-dimensional E8 lattice space will be described. There are the following two types of grids adjacent to the origin grid. One is [±1 2 , 0 6 ]. Another is [±(1/2) 8 ]. where the superscript indicates the number of dimensions. For example, [1 2 , 0 6 ] means [1, 1, 0, 0, 0, 0, 0, 0]. Note that the square brackets [ ] above represent a vector.
<超直方体を示す記法について>
 超直方体[s 、s ・・・s ]における上付き文字は、1次元目からa次元目までの各辺の長さがsであり、(a+1)次元目から(a+b)次元目までの各辺の長さがsであり、ということを示す。このように、上付き文字を用いる記法における上付き文字は、上付き文字の添付対象の文字をsとして、一続きの次元であって上付き文字の数だけの次元について、超直方体の辺の長さが同じ長さsであることを示す。すなわちsは、一続きのa個の次元については超直方体の辺の長さがsであることを示す。なお、次元が一続きか否かの判定には次元の順番の情報が必要であるが、次元の順番は予め定められた順番である。
<Regarding the notation for hyperrectangular parallelepiped>
The superscript in the hypercube [s 1 a , s 2 b … s 3 c ] has a length of s 1 for each side from the 1st dimension to the ath dimension, and from the (a+1)th dimension to ( It indicates that the length of each side up to the a+b) dimension is s2 . In this way, a superscript in a notation that uses a superscript is a character to be attached with the superscript, and for a series of dimensions equal to the number of superscripts, the number of sides of the hypercube Denote that the lengths are of the same length s. That is, s a indicates that for a sequence of a dimensions, the side length of the hyperrectangular parallelepiped is s. It should be noted that information on the order of dimensions is required for determining whether or not the dimensions are continuous, but the order of dimensions is a predetermined order.
 なお[±1、0]における“±”は、+1の場合と-1の場合とが存在することを示す。したがって、[±1、0]は、具体的には、[1、1、0]と、[-1、1、0]と、[1、-1、0]と、[-1、-1、0]との4つのベクトルを意味する。 Note that the “±” in [±1 2 , 0 6 ] indicates that there are cases of +1 and −1. Therefore, [±1 2 , 0 6 ] is specifically [1, 1, 0 6 ], [−1, 1, 0 6 ], [1, −1, 0 6 ], and [ −1, −1, 0 6 ].
 8次元の格子空間における超直方体の形状及び大きさの決定の説明に戻る。各隣接格子の超直方体が原点格子の超直方体と重ならない条件は、以下の2つである。1つは超直方体の辺(要素)のうち7つ以上が1以下という条件である。他の1つは、超直方体の辺(要素)のうち1つ以上が1/2以下という条件である。  Back to the description of determining the shape and size of the hypercube in the 8-dimensional lattice space. There are two conditions under which the hyperrectangular parallelepiped of each adjacent lattice does not overlap with the hyperrectangular parallelepiped of the origin lattice. One condition is that 7 or more of the sides (elements) of the hyperrectangular parallelepiped are 1 or less. Another condition is that one or more of the sides (elements) of the hyperrectangular parallelepiped is 1/2 or less.
 また、E8格子のボロノイ領域の体積は1なので、超直方体は、以下の式(4)の条件を満たす。 Also, since the volume of the Voronoi region of the E8 lattice is 1, the hypercube satisfies the condition of Equation (4) below.
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 これらの条件を満たす超直方体として、8次元のE8格子空間の場合には、例えばs=[1/2、1、2]が得られる。 As a hypercube that satisfies these conditions, s=[1/2, 1 6 , 2], for example, is obtained in the case of an eight-dimensional E8 lattice space.
 他の次元の場合の例を紹介する。2次元のA2格子空間の場合には、格子単位領域の範囲は、以下の式(5)で表される超直方体の囲う範囲である。 I will introduce an example for other dimensions. In the case of the two-dimensional A2 lattice space, the range of the lattice unit area is the range enclosed by the hyperrectangular parallelepiped represented by Equation (5) below.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 8次元のE8格子空間の場合には、格子単位領域の範囲は、以下の式(6)で表される超直方体の囲う範囲である。 In the case of the eight-dimensional E8 lattice space, the range of the lattice unit area is the range enclosed by the hyperrectangular parallelepiped represented by the following equation (6).
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 24次元のリーチ格子空間の場合には、格子単位領域の範囲は、以下の式(7)で表される超直方体の囲う範囲である。 In the case of the 24-dimensional reach lattice space, the range of the lattice unit area is the range enclosed by the hyperrectangular parallelepiped represented by the following equation (7).
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
 このようにして、超直方体の範囲(格子単位領域の形状及び大きさ)が決定される。図1の説明に戻る。 In this way, the range of the hyperrectangular parallelepiped (the shape and size of the lattice unit area) is determined. Returning to the description of FIG.
 主エントロピー取得部110は、主データ側確率推定部109の推定の結果に基づき主データ特徴量のエントロピーを取得する。以下、主データ特徴量のエントロピーを主エントロピーという。 The main entropy acquisition unit 110 acquires the entropy of the main data feature amount based on the estimation result of the main data side probability estimation unit 109 . Hereinafter, the entropy of the main data feature quantity will be referred to as the main entropy.
 主データ側復号部111は、主データ特徴量復号処理を実行する。主データ特徴量復号処理は主データ特徴量に基づいて得られた情報を処理対象とする処理であって、処理対象を復号する処理である。主データ側復号部111が実行する主データ特徴量復号処理の処理対象は、ノイズ付き主データ特徴量である。主データ側復号部111の復号の処理の内容は、学習により更新される。 The main data side decoding unit 111 executes main data feature amount decoding processing. The main data feature amount decoding process is a process for processing information obtained based on the main data feature amount, and is a process for decoding the processing target. The processing target of the main data feature amount decoding process executed by the main data side decoding unit 111 is the main data feature amount with noise. The contents of the decoding process of the main data side decoding unit 111 are updated by learning.
 再構成誤差算出部112は、主データ側復号部111の復号の結果と主データ取得部101の取得した主データとの違いを算出する。以下、主データ側復号部111の復号の結果と主データ取得部101の取得した主データとの違い、を再構成誤差という。主データ側復号部111の復号の結果と主データ取得部101の取得した主データとの違いは、例えば平均二乗誤差和で表されてもよいし、バイナリークロスエントロピーで表されてもよい。 The reconstruction error calculation unit 112 calculates the difference between the decoding result of the main data side decoding unit 111 and the main data acquired by the main data acquisition unit 101 . Hereinafter, the difference between the decoding result of the main data side decoding unit 111 and the main data acquired by the main data acquisition unit 101 is called a reconstruction error. The difference between the decoding result of the main data side decoding unit 111 and the main data acquired by the main data acquisition unit 101 may be represented by, for example, the sum of mean square errors or binary cross entropy.
 最適化部113は、補助エントロピーと主エントロピーと再構成誤差とに基づき、学習ネットワーク100を更新する。なお、補助エントロピーと主エントロピーと再構成誤差とはいずれも学習ネットワーク100の出力の例である。最適化部113は、具体的には、最適化誤差と、主エントロピーと、補助エントロピーとを小さくするように学習ネットワーク100を更新する。最適化部113が用いる目的関数は、例えば、L=D+λ(R+R)である。 The optimization unit 113 updates the learning network 100 based on the auxiliary entropy, the primary entropy and the reconstruction error. Note that auxiliary entropy, primary entropy, and reconstruction error are all examples of outputs of learning network 100 . Specifically, the optimization unit 113 updates the learning network 100 so as to reduce the optimization error, primary entropy, and auxiliary entropy. The objective function used by the optimization unit 113 is, for example, L=D+λ(R y +R z ).
 記号Lは目的関数を表す。記号Dは、再構成誤差を表す。記号ラムダは予め定められた定数である。記号Rは、補助エントロピーを表す。記号Rは、主エントロピーを表す。 The symbol L represents an objective function. Symbol D represents the reconstruction error. The symbol lambda is a predetermined constant. The symbol R y represents the auxiliary entropy. The symbol Rz represents the principal entropy.
 エントロピーの小ささは、符号長の短さを意味するので、最適化部113は、エントロピーを小さくするよう学習ネットワーク100を更新する。また、最適化誤差が小さいほど自己符号化の精度が高いことを意味するので、最適化部113は最適化誤差を小さくするよう学習ネットワーク100を更新する。 A small entropy means a short code length, so the optimization unit 113 updates the learning network 100 to reduce the entropy. Also, since the smaller the optimization error, the higher the accuracy of self-encoding, the optimization unit 113 updates the learning network 100 so as to reduce the optimization error.
 例えば学習ネットワーク100の更新は、目的関数Lの最小化問題を解くことが勾配法を用いて行われる。すなわち、例えば誤差逆伝搬法により学習ネットワーク100の各パラメータの値が更新されることで学習ネットワーク100が更新される。 For example, the learning network 100 is updated by solving the minimization problem of the objective function L using the gradient method. That is, the learning network 100 is updated by updating the values of the parameters of the learning network 100 by, for example, the error backpropagation method.
 学習ネットワーク100の更新とは具体的には、主データ側符号化部102、補助データ側符号化部103、補助データ側確率推定部106、補助データ側復号部107及び主データ側復号部111の処理の内容を更新することを意味する。 Specifically, the updating of the learning network 100 includes updating the main data side encoding unit 102, the auxiliary data side encoding unit 103, the auxiliary data side probability estimation unit 106, the auxiliary data side decoding unit 107, and the main data side decoding unit 111. It means to update the contents of the process.
<自己符号化と学習ネットワーク100との関係>
 ところで学習ネットワーク100は、主データ側ノイズ付与部104を備える。主データ側ノイズ付与部104そのものは量子化の処理ではない。しかしながら、主データ側ノイズ付与部104の処理を含むからこそ、学習が行われることで、ベクトル量子化を用いた自己符号化の性能が高まる。その理由を説明する。
<Relationship between self-encoding and learning network 100>
By the way, the learning network 100 includes a main data side noise adding section 104 . The main data side noise addition unit 104 itself is not a quantization process. However, the performance of self-encoding using vector quantization is improved by performing learning precisely because the processing of the main data side noise addition unit 104 is included. I will explain why.
 例えば、以下の参考文献1に記載のように学習時にベクトル量子化が行われると、勾配が0になってしまうため、ベクトル量子化を用いた自己符号化の処理を高効率化する学習の実行ができないことが知られている。したがって、自己符号化の処理を学習によって高効率化するためには、ベクトル量子化そのものに代えてベクトルに対してノイズを付与する処理が学習時に行われる。 For example, if vector quantization is performed during learning as described in Reference 1 below, the gradient becomes 0. Therefore, execution of learning that improves the efficiency of self-encoding processing using vector quantization is known to be impossible. Therefore, in order to improve the efficiency of the self-encoding process by learning, a process of adding noise to vectors is performed during learning instead of vector quantization itself.
 ベクトル量子化そのものに代えてノイズを付与する処理を行うことで、ベクトル量子化を行う処理そのものではなくベクトル量子化に用いられる情報を生成する他の処理の性能が高まる。その結果、自己符号化の際にはノイズを付与する処理に代えてベクトル量子化を行っても、学習前よりも高効率な自己符号化が行われる。 By performing the process of adding noise instead of vector quantization itself, the performance of other processes that generate information used for vector quantization, rather than the process of vector quantization itself, increases. As a result, even if vector quantization is performed instead of applying noise during self-encoding, self-encoding is performed with higher efficiency than before learning.
 参考文献1:Balle,”Variational image compression with a scale hyperprior,” 2018 Reference 1: Balle, ``Variational image compression with a scale hyperprior,'' 2018
 学習部10について関しては、主データ側ノイズ付与部104がノイズ付与対象ベクトルに対してノイズを付与する。したがって、主データ側ノイズ付与部104を含む学習を行うことで、勾配が0にならない。その結果、ベクトル量子化を用いた自己符号化の際にも用いられる主データ側符号化部102、補助データ側符号化部103、補助データ側確率推定部106、補助データ側復号部107及び主データ側復号部111の内容を更新することが可能である。 Regarding the learning unit 10, the main data noise addition unit 104 adds noise to the noise addition target vector. Therefore, the gradient does not become 0 by performing learning including the noise addition unit 104 on the main data side. As a result, the main data side encoding unit 102, the auxiliary data side encoding unit 103, the auxiliary data side probability estimation unit 106, the auxiliary data side decoding unit 107, and the main It is possible to update the contents of the data side decoding unit 111 .
 したがって、ここまでネットワーク学習結果と説明してきた内容は、具体的には、ベクトル量子化に代えてノイズの付与を実行するニューラルネットワーク(すなわち学習ネットワーク100)の学習で更新された周辺処理、を意味する。周辺処理は、ベクトル量子化で用いられる情報を生成する処理である。周辺処理は具体的には、主データ側符号化部102、補助データ側符号化部103、補助データ側確率推定部106、補助データ側復号部107及び主データ側復号部111のそれぞれが実行する処理である。 Therefore, what has been described as a network learning result thus far specifically means peripheral processing updated by learning of a neural network (that is, learning network 100) that performs noise addition instead of vector quantization. do. Peripheral processing is the processing that produces the information used in vector quantization. Specifically, the peripheral processing is executed by each of main data side encoding section 102, auxiliary data side encoding section 103, auxiliary data side probability estimation section 106, auxiliary data side decoding section 107, and main data side decoding section 111. processing.
 図7は、実施形態における学習部10が実行する処理の流れの一例を示す図である。主データ取得部101が主データx=[x、x、・・・、x]を取得する(ステップS101)。主データxはxからxまでのN個の要素を有するN次元ベクトルである。xからxまでの各要素はテンソルである。したがって、xからxまでの各要素はスカラーであってもよいしベクトルであってもよい。 FIG. 7 is a diagram showing an example of the flow of processing executed by the learning unit 10 in the embodiment. The main data acquisition unit 101 acquires main data x=[x 1 , x 2 , . . . , x N ] (step S101). The main data x is an N-dimensional vector having N elements from x1 to xN . Each element from x 1 to x N is a tensor. Therefore, each element from x1 to xN may be a scalar or a vector.
 次に、主データ側符号化部102が、主データ特徴量取得処理を実行する(ステップS102)。すなわち、主データ側符号化部102が主データxを符号化する。主データの符号化により、主データ特徴量y=fenc(x)が得られる。関数fenc(x)は、主データxの符号化の処理を表現する関数(以下「主データ符号化関数」という。)である。なお主データ特徴量yはk個のK次元ベクトルで構成されるテンソルである。 Next, the main data encoding unit 102 executes main data feature amount acquisition processing (step S102). That is, the main data side encoding unit 102 encodes the main data x. By encoding the main data, a main data feature quantity y=f enc (x) is obtained. The function f enc (x) is a function (hereinafter referred to as "main data encoding function") that expresses the encoding process of the main data x. Note that the main data feature quantity y is a tensor composed of k K-dimensional vectors.
 次に、補助データ側符号化部103が補助特徴量取得処理を実行する(ステップS103)。補助特徴量取得処理の実行により、補助特徴量z=genc(y)が得られる。関数genc(y)は、主データ特徴量yの符号化の処理を表現する関数(以下「主データ特徴量符号化関数」という。)である。なお補助特徴量zはベクトル等のテンソルである。 Next, the auxiliary data side encoding unit 103 executes auxiliary feature amount acquisition processing (step S103). By executing the auxiliary feature amount acquisition process, the auxiliary feature amount z=g enc (y) is obtained. The function g enc (y) is a function that expresses the processing of encoding the main data feature quantity y (hereinafter referred to as "main data feature quantity encoding function"). Note that the auxiliary feature z is a tensor such as a vector.
 次に、主データ側ノイズ付与部104がノイズ付き主データ特徴量取得処理を実行する(ステップS104)。ノイズ付き主データ特徴量取得処理の実行により、主データ特徴量にノイズが付与される。すなわち、ノイズ付き主データ特徴量取得処理の実行により、ノイズ付き主データ特徴量y^=[y^、y^、・・・、y^]が得られる。ただし、y^=y+uである。yは、主データ特徴量yのi番目のベクトル要素を表す。uはノイズを表す。 Next, the main-data-side noise addition unit 104 executes noise-added main data feature amount acquisition processing (step S104). Noise is added to the main data feature amount by executing the noise-attached main data feature amount acquisition process. That is, by executing the noisy main data feature amount acquisition process, the noisy main data feature amount y^=[y 1 ^, y 2 ^, . . . , y k ^] is obtained. However, yi ^= yi + uy . yi represents the i-th vector element of the main data feature quantity y. uy represents noise.
 次に、補助データ側ノイズ付与部105がノイズ付き補助特徴量取得処理を実行する(ステップS105)。ノイズ付き補助特徴量取得処理の実行により、補助特徴量にノイズが付与される。すなわち、ノイズ付き補助特徴量取得処理の実行により、ノイズ付き補助特徴量z^=[z^、z^、・・・、z^]が得られる。なお、wは1以上の整数である。ただし、z^=z+uである。zは、補助特徴量zのi番目の要素を表す。uはノイズを表す。 Next, the auxiliary data side noise addition unit 105 executes auxiliary feature quantity acquisition processing with noise (step S105). Noise is added to the auxiliary feature amount by executing the auxiliary feature amount acquisition process with noise. That is, the noise - attached auxiliary feature z^=[z 1 ^, z 2 ^, . Note that w is an integer of 1 or more. However, z i ^=z i + u z . zi represents the i-th element of the auxiliary feature z. u z represents noise.
 次に、補助データ側確率推定部106が補助データ側確率推定処理を実行する(ステップS106)。補助データ側確率推定部106は、補助データ側確率推定処理の実行により補助データ側確率を推定する。補助データ側確率は具体的には以下の式(8)で表される。 Next, the auxiliary data side probability estimation unit 106 executes auxiliary data side probability estimation processing (step S106). The auxiliary data-side probability estimation unit 106 estimates auxiliary data-side probabilities by executing auxiliary data-side probability estimation processing. The auxiliary data side probability is specifically represented by the following equation (8).
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000008
 式(8)の左辺の記号が補助データ側確率を表す。記号hはパラメトライズド補助特徴量累積分布関数である。 The symbol on the left side of Equation (8) represents the auxiliary data side probability. The symbol h is a parameterized auxiliary feature cumulative distribution function.
 次に補助データ側復号部107が、ノイズ付き補助特徴量に対して補助特徴量復号処理を実行する(ステップS107)。補助特徴量復号処理の実行により、補助特徴量復号処理の実行により、ノイズ付き補助特徴量が復号される。すなわち、補助特徴量復号処理の実行により、補助データθ=gdec(z^)が得られる。関数gdec(z^)は、ノイズ付き補助特徴量z^の復号の処理を表現する関数(以下「補助特徴量復号関数」という。)である。なお補助データθはベクトル等のテンソルである。 Next, the auxiliary data side decoding unit 107 executes auxiliary feature quantity decoding processing on the auxiliary feature quantity with noise (step S107). By executing the auxiliary feature decoding process, the auxiliary feature with noise is decoded. That is, auxiliary data θ=g dec (ẑ) is obtained by executing the auxiliary feature amount decoding process. The function g dec (ẑ) is a function (hereinafter referred to as “auxiliary feature quantity decoding function”) that expresses the process of decoding the auxiliary feature quantity with noise ẑ. The auxiliary data θ is a tensor such as a vector.
 次に、補助エントロピー取得部108が補助データ側確率に基づき、補助エントロピーを取得する(ステップS108)。具体的には、以下の式(9)で表される処理を実行することで、補助エントロピー取得部108は補助エントロピーを取得する。 Next, the auxiliary entropy acquisition unit 108 acquires auxiliary entropy based on the auxiliary data side probability (step S108). Specifically, the auxiliary entropy acquisition unit 108 acquires the auxiliary entropy by executing the process represented by the following formula (9).
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
 式(9)の左辺の記号が補助エントロピーを表す。 The symbol on the left side of Equation (9) represents auxiliary entropy.
 次に、主データ側確率推定部109が、主データ側確率推定処理を実行する(ステップS109)。主データ側確率推定部109は、主データ側確率推定処理の実行によりノイズ付き主データ特徴量と補助データとに基づき主データ側確率を推定する。主データ側確率は、具体的には上述の式(1)で表される。 Next, the main data side probability estimation unit 109 executes main data side probability estimation processing (step S109). The main-data-side probability estimation unit 109 estimates the main-data-side probability based on the noise-added main-data feature amount and the auxiliary data by executing the main-data-side probability estimation process. The main data side probability is specifically represented by the above-described formula (1).
 次に、主エントロピー取得部110が、主データ側確率に基づき主エントロピーを取得する(ステップS110)。具体的には、以下の式(10)で表される処理を実行することで、主エントロピー取得部110は主エントロピーを取得する。 Next, the primary entropy acquisition unit 110 acquires primary entropy based on the primary data side probability (step S110). Specifically, the primary entropy acquisition unit 110 acquires the primary entropy by executing the process represented by the following formula (10).
Figure JPOXMLDOC01-appb-M000010
Figure JPOXMLDOC01-appb-M000010
 式(10)の左辺の記号が主エントロピーを表す。 The symbol on the left side of Equation (10) represents the principal entropy.
 次に、主データ側復号部111が、ノイズ付き主データ特徴量に対して主データ特徴量復号処理を実行する(ステップS111)。主データ特徴量復号処理の実行により、ノイズ付き主データ特徴量が復号される。以下、ノイズ付き主データ特徴量が復号された情報を復号主データという。したがって、主データ側復号部111は主データ特徴量復号処理の実行による復号主データx^=fdec(y^)を得る。関数fdec(y^)は、ノイズ付き主データ特徴量y^の復号の処理を表現する関数(以下「主データ特徴量復号関数」という。)である。なお復号主データx^はベクトル等のテンソルである。 Next, the main data side decoding unit 111 executes main data feature amount decoding processing on the main data feature amount with noise (step S111). By executing the main data feature amount decoding process, the main data feature amount with noise is decoded. Hereinafter, the information obtained by decoding the main data feature amount with noise will be referred to as decoded main data. Therefore, the main-data-side decoding unit 111 obtains decoded main data x̂=f dec (ŷ) by executing the main data feature quantity decoding process. The function f dec (ŷ) is a function (hereinafter referred to as “main data feature quantity decoding function”) that expresses the process of decoding the main data feature quantity ŷ with noise. The decoded main data x̂ is a tensor such as a vector.
 次に、再構成誤差算出部112が、復号主データと主データ取得部101の取得した主データとの違いを取得する(ステップS112)。復号主データと主データ取得部101の取得した主データとの違いが再構成誤差である。 Next, the reconstruction error calculation unit 112 acquires the difference between the decoded main data and the main data acquired by the main data acquisition unit 101 (step S112). The difference between the decoded main data and the main data acquired by the main data acquisition unit 101 is the reconstruction error.
 次に最適化部113は、補助データエントロピーと主データエントロピーと再構成誤差とに基づき、学習ネットワーク100を更新する(ステップS113)。次に、最適化部113は学習に関する所定の終了条件(以下「学習終了条件」という。)が満たされたか否かを判定する(ステップS114)。学習終了条件は、例えば、学習ネットワーク100の所定の回数の更新が行われた、という条件である。 Next, the optimization unit 113 updates the learning network 100 based on the auxiliary data entropy, main data entropy, and reconstruction error (step S113). Next, the optimization unit 113 determines whether or not a predetermined termination condition (hereinafter referred to as "learning termination condition") regarding learning is satisfied (step S114). The learning end condition is, for example, a condition that the learning network 100 has been updated a predetermined number of times.
 学習終了条件が満たされた場合(ステップS114:YES)、処理が終了する。一方、学習終了条件が満たされない場合(ステップS114:NO)、ステップS101の処理に戻る。学習終了条件が満たされた時点の周辺処理が学習済みの周辺処理としてベクトル量子化に用いられる。 If the learning end condition is satisfied (step S114: YES), the process ends. On the other hand, if the learning end condition is not satisfied (step S114: NO), the process returns to step S101. The peripheral processing at the time when the learning end condition is satisfied is used for vector quantization as the learned peripheral processing.
 なお、ステップS101~ステップS114の各処理は因果律に反することが無ければどのような順番で実行されてもよい。 It should be noted that each process from step S101 to step S114 may be executed in any order as long as it does not violate the law of causality.
 なお、上述したように、補助データ側確率推定部106は予め与えられた確率分布を所定の記憶装置からの読み出し等によって取得してもよい。このような場合、補助データ側確率推定処理の内容は学習によって更新されない。したがって、学習済みの補助データ側確率推定処理は学習前の補助データ側確率推定処理と同じである。 Note that, as described above, the auxiliary data side probability estimation unit 106 may acquire a previously given probability distribution by reading from a predetermined storage device or the like. In such a case, the contents of the auxiliary data side probability estimation process are not updated by learning. Therefore, the learned auxiliary data side probability estimation process is the same as the auxiliary data side probability estimation process before learning.
 なお、補助データ側確率推定処理の内容の更新は、より具体的には、パラメトライズド補助特徴量累積分布関数hの更新である。したがって、補助データ側確率推定部106が予め与えられた確率分布を所定の記憶装置からの読み出し等によって取得する場合、学習済みのパラメトライズド補助特徴量累積分布関数hは学習前のパラメトライズド補助特徴量累積分布関数hと同じである。 It should be noted that updating the contents of the auxiliary data side probability estimation process is, more specifically, updating the parameterized auxiliary feature quantity cumulative distribution function h. Therefore, when the auxiliary data side probability estimating unit 106 acquires a previously given probability distribution by reading from a predetermined storage device or the like, the trained parameterized auxiliary feature quantity cumulative distribution function h is the parameterized auxiliary feature before learning. It is the same as the quantity cumulative distribution function h.
 このように学習部10は、ベクトル量子化を用いた自己符号化の処理における符号化と復号との処理を学習により更新する。ベクトル量子化を用いた自己符号化の処理は、自己符号化の対象の特徴量である主データ特徴量と、主データ特徴量の特徴量である補助特徴量とを用いる。さらに、ベクトル量子化を用いた自己符号化の処理は、主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う。 In this way, the learning unit 10 updates the encoding and decoding processes in the self-encoding process using vector quantization through learning. The self-encoding process using vector quantization uses a main data feature quantity, which is a feature quantity to be self-encoded, and an auxiliary feature quantity, which is a feature quantity of the main data feature quantity. Furthermore, in the self-encoding process using vector quantization, entropy encoding is performed on the result of vector quantization of the main data feature quantity and entropy encoding is performed on the result of scalar quantization of the auxiliary feature quantity.
 このようなベクトル量子化を用いた自己符号化の処理における符号化の処理は、具体的には、主データ側符号化部102が実行する主データ特徴量取得処理と、補助データ側符号化部103が実行する補助特徴量取得処理とである。また、このようなベクトル量子化を用いた自己符号化の処理における復号の処理は、具体的には、補助データ側復号部107が実行する補助特徴量復号処理と、主データ側復号部111が実行する主データ特徴量復号処理とである。 Specifically, the encoding process in the self-encoding process using such vector quantization includes the main data feature amount acquisition process executed by the main data side encoding unit 102, the auxiliary data side encoding unit 103 executes auxiliary feature amount acquisition processing. Further, the decoding process in the self-encoding process using such vector quantization specifically includes the auxiliary feature amount decoding process executed by the auxiliary data side decoding unit 107 and the main data side decoding unit 111 and main data feature amount decoding processing to be executed.
 学習済みの周辺処理を用いるベクトル量子化、を用いた自己符号化について図8及び図9を用いて説明する。学習済みとは、学習終了条件が満たされた時点の、という意味である。より具体的には、学習済みの周辺処理を用いるベクトル量子化、を用いた自己符号化を実行する装置の例として、エンコードとデコードとの処理を実行する自己符号化装置2を説明する。自己符号化装置2は、自己符号化器(オートエンコーダ)の1種である。 Self-encoding using vector quantization using learned peripheral processing will be described with reference to FIGS. 8 and 9. FIG. "Learning completed" means the time when the learning end condition is satisfied. More specifically, as an example of a device that performs self-encoding using vector quantization using learned peripheral processing, the auto-encoding device 2 that performs encoding and decoding will be described. The self-encoding device 2 is a type of self-encoder (autoencoder).
 図8は、実施形態における自己符号化装置2の概要を説明する第1の説明図である。図9は、実施形態における自己符号化装置2の概要を説明する第2の説明図である。より具体的には、図8は自己符号化装置2で実行されるエンコードの処理を説明する説明図であり、図9は、自己符号化装置2で実行されるデコードの処理を説明する説明図である。 FIG. 8 is a first explanatory diagram for explaining the outline of the self-encoding device 2 in the embodiment. FIG. 9 is a second explanatory diagram for explaining the outline of the self-encoding device 2 in the embodiment. More specifically, FIG. 8 is an explanatory diagram for explaining the encoding process executed by the self-encoding device 2, and FIG. 9 is an explanatory diagram for explaining the decoding process executed by the self-encoding device 2. is.
 自己符号化装置2は、自己符号化器の1種であるので、エンコーダとデコーダとを備える。具体的には、自己符号化装置2は、エンコーダ200とデコーダ212とを備える。エンコーダ200は、自己符号化対象取得部201、学習済み主データ側符号化部202、学習済み補助データ側符号化部203、ベクトル量子化部204、スカラー量子化部205、学習済み補助データ側確率推定部206、学習済み補助データ側復号部207、補助エントロピー符号化部208、主データ側確率推定部209、主エントロピー符号化部210及びデータ多重化部211を備える。 The self-encoding device 2 is a kind of self-encoder, so it has an encoder and a decoder. Specifically, the autoencoding device 2 comprises an encoder 200 and a decoder 212 . The encoder 200 includes a self-encoding target acquisition unit 201, a learned main data side encoding unit 202, a learned auxiliary data side encoding unit 203, a vector quantization unit 204, a scalar quantization unit 205, and a learned auxiliary data side probability. It comprises an estimating section 206 , a learned auxiliary data side decoding section 207 , an auxiliary entropy coding section 208 , a main data side probability estimating section 209 , a main entropy coding section 210 and a data multiplexing section 211 .
 デコーダ212は、符号化データ取得部213、データ分離部214、補助エントロピー復号部215、学習済み補助データ側復号部216、主エントロピー復号部217及び学習済み主データ側復号部218を備える。 The decoder 212 includes an encoded data acquisition unit 213, a data separation unit 214, an auxiliary entropy decoding unit 215, a trained auxiliary data side decoding unit 216, a main entropy decoding unit 217, and a trained main data side decoding unit 218.
 自己符号化対象取得部201は、自己符号化の対象であるデータを主データとして取得する。以下、自己符号化の対象を自己符号化対象という。 The self-encoding target acquisition unit 201 acquires data to be self-encoded as main data. An object of self-encoding is hereinafter referred to as a self-encoding object.
 学習済み主データ側符号化部202は、自己符号化対象に対して学習済みの主データ特徴量取得処理を実行する。学習済みの主データ特徴量取得処理の実行により学習済み主データ側符号化部202は自己符号化対象の主データ特徴量を取得する。 The learned main data side encoding unit 202 executes a learned main data feature amount acquisition process for the self-encoding target. By executing the learned main data feature amount acquisition process, the learned main data side encoding unit 202 acquires the main data feature amount to be self-encoded.
 学習済み補助データ側符号化部203は、自己符号化対象の主データ特徴量に対して学習済みの補助特徴量取得処理を実行する。学習済みの補助特徴量取得処理の実行により学習済み補助データ側符号化部203は自己符号化対象の補助特徴量を取得する。 The learned auxiliary data side encoding unit 203 executes learned auxiliary feature amount acquisition processing on the main data feature amount to be self-encoded. By executing the learned auxiliary feature amount acquisition process, the learned auxiliary data side encoding unit 203 acquires the auxiliary feature amount to be self-encoded.
 ベクトル量子化部204は、自己符号化対象の主データ特徴量に対してベクトル量子化の処理を実行する。ベクトル量子化の処理の実行によりベクトル量子化部204は、自己符号化対象のベクトル量子化された主データ特徴量(以下「ベクトル量子化特徴量」という。)を取得する。 The vector quantization unit 204 executes vector quantization processing on the main data feature quantity to be self-encoded. By executing the vector quantization process, the vector quantization unit 204 acquires the vector-quantized main data feature amount (hereinafter referred to as "vector quantized feature amount") to be self-encoded.
 スカラー量子化部205は、自己符号化対象の補助特徴量に対してスカラー量子化の処理を実行する。スカラー量子化の処理の実行によりスカラー量子化部205は、自己符号化対象のスカラー量子化された補助特徴量(以下「スカラー量子化特徴量」という。)を取得する。 The scalar quantization unit 205 performs scalar quantization processing on the auxiliary feature quantity to be auto-encoded. By executing the scalar quantization process, the scalar quantization unit 205 acquires a scalar-quantized auxiliary feature amount to be self-encoded (hereinafter referred to as “scalar quantized feature amount”).
 学習済み補助データ側確率推定部206は、学習済みの補助データ側確率推定処理を実行する。学習済み補助データ側確率推定部206は、学習済みの補助データ側確率推定処理の実行により、スカラー量子化特徴量に基づき自己符号化対象の補助データ側確率を推定する。 The learned auxiliary data side probability estimation unit 206 executes the learned auxiliary data side probability estimation process. The learned auxiliary data-side probability estimating unit 206 estimates the auxiliary data-side probability of the self-encoding target based on the scalar quantized feature value by executing the learned auxiliary data-side probability estimation process.
 学習済み補助データ側復号部207は、スカラー量子化特徴量に対して学習済みの補助特徴量復号処理を実行する。すなわち、学習済み補助データ側復号部207は、スカラー量子化特徴量を復号する。以下、スカラー量子化特徴量が復号された情報を量子化補助データという。したがって、学習済み補助データ側復号部207は、スカラー量子化特徴量を復号することで量子化補助データを取得する処理である。 The learned auxiliary data side decoding unit 207 executes learned auxiliary feature quantity decoding processing on the scalar quantized feature quantity. That is, the learned auxiliary data side decoding unit 207 decodes the scalar quantized feature quantity. Hereinafter, the information obtained by decoding the scalar quantized feature quantity will be referred to as quantized auxiliary data. Therefore, the learned auxiliary data side decoding unit 207 is a process of obtaining quantized auxiliary data by decoding the scalar quantized feature amount.
 補助エントロピー符号化部208は、スカラー量子化特徴量と自己符号化対象の補助データ側確率とに基づき、スカラー量子化特徴量のエントロピー符号化を行う。エントロピー符号化は例えば算術符号化である。 The auxiliary entropy encoding unit 208 entropy-encodes the scalar quantized feature amount based on the scalar quantized feature amount and the probability of the auxiliary data to be self-encoded. Entropy coding is, for example, arithmetic coding.
 主データ側確率推定部209は、ベクトル量子化特徴量と量子化補助データとに基づき自己符号化対象の主データ側確率を推定する。 The main-data-side probability estimation unit 209 estimates the main-data-side probability of the self-encoding target based on the vector quantized feature amount and the quantized auxiliary data.
 主エントロピー符号化部210は、ベクトル量子化特徴量と自己符号化対象の主データ側確率とに基づき、ベクトル量子化特徴量のエントロピー符号化を行う。エントロピー符号化は例えば算術符号化である。 The primary entropy encoding unit 210 performs entropy encoding of the vector quantized feature quantity based on the vector quantized feature quantity and the probability of the main data to be self-encoded. Entropy coding is, for example, arithmetic coding.
 データ多重化部211は、エントロピー符号化されたベクトル量子化特徴量と、エントロピー符号化されたスカラー量子化特徴量とをデコーダ212に出力する。このようにして、エンコーダ200は、自己符号化対象をエンコードする。 The data multiplexing unit 211 outputs the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount to the decoder 212 . In this manner, encoder 200 encodes the self-encoding object.
 符号化データ取得部213は、エントロピー符号化されたベクトル量子化特徴量と、エントロピー符号化されたスカラー量子化特徴量とを取得する。 The encoded data acquisition unit 213 acquires the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount.
 データ分離部214は、符号化データ取得部213の取得した、エントロピー符号化されたベクトル量子化特徴量と、エントロピー符号化されたスカラー量子化特徴量と、を取得する。データ分離部214は、エントロピー符号化されたスカラー量子化特徴量を補助エントロピー復号部215に出力し、エントロピー符号化されたベクトル量子化特徴量を主エントロピー復号部217に出力する。 The data separation unit 214 acquires the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount acquired by the encoded data acquisition unit 213 . The data separation unit 214 outputs the entropy-encoded scalar quantized feature amount to the auxiliary entropy decoding unit 215 and outputs the entropy-encoded vector quantized feature amount to the primary entropy decoding unit 217 .
 学習済み補助エントロピー復号部215は、学習済みのパラメトライズド補助特徴量累積分布関数を用いて、エントロピー符号化されたスカラー量子化特徴量に対してエントロピー復号を行う。 The trained auxiliary entropy decoding unit 215 performs entropy decoding on the entropy-encoded scalar quantized feature using the trained parameterized auxiliary feature cumulative distribution function.
 学習済み補助データ側復号部216は、学習済み補助エントロピー復号部215によるエントロピー復号の結果に対して、学習済みの補助特徴量復号処理を実行する。 The learned auxiliary data side decoding unit 216 executes the learned auxiliary feature quantity decoding process on the result of entropy decoding by the trained auxiliary entropy decoding unit 215 .
 主エントロピー復号部217は、エントロピー符号化されたベクトル量子化特徴量と、学習済み補助データ側復号部216による学習済みの補助特徴量復号処理の結果と、に基づき、エントロピー符号化されたベクトル量子化特徴量のエントロピー符号化を行う。より具体的には、主エントロピー復号部217は、復号累積分布関数を用いて、エントロピー符号化されたベクトル量子化特徴量のエントロピー復号を行う。復号累積分布関数は、パラメータの値が学習済み補助データ側復号部216による学習済みの補助特徴量復号処理の結果が示す値であるパラメトライズド主データ特徴量累積分布関数である。 The primary entropy decoding unit 217 entropy-encoded vector quantization based on the entropy-encoded vector quantized feature amount and the result of the learned auxiliary feature amount decoding process by the learned auxiliary data side decoding unit 216. entropy coding of the modified features. More specifically, the primary entropy decoding unit 217 performs entropy decoding on the entropy-encoded vector quantized feature using the decoded cumulative distribution function. The decoded cumulative distribution function is a parameterized main data feature quantity cumulative distribution function whose parameter values are values indicated by the result of learned auxiliary feature quantity decoding processing by the learned auxiliary data side decoding unit 216 .
 学習済み主データ側復号部218は、主エントロピー復号部217による復号の結果に対して、学習済みの主データ特徴量復号処理を実行する。 The learned main data side decoding unit 218 executes the learned main data feature value decoding process on the result of decoding by the main entropy decoding unit 217 .
 このようにして、デコーダ212は、エンコーダ200によってエンコードされた自己符号化対象をデコードする。また、このようにして、自己符号化装置2は、自己符号化対象の自己符号化を行う。 Thus, the decoder 212 decodes the self-encoded object encoded by the encoder 200. Also, in this manner, the self-encoding device 2 self-encodes the self-encoding target.
 図10は、実施形態におけるエンコーダ200が実行する処理の流れの一例を示すフローチャートである。自己符号化対象取得部201が自己符号化対象X=[X、X、・・・、X]を取得する(ステップS201)。自己符号化対象XはXからXまでのN個の要素を有するN次元ベクトルである。XからXまでの各要素はテンソルである。したがって、XからXまでの各要素はスカラーであってもよいしベクトルであってもよい。 FIG. 10 is a flow chart showing an example of the flow of processing executed by the encoder 200 in the embodiment. The self-encoding target acquiring unit 201 acquires the self-encoding target X=[X 1 , X 2 , . . . , X N ] (step S201). The autoencoding object X is an N-dimensional vector with N elements from X1 to XN . Each element from X 1 to X N is a tensor. Therefore, each element of X 1 through X N may be a scalar or a vector.
 次に、学習済み主データ側符号化部202が、学習済みの主データ特徴量取得処理を実行する(ステップS202)。すなわち、学習済み主データ側符号化部202が自己符号化対象Xを符号化する。自己符号化対象の符号化により、自己符号化対象の主データ特徴量Y=Fenc(Y)が得られる。関数Fenc(X)は、学習済みの主データ符号化関数である。なお主データ特徴量Yはベクトル等のテンソルである。 Next, the learned main data side encoding unit 202 executes a learned main data feature amount acquisition process (step S202). That is, the learned main data side encoding unit 202 encodes the self-encoding target X. FIG. The encoding of the self-encoding target yields the main data feature Y=F enc (Y) of the self-encoding target. The function F enc (X) is the learned primary data encoding function. Note that the main data feature quantity Y is a tensor such as a vector.
 次に、学習済み補助データ側符号化部203が学習済み補助特徴量取得処理を実行する(ステップS203)。学習済み補助特徴量取得処理の実行により、自己符号化対象の補助特徴量Z=Genc(Y)が得られる。関数Genc(Y)は、学習済みの主データ特徴量符号化関数である。なお補助特徴量Zはベクトル等のテンソルである。 Next, the learned auxiliary data side encoding unit 203 executes a learned auxiliary feature amount acquisition process (step S203). By executing the learned auxiliary feature amount acquisition process, the auxiliary feature amount Z=G enc (Y) to be self-encoded is obtained. The function G enc (Y) is a learned main data feature quantity encoding function. The auxiliary feature Z is a tensor such as a vector.
 次に、ベクトル量子化部204が自己符号化対象の主データ特徴量Yに対してベクトル量子化を行う(ステップS204)。ベクトル量子化の実行により、ベクトル量子化特徴量Y^=[Y^、Y^、・・・、Y^]=[Q(Y)、Q(Y)、・・・、Q(Y)]が得られる。Yは、自己符号化対象の主データ特徴量Yのi番目の要素を表す。Y^は、ベクトル量子化特徴量Y^のi番目の要素を表す。Qは以下の式(11)で表される関数である。 Next, the vector quantization unit 204 performs vector quantization on the main data feature quantity Y to be self-encoded (step S204). By performing vector quantization, the vector quantized feature Y ^ =[Y 1 ^, Y 2 ^ , . Q(Y k )] is obtained. Y i represents the i-th element of the main data feature quantity Y to be auto-encoded. Ŷi represents the i-th element of the vector quantized feature Ŷ. Q is a function represented by Equation (11) below.
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000011
 なお、記号Λは全格子点の集合を意味する。 The symbol Λ means the set of all lattice points.
 次に、スカラー量子化部205が自己符号化対象の補助特徴量Zに対してスカラー量子化を行う(ステップS205)。スカラー量子化の処理の実行によりスカラー量子化部205は、スカラー量子化特徴量Z^=round(Z)を取得する。なお、roundは丸め処理を表す。 Next, the scalar quantization unit 205 performs scalar quantization on the auxiliary feature Z to be self-encoded (step S205). By executing the scalar quantization process, the scalar quantization unit 205 acquires a scalar quantized feature Z^=round(Z). Note that round represents rounding processing.
 次に学習済み補助データ側確率推定部206が、学習済みの補助データ側確率推定処理を実行する(ステップS206)。学習済み補助データ側確率推定部206は、学習済みの補助データ側確率推定処理の実行により、スカラー量子化特徴量Z^に基づき自己符号化対象の補助データ側確率を推定する。自己符号化対象の補助データ側確率は具体的には以下の式(12)で表される。 Next, the learned auxiliary data side probability estimation unit 206 executes the learned auxiliary data side probability estimation process (step S206). The learned auxiliary data-side probability estimation unit 206 estimates the auxiliary data-side probability of the self-encoding target based on the scalar quantized feature value Z^ by executing the learned auxiliary data-side probability estimation process. Specifically, the auxiliary data side probability to be self-encoded is represented by the following equation (12).
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000012
 式(12)の左辺の記号が自己符号化対象の補助データ側確率を表す。記号Hは学習済みのパラメトライズド補助特徴量累積分布関数である。 The symbol on the left side of Equation (12) represents the auxiliary data side probability of the self-encoding target. The symbol H is a trained parameterized auxiliary feature cumulative distribution function.
 学習済み補助データ側復号部207が、スカラー量子化特徴量Z^に対して学習済みの補助特徴量復号処理を実行する(ステップS207)。すなわち、学習済み補助データ側復号部207は、スカラー量子化特徴量を復号することで量子化補助データΘ=Gdec(Z^)が得られる。関数Gdec(Z^)は、学習済みの補助特徴量復号関数である。なお量子化補助データΘはベクトル等のテンソルである。 The learned auxiliary data side decoding unit 207 executes the learned auxiliary feature quantity decoding process on the scalar quantized feature quantity Ẑ (step S207). That is, the trained auxiliary data side decoding unit 207 obtains the quantized auxiliary data Θ=G dec (Ẑ) by decoding the scalar quantized feature quantity. The function G dec (Ẑ) is a learned auxiliary feature decoding function. The quantization auxiliary data Θ is a tensor such as a vector.
 次に補助エントロピー符号化部208が、スカラー量子化特徴量Z^と自己符号化対象の補助データ側確率とに基づき、スカラー量子化特徴量Z^のエントロピー符号化を行う(ステップS208)。 Next, the auxiliary entropy encoding unit 208 entropy-encodes the scalar quantized feature Z^ based on the scalar quantized feature Z^ and the probability of the auxiliary data to be self-encoded (step S208).
 次に主データ側確率推定部209が、ベクトル量子化特徴量Y^と量子化補助データΘとに基づき自己符号化対象の主データ側確率を推定する(ステップS209)。 Next, the main-data-side probability estimating unit 209 estimates the main-data-side probability of the self-encoding target based on the vector quantized feature Y^ and the quantized auxiliary data Θ (step S209).
 次に主エントロピー符号化部210が、ベクトル量子化特徴量Y^と自己符号化対象の主データ側確率とに基づき、ベクトル量子化特徴量Y^のエントロピー符号化を行う(ステップS210)。 Next, the main entropy encoding unit 210 entropy-encodes the vector quantized feature Y^ based on the vector quantized feature Y^ and the probability of the main data to be self-encoded (step S210).
 次にデータ多重化部211が、エントロピー符号化されたベクトル量子化特徴量と、エントロピー符号化されたスカラー量子化特徴量とをデコーダ212に出力する(ステップS211)。 Next, the data multiplexing unit 211 outputs the entropy-encoded vector quantized feature amount and the entropy-encoded scalar quantized feature amount to the decoder 212 (step S211).
 ステップS201からステップS211までの一連の処理がエンコーダ200によるエンコードの処理の一例である。なお、ステップS201~ステップS211までの各処理は因果律に反することが無ければどのような順番で実行されてもよい。 A series of processing from step S201 to step S211 is an example of encoding processing by the encoder 200. Note that each process from step S201 to step S211 may be executed in any order as long as it does not violate causality.
 図11は、実施形態におけるデコーダ212が実行する処理の流れの一例を示すフローチャートである。符号化データ取得部213が、エンコーダ200によるエンコードの結果を取得する(ステップS301)。エンコーダ200によるエンコードの結果とは具体的には、ステップS211で出力されたエントロピー符号化されたベクトル量子化特徴量と、エントロピー符号化されたスカラー量子化特徴量とである。 FIG. 11 is a flowchart showing an example of the flow of processing executed by the decoder 212 in the embodiment. The encoded data acquisition unit 213 acquires the result of encoding by the encoder 200 (step S301). Specifically, the results of encoding by the encoder 200 are the entropy-encoded vector quantized feature amount output in step S211 and the entropy-encoded scalar quantized feature amount.
 次にデータ分離部214が、ステップS301で取得されたエントロピー符号化されたスカラー量子化特徴量と、ステップS302で取得されたエントロピー符号化されたベクトル量子化特徴量とを分離する(ステップS302)。分離するとは、具体的には、ステップS301で取得されたエントロピー符号化されたスカラー量子化特徴量を補助エントロピー復号部215に出力し、ステップS302で取得されたエントロピー符号化されたベクトル量子化特徴量を主エントロピー復号部217に出力することを意味する。 Next, the data separation unit 214 separates the entropy-encoded scalar quantized feature quantity acquired in step S301 from the entropy-encoded vector quantized feature quantity acquired in step S302 (step S302). . Specifically, the separation means that the entropy-encoded scalar quantized feature obtained in step S301 is output to the auxiliary entropy decoding unit 215, and the entropy-encoded vector quantized feature obtained in step S302 is output to the auxiliary entropy decoding unit 215. It means outputting the amount to the principal entropy decoding unit 217 .
 次に学習済み補助エントロピー復号部215が、学習済みのパラメトライズド補助特徴量累積分布関数を用いて、エントロピー符号化されたスカラー量子化特徴量に対してエントロピー復号を行う(ステップS303)。 Next, the trained auxiliary entropy decoding unit 215 performs entropy decoding on the entropy-encoded scalar quantized feature using the trained parameterized auxiliary feature cumulative distribution function (step S303).
 次に学習済み補助データ側復号部216が、学習済み補助エントロピー復号部215によるエントロピー復号の結果に対して、学習済みの補助特徴量復号処理を実行する(ステップS304)。 Next, the learned auxiliary data side decoding unit 216 executes the learned auxiliary feature quantity decoding process on the result of entropy decoding by the trained auxiliary entropy decoding unit 215 (step S304).
 次に主エントロピー復号部217が、復号累積分布関数を用いて、エントロピー符号化されたベクトル量子化特徴量のエントロピー復号を行う(ステップS305)。 Next, the primary entropy decoding unit 217 performs entropy decoding on the entropy-encoded vector quantized feature using the decoded cumulative distribution function (step S305).
 次に学習済み主データ側復号部218が、主エントロピー復号部217による復号の結果に対して、学習済みの主データ特徴量復号処理を実行する(ステップS306。 Next, the learned main data side decoding unit 218 executes the learned main data feature quantity decoding process on the result of decoding by the main entropy decoding unit 217 (step S306.
 ステップS301からステップS306までの一連の処理がデコーダ212によるデコードの処理の一例である。なお、ステップS301~ステップS306までの各処理は、ステップS211等のエンコーダ200によるエンコードの処理の実行後に実行され、なおかつ、因果律に反することが無ければどのような順番で実行されてもよい。 A series of processing from step S301 to step S306 is an example of decoding processing by the decoder 212. It should be noted that each process from step S301 to step S306 is executed after the encoding process by the encoder 200 such as step S211 is executed, and may be executed in any order as long as it does not violate causality.
<ハードウェアの説明>
 図12は、実施形態における学習装置1のハードウェア構成の一例を示す図である。学習装置1は、バスで接続されたCPU(Central Processing Unit)等のプロセッサ91とメモリ92とを備える制御部11を備え、プログラムを実行する。学習装置1は、プログラムの実行によって制御部11、入力部12、通信部13、記憶部14及び出力部15を備える装置として機能する。
<Description of hardware>
FIG. 12 is a diagram showing an example of the hardware configuration of the learning device 1 according to the embodiment. The learning device 1 includes a control unit 11 including a processor 91 such as a CPU (Central Processing Unit) connected via a bus and a memory 92, and executes a program. The learning device 1 functions as a device including a control unit 11, an input unit 12, a communication unit 13, a storage unit 14, and an output unit 15 by executing a program.
 より具体的には、プロセッサ91が記憶部14に記憶されているプログラムを読み出し、読み出したプログラムをメモリ92に記憶させる。プロセッサ91が、メモリ92に記憶させたプログラムを実行することによって、学習装置1は、制御部11、入力部12、通信部13、記憶部14及び出力部15を備える装置として機能する。 More specifically, the processor 91 reads the program stored in the storage unit 14 and stores the read program in the memory 92 . The processor 91 executes the program stored in the memory 92 , whereby the learning device 1 functions as a device comprising the control section 11 , the input section 12 , the communication section 13 , the storage section 14 and the output section 15 .
 制御部11は、学習装置1が備える各種機能部の動作を制御する。制御部11は、例えば出力部15の動作を制御する。制御部11は、例えば学習により生じた各種情報を記憶部14に記録する。 The control unit 11 controls the operations of various functional units included in the learning device 1. The control unit 11 controls the operation of the output unit 15, for example. The control unit 11 records, for example, various information generated by learning in the storage unit 14 .
 入力部12は、マウスやキーボード、タッチパネル等の入力装置を含んで構成される。入力部12は、これらの入力装置を学習装置1に接続するインタフェースとして構成されてもよい。入力部12は、学習装置1に対する各種情報の入力を受け付ける。 The input unit 12 includes input devices such as a mouse, keyboard, and touch panel. The input unit 12 may be configured as an interface that connects these input devices to the learning device 1 . The input unit 12 receives input of various information to the learning device 1 .
 通信部13は、学習装置1を外部装置に接続するための通信インタフェースを含んで構成される。通信部13は、有線又は無線を介して外部装置と通信する。外部装置は、例えば学習に用いられる主データの送信元の装置である。通信部13は、主データの送信元の装置との通信によって学習に用いられる主データを取得する。外部装置は、例えば自己符号化装置2である。通信部13は自己符号化装置2との通信により自己符号化装置2にネットワーク学習結果を送信する。なお、主データは、必ずしも通信部13を介して入力される必要は無く、入力部12に入力されてもよい。 The communication unit 13 includes a communication interface for connecting the learning device 1 to an external device. The communication unit 13 communicates with an external device via wire or wireless. The external device is, for example, a device that transmits main data used for learning. The communication unit 13 acquires main data used for learning through communication with the device that is the transmission source of the main data. The external device is for example the autoencoding device 2 . The communication unit 13 transmits the network learning result to the self-encoding device 2 through communication with the self-encoding device 2 . Note that the main data does not necessarily have to be input via the communication unit 13 and may be input to the input unit 12 .
 記憶部14は、磁気ハードディスク装置や半導体記憶装置などのコンピュータ読み出し可能な記憶媒体装置を用いて構成される。記憶部14は学習装置1に関する各種情報を記憶する。記憶部14は、例えば入力部12又は通信部13を介して入力された情報を記憶する。記憶部14は、例えば学習の実行により生じた各種情報を記憶する。 The storage unit 14 is configured using a computer-readable storage medium device such as a magnetic hard disk device or a semiconductor storage device. The storage unit 14 stores various information regarding the learning device 1 . The storage unit 14 stores information input via the input unit 12 or the communication unit 13, for example. The storage unit 14 stores, for example, various information generated by execution of learning.
 記憶部14は、例えば補助特徴量を示すテンソルの各要素の生起確率の取得に用いられる確率分布を予め記憶する。記憶部14は、例えばパラメトライズド補助特徴量累積分布関数を予め記憶する。記憶部14は、例えばパラメトライズド主データ特徴量累積分布関数を予め記憶する。記憶部14は、例えば代表ベクトル情報を予め記憶する。記憶部14は、例えば超直方体分割の結果を記憶する。 The storage unit 14 pre-stores, for example, probability distributions used to acquire the occurrence probability of each element of the tensor indicating the auxiliary feature amount. The storage unit 14 stores, for example, a parameterized auxiliary feature cumulative distribution function in advance. The storage unit 14 stores in advance, for example, the parameterized main data feature quantity cumulative distribution function. The storage unit 14 stores, for example, representative vector information in advance. The storage unit 14 stores, for example, the result of hyperrectangular parallelepiped division.
 記憶部14は、例えば学習ネットワーク100の各パラメータの値の初期値を予め記憶する。初期値は例えばランダムな値である。記憶部14は、例えばネットワーク学習結果を記憶する。 The storage unit 14 stores, for example, the initial values of the parameters of the learning network 100 in advance. The initial value is, for example, a random value. The storage unit 14 stores, for example, network learning results.
 出力部15は、各種情報を出力する。出力部15は、例えばCRT(Cathode Ray Tube)ディスプレイや液晶ディスプレイ、有機EL(Electro-Luminescence)ディスプレイ等の表示装置を含んで構成される。出力部15は、これらの表示装置を学習装置1に接続するインタフェースとして構成されてもよい。出力部15は、例えば入力部12に入力された情報を出力する。出力部15は、例えば学習の結果を表示してもよい。 The output unit 15 outputs various information. The output unit 15 includes a display device such as a CRT (Cathode Ray Tube) display, a liquid crystal display, an organic EL (Electro-Luminescence) display, or the like. The output unit 15 may be configured as an interface that connects these display devices to the study device 1 . The output unit 15 outputs information input to the input unit 12, for example. The output unit 15 may display the result of learning, for example.
 図13は、実施形態における学習装置1が備える制御部11の構成の一例を示す図である。制御部11は、学習部10、記憶制御部120、通信制御部130及び出力制御部140を備える。記憶制御部120は、記憶部14に各種情報を記録する。通信制御部130は通信部13の動作を制御する。出力制御部140は、出力部15の動作を制御する。 FIG. 13 is a diagram showing an example of the configuration of the control unit 11 included in the learning device 1 according to the embodiment. The control unit 11 includes a learning unit 10 , a memory control unit 120 , a communication control unit 130 and an output control unit 140 . The storage control unit 120 records various information in the storage unit 14 . The communication control section 130 controls the operation of the communication section 13 . The output control section 140 controls the operation of the output section 15 .
 図14は、実施形態における自己符号化装置2のハードウェア構成の一例を示す図である。自己符号化装置2は、バスで接続されたCPU(Central Processing Unit)等のプロセッサ93とメモリ94とを備える制御部21を備え、プログラムを実行する。自己符号化装置2は、プログラムの実行によって制御部21、入力部22、通信部23、記憶部24及び出力部25を備える装置として機能する。 FIG. 14 is a diagram showing an example of the hardware configuration of the self-encoding device 2 in the embodiment. The self-encoding device 2 includes a control section 21 including a processor 93 such as a CPU (Central Processing Unit) connected via a bus and a memory 94, and executes a program. The self-encoding device 2 functions as a device comprising a control section 21, an input section 22, a communication section 23, a storage section 24 and an output section 25 by executing a program.
 より具体的には、プロセッサ93が記憶部24に記憶されているプログラムを読み出し、読み出したプログラムをメモリ94に記憶させる。プロセッサ93が、メモリ94に記憶させたプログラムを実行することによって、自己符号化装置2は、制御部21、入力部22、通信部23、記憶部24及び出力部25を備える装置として機能する。 More specifically, the processor 93 reads the program stored in the storage unit 24 and stores the read program in the memory 94 . The processor 93 executes the program stored in the memory 94 so that the self-encoding device 2 functions as a device comprising the control section 21 , the input section 22 , the communication section 23 , the storage section 24 and the output section 25 .
 制御部21は、自己符号化装置2が備える各種機能部の動作を制御する。制御部21は、例えば出力部25の動作を制御する。制御部21は、例えばエンコーダ200によるエンコードとデコーダ212によるデコードとにより生じた各種情報を記憶部24に記録する。 The control unit 21 controls operations of various functional units provided in the self-encoding device 2 . The control unit 21 controls the operation of the output unit 25, for example. The control unit 21 records various information generated by encoding by the encoder 200 and decoding by the decoder 212 in the storage unit 24, for example.
 入力部22は、マウスやキーボード、タッチパネル等の入力装置を含んで構成される。入力部22は、これらの入力装置を自己符号化装置2に接続するインタフェースとして構成されてもよい。入力部22は、自己符号化装置2に対する各種情報の入力を受け付ける。 The input unit 22 includes input devices such as a mouse, keyboard, and touch panel. The input unit 22 may be configured as an interface connecting these input devices to the autoencoding device 2 . The input unit 22 receives input of various information to the self-encoding device 2 .
 通信部23は、自己符号化装置2を外部装置に接続するための通信インタフェースを含んで構成される。通信部23は、有線又は無線を介して外部装置と通信する。外部装置は、例えば自己符号化対象の送信元の装置である。通信部23は、自己符号化対象の送信元の装置との通信によって自己符号化対象を取得する。外部装置は、例えば学習装置1である。通信部23は学習装置1との通信によりネットワーク学習結果を受信する。なお、自己符号化対象は、必ずしも通信部23を介して入力される必要は無く、入力部22に入力されてもよい。 The communication unit 23 includes a communication interface for connecting the self-encoding device 2 to an external device. The communication unit 23 communicates with an external device via wire or wireless. The external device is, for example, the device that is the source of the self-encoding. The communication unit 23 acquires the self-encoding target through communication with the device that is the transmission source of the self-encoding target. The external device is the learning device 1, for example. The communication unit 23 receives network learning results through communication with the learning device 1 . Note that the self-encoding target does not necessarily have to be input via the communication unit 23 and may be input to the input unit 22 .
 記憶部24は、磁気ハードディスク装置や半導体記憶装置などのコンピュータ読み出し可能な記憶媒体装置を用いて構成される。記憶部24は自己符号化装置2に関する各種情報を記憶する。記憶部24は、例えば入力部22又は通信部23を介して入力された情報を記憶する。記憶部24は、例えばエンコーダ200によるエンコードとデコーダ212によるデコードとの実行により生じた各種情報を記憶する。 The storage unit 24 is configured using a computer-readable storage medium device such as a magnetic hard disk device or a semiconductor storage device. A storage unit 24 stores various information about the self-encoding device 2 . The storage unit 24 stores information input via the input unit 22 or the communication unit 23, for example. The storage unit 24 stores various kinds of information generated by executing encoding by the encoder 200 and decoding by the decoder 212, for example.
 記憶部24は、例えばネットワーク学習結果を記憶する。記憶部24は、例えば代表ベクトル情報を予め記憶する。記憶部24は、例えば超直方体分割の結果を記憶する。 The storage unit 24 stores, for example, network learning results. The storage unit 24 stores, for example, representative vector information in advance. The storage unit 24 stores, for example, the result of hyperrectangular parallelepiped division.
 出力部25は、各種情報を出力する。出力部25は、例えばCRT(Cathode Ray Tube)ディスプレイや液晶ディスプレイ、有機EL(Electro-Luminescence)ディスプレイ等の表示装置を含んで構成される。出力部25は、これらの表示装置を学習装置1に接続するインタフェースとして構成されてもよい。出力部25は、例えば入力部22に入力された情報を出力する。出力部25は、例えば自己符号化対象の自己符号化の結果を出力してもよい。 The output unit 25 outputs various information. The output unit 25 includes a display device such as a CRT (Cathode Ray Tube) display, a liquid crystal display, an organic EL (Electro-Luminescence) display, or the like. The output unit 25 may be configured as an interface that connects these display devices to the study device 1 . The output unit 25 outputs information input to the input unit 22, for example. The output unit 25 may output, for example, the result of self-encoding of the object to be self-encoded.
 図15は、実施形態における自己符号化装置2が備える制御部21の構成の一例を示す図である。制御部21は、自己符号化実行部20、記憶制御部220、通信制御部230及び出力制御部240を備える。自己符号化実行部20は、自己符号化対象に対して自己符号化を行う。自己符号化実行部20はエンコーダ200とデコーダ212とを備える。自己符号化実行部20は、エンコーダ200によるエンコードとデコーダ212によるデコードとを行うことで、自己符号化対象の自己符号化を行う。 FIG. 15 is a diagram showing an example of the configuration of the control unit 21 included in the self-encoding device 2 according to the embodiment. The control unit 21 includes a self-encoding execution unit 20 , a memory control unit 220 , a communication control unit 230 and an output control unit 240 . The self-encoding execution unit 20 performs self-encoding on a self-encoding target. The autoencoding execution unit 20 comprises an encoder 200 and a decoder 212 . The self-encoding execution unit 20 carries out encoding by the encoder 200 and decoding by the decoder 212 to self-encode the object to be self-encoded.
 記憶制御部220は、記憶部24に各種情報を記録する。通信制御部230は通信部23の動作を制御する。出力制御部240は、出力部25の動作を制御する。 The storage control unit 220 records various information in the storage unit 24. A communication control unit 230 controls the operation of the communication unit 23 . The output control section 240 controls the operation of the output section 25 .
 このように構成された学習装置1は、LatticeVQ等のベクトル空間において格子状に配置された代表ベクトルの位置を示す情報である代表ベクトル情報を用いてベクトル量子化のための周辺処理の学習を行う。そして学習装置1は、超直方体分割の結果を用いて代表ベクトルの生起確率の推定を行う。上述したように、代表ベクトルを用いる場合にはボロノイ分割を用いて代表ベクトルの生起確率の推定が行われる場合があるが積分が容易ではない。 The learning device 1 configured in this way learns peripheral processing for vector quantization using representative vector information, which is information indicating the positions of representative vectors arranged in a lattice in a vector space such as LatticeVQ. . Then, the learning device 1 estimates the occurrence probability of the representative vector using the result of the hyperrectangular parallelepiped partitioning. As described above, when representative vectors are used, Voronoi division may be used to estimate the probability of occurrence of representative vectors, but integration is not easy.
 したがって、超直方体分割によって得られた超直方体を用いて代表ベクトルの生起確率を推定する学習装置1は、ベクトル量子化を用いた自己符号化の処理を得ることに要する負担を軽減することができる。これはベクトル量子化を用いた自己符号化の実現までに要する学習段階からの負担を軽減することである。したがって、学習装置1は、ベクトル量子化を用いた自己符号化に要する負担を軽減することができる。 Therefore, the learning device 1 that estimates the probability of occurrence of a representative vector using a hyperrectangular parallelepiped obtained by hyperrectangular parallelepiped partitioning can reduce the load required to obtain self-encoding processing using vector quantization. . This is to reduce the burden from the learning stage required until the realization of self-encoding using vector quantization. Therefore, the learning device 1 can reduce the load required for self-encoding using vector quantization.
 また、このように構成された学習装置1は、代表ベクトルを用いてベクトル量子化のための周辺処理の学習を行うため、代表ベクトルの学習に要する負担を軽減することができる。また、このように構成された学習装置1は、代表ベクトルを用いてベクトル量子化のための周辺処理の学習を行うため、代表ベクトルの学習を行う際には必要となるメモリの使用が無い。そのため、学習装置1は、メモリ不足の問題が生じる頻度を下げることができ、より大きな次元の主データを処理することができる。 In addition, since the learning device 1 configured in this way uses representative vectors to learn peripheral processing for vector quantization, it is possible to reduce the burden required for learning representative vectors. In addition, since the learning device 1 configured in this way learns the peripheral processing for vector quantization using the representative vector, there is no need to use a memory when learning the representative vector. Therefore, the learning device 1 can reduce the frequency of memory shortage problems and can process main data of a larger dimension.
 ところで上述したように、ベクトル量子化の学習では量子化に変えてノイズの付与を行う必要がある。しかしながら、LatticeVQ等の代表ベクトル情報を用いる場合、ボロノイ領域の範囲が量子化誤差になるため、確率分布がガウス分布であるノイズの生成そのものが容易ではない。 By the way, as mentioned above, in learning vector quantization, it is necessary to add noise instead of quantization. However, when representative vector information such as LatticeVQ is used, the range of the Voronoi region becomes a quantization error, so it is not easy to generate noise whose probability distribution is a Gaussian distribution.
 一方、このように構成された学習装置1は、学習において、K次元のベクトル空間におけるボロノイ領域に外接する(K-1)次元球面内にランダムに生成させたサンプルのうち、ボロノイ領域内のサンプルのみノイズとして用いる。このため、ガウス分布に従うノイズを生成することができる。したがって、学習装置1は、ベクトル量子化を用いた自己符号化の処理を得ることに要する負担を軽減することができる。そのため、学習装置1は、ベクトル量子化を用いた自己符号化に要する負担を軽減することができる。 On the other hand, the learning device 1 configured as described above, in learning, among the samples randomly generated in the (K−1)-dimensional sphere circumscribing the Voronoi region in the K-dimensional vector space, the samples in the Voronoi region is used as noise. Therefore, it is possible to generate noise following a Gaussian distribution. Therefore, the learning device 1 can reduce the burden required to obtain self-encoding processing using vector quantization. Therefore, the learning device 1 can reduce the load required for self-encoding using vector quantization.
 このように構成された自己符号化装置2は、学習装置1による学習の結果を用いてベクトル量子化を用いた自己符号化を行う。そのため、ベクトル量子化を用いた自己符号化に要する負担を軽減することができる。 The self-encoding device 2 configured in this way performs self-encoding using vector quantization using the learning result of the learning device 1 . Therefore, the load required for self-encoding using vector quantization can be reduced.
(変形例)
 主データ特徴量に対するノイズの付与では、体積をK次元のベクトル空間におけるボロノイ領域の体積に近似させた(K-1)次元球面内に一様に発生させたサンプルがノイズ点として用いられてもよい。主データ特徴量に対するノイズの付与では、超直方体分割によって得られた超直方体内に一様に発生させたサンプルがノイズ点として用いられてもよい。
(Modification)
In the addition of noise to the main data features, even if samples uniformly generated in a (K-1)-dimensional sphere whose volume is approximated to the volume of the Voronoi region in a K-dimensional vector space are used as noise points. good. In addition of noise to the main data feature amount, samples uniformly generated within the hyperrectangular parallelepiped obtained by hyperrectangular parallelepiped division may be used as noise points.
 したがって主データ特徴量に対するノイズを付与する処理(すなわちベクトルノイズ付与処理)は、第1ノイズ付与処理、第2ノイズ付与処理又は第3ノイズ付与処理のいずれかひとつの処理であってもよい。第1ノイズ付与処理は、K次元のベクトル空間におけるボロノイ領域に外接する(K-1)次元球面内にランダムに生成させたサンプルのうち、ボロノイ領域内のサンプルをノイズとして付与する処理である。すなわち第1ノイズ付与処理は、図3及び図4を用いて説明した処理である。 Therefore, the process of adding noise to the main data feature amount (that is, the vector noise adding process) may be any one of the first noise adding process, the second noise adding process, or the third noise adding process. The first noise adding process is a process of adding, as noise, samples in the Voronoi region among the samples randomly generated in the (K−1)-dimensional sphere circumscribing the Voronoi region in the K-dimensional vector space. That is, the first noise adding process is the process described with reference to FIGS. 3 and 4. FIG.
 第2ノイズ付与処理は、体積をK次元のベクトル空間におけるボロノイ領域の体積に近似させた(K-1)次元球面内に一様に発生させたサンプルをノイズとして付与する処理である。第3ノイズ付与処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域でありベクトル空間の1つの格子点を含む領域であり形状が超直方体である領域内に一様に発生させたサンプルをノイズとして付与する処理である。 The second noise addition process is a process of adding, as noise, samples uniformly generated within a (K-1)-dimensional sphere whose volume is approximated to the volume of a Voronoi region in a K-dimensional vector space. In the third noise addition process, noise is uniformly generated in a hyperrectangular parallelepiped region that is a region that divides a vector space in which representative vectors are arranged in a grid pattern, that is a region that includes one grid point of the vector space, and that has a hyperrectangular parallelepiped shape. This is a process to assign the samples obtained as noise.
 なお、学習装置1は、ネットワークを介して通信可能に接続された複数台の情報処理装置を用いて実装されてもよい。この場合、学習装置1が備える各機能部は、複数の情報処理装置に分散して実装されてもよい。 It should be noted that the learning device 1 may be implemented using a plurality of information processing devices that are communicably connected via a network. In this case, each functional unit included in the learning device 1 may be distributed and implemented in a plurality of information processing devices.
 なお、自己符号化装置2は、ネットワークを介して通信可能に接続された複数台の情報処理装置を用いて実装されてもよい。この場合、自己符号化装置2が備える各機能部は、複数の情報処理装置に分散して実装されてもよい。 Note that the self-encoding device 2 may be implemented using a plurality of information processing devices communicatively connected via a network. In this case, each functional unit included in the self-encoding device 2 may be distributed and implemented in a plurality of information processing devices.
 なお、学習装置1と、自己符号化装置2と、の各機能の全て又は一部は、ASIC(Application Specific Integrated Circuit)やPLD(Programmable Logic Device)やFPGA(Field Programmable Gate Array)等のハードウェアを用いて実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ROM、CD-ROM等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。プログラムは、電気通信回線を介して送信されてもよい。 All or part of the functions of the learning device 1 and the self-encoding device 2 are hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). may be implemented using The program may be recorded on a computer-readable recording medium. Computer-readable recording media include portable media such as flexible disks, magneto-optical disks, ROMs and CD-ROMs, and storage devices such as hard disks incorporated in computer systems. The program may be transmitted over telecommunications lines.
 以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiment of the present invention has been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and includes design within the scope of the gist of the present invention.
 1…学習装置、 10…学習部、 100…学習ネットワーク、 101…主データ取得部、 102…主データ側符号化部、 103…補助データ側符号化部、 104…主データ側ノイズ付与部、 105…補助データ側ノイズ付与部、 106…補助データ側確率推定部、 107…補助データ側復号部、 108…補助エントロピー取得部、 109…主データ側確率推定部、 110…主エントロピー取得部、 111…主データ側復号部、 112…再構成誤差算出部、 113…最適化部、 2…自己符号化装置、 200…エンコーダ、 201…自己符号化対象取得部、 202…学習済み主データ側符号化部、 203…学習済み補助データ側符号化部、 204…ベクトル量子化部、 205…スカラー量子化部、 206…学習済み補助データ側確率推定部、 207…学習済み補助データ側復号部、 208…補助エントロピー符号化部、 209…主データ側確率推定部、 210…主エントロピー符号化部、 211…データ多重化部、 213…符号化データ取得部、 214…データ分離部、 215…補助エントロピー復号部、 216…学習済み補助データ側復号部、 217…主エントロピー復号部、 218…学習済み主データ側復号部、 11…制御部、 12…入力部、 13…通信部、 14…記憶部、 15…出力部、 120…記憶制御部、 130…通信制御部、 140…出力制御部、 21…制御部、 22…入力部、 23…通信部、 24…記憶部、 25…出力部、 20…自己符号化実行部、 220…記憶制御部、 230…通信制御部、 240…出力制御部、 91…プロセッサ、 92…メモリ、 93…プロセッサ、 94…メモリ 1... learning device, 10... learning unit, 100... learning network, 101... main data acquisition unit, 102... main data side encoding unit, 103... auxiliary data side encoding unit, 104... main data side noise adding unit, 105 ... auxiliary data side noise addition section 106 ... auxiliary data side probability estimation section 107 ... auxiliary data side decoding section 108 ... auxiliary entropy acquisition section 109 ... main data side probability estimation section 110 ... main entropy acquisition section 111 ... Main data side decoding unit 112... Reconstruction error calculation unit 113... Optimization unit 2... Self encoding device 200... Encoder 201... Self encoding target acquisition unit 202... Learned main data side encoding unit 203... Learned auxiliary data side encoding unit 204... Vector quantization unit 205... Scalar quantization unit 206... Learned auxiliary data side probability estimation unit 207... Learned auxiliary data side decoding unit 208... Assistant Entropy encoding unit 209... Main data side probability estimation unit 210... Main entropy encoding unit 211... Data multiplexing unit 213... Encoded data acquisition unit 214... Data separation unit 215... Auxiliary entropy decoding unit 216... Learned auxiliary data side decoding unit 217... Main entropy decoding unit 218... Learned main data side decoding unit 11... Control unit 12... Input unit 13... Communication unit 14... Storage unit 15... Output Part 120... Memory control part 130... Communication control part 140... Output control part 21... Control part 22... Input part 23... Communication part 24... Storage part 25... Output part 20... Self-encoding Execution unit 220 Storage control unit 230 Communication control unit 240 Output control unit 91 Processor 92 Memory 93 Processor 94 Memory

Claims (8)

  1.  ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習部、
     を備え、
     前記学習部は前記学習において、前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、
     前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、
     学習装置。
    A process of self-encoding using vector quantization, which is self-encoding using a main data feature amount that is a feature amount to be self-encoded and an auxiliary feature amount that is a feature amount of the main data feature amount. a self-encoding process in which entropy coding is performed on the result of vector quantization of the main data feature amount and entropy coding is performed on the result of scalar quantization of the auxiliary feature amount. a learning unit that updates processing by learning;
    with
    In the learning, the learning unit executes main data side probability estimation processing for estimating the occurrence probability of each element of the tensor indicating the main data feature amount,
    The main data side probability estimation process integrates a region that is a region that divides a vector space in which representative vectors are arranged in a grid pattern, that is a region that includes one grid point of the vector space, and that has a hyperrectangular shape. estimating the probability of occurrence using the result of integrating a probability density function parameterized as a region;
    learning device.
  2.  前記学習部は、前記主データ側確率推定処理によって推定された前記生起確率に基づいて得られる前記主データ特徴量のエントロピー、を用いて前記符号化と前記復号との処理を更新する、
     請求項1に記載の学習装置。
    The learning unit updates the encoding and decoding processes using the entropy of the main data feature amount obtained based on the occurrence probability estimated by the main data side probability estimation process.
    A learning device according to claim 1.
  3.  前記ベクトル量子化は、LatticeVQである、
     請求項1又は2に記載の学習装置。
    the vector quantization is LatticeVQ;
    3. The learning device according to claim 1 or 2.
  4.  自己符号化の対象を取得する自己符号化対象取得部と、
     ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習部、を備え、前記学習部は前記学習において、前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、学習装置を用いて得られた、学習済みの符号化の処理と学習済みの復号の処理とを用いて、前記自己符号化対象取得部が取得した前記対象のベクトル量子化による自己符号化を行う自己符号化実行部と、
     を備える自己符号化装置。
    a self-encoding target acquisition unit that acquires a self-encoding target;
    A process of self-encoding using vector quantization, which is self-encoding using a main data feature amount that is a feature amount to be self-encoded and an auxiliary feature amount that is a feature amount of the main data feature amount. a self-encoding process in which entropy coding is performed on the result of vector quantization of the main data feature amount and entropy coding is performed on the result of scalar quantization of the auxiliary feature amount. a learning unit that updates processing by learning, wherein the learning unit executes main data side probability estimation processing for estimating the occurrence probability of each element of a tensor representing the main data feature amount in the learning, and In the side probability estimation process, a region that divides a vector space in which representative vectors are arranged in a grid pattern, that is a region that includes one grid point of the vector space, and that has a hyperrectangular parallelepiped shape is parameterized as an integration region. Using learned encoding processing and learned decoding processing obtained using a learning device that estimates the occurrence probability using the result of integration of the probability density function obtained, a self-encoding execution unit that performs self-encoding by vector quantization of the target acquired by the self-encoding target acquisition unit;
    An autoencoding device comprising:
  5.  ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習ステップ、
     を有し、
     前記学習ステップは前記学習において前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、
     前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、
     学習方法。
    A process of self-encoding using vector quantization, which is self-encoding using a main data feature amount that is a feature amount to be self-encoded and an auxiliary feature amount that is a feature amount of the main data feature amount. a self-encoding process in which entropy coding is performed on the result of vector quantization of the main data feature amount and entropy coding is performed on the result of scalar quantization of the auxiliary feature amount. a learning step that updates the process by learning;
    has
    The learning step executes main data side probability estimation processing for estimating the occurrence probability of each element of the tensor indicating the main data feature amount in the learning,
    The main data side probability estimation process integrates a region that is a region that divides a vector space in which representative vectors are arranged in a grid pattern, that is a region that includes one grid point of the vector space, and that has a hyperrectangular shape. estimating the probability of occurrence using the result of integrating a probability density function parameterized as a region;
    learning method.
  6.  自己符号化の対象を取得する自己符号化対象取得ステップと、
     ベクトル量子化を用いた自己符号化の処理であり、自己符号化の対象の特徴量である主データ特徴量と、前記主データ特徴量の特徴量である補助特徴量とを用いる自己符号化の処理であり、前記主データ特徴量をベクトル量子化した結果に対するエントロピー符号化と前記補助特徴量をスカラー量子化した結果に対するエントロピー符号化とを行う自己符号化の処理、における符号化と復号との処理を学習により更新する学習部、を備え、前記学習部は前記学習において、前記主データ特徴量を示すテンソルの各要素の生起確率を推定する主データ側確率推定処理を実行し、前記主データ側確率推定処理は、代表ベクトルが格子状に配置されたベクトル空間を分割する領域であって前記ベクトル空間の1つの格子点を含む領域であって形状が超直方体である領域を積分領域としてパラメトライズされた確率密度関数が積分された結果を用いて、前記生起確率を推定する、学習装置を用いて得られた、学習済みの符号化の処理と学習済みの復号の処理とを用いて、前記自己符号化対象取得ステップが取得した前記対象のベクトル量子化による自己符号化を行う自己符号化実行ステップと、
     を有する自己符号化方法。
    a self-encoding target acquisition step for acquiring a self-encoding target;
    A process of self-encoding using vector quantization, which is self-encoding using a main data feature amount that is a feature amount to be self-encoded and an auxiliary feature amount that is a feature amount of the main data feature amount. a self-encoding process in which entropy coding is performed on the result of vector quantization of the main data feature amount and entropy coding is performed on the result of scalar quantization of the auxiliary feature amount. a learning unit that updates processing by learning, wherein the learning unit executes main data side probability estimation processing for estimating the occurrence probability of each element of a tensor representing the main data feature amount in the learning, and In the side probability estimation process, a region that divides a vector space in which representative vectors are arranged in a grid pattern, that is a region that includes one grid point of the vector space, and that has a hyperrectangular parallelepiped shape is parameterized as an integration region. Using learned encoding processing and learned decoding processing obtained using a learning device that estimates the occurrence probability using the result of integration of the probability density function obtained, a self-encoding execution step of performing self-encoding by vector quantization of the target acquired by the self-encoding target acquiring step;
    A self-encoding method with
  7.  請求項1から請求項3のいずれか一項に記載の学習装置としてコンピュータを機能させるためのプログラム。 A program for causing a computer to function as the learning device according to any one of claims 1 to 3.
  8.  請求項4に記載の自己符号化装置としてコンピュータを機能させるためのプログラム。 A program for causing a computer to function as the self-encoding device according to claim 4.
PCT/JP2021/042980 2021-11-24 2021-11-24 Learning device, autoencoding device, learning method, autoencoding method, and program WO2023095204A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2023563381A JPWO2023095204A1 (en) 2021-11-24 2021-11-24
PCT/JP2021/042980 WO2023095204A1 (en) 2021-11-24 2021-11-24 Learning device, autoencoding device, learning method, autoencoding method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/042980 WO2023095204A1 (en) 2021-11-24 2021-11-24 Learning device, autoencoding device, learning method, autoencoding method, and program

Publications (1)

Publication Number Publication Date
WO2023095204A1 true WO2023095204A1 (en) 2023-06-01

Family

ID=86539066

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/042980 WO2023095204A1 (en) 2021-11-24 2021-11-24 Learning device, autoencoding device, learning method, autoencoding method, and program

Country Status (2)

Country Link
JP (1) JPWO2023095204A1 (en)
WO (1) WO2023095204A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024068368A1 (en) * 2022-09-27 2024-04-04 Interdigital Ce Patent Holdings, Sas Uniform vector quantization for end-to-end image/video compression

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EIRIKUR AGUSTSSON, MENTZER FABIAN, TSCHANNEN MICHAEL, CAVIGELLI LUKAS, TIMOFTE RADU, BENINI LUCA, VAN GOOL LUC: "Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations", 8 June 2017 (2017-06-08), XP055730082, Retrieved from the Internet <URL:https://arxiv.org/pdf/1704.00648.pdf> [retrieved on 20200911] *
LIJUN ZHAO; HUIHUI BAI; ANHONG WANG; YAO ZHAO: "Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 12 January 2020 (2020-01-12), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081577430 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024068368A1 (en) * 2022-09-27 2024-04-04 Interdigital Ce Patent Holdings, Sas Uniform vector quantization for end-to-end image/video compression

Also Published As

Publication number Publication date
JPWO2023095204A1 (en) 2023-06-01

Similar Documents

Publication Publication Date Title
US11558620B2 (en) Image encoding and decoding, video encoding and decoding: methods, systems and training methods
Xin et al. Maximal sparsity with deep networks?
CN113424202A (en) Adjusting activation compression for neural network training
US11221990B2 (en) Ultra-high compression of images based on deep learning
Cohen et al. Nonlinear approximation of random functions
Won et al. Stochastic image processing
CN111818346A (en) Image encoding method and apparatus, image decoding method and apparatus
US20210065052A1 (en) Bayesian optimization of sparsity ratios in model compression
JP5349407B2 (en) A program to cluster samples using the mean shift procedure
CN108959322B (en) Information processing method and device for generating image based on text
Saravanan et al. Intelligent Satin Bowerbird Optimizer Based Compression Technique for Remote Sensing Images.
WO2023095204A1 (en) Learning device, autoencoding device, learning method, autoencoding method, and program
WO2023095207A1 (en) Learning device, autoencoding device, learning method, autoencoding method, and program
US20240202982A1 (en) 3d point cloud encoding and decoding method, compression method and device based on graph dictionary learning
US11544881B1 (en) Method and data processing system for lossy image or video encoding, transmission and decoding
Xing et al. Flexible signal denoising via flexible empirical Bayes shrinkage
CN115311515A (en) Training method for generating countermeasure network by mixed quantum classical and related equipment
US11922018B2 (en) Storage system and storage control method including dimension setting information representing attribute for each of data dimensions of multidimensional dataset
Mathieu et al. Geometric neural diffusion processes
Psenka et al. Representation learning via manifold flattening and reconstruction
Kharinov Model of the quasi-optimal hierarchical segmentation of a color image
WO2023248427A1 (en) Learning device, autoencoding device, learning method, autoencoding method, and program
WO2023248431A1 (en) Training device, training method, and program
CN116261856A (en) Point cloud layering method, decoder, encoder and storage medium
JP2018182531A (en) Division shape determining apparatus, learning apparatus, division shape determining method, and division shape determining program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21965578

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023563381

Country of ref document: JP