WO2021250868A1 - Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme - Google Patents

Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme Download PDF

Info

Publication number
WO2021250868A1
WO2021250868A1 PCT/JP2020/023094 JP2020023094W WO2021250868A1 WO 2021250868 A1 WO2021250868 A1 WO 2021250868A1 JP 2020023094 W JP2020023094 W JP 2020023094W WO 2021250868 A1 WO2021250868 A1 WO 2021250868A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
determination
unit
compressed data
learned
Prior art date
Application number
PCT/JP2020/023094
Other languages
English (en)
Japanese (ja)
Inventor
伸 水谷
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to JP2022529972A priority Critical patent/JPWO2021250868A1/ja
Priority to PCT/JP2020/023094 priority patent/WO2021250868A1/fr
Publication of WO2021250868A1 publication Critical patent/WO2021250868A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units

Definitions

  • the present invention relates to a restoration possibility determination method, a restoration possibility determination device, and a program.
  • sensor nodes used in sensor networks operate with low power consumption and have not only sensing and communication functions but also some information processing functions, such as data compression and data classification identification / event detection. It is possible to have a function.
  • This field is called edge computing (for example, Non-Patent Document 1 and the like).
  • the observation sensing data obtained from the sensor nodes is sent to the center as it is, if the number of sensor nodes increases, the amount of communication to the center increases, which may exceed the limit (band) of the possible amount of communication. .. Therefore, it is considered to compress the sensor data at the sensor node to reduce the amount of communication information to the center.
  • Communication by data classification identification / event detection is a kind of communication volume compression because it does not communicate the sensor data itself.
  • the following three can be considered as the contents of communication between the sensor node and the center.
  • the sensor data is identified on the sensor node side, and only the identification result is sent to the center.
  • the sensor node side obtains the feature amount of the sensor data, and sends only the feature amount to the center.
  • identification is performed using the feature amount transmitted to the center, but in the case of sensor data that exceeds the expected identification, the required feature amount is also different, and it is not possible to identify an unexpected input. Have difficulty.
  • a data-dependent compression / restoration method can be used depending on what kind of sensor data is to be compressed.
  • compression / restoration methods are used in which the sensor data differs depending on the image, sound, and other time series.
  • lossless compression and lossy compression there are lossless compression and lossy compression, and in general, lossy compression has a higher compressibility than lossy compression.
  • AE autoencoder
  • the AE can be used with any sensor data, but it is necessary to train the AE in advance using the sensor data (Non-Patent Document 2 and Non-Patent Document 3).
  • AE since the compression restorer is created by learning, data other than the assumed learning data and data close to it cannot be restored, and the output of AE is different from the input. In the case of sensor data exceeding such assumptions, the AE will not function as a compression restorer.
  • the computing power and energy consumption on the sensor node side are limited compared to the center side.
  • an autoencoder may be used as a compression restorer (Non-Patent Document 4).
  • the autoencoder consists of an encoder part and a decoder part, the encoder compresses the input data, and the decoder restores the compressed data.
  • An autoencoder is a kind of neural network, and is learned so that the input data to the encoder is output as it is by the decoder.
  • the data used as training data and the data close to it can be compressed and restored by learning and generalization. On the other hand, with data other than those, it is unclear whether compression restoration is possible, and whether the input sensor data can be restored on the decoder side.
  • the autoencoder determines that the data can be restored if the difference between the inputs and outputs is less than a certain threshold, but in the above sensor network, the input data is in the sensor node and the output data is in the center. In order to compare the input data and the output data, it is necessary to transmit the input data or the output data to the center or the sensor node.
  • the autoencoder can be expected to be generalized by learning, the learning data itself can be naturally compressed and restored, and data close to the learning data can also be compressed and restored. Therefore, if the data obtained by the sensor node can be limited to the learning data and the data close to it, the sensor network can function and operate.
  • the present invention has been made in view of the above points, and an object of the present invention is to make it possible to determine whether or not restoration is possible based on compressed data.
  • the computer executes a determination procedure for determining whether or not to use it.
  • FIG. 1 is a diagram showing a configuration example of a sensor network according to an embodiment of the present invention.
  • one or more sensor nodes 20 are connected to the center 10 via a network such as the Internet.
  • the sensor node 20 is connected to a sensor (or has a sensor), compresses data sensed by the sensor (hereinafter referred to as “sensor data”), and compresses the compressed data (hereinafter referred to as “compressed data”).
  • sensor data data sensed by the sensor
  • compressed data hereinafter referred to as “compressed data”.
  • the center 10 is one or more computers having a function of receiving compressed data transmitted from the sensor node 20 and restoring the compressed data.
  • the sensor data is assumed to be image data.
  • the details of the image data will be described later.
  • the data to which this embodiment is applicable is not limited to the specific format.
  • FIG. 2 is a diagram showing a hardware configuration example of the center 10 according to the embodiment of the present invention.
  • the center 10 of FIG. 2 has a drive device 100, an auxiliary storage device 102, a memory device 103, a processor 104, an interface device 105, and the like, which are connected to each other by a bus B, respectively.
  • the program that realizes the processing at the center 10 is provided by a recording medium 101 such as a CD-ROM.
  • a recording medium 101 such as a CD-ROM.
  • the program is installed in the auxiliary storage device 102 from the recording medium 101 via the drive device 100.
  • the program does not necessarily have to be installed from the recording medium 101, and may be downloaded from another computer via the network.
  • the auxiliary storage device 102 stores the installed program and also stores necessary files, data, and the like.
  • the memory device 103 reads the program from the auxiliary storage device 102 and stores it when there is an instruction to start the program.
  • the processor 104 is a CPU or GPU (Graphics Processing Unit), or a CPU and GPU, and executes a function related to the center 10 according to a program stored in the memory device 103.
  • the interface device 105 is used as an interface for connecting to a network.
  • the sensor node 20 may also have the same hardware configuration as the center 10. However, the performance of the hardware of the sensor node 20 may be lower than the performance of the hardware of the center 10.
  • FIG. 3 is a diagram showing a functional configuration example of the sensor network according to the first embodiment.
  • the sensor node 20 has a compression unit 21 and a transmission unit 22.
  • Each of these parts is realized by a process in which one or more programs installed in the sensor node 20 are executed by a processor (for example, a CPU) of the sensor node 20.
  • the compression unit 21 generates compressed data by compressing the sensor data. In this embodiment, compression with loss is performed.
  • the transmission unit 22 transmits the compressed data to the center 10.
  • the center 10 has a receiving unit 11, a determination unit 12, a restoring unit 13, a classification unit 14, and a learning unit 15. Each of these parts is realized by a process in which one or more programs installed in the center 10 are executed by the processor 104.
  • the center 10 also uses the data storage unit 16.
  • the data storage unit 16 can be realized by using, for example, an auxiliary storage device 102, a storage device that can be connected to the center 10 via a network, or the like.
  • the receiving unit 11 receives the compressed data transmitted from the sensor node 20.
  • the determination unit 12 determines that the compressed data can be restored.
  • the recoverable judgment means a judgment as to whether or not restoration is possible.
  • “recoverable” means a state in which the difference between the input and output of the compression restorer (compression unit 21 and the restoration unit 13) is less than a defined threshold value, and "unrecoverable” means that the difference is a threshold value.
  • the restoration unit 13 generates the restoration data by restoring the compressed data determined to be recoverable.
  • the classification unit 14 classifies the restored data, which will be described later.
  • the learning unit 15 learns about the neural network constituting the compression unit 21 and the restoration unit 13 by using the data group stored in the data storage unit 16. In this embodiment, compression and decompression (encoding and decoding) of sensor data is performed using an autoencoder (Autoencoder (AE)).
  • AE Autoencoder
  • FIG. 4 is a diagram schematically showing a compression unit 21 and a restoration unit 13 in the embodiment of the present invention.
  • the compression unit 21 is realized by the encoder of the restoration unit 13
  • the restoration unit 13 is realized by the decoder of the AE
  • the information with the smallest number of units in the intermediate layer of the AE is sensored as compressed data.
  • Communication between the node 20 and the center 10 reduces the amount of communication.
  • the AE is a kind of layered neural network, and includes a compression code (encoder) device and a restoration / decoding (decoder) device (Non-Patent Document 2, Non-Patent Document 2, Non-Patent Document 2). Patent Document 3).
  • the white circles represent the units of the neural network, and the lines connecting the units represent the weights (links) between the units.
  • FIG. 5 shows an AE having a five-layer structure in which the input is five-dimensional, compressed in two dimensions, and the input is reproduced on the output side.
  • the number of units gradually decreases as the processing in each layer progresses from left to right in order to compress the dimension of the input data.
  • the number of units decreased in the intermediate layer increases to the same number of units as the input layer of the encoder, and the input information is restored.
  • the encoder and the decoder have a plurality of layers, and the middle layer in the middle has the smallest number of units, forming an hourglass shape.
  • the amount of communication is reduced.
  • AE is supervised learning and is learned so that the input and output are the same.
  • the loss function used in training varies depending on the data set used, but is MSE (Mean Square Error), BCE (Binary Cross Entropy), CCE (Categorical Cross Entropy), or the like.
  • the sensor data is image data. It is assumed that the number of each image of the image data is 28 ⁇ 28, and each pixel is 8 bits. Examples of AEs suitable for such image data include AEs as shown in FIG.
  • the AE in FIG. 6 compresses and restores image data using a 28 ⁇ 28 matrix as an 8-bit input, and CNN (Convolutional Neural Network) is usually used (https://qiita.com/icoxfog417/items/5fd55fad152231d706c2). ).
  • CNN Convolutional Neural Network
  • the dimension of information is reduced and compressed into information such as features while maintaining the information of spatial position.
  • the CNN in FIG. 6 has a 9-layer structure, and the input / output represents a 28 ⁇ 28 dimensional vector.
  • N and M of each intermediate layer represented by a rectangular parallelepiped represent the number of unit planes (types of filters), and the rectangle of the shaded portion represents the range of connection from the previous layer.
  • a neural network having a structure in which a fully connected layer (FC (Fully Connected) layer) is sandwiched between CNNs is also suitable for image data.
  • FC Fully Connected
  • a neural network having a symmetrical structure centered on the middle layer is considered, but if the input and output have the same dimension (number of units), what kind of neural network will be used?
  • any neural network can be used as a compression restorer as long as the input data can be restored at the output layer.
  • the determination unit 12 will be described in more detail.
  • a state in which the difference between the input and output of the compression restorer (compression unit 21 and the restoration unit 13) is less than a defined threshold value is defined as recoverable, and a state in which the difference is greater than or equal to the threshold value is considered unrecoverable.
  • the uncompressed sensor data and its restored data are required in order to strictly perform the recoverable determination according to such a definition. Then, it is necessary to constantly transmit the sensor data from the sensor node 20, and the significance of compressing the sensor data is lost. Therefore, in the present embodiment, it is determined whether or not the compressed sensor data is learned data (unlearned data) for the autoencoder as the compression unit 21 and the restoration unit 13 (hereinafter,). , "Learned judgment") to make a recoverable judgment. That is, the determination result that it has been learned indicates that it can be restored, and the determination result that it has not been learned indicates that it cannot be restored.
  • an abnormality detection technique is used for the learned determination.
  • various conventional methods in the technical field of abnormality detection and these conventional methods are used in the present embodiment.
  • trained data is given in advance to the autoencoder as a compression restorer, but unlearned data is not given in advance, some anomaly detection methods are given only normal data. It is available.
  • the learned determination is performed by applying the abnormality detection technique to the compressed data. That is, as a result of applying the abnormality detection technology to the compressed data, if the abnormality is not detected (when it is determined to be normal), it is determined to have been learned, and when it is determined to be abnormal, it is unlearned. Is determined.
  • the learned determination is made based on the compressed data in which the normal data (that is, the trained data) is compressed. Must be possible.
  • a VAE Vehicle AutoEncoder
  • a loss function containing a regularization term that satisfies some restrictions is used for the compressed data obtained by compressing the input data (non-).
  • Patent Document 2 Non-Patent Document 3). Anomaly detection is performed depending on how much this limit is met or not.
  • input / output data is called an observed variable, and compressed data is called a latent variable.
  • VAE is a loss function between the input and output of a normal AE, and the loss function is defined by adding a term such that the compressed data in the middle layer becomes the set probability density function. Is. Therefore, the VAE encoder outputs the parameters of the probability density function. That is, an inference model in which the AE encoder obtains the latent variable Z (information that cannot be directly obtained from observations such as classification labels and handwriting) from the observed variables (observed data), and a generation model in which the decoder obtains the observed variables from the latent variable Z. What is made to be is VAE.
  • VAE learning is usually performed so that the data points have a Gaussian distribution N (0, I 2) in a space called a latent variable Z in the middle layer in the middle.
  • the encoder learns to reproduce the input to the encoder from the input by independently calculating the mean and standard deviation of the Gaussian distribution in each dimension of the latent variable and using the latent variable Z sampled by those probability distributions as the input of the decoder. Will be done.
  • KLD Kullback-Leibler Divergence
  • the VAE is used as the compression unit 21 and the restoration unit 13.
  • the AEs shown in FIGS. 6 and 7 can also be used as VAEs (learnable).
  • the sensor data is image data
  • a case where the AE of FIG. 7 is used as a VAE will be described.
  • the VAE is hereinafter referred to as "target AE".
  • the target AE is prepared so that the input can be restored to the output by learning in advance the sensor data observed by the sensor node 20 as input / output.
  • the learning data for the target AE it is usually sufficient that the data sampled from the assumed probability distribution in which a certain event occurs is sufficient to capture the characteristics of the distribution. It is considered that data points other than sample points can be interpolated by the generalization ability of AE.
  • the learning data set of the target AE handwritten numerical data (http://yann.lecun.com/) often used in the field of supervised learning of machine learning called MNIST shown in FIG. I used exdb / mnist /).
  • MNIST machine learning
  • the total number of MNIST data is 60,000, and it is classified into 50,000 data for normal learning and 10,000 data for testing.
  • the observation data obtained by the sensor node 20 is 8-bit handwritten numeric data of a 28 ⁇ 28 matrix, and the situation where the center 10 side wants the classification label is considered. That is, the image of the handwritten numbers taken by the camera (sensor node 20) is the input data, and the output data on the center 10 side is the classification.
  • Fashion-MNIST (https://github.com/zalandoresearch/fashion-mnist) is used as unlearned data that cannot be assumed.
  • Data F-MNIST like MNIST, 0-255 to each of the matrix of 28 ⁇ 28: The value of (2 8 8bit) is allocated, of FIG. 9 digits 0-9 is given as a classification label It is monochrome image data of such fashion items.
  • the total number of data of F-MNIST is 60,000, and it is classified into 50,000 data for normal learning and 10,000 data for testing.
  • the data value to be input is an integer value of the luminance of pixels of the image 0-255: the value of (2 8 8bit) normalized to [0,1], used.
  • the target AE learns so that the latent variable Z is close to the probability density function N (0, I 2 ), it is conceivable that the trained data is close to this distribution and the unlearned data is away from this distribution.
  • FIGS. 10 to 12 are diagrams showing variations in the latent space of trained / unlearned data points in the case of various latent variable dimension numbers.
  • FIG. 10 shows the variation when the number of dimensions of the latent variable Z is 2.
  • FIG. 11 shows the variation when the number of dimensions of the latent variable Z is 16.
  • FIG. 12 shows the variation when the number of dimensions of the latent variable Z is 64.
  • t-SNE t-Student-distributed Neighbor Embedding
  • black dots indicate trained data points (MNIST data points)
  • white dots indicate unlearned data points (F-MNIST data points).
  • the trained data points refer to the points of each data included in the MNIST in the latent space of the target AE trained based on the MNIST
  • the untrained data points refer to the points of each data included in the F-MNIST. A point in the latent space.
  • the learned determination can be made by the difference in the position where the learned / unlearned data points exist by using an appropriate number of latent dimensions. That is, according to FIGS. 10 to 12, it seems that the positions where the trained / unlearned data points exist in the 16th dimension (FIG. 11) are sufficiently separated. Therefore, in the present embodiment, the dimension of the latent variable Z is set to 16, and a case where various anomaly detection methods (abnormality detection techniques) are applied to such a latent variable Z will be described.
  • FIG. 13 shows a histogram for the squared Mahalanobis distance (abnormal score function) of the trained data points.
  • the white histogram shows the histogram of the trained data
  • the curve shows the ⁇ 2 distribution which is a theoretical curve.
  • the square of the Mahalanobis distance between the new data following the same distribution as the trained data and the average distribution of the trained data follows the ⁇ 2 distribution. With this ⁇ 2 distribution, if the false recognition rate is determined, the threshold value can be determined from the distribution.
  • a histogram for the squared Mahalanobis distance in the unlearned data set (hereinafter referred to as "unlearned histogram") and a histogram in the above-mentioned trained data set (hereinafter referred to as ""
  • the intersection with the "learned histogram” is the threshold value for the squared Mahalanobis distance that minimizes the false positive rate.
  • the erroneous judgment means that the learned is judged as unlearned, or vice versa.
  • FIG. 14 is the first diagram showing the threshold value at the square Mahalanobis distance.
  • the X-axis corresponds to the square Mahalanobis distance.
  • the white histogram shows the trained histogram
  • the black-painted histogram shows the unlearned histogram (this point is the same in FIGS. 15 to 19).
  • the broken line L1 indicates the intersection of the unlearned histogram and the learned histogram, that is, the position of the threshold value (this point is also the same in FIGS. 15 to 19).
  • the determination unit 12 uses this threshold value to determine that the new input data has been learned. Specifically, the determination unit 12 calculates the square Mahalanobis distance of the latent variable Z (that is, compressed data) of the new input data, and if the calculation result is equal to or greater than the threshold value, it is determined that the data has not been learned, and the calculation is performed. If the result is less than the threshold value, it is determined that the training has been completed.
  • Z that is, compressed data
  • [Case 2] A histogram for the squared Mahalanobis distance from the average of trained data points at N ( ⁇ , ⁇ ) when fitted to a multivariate Gaussian distribution using all trained data points (hereinafter referred to as "trained histogram”). ) And the histogram for the squared Mahalanobis distance in the unlearned data set F-MNIST (hereinafter referred to as "unlearned histogram”) is used as a threshold.
  • FIG. 15 is a second diagram showing a threshold value at the squared Mahalanobis distance.
  • the view of FIG. 15 is the same as that of FIG. Further, the method of determining that the learning has been completed is the same as in Case 1.
  • FIG. 16 is a diagram showing a threshold value at the ⁇ ln probability value.
  • the X-axis is the ⁇ ln probability value.
  • the case where the probability value is 0 is excluded, and in this case, the abnormal score function is considered to be infinite.
  • the determination unit 12 determines that the new input data has been learned using the threshold value shown in FIG. Specifically, the determination unit 12 calculates the ⁇ ln probability value of the latent variable Z (that is, the compressed data) of the new input data, and if the calculation result is equal to or greater than the threshold value, it is determined that the data has not been learned. If the calculation result is less than the threshold value, it is determined that the learning has been completed.
  • the encoder outputs ⁇ and ⁇ , and one distribution N ( ⁇ , ⁇ ) is obtained. Therefore, the difference (abnormal score function) is defined by the distance between distributions in KLD. However, the threshold may be determined in consideration of those histograms. At that time, as a distribution representing the trained data, various distributions can be considered, such as selecting the difference between the distribution of the minimum distance with the smallest KLD and the distribution obtained by averaging the distributions of all the trained data. Further, the difference due to the squared Mahalanobis distance may be considered as described above by using only the average ⁇ obtained from the encoder. The difference (abnormal score function) may be defined using the purpose, measurement, and amount that can be calculated, and the learned judgment may be performed using the anomaly detection technique.
  • the above (cases 1 to 4) is a method by threshold processing based on the error rate threshold in the tail of the distribution or the intersection threshold in the histogram with the trained / unlearned data set.
  • FIG. 17 is a diagram showing a threshold value by LOF based on the neighbor method.
  • the horizontal axis is LOF.
  • the position of the broken line L2 at the right end of the histogram of the trained data set is the threshold value.
  • the intersection (position of the broken line L1) in the histogram of the trained / unlearned data set may be set as the threshold value.
  • the determination unit 12 uses any of the threshold values to determine that the new input data has been learned. Specifically, the determination unit 12 calculates the LOF of the latent variable Z (that is, the compressed data) of the new input data, and if the calculation result is equal to or higher than the threshold value, it is determined that the data has not been learned, and the calculation result is If it is less than the threshold value, it is determined that the learning has been completed.
  • the threshold value that is, the compressed data
  • One-class SVM is a method conceived to apply SVM (Support Vector Machine), which is a two-class classifier, to training data or other data (One-class SVM with non-linear kernel (RBF)).
  • SVM Small Vector Machine
  • RBF non-linear kernel
  • Https ://scikit-learn.org/stable/auto_examples/svm/plot_oneclass.html#sphx-glr-auto-examples-svm-plot-oneclass-py etc.
  • the One-class SVM is used as an anomalous score function with the latent variable Z as the input.
  • FIG. 18 is a diagram showing a threshold value by One-class SVM.
  • the horizontal axis is the score by One-class SVM.
  • the position of the broken line L2 at the right end of the histogram of the trained data set is the threshold value.
  • the intersection (position of the broken line L1) in the histogram of the trained / unlearned data set may be set as the threshold value.
  • Isolation Forest is a method conceived to apply Random Forest to anomaly detection (IsolationForest example, https://scikit-learn.org/stable/auto_examples/ensemble/plot_isolation_forest.html, etc.). In Case 7, Isolation Forest is used as an anomalous score function with the latent variable Z as input.
  • FIG. 19 is a diagram showing a threshold value by Isolation Forest.
  • the horizontal axis is the score by Isolation Forest.
  • the position of the broken line L2 at the right end of the histogram of the trained data set is the threshold value.
  • the intersection (position of the broken line L1) in the histogram of the trained / unlearned data set may be set as the threshold value.
  • anomaly detection methods that can be performed only with trained data (known data) can be used for trained judgment (https://scikit-learn.org/stable/auto_examples/plot_anomaly_comparison.html# sphx-glr-auto-examples-plot-anomaly-comparison-py, Takeshi Ide, Masashi Sugiyama, "Anomaly Detection and Change Detection", Kodansha, etc.).
  • FIG. 20 shows the evaluation results of the misrecognition rate in the experiments conducted by the inventor of the present application for Cases 1 to 3 and Cases 5 to 7 described above. It should be noted that the reason why there is no value in the item of "distribution" for cases 5 to 7 is that the abnormal score of cases 5 to 7 is not an abnormal score based on the distribution of the latent variable Z.
  • FIG. 21 is a flowchart for explaining an example of a processing procedure at the time of communication of compressed data.
  • the compression unit 21 compresses the sensor data using the encoder of the target AE to generate compressed data. (S102).
  • the transmission unit 22 transmits the compressed data to the center 10.
  • the determination unit 12 applies a value obtained by applying the anomalous score function to the compressed data (that is, the value of the latent variable Z) (hereinafter, "abnormal score"). By comparing with the threshold value corresponding to the abnormal score function, a learned determination (that is, a recoverable determination) is executed for the target data (S103).
  • the threshold value is set in advance.
  • the restoration unit 13 restores the compressed data using the decoder of the target AE. By doing so, the restored data is generated (S104). Subsequently, the restoration unit 13 outputs the restored data to the classification unit 14 and stores the restored data in the data storage unit 16 (S105). Therefore, a set of restored data related to the sensor data determined to have been learned (recoverable) is accumulated in the data storage unit 16.
  • the classification unit 14 classifies the restored data (S106). If the sensor data is an image of handwritten numbers, classification is performed on what the numbers are included in the restored data.
  • the receiving unit 11 acquires the target data from the sensor node 20. Then, the target data is stored in the data storage unit 16 (S107). For example, the receiving unit 11 requests the sensor node 20 to transmit the target data. The transmission unit 22 of the sensor node 20 transmits the target data to the center 10 in response to the request. The receiving unit 11 stores the received target data in the data storage unit 16. Therefore, a set of sensor data determined to be unlearned (unrecoverable) is stored in the data storage unit 16.
  • a communication path capable of sending the sensor data itself to the center 10 and a communication path for holding the target data in the sensor node 20 in preparation for a transmission request of the target data. It is necessary to prepare a certain amount of memory.
  • the communication path may be one for normal compressed data transmission. Further, if the learned determination and the transmission request of the target data can be processed in real time, no memory is particularly required.
  • the learning unit 15 may perform the learning unit 15.
  • the target AE is additionally learned or relearned using the data set (restored data set, unlearned data set) stored in the data storage unit 16 and the learned data set used for learning the initial target AE. (S109).
  • a known method may be used for learning the target AE.
  • the learning unit 15 executes a process for updating the target AE (S110). Specifically, the learning unit 15 updates the model parameters of the decoder as the restoration unit 13 to the values after additional learning or re-learning. Further, the learning unit 15 transmits the model parameters of the encoder as the compression unit 21 to the sensor node 20. The compression unit 21 of the sensor node 20 updates the model parameter of the encoder according to the received value.
  • the target AE can restore the unlearned data and also restore the original learned data.
  • the second embodiment will explain the differences from the first embodiment.
  • the points not particularly mentioned in the second embodiment may be the same as those in the first embodiment.
  • FIG. 22 is a diagram showing a functional configuration example of the sensor network according to the second embodiment.
  • the same parts as those in FIG. 3 are designated by the same reference numerals, and the description thereof will be omitted.
  • the sensor node 20 further has a determination unit 23.
  • the determination unit 23 makes a learned determination in the same manner as the determination unit 12.
  • the center 10 does not have the determination unit 12. That is, the second embodiment is an example in which the learned determination is performed at the sensor node 20.
  • the compressed data is not transmitted in step S102, and the determination unit 23 of the sensor node 20 makes a learned determination in step S103.
  • the transmission unit 22 transmits the compressed data to the center 10, and steps S104 and subsequent steps are executed.
  • the transmission unit 22 transmits the target data.
  • the receiving unit 11 of the center 10 receives the target data
  • the receiving unit 11 stores the target data in the data storage unit 16.
  • the structure of the target VAE is not limited to that shown in FIG. 7, and CNN is not used. You may.
  • the evaluation results by the present inventor are shown for compression restoration by a plurality of types of AEs.
  • FIG. 23 is a diagram showing the evaluation results of compression / restoration by a plurality of types of AEs.
  • the input data, the true value of the classification label of the input data, the restored data, and the classification label output are shown for the five types of AEs of AE, CNN-AE, VAE, CVAE, and CNN + FC-VAE.
  • AE indicates a fully bound AE
  • CNN-AE indicates a CNN structure AE
  • VAE indicates a fully bound VAE
  • CVAE indicates a fully bound conditional VAE
  • CNN + FC-VAE indicates a target VAE. Yes, showing a VAE that combines a CNN structure with a perfect bond.
  • the AE applicable to each of the above embodiments is not limited to the specific VAE. Further, the AE is not limited to the VAE, and any AE capable of defining the distribution in the latent variable Z space can be used as a compression restorer capable of making a learned determination. Further, if it is the method after Case 5, it can be used as a compression restorer capable of making a learned determination even if the AE cannot specify the distribution in the latent variable Z space.
  • unlearned data determined to be unlearned (unrecoverable) by performing a learned determination (restorable determination) is collected, and the compression restorer is added / relearned using them. This makes it possible for the compression restorer to handle unlearned data.
  • each of the above embodiments may be applied to various data other than the data transmitted from the sensor node 20 to the center 10 and to be compressed data.
  • the center 10 and the sensor node 20 are examples of the restoreability determination device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

Dans la présente invention, un ordinateur exécute une procédure de détermination dans laquelle une technique de détection d'anomalie est appliquée à des données compressées générées par compression de données d'entrée à l'aide d'un codeur d'un autocodeur appris, et il est ainsi déterminé si les données compressées peuvent être décompressées. En conséquence de l'exécution de la procédure de détermination, la faisabilité de la décompression peut être déterminée sur la base des données compressées.
PCT/JP2020/023094 2020-06-11 2020-06-11 Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme WO2021250868A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022529972A JPWO2021250868A1 (fr) 2020-06-11 2020-06-11
PCT/JP2020/023094 WO2021250868A1 (fr) 2020-06-11 2020-06-11 Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/023094 WO2021250868A1 (fr) 2020-06-11 2020-06-11 Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme

Publications (1)

Publication Number Publication Date
WO2021250868A1 true WO2021250868A1 (fr) 2021-12-16

Family

ID=78847132

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/023094 WO2021250868A1 (fr) 2020-06-11 2020-06-11 Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme

Country Status (2)

Country Link
JP (1) JPWO2021250868A1 (fr)
WO (1) WO2021250868A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114547970A (zh) * 2022-01-25 2022-05-27 中国长江三峡集团有限公司 一种水电厂顶盖排水系统异常智能诊断方法
JP7385840B1 (ja) * 2023-06-30 2023-11-24 株式会社Supwat プログラム、情報処理装置、方法、システム

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017094267A1 (fr) * 2015-12-01 2017-06-08 株式会社Preferred Networks Système de détection d'anomalie, procédé de détection d'anomalie, programme de détection d'anomalie et procédé de génération de modèle appris
JP2018152004A (ja) * 2017-03-15 2018-09-27 富士ゼロックス株式会社 情報処理装置及びプログラム
JP2019101781A (ja) * 2017-12-04 2019-06-24 日本電信電話株式会社 検知システム、学習方法及び学習プログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017094267A1 (fr) * 2015-12-01 2017-06-08 株式会社Preferred Networks Système de détection d'anomalie, procédé de détection d'anomalie, programme de détection d'anomalie et procédé de génération de modèle appris
JP2018152004A (ja) * 2017-03-15 2018-09-27 富士ゼロックス株式会社 情報処理装置及びプログラム
JP2019101781A (ja) * 2017-12-04 2019-06-24 日本電信電話株式会社 検知システム、学習方法及び学習プログラム

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114547970A (zh) * 2022-01-25 2022-05-27 中国长江三峡集团有限公司 一种水电厂顶盖排水系统异常智能诊断方法
CN114547970B (zh) * 2022-01-25 2024-02-20 中国长江三峡集团有限公司 一种水电厂顶盖排水系统异常智能诊断方法
JP7385840B1 (ja) * 2023-06-30 2023-11-24 株式会社Supwat プログラム、情報処理装置、方法、システム

Also Published As

Publication number Publication date
JPWO2021250868A1 (fr) 2021-12-16

Similar Documents

Publication Publication Date Title
Denouden et al. Improving reconstruction autoencoder out-of-distribution detection with mahalanobis distance
Belghazi et al. Mutual information neural estimation
EP3276540B1 (fr) Procédé et appareil de réseau neuronal
US11822579B2 (en) Apparatus for functioning as sensor node and data center, sensor network, communication method and program
Min et al. Network anomaly detection using memory-augmented deep autoencoder
US20200349673A1 (en) Method for processing image for improving the quality of the image and apparatus for performing the same
Nagaraj et al. Competent ultra data compression by enhanced features excerption using deep learning techniques
WO2021250868A1 (fr) Procédé de détermination de faisabilité de décompression, dispositif de détermination de faisabilité de décompression et programme
KR102412829B1 (ko) 개인 정보 보호를 위하여 원본 데이터를 변조하는 변조 네트워크를 학습하는 방법 및 테스팅하는 방법, 그리고, 이를 이용한 학습 장치 및 테스팅 장치
CN116310473B (zh) 一种基于误差缓解的量子神经网络的图像分类方法
Berg et al. Searching for Hidden Messages: Automatic Detection of Steganography.
JP7439944B2 (ja) 復元可否判定方法、復元可否判定装置及びプログラム
CN115632660B (zh) 一种数据压缩方法、装置、设备及介质
Miranda et al. Hyperdimensional computing encoding schemes for improved image classification
CN113191380B (zh) 一种基于多视角特征的图像取证方法及系统
KR102242904B1 (ko) 압축 알고리즘의 파라미터를 추정하는 방법 및 장치
Boufounos et al. Universal embeddings for kernel machine classification
Zhao et al. Genetic simulated annealing-based kernel vector quantization algorithm
Benbarrad et al. Impact of standard image compression on the performance of image classification with deep learning
Salunkhe et al. An efficient video steganography for pixel location optimization using Fr-WEWO algorithm based deep CNN model
Kokalj-Filipovic et al. Generative Lossy Sensor Data Reconstructions for Robust Deep Inference
CN118353723B (zh) 攻击检测方法、装置、设备及介质
US20240291503A1 (en) System and method for multi-type data compression or decompression with a virtual management layer
Manikandan et al. Classification of Intrusion Affected Stego Images Over a Channel Using Deep Learning Techniques
Pandit et al. Improving the Security of Data in the Internet of Things by Performing Data Aggregation Using Neural Network-Based Autoencoders

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20939739

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022529972

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20939739

Country of ref document: EP

Kind code of ref document: A1