CN112200307A - Recognizer processing method based on picture data expansion - Google Patents

Recognizer processing method based on picture data expansion Download PDF

Info

Publication number
CN112200307A
CN112200307A CN202011111459.0A CN202011111459A CN112200307A CN 112200307 A CN112200307 A CN 112200307A CN 202011111459 A CN202011111459 A CN 202011111459A CN 112200307 A CN112200307 A CN 112200307A
Authority
CN
China
Prior art keywords
data
recognizer
data expansion
expansion
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011111459.0A
Other languages
Chinese (zh)
Inventor
纪雪飞
王珏
李业
孙强
丁瑞
徐晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN202011111459.0A priority Critical patent/CN112200307A/en
Publication of CN112200307A publication Critical patent/CN112200307A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a recognizer processing method based on image data expansion, which belongs to the field of machine learning and comprises the following steps: data set preparation: making the existing multiple types of picture data into a data set, and attaching a label; data expansion: inputting a data set consisting of pictures and labels into a network, generating the network to complete data expansion, sending the expanded data set and original data together, and training weight parameters of a recognizer; and (3) feedback: feeding back the test result of the recognizer on the original data to the data expansion module, and dynamically adjusting the weight parameters of the data expansion module; and (4) taking a period of complete training of a batch of data, and repeating the cycle expansion and the feedback step until reaching the preset training period number. The method of data expansion is utilized, a limited number of samples are expanded, and then the original data and the expanded data are used for training the weight parameters of the recognizer; the method improves the identification accuracy of the identifier under the scene of less samples.

Description

Recognizer processing method based on picture data expansion
Technical Field
The invention relates to the technical field of machine learning, in particular to a recognizer processing method based on picture data expansion.
Background
The deep learning method is a new algorithm which has received much attention in recent years as a branch of machine learning. Image recognition is a research topic, and compared with a traditional recognition method based on feature extraction, deep learning has higher accuracy and higher recognition speed. In deep learning, the data volume usually determines the recognition accuracy of a deep learning network, the deep learning network trained by mass data has better generalization capability, the weight parameters of the network are more mature, and the recognition capability of unknown data types is higher. However, in some cases, the number of obtained samples is very limited, and how to train a better network with a smaller number of training samples is a promising prospect. On the other hand, in order to extract more feature information, the depth of the deep learning network is increased, but with the problem of "overfitting", which is the most direct solution to increase the size of the training data set.
There are two common data expansion algorithms: the data regeneration based on the generation of the antagonistic neural network and the data expansion based on the variation self-coding machine. The generation of the confrontation neural network is to set up a generator and a discriminator from the perspective of game theory, the discriminator needs to distinguish data generated by the generator as far as possible, the generator needs to generate data which is difficult for the discriminator to distinguish true and false, and the generator and the discriminator are confronted with each other and are cooperatively optimized. The variational self-coding machine stands in the angle of probability, extracts the mean value and the variance of original data, and synthesizes new data. Both methods are based on image data, the main purpose of which is to produce, from the point of view of the generated data, generated data that is close to the original data but different. At present, two kinds of indexes for identifying the quality of generated data are provided: diversity and clarity. The diversity reflects the difference between the generated images, and whether the same data is generated. The sharpness reflects the degree of sharpness of the generated image.
Disclosure of Invention
The invention aims to provide a recognizer processing method based on image data expansion, which aims to train a recognizer network by adding generated data and improve the generalization capability of a recognizer.
In order to achieve the purpose, the invention adopts the technical scheme that: the innovation point of the recognizer processing method based on the picture data expansion is that the recognizer processing method comprises the following steps:
s1, data set preparation: making the existing multiple types of picture data into a data set, and attaching a label;
s2, data expansion: inputting a data set consisting of pictures and labels into a network, generating the network to complete data expansion, sending the expanded data set and original data together, and training weight parameters of a recognizer;
s3, feedback structure: feeding back the test result of the recognizer on the original data to the data expansion module, and dynamically adjusting the weight parameters of the data expansion module; taking a period of complete training of a batch of data, repeating the cycle expansion and the feedback step until reaching the preset training period number;
wherein, a feedback structure is formed between the data expansion module and the recognizer module.
Furthermore, the data expansion module adopts a deep learning method, and the deep learning network is a differential condition variation automatic encoder.
Further, the recognizer adopts a deep learning method, and the deep learning network is a convolutional neural network.
Further, the feedback structure calculates a loss value between the output distribution of the identifier and the data tag distribution, and uses the loss value as a part of a loss function of the data expansion module.
Furthermore, the data expansion module comprises an encoding unit and a decoding unit; the encoding unit samples the posterior distribution of the original data to obtain an intermediate variable, and the decoding unit reconstructs the original data from the distribution of the intermediate variable to obtain expanded data.
Furthermore, a feedback structure is formed between the decoding unit and the identifier, the weight parameters of the decoding unit are adjusted, and the generated data reconstructed by the decoder is optimized.
Further, the recognizer is a weight parameter for training the network with the augmented data along with the raw data.
The invention has the beneficial effects that:
the invention utilizes a data expansion method to expand the existing data, then trains the recognizer network together with the expanded data, feeds back the recognition result to the data expansion module, and dynamically adjusts the weight parameters of the data expansion module. Compared with the method that after thousands of training, the generated countermeasure network trains the recognizer by using the expanded data, the efficiency is lower. The method can obviously improve the generalization ability of the recognizer only by hundreds of times of training, improves the recognition accuracy of the recognizer, and is simple and convenient and high in efficiency.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a diagram of a conventional recognizer that is trained without adding augmented data;
FIG. 3 is a diagram of recognizers trained by other researchers to generate data against network augmentation;
FIG. 4 shows the test accuracy of the network obtained by training only with the raw data in scenario 1;
FIG. 5 shows the test accuracy of the network trained by adding the extended data generated by the generation of the countermeasure network in scenario 1;
FIG. 6 shows the test accuracy of the network trained by adding the extended data generated by the method under scenario 1;
FIG. 7 shows the test accuracy of the network trained with the original data and the extended data added by the method under scenario 2.
Detailed Description
The invention will be further described with reference to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and the detailed description. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a block diagram of an algorithm, the functions of the different modules being as follows:
the encoder obtains the mean and variance from the original data, which are recorded as muzAnd σz(ii) a The parameter to be updated by the encoder is phi.
The decoder uses the intermediate variables to generate the extended data. The parameter of the decoder is θ.
The identifier uses a convolution neural network to extract the characteristic information of the signal, and then uses a logistic regression algorithm to output the probability values of different types of labels. ThetaCAre parameters of the identifier.
The feedback structure informs the decoder of the decision result of the identifier so as to adjust the weight parameters of the decoder.
The algorithm is divided into two steps, the first step is to train a recognizer by using the generated data and the original data, and the second step is to recognize the regenerated new data by using the recognizer and feed back the result to a decoder so as to adjust the parameters of the decoder. The algorithm flow is as follows:
randomly extracting a batch of data from a data set, wherein x and y are respectively pictures and labels, and the labels are preprocessed by one-hot coding and can be recorded as { x, y } PdataI.e. the distribution of x, y follows the distribution of the original data set.
The posterior probability of the intermediate variable z is qφ(z | x, y) assuming a priori probability p for zθ(z) and pθ(z) is subject to a standard normal distribution.
From qφSampling of the probability distribution of (z | x, y) can yield z:
z=gφ(x,y,ε)=μzz·ε (1)
wherein g isφ(x, y, ε) represents the pair qφThe distribution of (z | x, y) is sampled, and ε follows a standard normal distribution.
The loss function of the encoder can be written as:
LKL=KL(qφ(z|x,y)||pθ(z)) (2)
where KL (-) represents the KL divergence loss function. Denotes qφ(z | x, y) vs. pθEntropy of (z) in order to expect the posterior probability distribution of the intermediate variables to approximate the standard normal distribution.
By probability distribution
Figure BDA0002728732050000051
The data that can be generated is recorded as
Figure BDA0002728732050000052
Can be implemented by a decoder:
Figure BDA0002728732050000053
where D (-) denotes the decoder.
And according to the newly generated data, identifying the data by using an identifier, wherein the loss function of the identifier is as follows:
Figure BDA0002728732050000054
wherein the content of the first and second substances,
Figure BDA0002728732050000055
a category of the output of the recognizer is represented,
Figure BDA0002728732050000056
is expressed as input
Figure BDA0002728732050000057
The recognizer outputs probabilities for different types of tags, if any.
E (-) is a cross-entropy loss function expressed in the distribution qφUnder the condition of (z | x, y),computing
Figure BDA0002728732050000058
The entropy of the distribution. The purpose is to hope
Figure BDA0002728732050000059
Can approach the distribution of x.
Accordingly, the loss function of the decoder can be written as:
Figure BDA00027287320500000510
then, the loss function of the encoder is reversely differentiated by using a gradient descent algorithm, and the weight parameters of the encoder are updated:
Figure BDA00027287320500000511
where λ is an empirical parameter, it may be taken to be 0.5.
Updating the recognizer by using a gradient descent algorithm:
Figure BDA00027287320500000512
the encoder again generates a new batch of data (step two), which is recorded as
Figure BDA00027287320500000513
Figure BDA00027287320500000514
The loss function of the feedback structure is:
Figure BDA0002728732050000061
wherein the content of the first and second substances,
Figure BDA0002728732050000062
representing input data
Figure BDA0002728732050000063
As a result of the recognition at the time of the day,
Figure BDA0002728732050000064
indicating the probability of the recognizer outputting different types of tags.
The gradient descent algorithm is used to follow the parameters of the new decoder:
Figure BDA0002728732050000065
the complete loss function is as follows:
L=LCVAE+LC+LDC (11)
repeating the steps (1) to (11) until a preset training period is reached.
And saving the weight parameter of the last training as the final training weight.
Feasible, in the present embodiment, the recognizer is obtained by comparing the experiment without data expansion training, as shown in fig. 2; other researchers have joined recognizers trained to generate augmented data against network generation, as in FIG. 3.
Test cases:
in order to test the effectiveness of the algorithm provided by the patent, the actual picture recognition is taken as an example for explanation.
The transmitted signal is modulated according to a common signal modulation type to obtain a received signal under different channel environments, the signal can be represented in the form of pictures, and the purpose of the case is to obtain a corresponding modulation type from the received signal.
The collected data are divided into a training set and a test set according to different proportions, the training set is used for training the model, and the test set is used for testing the recognition accuracy of the model. The recognizer adopts a simplest deep learning network with a convolutional neural network layer and a fully connected layer.
The test scenarios are divided into the following two types:
the recognition accuracy under the condition of additive white Gaussian noise simulation is that the sum of training data and test data of each modulation type is 300 pictures.
And actually measuring the identification accuracy under the indoor channel condition, wherein the sum of the training data and the test data of each modulation type is 500 pictures.
And (3) testing results:
under the condition of an additive white gaussian noise channel, fig. 4, fig. 5 and fig. 6 are respectively schematic diagrams of the relationship between the test accuracy and the picture quality after adding the data generated by the method into the data generated by the generation of the countermeasure network by only using the original data. The illustration in the figure refers to the scale of the training and test sets. Therefore, the test accuracy rate is increased along with the increase of the training set; the accuracy rate changes along with the quality degree of the picture, and when the picture quality is good, the accuracy rate reaches 100 percent; further, when the extended data is added, the accuracy is further improved, especially under the condition of poor picture quality. Compared with the method for generating the confrontation network expansion data, the method has better performance.
Under measured indoor channel conditions, the test results are shown in fig. 7. It can be seen that the recognizer provided by the patent is also better than the traditional recognizer trained only with raw data.
The above description is only for the purpose of illustrating the technical solutions of the present invention and not for the purpose of limiting the same, and other modifications or equivalent substitutions made by those skilled in the art to the technical solutions of the present invention should be covered within the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (7)

1. A recognizer processing method based on picture data expansion is characterized by comprising the following steps:
s1, data set preparation: making the existing multiple types of picture data into a data set, and attaching a label;
s2, data expansion: inputting a data set consisting of pictures and labels into a network, generating the network to complete data expansion, sending the expanded data set and original data together, and training weight parameters of a recognizer;
s3, feedback structure: feeding back the test result of the recognizer on the original data to the data expansion module, and dynamically adjusting the weight parameters of the data expansion module; taking a period of complete training of a batch of data, repeating the cycle expansion and the feedback step until reaching the preset training period number;
wherein, a feedback structure is formed between the data expansion module and the recognizer module.
2. The picture data expansion-based recognizer processing method according to claim 1, wherein: the data expansion module adopts a deep learning method, and the deep learning network is a differential condition variation automatic encoder.
3. The picture data expansion-based recognizer processing method according to claim 1, wherein: the recognizer adopts a deep learning method, and the deep learning network is a convolutional neural network.
4. The picture data expansion-based recognizer processing method according to claim 1, wherein: the feedback structure calculates a loss value between the output distribution of the recognizer and the data tag distribution, and uses the loss value as a part of a loss function of the data expansion module.
5. The picture data expansion-based recognizer processing method according to claim 1, wherein: the data expansion module comprises an encoding unit and a decoding unit; the encoding unit samples the posterior distribution of the original data to obtain an intermediate variable, and the decoding unit reconstructs the original data from the distribution of the intermediate variable to obtain expanded data.
6. The picture data expansion-based recognizer processing method according to claim 5, wherein: and a feedback structure is formed between the decoding unit and the identifier, the weight parameter of the decoding unit is adjusted, and the generated data reconstructed by the decoder is optimized.
7. The picture data expansion-based recognizer processing method according to claim 6, wherein: the recognizer is a weight parameter that trains the network with the augmented data along with the raw data.
CN202011111459.0A 2020-10-16 2020-10-16 Recognizer processing method based on picture data expansion Pending CN112200307A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011111459.0A CN112200307A (en) 2020-10-16 2020-10-16 Recognizer processing method based on picture data expansion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011111459.0A CN112200307A (en) 2020-10-16 2020-10-16 Recognizer processing method based on picture data expansion

Publications (1)

Publication Number Publication Date
CN112200307A true CN112200307A (en) 2021-01-08

Family

ID=74010383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011111459.0A Pending CN112200307A (en) 2020-10-16 2020-10-16 Recognizer processing method based on picture data expansion

Country Status (1)

Country Link
CN (1) CN112200307A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881723A (en) * 2023-09-06 2023-10-13 北京城建设计发展集团股份有限公司 Data expansion method and system for existing structure response prediction

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111258992A (en) * 2020-01-09 2020-06-09 电子科技大学 Seismic data expansion method based on variational self-encoder

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111258992A (en) * 2020-01-09 2020-06-09 电子科技大学 Seismic data expansion method based on variational self-encoder

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
X. JI 等: "Data-Limited Modulation Classification With a CVAE-Enhanced Learning Model", 《IEEE COMMUNICATIONS LETTERS》, vol. 24, no. 10, pages 2191 - 2195, XP011813680, DOI: 10.1109/LCOMM.2020.3004877 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881723A (en) * 2023-09-06 2023-10-13 北京城建设计发展集团股份有限公司 Data expansion method and system for existing structure response prediction
CN116881723B (en) * 2023-09-06 2024-02-20 北京城建设计发展集团股份有限公司 Data expansion method and system for existing structure response prediction

Similar Documents

Publication Publication Date Title
CN110287374B (en) Self-attention video abstraction method based on distribution consistency
CN111565318A (en) Video compression method based on sparse samples
CN109890043B (en) Wireless signal noise reduction method based on generative countermeasure network
CN108921285B (en) Bidirectional gate control cyclic neural network-based classification method for power quality disturbance
CN109743275B (en) Signal modulation identification method based on under-complete self-encoder
CN110135386B (en) Human body action recognition method and system based on deep learning
CN111652233B (en) Text verification code automatic identification method aiming at complex background
CN114092964A (en) Cross-domain pedestrian re-identification method based on attention guidance and multi-scale label generation
CN109581339B (en) Sonar identification method based on automatic adjustment self-coding network of brainstorming storm
CN111935042B (en) Probability shaping recognition system and method based on machine learning and receiving end
CN109800768B (en) Hash feature representation learning method of semi-supervised GAN
CN114006870A (en) Network flow identification method based on self-supervision convolution subspace clustering network
CN113627266A (en) Video pedestrian re-identification method based on Transformer space-time modeling
CN116939320B (en) Method for generating multimode mutually-friendly enhanced video semantic communication
CN115311605B (en) Semi-supervised video classification method and system based on neighbor consistency and contrast learning
Huang et al. A parallel architecture of age adversarial convolutional neural network for cross-age face recognition
CN111291705B (en) Pedestrian re-identification method crossing multiple target domains
CN112507778A (en) Loop detection method of improved bag-of-words model based on line characteristics
CN111967358A (en) Neural network gait recognition method based on attention mechanism
CN112200307A (en) Recognizer processing method based on picture data expansion
CN114972904A (en) Zero sample knowledge distillation method and system based on triple loss resistance
CN116434759B (en) Speaker identification method based on SRS-CL network
CN103295007B (en) A kind of Feature Dimension Reduction optimization method for Chinese Character Recognition
Aziz et al. Multi-level refinement feature pyramid network for scale imbalance object detection
CN111401263A (en) Expert knowledge fused optimal effect combined modulation identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210108

WD01 Invention patent application deemed withdrawn after publication