CN113807490A - Data linear correlation judgment method based on convolutional neural network - Google Patents

Data linear correlation judgment method based on convolutional neural network Download PDF

Info

Publication number
CN113807490A
CN113807490A CN202010535355.6A CN202010535355A CN113807490A CN 113807490 A CN113807490 A CN 113807490A CN 202010535355 A CN202010535355 A CN 202010535355A CN 113807490 A CN113807490 A CN 113807490A
Authority
CN
China
Prior art keywords
data
convolutional neural
linear
neural network
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010535355.6A
Other languages
Chinese (zh)
Inventor
汪丽莉
刘烨
李大明
李伟豪
郭博研
朱子杰
田浥岐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202010535355.6A priority Critical patent/CN113807490A/en
Publication of CN113807490A publication Critical patent/CN113807490A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a data linear correlation judgment method based on a convolutional neural network, which overcomes the defect that the application range of the traditional analysis is limited; the training strategy refers in particular to a strategy of generating random measurement data according to the nonlinear rate and the data quality parameters to generate images; the method for judging the linear correlation of the data based on the convolutional neural network can give the judgment reliability of the network under different conditions without depending on the statistical assumption of variables.

Description

Data linear correlation judgment method based on convolutional neural network
Technical Field
The invention relates to data linear correlation judgment, in particular to a data linear correlation judgment method based on a convolutional neural network.
Background
In recent years, a big outbreak of deep learning theory and practice provides a research basis for establishing a new linear correlation analysis method. The convolutional neural network is used as a deep learning model with ultrahigh learning efficiency, is widely applied to the fields of image and voice recognition, financial analysis and scientific research, and achieves a leap development. The powerful feature extraction capability makes it a powerful analytical modeling tool.
As two closely related analysis means, correlation analysis and regression analysis have important application in scientific experimental data processing and various engineering practices. The objective of regression analysis is to obtain quantitative mathematical relationships between the variables under study using a method of data fitting. The analysis may be performed with a linear, non-linear, or a specified function fit to the variables. However, since the relationship between the variables is not known in advance, there is a problem in that it is difficult to select a correct fitting functional relationship in the fitting process, resulting in distortion of the model. If the accuracy of the fitting is pursued, the fitting is overfitted. Correlation analysis can provide a reasonable reference for regression analysis. In the classical correlation analysis, the linear correlation coefficient based on the Pearson product distance can reflect the strength of the linear correlation between two variables. Therefore, judgment and support of the rationality degree are provided for linear regression analysis.
However, the existence of the pearson product distance is based on the assumption that both variables conform to a normal distribution, and the application range is greatly limited. Although the theory of correlation analysis continues to develop, there is no linear correlation analysis method with wide applicability.
Disclosure of Invention
The invention aims to provide a data linear correlation judgment method based on a convolutional neural network, which can be independent of the statistical hypothesis of variables and provide the judgment reliability of the network under different conditions.
The technical purpose of the invention is realized by the following technical scheme:
a data linear correlation judgment method based on a convolutional neural network comprises the following steps:
establishing a convolutional neural network;
generating linear data and nonlinear data of accurate data according to the generating function;
setting nonlinear rate and data quality parameters and generating random measurement data based on accurate data;
generating an image according to a training strategy by using the generated random measurement data;
inputting the generated image into a convolutional neural network for training;
obtaining a convolutional neural network with data linear correlation judgment capability;
and detecting different nonlinear rates and data quality parameters to obtain corresponding recognition capability limits of the convolutional neural network, so as to obtain judgment reliability.
Preferably, the specific steps of generating the random measurement data according to the accurate data are as follows:
generating a training data set and a testing data set according to linear data and nonlinear data classification, generating accurate data according to the following two generating functions,
yl=bx+c;
ynl=ax2+bx+c;
wherein, ylTo accurately linear data, ynlFor accurate nonlinear data, a is the coefficient of the second order nonlinear term, b is the coefficient of the linear term, c is an arbitrary constant, and the nonlinear ratio is defined as: pnl=a/b。
Adjusting data quality parameters, obtaining random measurement data by a generating probability function expressed by the following two formulas,
Figure BDA0002536803890000031
Figure BDA0002536803890000032
wherein, σ is a data quality parameter, represents a relative deviation value of random measurement data and accurate data, and can be understood as relative uncertainty in real experimental measurement; y'lIs a linear random number, y 'in the measured data'nlIs a non-linear random number in the measured data.
Preferably, the strategy for generating the image specifically includes:
uniformly taking values in an x value interval, and generating a linear random number and a nonlinear random number according to a generating function of the measurement data; directly generating (x to y ') according to the corresponding relation between the x value and the linear random number and the nonlinear random number'l) And (x-y'nl) And (4) function images.
Preferably, the method for detecting and obtaining the identification capability limit of the convolutional neural network according to different nonlinear rates and data quality parameters specifically comprises the following steps:
inputting a function image generated by a training strategy into a convolutional neural network for training to obtain the convolutional neural network with judgment capability;
and identifying and judging the convolutional neural networks trained by different nonlinear rates and data quality parameters by inputting corresponding data images to obtain the identification judgment limit of the convolutional neural networks trained by different nonlinear rates and data quality parameters.
In conclusion, the invention has the following beneficial effects:
a novel data linear correlation judgment method based on a convolutional neural network is provided, and a training strategy of the method can give judgment reliability of the network under different nonlinear rates and data quality conditions.
Drawings
FIG. 1 is a schematic block flow diagram of the process;
FIG. 2 is an image generated by setting a policy;
FIG. 3 shows the non-linearity Pnl0.4, different data quality is takenIdentification results of the network at the time of the parameter;
fig. 4 shows the data quality parameter σ of 0.02, which is the recognition result of the network for different non-linear rates.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
According to one or more embodiments, a method for determining linear correlation of data based on a convolutional neural network is disclosed, which comprises the following steps:
establishing a convolutional neural network;
generating linear data and nonlinear data of accurate data according to the generating function;
setting a nonlinear rate and a data quality parameter to generate random measurement data based on accurate data;
generating an image according to a training strategy by using the generated random measurement data;
inputting the generated image into a convolutional neural network for training;
obtaining a neural network with data linear correlation judgment capability;
and detecting different nonlinear rates and data quality parameters to obtain corresponding recognition capability limits of the convolutional neural network, so as to obtain the judgment reliability of the method.
The specific steps of generating the measurement data according to the accurate data are as follows:
generating a training data set and a testing data set according to the linear data and the nonlinear data in a classification mode, generating accurate data according to a generating function,
yl=bx+c;
ynl=ax2+bx+c;
wherein, ylTo accurately linear data, ynlFor accurate nonlinear data, a is the coefficient of the second order nonlinear term, b is the coefficient of the linear term, c is an arbitrary constant, and the nonlinear ratio is defined as: pnl=a/b。
Adjusting data quality parameters, obtaining random measurement data by a generating probability function expressed by the following two formulas,
Figure BDA0002536803890000051
Figure BDA0002536803890000052
wherein, sigma is a data quality parameter and represents a relative deviation value of the measured data and the accurate data; y'lIs a linear random number, y 'in the measured data'nlIs a non-linear random number in the measured data.
The strategy for generating the image specifically comprises the following steps:
uniformly taking values in an x value interval, and generating a linear random number and a nonlinear random number according to a generating function of the measurement data; directly generating (x to y ') according to the corresponding relation between the x value and the linear random number and the nonlinear random number'l) And (x-y'nl) And (4) function images.
The method for detecting and obtaining the identification capability limit of the convolutional neural network according to different nonlinear rates and data quality parameters specifically comprises the following steps:
inputting a function image generated by a training strategy into a convolutional neural network for training to obtain the convolutional neural network with judgment capability;
and identifying and judging the convolutional neural networks trained by different nonlinear rates and data quality parameters by inputting corresponding data images to obtain the identification judgment limit of the convolutional neural networks trained by different nonlinear rates and data quality parameters.
The method is based on the powerful complex data feature extraction capability of the convolutional neural network, and converts the traditional linear correlation analysis problem into an image recognition problem based on a deep learning method. A new linear correlation data analysis method based on the convolutional neural network is obtained by establishing the convolutional neural network with the capability of identifying the linear correlation degree of the data. The linear correlation analysis method does not depend on statistical hypothesis of variables, has a wider application range than a classical Pearson product distance correlation coefficient method, and has better expansibility. Therefore, judgment support can be better provided for regression analysis.
A network training strategy based on different data imaging methods is provided. And comparing the judgment capability of the network by using the accuracy index of the trained network under the conditions of different data quality and nonlinear degree, and providing the optimal convolutional neural network with linear correlation analysis capability.
Through big data training, the convolutional neural network can establish the internal mapping relation of input and output. And classifying according to the linear data and the nonlinear data to generate a training data set and a testing data set of accurate data, and training the network. The generation functions of the accurate linear data and the non-linear data are shown in formula (1) and formula (2).
yl=bx+c; (1)
ynl=ax2+bx+c; (2)
The image input into the convolutional neural network is not generated from accurate data. In practical applications, errors exist between the measured data and the accurate data. The measured data input into the convolutional neural network for training is linear data y of accurate datalAnd non-linear data ynlCentered at σ ylAnd σ ynlFor standard deviation, random numbers conforming to a normal distribution are generated. As measurement data, a linear random number y'lAnd a non-linear random number y'nlThe value distribution of (A) is given by the formulas (3) and (4).
Figure BDA0002536803890000071
Figure BDA0002536803890000072
Wherein, σ is a data quality parameter, which represents a relative deviation value between random measurement data and accurate data, and can be understood as relative uncertainty in real experimental measurement. When σ is 0.02, the standard deviation indicating that the measured value deviates from the true value is ± 2%. By setting different sigma values, test data with different data qualities can be generated, and the identification capability of the network under different data quality conditions is tested so as to detect the limit of the identification capability of the network.
Without loss of generality, in x ∈ [0,1 ]]Evenly taking 11 points of data in intervals, and generating 11 real data y respectivelylAnd ynl. According to equations (3) and (4), 11 linear random numbers y 'are each generated'lAnd a non-linear random number y'nl. Finally, utilizing the generated y'lAnd y'nlAnd (4) making a function image by using the data, and inputting the function image into a convolutional neural network for training. Fig. 2 (a) and (b) show images of linear data and nonlinear data generated according to a training strategy. In the data generation, a ═ b ═ c ═ 1, and σ ═ 0.02 are selected.
For clarity, two examples are given, respectively:
1. and (3) improving the data quality recognition rate:
as shown in FIG. 3, take PnlWhen the data quality parameter σ is 0.01, 0.02 and 0.03, the data quality gradually deteriorates as σ gradually increases, which leads to the reduction of the recognition capability of the convolutional neural network. When σ is 0.02, the convolutional neural network has not been able to identify the difference between one linear and another non-linear image.
2. Improvement in nonlinear rate detection:
as shown in fig. 4, when σ is taken to be 0.02 as a constant value, different nonlinear coefficient values are adopted, and when P is taken to be PnlWhen P is 0.8, the recognition rate can be 99% or more as shown in fig. 4(a), and P is equal tonlAs shown in fig. 4(b), 0.6, the recognition rate can be 99% or more; when P is presentnlWhen the value is 0.2, the effect of the nonlinear term becomes weaker and the intelligibility decreases to 0.5 as shown in fig. 4 (c).
In actual network training, the size of the training set is 20000 pictures. The figure shows the accuracy of the training strategy under different sigma conditions. During the training process, the non-linear rate P is maintainednlStep by step from 1 to 0.4, with PnlA decrease in the value, that is to say that it meansThe contribution of the middle non-linear term is smaller and smaller, and under the condition, if the artificial intelligence identification by using the traditional method cannot be distinguished, the judgment reliability of the convolutional neural network under different non-linear rates and data quality conditions can be obtained by using the method, so that the judgment and analysis of the physical experiment measurement data are facilitated.
The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.

Claims (4)

1. A data linear correlation judgment method based on a convolutional neural network is characterized by comprising the following steps:
establishing a convolutional neural network;
generating linear data and nonlinear data of accurate data according to the generating function;
setting nonlinear rate and data quality parameters and generating random measurement data based on accurate data;
generating an image according to a training strategy by using the generated random measurement data;
inputting the generated image into a convolutional neural network for training;
obtaining a convolutional neural network with data linear correlation judgment capability;
and detecting different nonlinear rates and data quality parameters to obtain corresponding recognition capability limits of the convolutional neural network, so as to obtain judgment reliability.
2. The convolutional neural network-based data linear correlation decision method as claimed in claim 1, wherein the specific steps of generating the measurement data from the accurate data are as follows:
generating training data set and testing data set according to following linear data and non-linear data formulas, generating accurate data according to generating function,
yl=bx+c;
ynl=ax2+bx+c;
wherein, ylTo accurately linear data, ynlFor accurate nonlinear data, a is the coefficient of the second order nonlinear term, b is the coefficient of the linear term, c is an arbitrary constant, and the nonlinear ratio is defined as: pnl=a/b。
Adjusting data quality parameters, obtaining random measurement data by a generating probability function expressed by the following two formulas,
Figure FDA0002536803880000011
Figure FDA0002536803880000021
wherein, σ is a data quality parameter, represents a relative deviation value of random measurement data and accurate data, and can be understood as relative uncertainty in real experimental measurement; y'lIs a linear random number, y 'in the measured data'nlIs a non-linear random number in the measured data.
3. The convolutional neural network-based data linear correlation decision method as claimed in claim 2, wherein the training strategy for generating the image is specifically:
uniformly taking values in an x value interval, and generating a linear random number and a nonlinear random number according to a generating function of the measurement data; directly generating (x to y ') according to the corresponding relation between the x value and the linear random number and the nonlinear random number'l) And (x-y'nl) And (4) function images.
4. The convolutional neural network-based data linear correlation decision method as claimed in claim 3, wherein the obtaining of the recognition capability limit of the convolutional neural network based on different non-linear rates and data quality parameter detection specifically comprises:
inputting a function image generated by a training strategy into a convolutional neural network for training to obtain the convolutional neural network with judgment capability;
and identifying and judging the convolutional neural networks trained by different nonlinear rates and data quality parameters by inputting corresponding data images to obtain the identification judgment limit of the convolutional neural networks trained by different nonlinear rates and data quality parameters.
CN202010535355.6A 2020-06-12 2020-06-12 Data linear correlation judgment method based on convolutional neural network Pending CN113807490A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010535355.6A CN113807490A (en) 2020-06-12 2020-06-12 Data linear correlation judgment method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010535355.6A CN113807490A (en) 2020-06-12 2020-06-12 Data linear correlation judgment method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN113807490A true CN113807490A (en) 2021-12-17

Family

ID=78892114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010535355.6A Pending CN113807490A (en) 2020-06-12 2020-06-12 Data linear correlation judgment method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN113807490A (en)

Similar Documents

Publication Publication Date Title
CN112115895B (en) Pointer type instrument reading identification method, pointer type instrument reading identification device, computer equipment and storage medium
WO2020173177A1 (en) Object color difference defect detection method, device, computer device, and storage medium
CN108829878B (en) Method and device for detecting abnormal points of industrial experimental data
CN110400293B (en) No-reference image quality evaluation method based on deep forest classification
CN110728656A (en) Meta-learning-based no-reference image quality data processing method and intelligent terminal
CN114372955A (en) Casting defect X-ray diagram automatic identification method based on improved neural network
CN112907589A (en) Deep learning algorithm for detecting abnormality and segmenting abnormal region in image
CN112365497A (en) High-speed target detection method and system based on Trident Net and Cascade-RCNN structures
CN106779217A (en) Detection of Air Quality method and air quality detection system
CN113642666A (en) Active enhanced soft measurement method based on sample expansion and screening
CN114648528B (en) Semiconductor detection method, device and computer readable storage medium
CN110514366B (en) Method for detecting weak leakage of pipeline under small sample condition
CN111914386A (en) Reliability assessment method and system based on uncertain analysis of degradation model
CN113792666B (en) Concrete classification method and system based on scanning electron microscope image
TW202117664A (en) Optical inspection secondary image classification method which can effectively improve the accuracy of image recognition and classification
CN111863151B (en) Polymer molecular weight distribution prediction method based on Gaussian process regression
CN116452581B (en) Intelligent voltage source state detection system and method based on machine vision
CN106682604B (en) Blurred image detection method based on deep learning
CN113807490A (en) Data linear correlation judgment method based on convolutional neural network
CN113743707B (en) Product credibility calculation method based on uniform distribution
CN114596296A (en) High-sensitivity hot-rolled steel coil end surface defect identification system and method
CN114894792A (en) Plastic film detection method based on artificial intelligence
Galagan et al. Statistical analysis of thermal nondestructive testing data
CN117491357B (en) Quality monitoring method and system for paint
JP2020177287A (en) Specimen evaluation system and method for constructing the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination