CN109815444A - Method and apparatus based on multiple linear regression processing nonlinear data - Google Patents

Method and apparatus based on multiple linear regression processing nonlinear data Download PDF

Info

Publication number
CN109815444A
CN109815444A CN201910104454.6A CN201910104454A CN109815444A CN 109815444 A CN109815444 A CN 109815444A CN 201910104454 A CN201910104454 A CN 201910104454A CN 109815444 A CN109815444 A CN 109815444A
Authority
CN
China
Prior art keywords
linear regression
formula
function
formulas
goal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910104454.6A
Other languages
Chinese (zh)
Inventor
贾瑞凯
郭森
肖芳
叶桦
贾延凯
廖国娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SUZHOU GENEWIZ BIOLOGICAL TECHNOLOGY Co Ltd
Original Assignee
SUZHOU GENEWIZ BIOLOGICAL TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SUZHOU GENEWIZ BIOLOGICAL TECHNOLOGY Co Ltd filed Critical SUZHOU GENEWIZ BIOLOGICAL TECHNOLOGY Co Ltd
Priority to CN201910104454.6A priority Critical patent/CN109815444A/en
Publication of CN109815444A publication Critical patent/CN109815444A/en
Pending legal-status Critical Current

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

This application involves a kind of method and apparatus based on multiple linear regression processing nonlinear data, and wherein method includes: to obtain the data model comprising multiple explanatory variables;Preset function is used to be converted to multiple explanatory variables respectively to obtain the explanatory variable after multiple conversions;The explanatory variable after multiple conversions is combined according to data model to obtain multiple candidate linear regression formulas, according to linear regression analysis to determine goal regression formula.The application is by obtaining the data model comprising multiple mutually independent explanatory variables, preset function is used to be converted to multiple explanatory variables respectively to obtain the explanatory variable after multiple conversions, then the explanatory variable after multiple conversions is combined according to data model to obtain multiple candidate linear regression formulas, and by linear regression analysis to determine goal regression formula, to which Multiple Non-linear Regression Analysis is converted into multiple multiple linear regression analysis, to handle nonlinear data.

Description

Method and apparatus based on multiple linear regression processing nonlinear data
Technical field
This application involves technical field of data processing, handle non-linear number based on multiple linear regression more particularly to one kind According to method, apparatus, computer equipment and storage medium.
Background technique
Obtained data are often a table about discrete data point generally in engineering design or scientific experiment, are not had There is analytic expression to describe the relationship of x-y.It is referred to as irregular bent generally according to the curve of these given discrete data point-renderings Such issues that line, the general method with nonlinear regression solves.
So-called non-linear regression method is to be established data relationship (mathematical model) by given discrete data point, found out one These interpolation points, are then connected into curve by the small straightway of series, as long as the interval selection of interpolation point is proper, so that it may shape At a smooth curve.
The non-linear regression method of mainstream mainly has the nls of R language (nonlinear least squares is non-thread at present Property least square method) function, lm function and Matlab (matrix labotstory) curvefit (curve matching) function etc., but this Class method needs to preset the precompensation parameter of the function formula that data follow and formula.Therefore, such method is limited to data Complexity and non-intellectual, have higher requirement to knowledge such as mathematics, the statistics of user.
Summary of the invention
Based on this, it is necessary to handle discrete data using non-linear regression method for above-mentioned, mathematics, system to user Meter such as learns at have higher requirements the technical issues of of knowledge, provide it is a kind of be easily achieved based on the non-linear number of multiple linear regression processing According to method, apparatus, computer equipment and storage medium.
To achieve the goals above, on the one hand, the embodiment of the present application provides a kind of non-based on multiple linear regression processing The method of linear data, comprising:
Obtain the data model comprising multiple explanatory variables, wherein multiple explanatory variables are mutually indepedent;
Preset function is used to be converted to multiple explanatory variables respectively to obtain the explanatory variable after multiple conversions;
The explanatory variable after multiple conversions is combined according to data model to obtain multiple candidate linear regression formulas;
Linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas.
Linear regression analysis is carried out respectively to determine mesh to multiple candidate linear regression formulas in one of the embodiments, Mark regression formula, comprising: linear regression is carried out to multiple candidate linear regression formulas using the linear regression function of R language respectively Analysis is to determine goal regression formula.
Candidate linear regression formula includes multiple recurrence items in one of the embodiments,;Then returned using the linear of R language Function is returned to carry out linear regression analysis respectively to multiple candidate linear regression formulas to determine goal regression formula, comprising: to use R Each of multiple candidate linear regression formulas of the linear regression function estimation of language return the corresponding coefficient of item;According to estimation As a result T is used to examine to determine goal regression formula.
T is used to examine to determine goal regression formula according to estimation result in one of the embodiments, comprising: according to Estimation result carries out T to multiple candidate linear regression formulas and examines to obtain T inspection result;It is public to reject multiple candidate linear regressions T inspection result P value is greater than 0.01 recurrence item in formula;The remaining item T that returns in each candidate linear regression formula is calculated to examine As a result the mean value of middle T;By the maximum estimation result of mean value of corresponding T by being fitted to obtain goal regression formula.
The linear regression function of R language includes in the lm function and glm function of R language in one of the embodiments, It is any.
Preset function includes preset elementary function and preset custom function in one of the embodiments,.
On the other hand, the embodiment of the present application provides a kind of device based on multiple linear regression processing nonlinear data, Include:
Module is obtained, for obtaining the data model comprising multiple explanatory variables, wherein multiple explanatory variables are mutually only It is vertical;
Conversion module, after using preset function to be converted to multiple explanatory variables to obtain multiple conversions respectively Explanatory variable;
Composite module, for being combined to the explanatory variable after multiple conversions according to data model to obtain multiple candidates Linear regression formula;
Determining module, for carrying out linear regression analysis respectively to multiple candidate linear regression formulas to determine goal regression Formula.
Determining module includes: evaluation unit in one of the embodiments, for the linear regression function using R language Estimate that each of multiple candidate linear regression formulas return the corresponding coefficient of item;Goal regression formula determination unit, is used for T is used to examine to determine goal regression formula according to estimation result.
Goal regression formula determination unit includes: that T examines subelement in one of the embodiments, for according to estimation As a result T is carried out to multiple candidate linear regression formulas to examine to obtain T inspection result;Subelement is rejected, for rejecting multiple times T inspection result P value is greater than 0.01 recurrence item in selected linear regression formula;Computation subunit, for calculating each candidate line Property regression formula in the remaining mean value for returning T in item T inspection result;It is fitted subelement, for estimating the mean value for corresponding to T is maximum Result is calculated to be fitted to obtain goal regression formula.
The third aspect, the embodiment of the present application provide a kind of computer equipment, including memory and processor, the storage Device is stored with computer program, and the processor realizes step described in above method when executing the computer program.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence realizes step described in above method when the computer program is executed by processor.
Above-mentioned method, apparatus, computer equipment and storage medium based on multiple linear regression processing nonlinear data, leads to It crosses and obtains the data model comprising multiple mutually independent explanatory variables, multiple explanatory variables are carried out respectively using preset function Then conversion is combined the explanatory variable after multiple conversions according to data model with obtaining the explanatory variable after multiple conversions To obtain multiple candidate linear regression formulas, and by linear regression analysis to determine goal regression formula, thus by polynary non- Linear regression analysis is converted into multiple multiple linear regression analysis, to handle nonlinear data.
Detailed description of the invention
Fig. 1 is the applied environment figure for handling the method for nonlinear data in one embodiment based on multiple linear regression;
Fig. 2 is the flow diagram for handling the method for nonlinear data in one embodiment based on multiple linear regression;
Fig. 3 is to carry out linear regression analysis respectively to multiple candidate linear regression formulas in one embodiment to determine target The flow diagram of the step of regression formula;
Fig. 4 is the step for determining goal regression formula in one embodiment using T inspection and lm function according to estimation result Rapid flow diagram;
Fig. 5 is the structural block diagram for handling the device of nonlinear data in one embodiment based on multiple linear regression;
Fig. 6 is the structural block diagram of determining module in Fig. 5;
Fig. 7 is the structural block diagram of goal regression formula determination unit in Fig. 6.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Due to handling discrete data using non-linear regression method, have to knowledge such as mathematics, the statistics of user higher It is required that and the method for linear regression because of engineering design or the complexity and diversity of scientific experimental data, it usually needs by data into Row linear transformation.However experimental data equally exists non-intellectual, mapping mode is varied, it usually needs takes considerable time To explore rule therein.But if there is two or more independents variable, it is known as in regression analysis polynary time Return.In fact, a kind of phenomenon is often to be associated with Multiple factors, by the optimum combination of multiple independents variable jointly predicting or Estimate dependent variable, it is more more effective than only being predicted or being estimated with an independent variable, more meet reality.
Based on this, this application provides a kind of methods based on multiple linear regression processing nonlinear data, can apply In computer equipment as shown in Figure 1, which includes processor, the memory, network connected by system bus Interface, display screen and input unit.Wherein, the processor of the computer equipment is for providing calculating and control ability.The calculating The memory of machine equipment includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system And computer program.The built-in storage provides ring for the operation of operating system and computer program in non-volatile memory medium Border.The network interface of the computer equipment is used to communicate with external terminal device by network connection.The computer program quilt To realize a kind of method based on multiple linear regression processing nonlinear data when processor executes.The display of the computer equipment Screen can be liquid crystal display or electric ink display screen, which can be the defeated of the processing system of brightened goods Enter device, can be communicated to connect with the processing system of brightened goods.In addition, the input unit of the computer equipment can be it is aobvious Key, trace ball, Trackpad, the external keyboard, Trackpad being arranged on the touch layer that is covered in display screen, computer equipment shell And/or mouse etc..
Optionally, which can be server, can be PC, can also be personal digital assistant, can also be with It is other terminal devices, such as PAD, mobile phone etc., can also be cloud or remote server, the embodiment of the present application is to meter Calculate the concrete form and without limitation of machine equipment.
It will be understood by those skilled in the art that structure shown in Fig. 1, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, as shown in Fig. 2, providing a kind of side based on multiple linear regression processing nonlinear data Method is applied to be illustrated for the computer equipment in Fig. 1 in this way, comprising the following steps:
Step 202, the data model comprising multiple explanatory variables is obtained.
It wherein, include the linear relationship of explained variable and multiple explanatory variables in data model.For example, data model General type is as follows:
Wherein Y is explained variable, and Xi is explanatory variable, and n is explanatory variable number (including constant explanatory variable), and Gi is Regression function, Gi (Xi) are to return item.In the present embodiment, multiple explanatory variables are mutually indepedent, and nothing influences each other.
Step 204, preset function is used to be converted to multiple explanatory variables respectively to obtain the explanation after multiple conversions Variable.
Specifically, being converted respectively to multiple explanatory variables by preset function to obtain the change of the explanation after multiple conversions Amount.For example, X=g (x), wherein x is a certain explanatory variable, and X is after explanatory variable x conversion as a result, g is functional relation, It specifically may include elementary function and custom function.Therefore, after one group of transformation will being obtained after the same explanatory variable conversion Explanatory variable.In the present embodiment, preset function may include preset elementary function and preset custom function.
Step 206, the explanatory variable after multiple conversions is combined according to data model multiple candidate linear to obtain Regression formula.
Due to that will obtain one group of transformed explanatory variable after the conversion of the same explanatory variable, then data model is corresponding The corresponding transformed explanatory variable of multiple groups will be obtained after multiple explanatory variable conversions.Therefore, in the present embodiment, each An explanatory variable is selected in the transformed explanatory variable of group, and it is combined according to data model, to be corresponded to Candidate linear regression formula.Each group of transformed explanatory variable is respectively combined, then is obtained multiple candidate linear times It makes a public possession formula, so that Multiple Non-linear Regression Analysis is converted into multiple multiple linear regression analysis.
Step 208, linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas.
Wherein, linear regression analysis can specifically be carried out using the linear regression function of R language, i.e., to multiple candidate linear The linear regression function that R language is respectively adopted in regression formula carries out linear regression analysis to determine goal regression formula.
In the above-mentioned method based on multiple linear regression processing nonlinear data, by obtaining comprising multiple mutually independent The data model of explanatory variable uses preset function to be converted to multiple explanatory variables respectively to obtain the solution after multiple conversions Variable is released, then the explanatory variable after multiple conversions is combined according to data model public to obtain multiple candidate linear regressions Formula, and by linear regression analysis to determine goal regression formula, to Multiple Non-linear Regression Analysis is converted into multiple more First linear regression analysis, to handle nonlinear data.
In one embodiment, linear regression analysis is carried out respectively to determine that target is returned to multiple candidate linear regression formulas It makes a public possession formula, may include: that linear regression is carried out to multiple candidate linear regression formulas using the linear regression function of R language respectively Analysis is to determine goal regression formula, wherein candidate linear regression formula includes multiple recurrence items.Specifically, R language is linear Regression function includes in the lm function and glm (generalized linear model, generalized linear model) function of R language It is any.
Specifically, as shown in figure 3, may include steps of:
Step 302, it is returned using each of multiple candidate linear regression formulas of the linear regression function estimation of R language The corresponding coefficient of item.
Specifically, calculating multiple candidate lines by the method that the linear regression function of R language is estimated in the present embodiment Property regression formula in each return the corresponding coefficient of item, and from obtaining the pass of explanatory variable and explained variable in data model System.
Step 304, T is used to examine to determine goal regression formula according to estimation result.
Wherein, T, which is examined, is also known as student t inspection (Student's t test).In the present embodiment, estimated according to above-mentioned It calculates result and progress regression analysis is examined using T, to be fitted based on the analysis results to determine goal regression formula, to increase The processing capacity of nonlinear data.
In one embodiment, as shown in figure 4, using T to examine to determine goal regression formula, specifically according to estimation result It may include steps of:
Step 402, T is carried out to multiple candidate linear regression formulas according to estimation result to examine to obtain T inspection result.
Step 404, the recurrence item that T inspection result P value in multiple candidate linear regression formulas is greater than 0.01 is rejected.
Step 406, the remaining mean value for returning T in item T inspection result in each candidate linear regression formula is calculated.
Step 408, the maximum estimation result of mean value of corresponding T is fitted to obtain goal regression formula.
Wherein, T inspection is that some explanatory variable usually uses P for the conspicuousness of explained variable in inspection data model Value (P-value) judge conspicuousness, when P value less than 0.01 or it is smaller when illustrate this explanatory variable it is related to explained variable pass System is significant, while also providing T value (T-value), and T value is bigger, and expression correlation is stronger.
Specifically, in the present embodiment, carrying out T to multiple candidate linear regression formulas according to estimation result and examining to obtain T inspection result is greater than to reject the corresponding P value of T inspection result in multiple candidate linear regression formulas according to T inspection result 0.01 recurrence item, and the remaining mean value for returning T in item T inspection result in each candidate linear regression formula is calculated, it will be right The maximum estimation result of the mean value of T is answered to be fitted to obtain goal regression formula.
The present processes are further illustrated below by way of specific embodiment, it is assumed that there are data acquisition system A includes 3 independent explanatory variable X1, X2, X3 and an explained variable Y, wherein each variable contains 100 elements, corresponds to Data model be Y=sin (X1)+X22+cos(X3)。
According to the above method, it is necessary first to which three explanatory variables in data acquisition system A are used elementary function and customized Function is converted, it is assumed that X1, which converts postscript and converts postscript into g (X1), X2, converts postscript into g (X2), X3 as g (X3), wherein g () includes a variety of transformational relations.
Then the explanatory variable after conversion is combined, to obtain candidate linear regression formula, few examples are as follows Shown in table:
Wherein, symbol "~" indicates the linear of analysis explained variable and each explanatory variable connecting using symbol "+" Multiple Non Linear Regression problem is so far converted into multiple multiple linear regression problems by relationship.
For candidate linear regression formula, then the corresponding system of each recurrence item is calculated by the method for parameter Estimation Number, and from the relationship for the explanatory variable and explained variable for obtaining data set.Its target is straight line to be calculated, and is made directly The quadratic sum of the difference of the explained variable value of the explained variable value and real data of each point is minimum on line, i.e. (practical-the Y1 of Y1 Prediction)2+ (practical-Y2 prediction of Y2)2+ ...+(practical-Yn prediction of Yn)2Value it is minimum.
In the present embodiment, for being realized using the lm function of R language to the parameter Estimation of the regression model of data It is illustrated, wherein as follows come the modeling process for realizing multiple linear regression using lm () function:
Wherein, Intercept is constant term, and sin (X1), sin (X2) and sin (X3) etc. are to return item, Estimate For coefficient estimated value, Std.Error be standard error, t value be T value during T is examined, Pr (> | t |) then be P in T inspection Value.
Therefore, it has obtained when candidate regression formula are as follows: when Y~sin (X1)+sin (X2)+sin (X3), explanatory variable and quilt The relationship of explanatory variable: Y=3.687*sin (X1) -35.52*sin (X2)+13.624*sin (X3)+2266.9.
In the analysis result of the linear formula of making a public possession of all candidates, rejects P value in candidate regression formula T inspection result and be greater than 0.01 recurrence item calculates the mean value that candidate regression formula residue returns T value in item T inspection result, selects T value mean value maximum Regression Analysis Result is as final result: Y1=1.015*sin (X1)+1.000*X22+0.987*cos(X3)。
After completing parameter estimation, it can also be examined as needed through T inspection, F or R2 related system is examined, and according to inspection Test result judgement estimation effect.
Wherein, T method of inspection is inspection data model some explanatory variable for the conspicuousness of explained variable, usually uses P- Value judges conspicuousness, and less than 0.01, more hour illustrates that this explanatory variable and explained variable correlativity are significant, while T-value value is provided, T value is bigger, and expression correlation is stronger.
F method of inspection for seeing the linear conspicuousness for explained variable on the whole to all explanatory variables, and Conspicuousness is judged with P-value, and less than 0.01, explanatory variable and explained variable correlativity are significant on the whole for more hour explanation.
R2 related system test rules are used to judge that the fitting degree of regression equation, the value of R2 more connect between 0 to 1 Nearly 1 illustrates that fitting degree is better.
It is as shown in the table to the certificate authenticity index of above-mentioned regression function, the corresponding R of final result2P value is examined for 1, F Less than 2.2e-16.Show that analysis result is good, and actual conditions deviation is few.It follows that being handled by the present processes Function formula and precompensation parameter that mandatory default explanatory variable may follow are not needed when Multiple Non Linear Regression problem, are being visited It has a clear superiority in rope excavation unknown data relationship, the polynary independent explanatory variable problem of processing.
It should be noted that if realizing the parameter Estimation to the regression model of data using the glm function of R language, then Code section can be adjusted correspondingly according to the characteristic of glm function, and estimation result is estimated with lm function is used As a result also slightly different.For example, include z value in estimation result after being estimated using glm function, it specifically can root Judgement is carried out according to z value or z value is converted into t value, effect and the effect phase estimated using lm function Together, those skilled in the art can select to use according to the actual situation.
It should be understood that although each step in the flow chart of Fig. 2-4 is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 2-4 Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately It executes.
In one embodiment, as shown in figure 5, providing a kind of dress based on multiple linear regression processing nonlinear data It sets, comprising: obtain module 510, conversion module 520, composite module 530 and determining module 540, in which:
Module 510 is obtained, for obtaining the data model comprising multiple explanatory variables, wherein multiple explanatory variables are mutual It is independent;
Conversion module 520, for using preset function to be converted to multiple explanatory variables respectively to obtain multiple conversions Explanatory variable afterwards;
Composite module 530, it is multiple to obtain for being combined according to data model to the explanatory variable after multiple conversions Candidate linear regression formula;
Determining module 540, for carrying out linear regression analysis respectively to multiple candidate linear regression formulas to determine target Regression formula.
In one embodiment, as shown in fig. 6, determining module 540 includes:
Evaluation unit 541 is estimated every in multiple candidate linear regression formulas for the linear regression function using R language One corresponding coefficient of recurrence item;
Goal regression formula determination unit 542, for using T to examine to determine goal regression formula according to estimation result.
In one embodiment, as shown in fig. 7, goal regression formula determination unit 542 further comprises:
T examines subelement 5421, examines for carrying out T to multiple candidate linear regression formulas according to estimation result to obtain T inspection result;
Subelement 5422 is rejected, for rejecting T inspection result P value returning greater than 0.01 in multiple candidate linear regression formulas Gui Xiang;
Computation subunit 5423 returns T in item T inspection result for calculating residue in each candidate linear regression formula Mean value;
It is fitted subelement 5424, for being fitted the maximum estimation result of the mean value for corresponding to T to obtain goal regression Formula.
About based on multiple linear regression processing nonlinear data device it is specific limit may refer to above for The restriction of method based on multiple linear regression processing nonlinear data, details are not described herein.It is above-mentioned to be based on multiple linear regression The modules handled in the device of nonlinear data can be realized fully or partially through software, hardware and combinations thereof.It is above-mentioned Each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be stored in a software form In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.
In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory Computer program, the processor perform the steps of when executing computer program
Obtain the data model comprising multiple explanatory variables, wherein multiple explanatory variables are mutually indepedent;
Preset function is used to be converted to multiple explanatory variables respectively to obtain the explanatory variable after multiple conversions;
The explanatory variable after multiple conversions is combined according to data model to obtain multiple candidate linear regression formulas;
Linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas.
In one embodiment, linear regression analysis is carried out respectively to determine that target is returned to multiple candidate linear regression formulas It makes a public possession formula, comprising: linear regression analysis is carried out to multiple candidate linear regression formulas using the linear regression function of R language respectively To determine goal regression formula.
In one embodiment, candidate linear regression formula includes multiple recurrence items;Then use the linear regression letter of R language It is several that linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas, comprising: to use R language Each of multiple candidate linear regression formulas of linear regression function estimation return the corresponding coefficient of item;According to estimation result T is used to examine to determine goal regression formula.
In one embodiment, T is used to examine to determine goal regression formula according to estimation result, comprising: according to estimation As a result T is carried out to multiple candidate linear regression formulas to examine to obtain T inspection result;It rejects in multiple candidate linear regression formulas T inspection result P value is greater than 0.01 recurrence item;Calculate remaining recurrence item T inspection result in each candidate linear regression formula The mean value of middle T;The maximum estimation result of mean value of corresponding T is fitted to obtain goal regression formula.
In one embodiment, the linear regression function of R language includes any in the lm function and glm function of R language Kind.
In one embodiment, preset function includes preset elementary function and preset custom function.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of when being executed by processor
Obtain the data model comprising multiple explanatory variables, wherein multiple explanatory variables are mutually indepedent;
Preset function is used to be converted to multiple explanatory variables respectively to obtain the explanatory variable after multiple conversions;
The explanatory variable after multiple conversions is combined according to data model to obtain multiple candidate linear regression formulas;
Linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas.
In one embodiment, linear regression analysis is carried out respectively to determine that target is returned to multiple candidate linear regression formulas It makes a public possession formula, comprising: linear regression analysis is carried out to multiple candidate linear regression formulas using the linear regression function of R language respectively To determine goal regression formula.
In one embodiment, candidate linear regression formula includes multiple recurrence items;Then use the linear regression letter of R language It is several that linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas, comprising: to use R language Each of multiple candidate linear regression formulas of linear regression function estimation return the corresponding coefficient of item;According to estimation result T is used to examine to determine goal regression formula.
In one embodiment, it is examined according to estimation result using T and lm function determines goal regression formula, comprising: T is carried out to multiple candidate linear regression formulas according to estimation result to examine to obtain T inspection result;It rejects multiple candidate linear times T inspection result P value is greater than 0.01 recurrence item in formula of making a public possession;Calculate remaining recurrence item T in each candidate linear regression formula The mean value of T in inspection result;The maximum estimation result of mean value of corresponding T is fitted to obtain goal regression formula.
In one embodiment, the linear regression function of R language includes any in the lm function and glm function of R language Kind.
In one embodiment, preset function includes preset elementary function and preset custom function.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (10)

1. a kind of method based on multiple linear regression processing nonlinear data characterized by comprising
The data model comprising multiple explanatory variables is obtained, the multiple explanatory variable is mutually indepedent;
Preset function is used respectively to be converted multiple explanatory variables to obtain the explanatory variable after multiple conversions;
The explanatory variable after multiple conversions is combined according to the data model to obtain multiple candidate linear regression formulas;
Linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas.
2. the method according to claim 1 based on multiple linear regression processing nonlinear data, which is characterized in that described Linear regression analysis is carried out respectively to determine goal regression formula to multiple candidate linear regression formulas, comprising:
The linear regression function of R language is used to carry out linear regression analysis respectively to multiple candidate linear regression formulas with true Set the goal regression formula.
3. the method according to claim 2 based on multiple linear regression processing nonlinear data, which is characterized in that described Candidate linear regression formula includes multiple recurrence items;The linear regression function using R language is to multiple described candidate linear Regression formula carries out linear regression analysis respectively to determine goal regression formula, comprising:
It is corresponding that item is returned using each of multiple described candidate linear regression formulas of the linear regression function estimation of R language Coefficient;
T is used to examine to determine goal regression formula according to estimation result.
4. the method according to claim 3 based on multiple linear regression processing nonlinear data, which is characterized in that described T is used to examine to determine goal regression formula according to estimation result, comprising:
T is carried out to multiple candidate linear regression formulas according to estimation result to examine to obtain T inspection result;
Reject the recurrence item that T inspection result P value in multiple candidate linear regression formulas is greater than 0.01;
Calculate the remaining mean value for returning T in item T inspection result in each described candidate linear regression formula;
The maximum estimation result of mean value of corresponding T is fitted to obtain goal regression formula.
5. according to the described in any item methods based on multiple linear regression processing nonlinear data of claim 2~4, feature It is, the linear regression function of the R language includes any one of lm function and glm function of R language.
6. the method according to any one of claims 1 to 4 based on multiple linear regression processing nonlinear data, feature It is, the preset function includes preset elementary function and preset custom function.
7. a kind of device based on multiple linear regression processing nonlinear data characterized by comprising
Module is obtained, for obtaining the data model comprising multiple explanatory variables, the multiple explanatory variable is mutually indepedent;
Conversion module, after using preset function respectively to be converted multiple explanatory variables to obtain multiple conversions Explanatory variable;
Composite module, for being combined to the explanatory variable after multiple conversions according to the data model to obtain multiple candidates Linear regression formula;
Determining module, for carrying out linear regression analysis respectively to multiple candidate linear regression formulas to determine goal regression Formula.
8. the device according to claim 7 based on multiple linear regression processing nonlinear data, which is characterized in that described Determining module includes:
Evaluation unit is estimated each in multiple candidate linear regression formulas for the linear regression function using R language The corresponding coefficient of a recurrence item;
Goal regression formula determination unit, for using T to examine to determine goal regression formula according to estimation result.
9. the device according to claim 8 based on multiple linear regression processing nonlinear data, which is characterized in that described Goal regression formula determination unit includes:
T examines subelement, examines for carrying out T to multiple candidate linear regression formulas according to estimation result to obtain T inspection Test result;
Subelement is rejected, the recurrence for being greater than 0.01 for rejecting T inspection result P value in multiple candidate linear regression formulas ?;
Computation subunit remaining in each described candidate linear regression formula returns the equal of T in item T inspection result for calculating Value;
It is fitted subelement, for being fitted the maximum estimation result of the mean value for corresponding to T to obtain goal regression formula.
10. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 6 the method when executing the computer program.
CN201910104454.6A 2019-02-01 2019-02-01 Method and apparatus based on multiple linear regression processing nonlinear data Pending CN109815444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910104454.6A CN109815444A (en) 2019-02-01 2019-02-01 Method and apparatus based on multiple linear regression processing nonlinear data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910104454.6A CN109815444A (en) 2019-02-01 2019-02-01 Method and apparatus based on multiple linear regression processing nonlinear data

Publications (1)

Publication Number Publication Date
CN109815444A true CN109815444A (en) 2019-05-28

Family

ID=66605125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910104454.6A Pending CN109815444A (en) 2019-02-01 2019-02-01 Method and apparatus based on multiple linear regression processing nonlinear data

Country Status (1)

Country Link
CN (1) CN109815444A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380489A (en) * 2020-11-03 2021-02-19 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112818287A (en) * 2021-01-29 2021-05-18 三一海洋重工有限公司 Weighing method and system of material grabbing machine, storage medium and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380489A (en) * 2020-11-03 2021-02-19 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112380489B (en) * 2020-11-03 2024-04-16 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112818287A (en) * 2021-01-29 2021-05-18 三一海洋重工有限公司 Weighing method and system of material grabbing machine, storage medium and electronic equipment
CN112818287B (en) * 2021-01-29 2024-04-09 三一海洋重工有限公司 Weighing method and system of grabbing machine, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN110991649A (en) Deep learning model building method, device, equipment and storage medium
CN111178512B (en) Device operation neural network test method and device
US10902089B2 (en) Method for predicting stochastic output performance or scaling stochastic inputs
CN109815444A (en) Method and apparatus based on multiple linear regression processing nonlinear data
CN105868102B (en) A kind of mobile terminal application test systems and method based on computer vision
CN106093897A (en) The test system of a kind of radar system and method for testing
CN109542779A (en) Test method, device and storage medium
CN110598305B (en) Sensitivity analysis method for comparing scanning simulation increment of circuit
CN110110406B (en) Slope stability prediction method for achieving LS-SVM model based on Excel computing platform
JP6647992B2 (en) Design support equipment
CN109815126A (en) Method for testing software, device, computer equipment and storage medium
CN111581586A (en) Lake and reservoir water quality anisotropic interpolation method and device based on registration model
CN108921459A (en) Index generates method, apparatus, computer equipment and storage medium
CN109815127B (en) Automatic script conversion method and device, computer equipment and storage medium
CN110020402A (en) Variation function Nested model method for parameter estimation, device, equipment and storage medium
JP5348351B2 (en) Risk profile generator
CN110276802A (en) Illness tissue localization method, device and equipment in medical image
Opara et al. Parametrized benchmarking: an outline of the idea and a feasibility study
CN110210046A (en) Application program and dedicated instruction set processor integration agile design method
CN112632787A (en) Simulation test method for multi-solution flash evaporation optimization strategy
Boher et al. Implications of model misspecification in robust tests for recurrent events
Mittas et al. StatREC: A graphical user interface tool for visual hypothesis testing of cost prediction models
CN109544661A (en) Area drawing drawing method, device, computer equipment and storage medium
Łazȩcka et al. Multiple testing of conditional independence hypotheses using information-theoretic approach
CN107992287A (en) Method and device for checking system demand priority ranking result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190528