CN108170770A

CN108170770A - A kind of analyzing and training platform based on big data

Info

Publication number: CN108170770A
Application number: CN201711428840.8A
Authority: CN
Inventors: 吕雪岭; 黄波士; 吕晓燕; 王飞雁; 吕晓超; 朱飞
Original assignee: Shandong Cloud Computing Ltd By Share Ltd
Current assignee: Shandong Cloud Computing Ltd By Share Ltd
Priority date: 2017-12-26
Filing date: 2017-12-26
Publication date: 2018-06-15

Abstract

The present invention relates to a kind of analyzing and training platform based on big data, including with lower module：Web Analysis Service modules, for publication and analysis result, the Pages Design based on browser；Automation services assembly module, the calling of data task is with automatically updating；Management and integrated services module, the centralized deployments correlation functions such as data integration, user management, Study document management, data source information management；Statistical fractals module provides advanced statistics computing engines and Deployment Manager；Application data service module, is directly integrated with enterprise data source：SAP R/3, SAP B/W, UF ERP, Oracle EBS, Kingdee ERP；Further include persistence architecture load-on module；The persistence architecture load-on module includes flat file data processing unit, database file data loading unit and creates data model unit.

Description

A kind of analyzing and training platform based on big data

Technical field

The invention belongs to big data technical fields, and in particular to a kind of analyzing and training platform based on big data.

Background technology

Big data is that a kind of big arrive of scale is well beyond traditional database software in terms of acquisition, storage, management, analysis The data acquisition system of means capability range.Big data does not use traditional Stochastic Analysis Method, but is divided using all data Analysis is handled.Data scale with magnanimity, various data type, quick stream compression and the low four big feature of value density.

Data scale：Data volume is huge to concentrate storage/centralized calculation that can not handle huge data volume；

Various data type：Type and source diversification daily record/picture/video/document/geographical location；

Quick stream compression：Analyze and process the timely and effective analysis of the fast mass data of speed；

Value density is low：Value density is low, the carry out complexity depth analysis of the high a large amount of irrelevant information of commercial value, deep Dig value.

The strategic importance of big data technology does not simultaneously lie in people and has grasped how many huge data, but by these by The data information of grasp carries out some specialized process with big data technology.In fact, if big data ratio is done a kind of industry, that , key that this big data industry is generated profit, in that improve some working abilities to big data, by this plus The ability of work and then the value for realizing big data.In fact, people study big data, seek to using big data research and reality Existing its certain value, especially some departments of commercial business, the meaning for more having its tangible to excavating big data research.

However in the prior art but without providing special big data experience system for the demand.This is the prior art Shortcoming.

Therefore, for drawbacks described above of the prior art, provide design a kind of analyzing and training platform based on big data；With Above-mentioned technical problem is solved, is necessary.

Invention content

It is an object of the present invention in view of the above-mentioned drawbacks of the prior art, to provide design a kind of based on big data Analyzing and training platform, to solve above-mentioned technical problem.

To achieve the above object, the present invention provides following technical scheme：

A kind of analyzing and training platform based on big data, which is characterized in that including with lower module：

Web Analysis Service modules, for publication and analysis result, the Pages Design based on browser；

Automation services assembly module, the calling of data task is with automatically updating；

Management and integrated services module, the collection such as data integration, user management, Study document management, data source information management Chinese style disposes correlation function；

Statistical fractals module provides advanced statistics computing engines and Deployment Manager；

Application data service module, is directly integrated with enterprise data source：SAP R/3, SAP B/W, UF ERP, Oracle EBS, Kingdee ERP；

Further include persistence architecture load-on module；

The persistence architecture load-on module includes flat file data processing unit, database file data load Unit and establishment data model unit；

The flat file data processing unit includes the following steps：

S1.1：Plane txt file application

(1) select " file "->" interpolation data table "->" addition " carries out data loading；

(2) addition analysis text file；

(3) Study document is opened；

(4) " determining " is clicked, imports text；

S1.2：Plane excel file applications

(2) Excel Study documents are added；

(3) Excel Study documents are opened；

(4) " determining " is clicked, imports Excel Study documents；

S1.3：Change data step：

(1) " file "-" replacement data table " or selection " insertion "-" row " or selection " insertion "-" row " are selected；

(2) conversion regime is selected；

(3) a variety of normalization methods can be written as expression formula or as switch process during interpolation data table；

(4) conversion can apply in the dialog box of interpolation data, " interpolation data table " dialog box or from external data work In the insertion column or row of tool；" conversion "-" standardization "-" addition " is clicked to show control as described below；

S1.4：Data are pre-processed, calculate row, standardization processing operation；

S1.5：Transposition data

(1) selection " conversion "-" transposition "-" addition " carries out data transposition；

(2) row identifier：Each unique value in selected identity column or level can form a line in generation table；

(3) column heading：Each unique value in selected classification row or level can be to each polymerization in generated tables of data Method forms a newline；

(4) value (%V) and polymerization (%M)：The row of data value are calculated from it.Value in generated tables of data is according to row The method selected under " polymerization " in selector menu is calculated；

S1.6：Data are put in reverse

(1) selection " conversion "-" reverse is put "-" addition " carries out data inverse transposition；

(2) will by row：Selected row, wherein the information for reversing and putting data set should be transferred to comprising unconverted；

(3) row to be converted：Selected row, wherein including the value that be merged into single row.The column name of these row will Class label in new category row as generation；

(4) classification column name：The row for having selected the information provided in the row converted can be summarised in by keying in one Title；

(5) value column name：Key in a column name that can show the information type included in new value row；

The database file data loading unit includes the following steps：

S1.7：The step of data being opened using OLE DB

(1) select " file "->" interpolation data table "->" addition "->" other "->" database "；

(2) it in " opening database " dialog box, clicks to select " OleDb Data Provider "；

(3) it clicks " configuration "；

S1.8：The step of data being opened using ODBC

(2) it in " opening database " dialog box, clicks to select " Odbc Data Provider "；

(3) it clicks " configuration ".

The establishment data model unit includes the following steps：

S1.9：Interpolation data connects

S1.91：In library the step of interpolation data source：

(1) " tool " selection " management data connection " option, pop-up " management data connection " dialog box are clicked in menu bar；

(2) " data source " option of selection " addition is new ", selects data source types from list.

(3) according to selected data source types, corresponding information is filled in, is connected to data source, selects database simultaneously " determining "；

(4) " explanation " is optional fills out for addition in " data source dialog box "；

(5) " saving as library item mesh " dialog box will be shown by clicking " preservation "；

(6) newdata source is stored in designated position in library.

S1.92：The step of interpolation data connects in library：

(1) " tool " selection " management data connection " option is clicked in menu bar；

(2) " data connection " option of selection " addition new projects ", selection " the source connections of data in the library " choosing from list ；

(3) according to selected type of data connection, corresponding information is filled in, is connected to data source, selects database simultaneously " true It is fixed ", pop up " view in connection " dialog box；

(4) in " the available table in database " list, the table used in the analyzing and training platform of big data is double-clicked；

(5) after the completion of, click OK；Pop up " data connection setting " dialog box；The tables of data added can be shown in In " spreadsheet views " list；

(6) the input connection description in " connection description " frame facilitates other users to understand and uses；

(7) it clicks " preservation " and data connection is stored in designated position in library；

S1.10：Data connection is used in analysis

(1) selection " interpolation data table " option in " file " is clicked, pops up " interpolation data table " dialog box；

(2) it clicks " addition " and is connected to " data connection in library " option, pop-up " selection data connection " dialog box；

(3) your data connection to be used, and " determining " are selected in library；

(4) in " interpolation data table " dialog box, by choosing check box, selection is by which of data connection view It is added to new data table；

(5) " loading method " and " importing tables of data " is selected still " to be retained in tables of data outer ", also may specify whether On-demand loading data；

(6) " determining " is clicked；

S1.10：The editor of data connection

S1.101 is connected in library edit data：

(1) selection " tool " is clicked " management data connection "；

(2) your data connection to be edited is selected, is then clicked " editor "；It will show " data connection setting " dialog box；

(3) it makes change and preserves data connection；

S1.102 is in library edit data source：

(1) selection " tool " is clicked " management data connection "；

(2) your data source to be edited is selected, " data source setting " dialog box will be shown by then clicking " editor "；

(3) it makes change and preserves data source；

S1.11：Self-defined inquiry

(1) using tool>Manage data connection or file>Interpolation data table ... creates one to the new of relational database Then data connection selects necessary content, until showing " view in connection " dialog box；

(2) it in " view in connection " dialog box, selects " self-defined inquiry ">" newly-built self-defined inquiry "；

(3) Query Name is keyed in " self-defined inquiry "；

(4) it is keyed in and inquired using the language of selected database；

(5) verification is clicked；

(6) browsing result row, it is ensured that list required all results row, and ensure that they have correct data type；

(7) click OK；

S1.12：Create tables of data relationship

For establishing the relationship between tables of data in data connection；When there are relationships between tables of data, established by relationship Data model；

The establishment of S1.121 tables of data " relationship "：

(1) in " view in connection ", the tables of data of " relationship " is established in selection；

(2) select " relationship "->" newly-built relationship " opening relationships；

(3) click OK；

S1.122 tables " relationship " is checked：

Plus sige represents that the table establishes one or more structural relations with other tables in database, to the relationship of checking Structure clicks plus sige so that view is unfolded；

It further includes big data and excavates module, the big data is excavated module and included：

Principal component analysis function prcomp (x ...)

Prcomp.default (x, retx=TRUE, center=TRUE, scale.=FALSE, tol= NULL,...)

Parameter declaration：

The specified numeric type or complex matrix for being used for analyzing of x acquiescences；

Retx logical variables designate whether to return to rotary variable；

Center logical variables are designated whether variable centralization；

Scale. logical variable is designated whether variable standardization；

Tol numeric type variables, are used to specify precision, and the value less than the numerical value will be ignored；

Principal component analysis refers to turn to multi objective a kind of statistical analysis technique of a few overall target, the master of generation Ingredient can reflect the linear combination of the most information, usually original variable of original variable；

Principal component information inquiry function summary.aov (object, intercept=FALSE, split, Expand.split=TRUE, keep.zero.df=TRUE ...)；

Summary functions can extract principal component information, provide minimum value, maximum value, quartile and numeric type variable The Frequency statistics of mean value and factor vector sum logical type vector；

Parameter declaration：

object：It is inherited from the model object of class avo；

Intercept：This option is only applicable under single-layer model default situations.Intercept=FALSE

expand.split：Whether it is continuing in level；

Expand.split=TRUE

keep.zero.df：Whether original data length is retained；

Linear least square fitting lsfit (x, y, wt=NULL, intercept=T, tolerance=1.e-07, Yname=NULL)

It is suitble to weighted least-squares method multiple regression；Return explanatory variable matrix estimation coefficient and residual error and QR points The list of solution；

Parameter declaration：

X vectors or explanatory variable matrix；

Y response variables can be a matrix；

Wt optional parameters, the execution weight vectors of weighted least-squares method；

Whether intercept should use intercept item；

Tolerance tolerances will be used in matrix decomposition；

Yname is used for the title of response variable；

Factorial analysis factanal (x, factors, data=NULL, covmat=NULL, n.obs=NA, subset, Na.action, start=NULL, scores=c (" none ", " regression ", " Bartlett "), rotation=" Varimax ", control=NULL ...)

Factanal functions can set out from sample, sample variance matrix or sample correlation matrix and do Factor minute to data Analysis；

Parameter declaration：

X is the matrix or data frame being made of data.

Factors is the number of the factor；

Data data frames or matrix, this is only used when x is formula；

Covmat is the covariance matrix of sample or the correlation matrix of sample；

The method that x.scores represents factor score.Rotation represents rotation, the title of the rotation function used；

Principal component analysis anticipation function predict.prcomp (object, newdata ...)

By the prediction model of principal component analysis, prediction processing is carried out to data；

Parameter declaration：

object：The object of class prcomp；

newdata：For the data matrix or numerical value analyzed；

Variance analysis calculating aov (formula, data=NULL, projections=FALSE, qr=TRUE, Contrasts=NULL ...)

The class " aov " of the variance analysis of designated model is fitted differential pattern analysis；

Parameter declaration：

Formula represents the formula of variance analysis, is x~A in one-way analysis of variance；

Data represents to do the data frame of variance analysis；

Projections is logical value, indicates whether to return to prediction result；

Qr is logic flag, indicates whether to return to Orthogonal Decomposition；

Regression forecasting function lm (formula, data, weights, subset, na.action, method=" qr ", Model=FALSE, x=FALSE, y=FALSE, contrasts=NULL ...)

Lm () function can return fitting as a result, it can be used for being returned, single stratigraphic analysis, variance and covariance point Analysis；

Parameter introduction：

A kind of model of fit of Formula；

Mono- optional data frame of Data, list；

The subset for the observation that Subset chooses；

Model, x, y, qr logical value are fitted in the model component of object and return to model framework, matrix, response and matrix QR decompose；

Hierarchical cluster function hclust (d, method=" complete ", members=NULL)

Hclust can perform hierarchical clustering in distance or Similarity Structure；

Parameter declaration：

D represents distance structure or distance matrix；

Method provides the character string of clustering method；

Complete longest distance methods；

Members is the vector of NULL or d length, and default value is that the value of all elements is all 1；

K mean clusters kmeans (x, centers, iter.max=10, nstart=1, algorithm=c ())

K-means clustering algorithms are divided into K class using by the matrix X of N*P so that in class between object away from From maximum, and the distance between class minimum；

Parameter declaration：

X is the matrix or data frame being made of data；

Centers is the center of the number either initial classes of cluster；

It is 10 when iter.max maximum iterations are default；

The number of nstart random collections.

Algorithm specifies the algorithm for cluster calculation；

Association analysis function confint (object, parm, level=0.95 ...)

Confint functions can obtain the confidence interval of model parameter；

Parameter declaration：

The model that object is suitble to；

Parm character string vectors refer to the parameter of required interval estimation；

Level represents that the confidence level of confidence interval is had to be between 0 and 1；

ARIMA time series modeling functions arima (x, order=c (0,0,0), seasonal=list (order=c (0,0,0), period=NA), xreg=NULL, include.mean=TRUE, transform.pars=TRUE, fixed =NULL, init=NULL, method=c (" CSS-ML ", " ML ", " CSS "), n.cond, optim.method=" BFGS ", optim.control=list (), kappa=1e+06)

ARIMA models refer to nonstationary time series being converted into stationary time series, then by dependent variable only to it Lagged value and the present worth and lagged value of stochastic error returned established model；

Parameter declaration：

X digital vectors or univariate time series；

Order integer vectors are the exponent number (p, d, q) of model；

Seasonal is designated whether as seaconal model；Whether fixed designated models are fixed model, if preset parameter is 0；

Xreg time serieses, vector or regression matrix；

Function of time time (x ...)

cycle(x,...)

The time point in time series a cycle or position are provided, returns to single time series or other objects；

Mono- time series object of x；

Function of time ar (x, aic=TRUE, order.max=NULL, method=c (" yule- walker ", “burg”,“ols”,

" mle ", " yw "), na.action=na.fail, series=deparse (substitute (x)) ...)

It is suitble to the autoregression model of time series, autoregression model can be fitted to time series；

Parameter declaration：

x：From the single argument or Multivariate Time Series of " ts " Similar integral；

order.max：Autoregressive maximum sequence is suitble to time series；

na.action：Handle the function of missing values.

The beneficial effects of the present invention are Internet era, big data develops like a raging fire, and data and information content are to refer to Number mode rises, and people carry out data higher level analysis, and then are preferably used.The distributed storage of cloud computing Technical support is provided with computing capability；And the core of big data is data processing, data mining technology, which becomes, efficiently utilizes number According to, find value core technology.It is skilled to grasp big data analysis tool, it is good at big data analysis result combination enterprise Sale and operation management practice are new requirements.The analyzing and training platform of big data is the big data analysis software of latest generation, Multiple types of data can be carried out quickly analyzing and handling, can met in management of different nature and processes of research ＆ development to a large amount of The analysis and decision requirement of data.The characteristics of it is maximum is by a variety of dynamic figures and screening conditions, quickly to a large amount of Data are analyzed and are handled, can generate including block diagram, curve graph, pie chart, scatter plot, constitutional diagram, map, arborescence, Thermal map, box-shaped figure, summary sheet and crosstab etc. are a variety of to show form, and all figures can provide numerous data analysis dimensions Degree, supports the access and display of a variety of client end interfaces and web interface.In addition, the platform also has the characteristics that：

There is data warehouse to foundation, existing data warehouse can be made full use of；Avoid existing IT inputs that from can continuing to produce Raw bigger value；Support big data quantity analysis；Support access strategy on demand；Data warehouse data is loaded directly into, avoids repeating to build The workload of mould；Various table components are provided：Table, crosstab, figure table, summary sheet；Abundant figure is provided：Bar chart, broken line Figure, constitutional diagram, pie chart, scatter plot, three-dimensional scatter plot, map, arborescence, thermal map, parallel coordinates figure, box must scheme；Text is provided Component；Drag and drop formula visualized operation, instant analysis result are shown.In addition, design principle of the present invention is reliable, simple in structure, tool There is very extensive application prospect.

It can be seen that compared with prior art, the present invention improve with prominent substantive distinguishing features and significantly, implement Advantageous effect be also obvious.

Specific embodiment

Below by specific embodiment, the present invention will be described in detail, and following embodiment is explanation of the invention, and The invention is not limited in implementation below.

A kind of analyzing and training platform based on big data provided by the invention, which is characterized in that including with lower module：

Further include persistence architecture load-on module；

The flat file data processing unit includes the following steps：

S1.1：Plane txt file application

(2) addition analysis text file；

(3) Study document is opened；

(4) " determining " is clicked, imports text；

S1.2：Plane excel file applications

(2) Excel Study documents are added；

(3) Excel Study documents are opened；

(4) " determining " is clicked, imports Excel Study documents；

S1.3：Change data step：

(2) conversion regime is selected；

S1.5：Transposition data

S1.6：Data are put in reverse

The database file data loading unit includes the following steps：

S1.7：The step of data being opened using OLE DB

(3) it clicks " configuration "；

S1.8：The step of data being opened using ODBC

(3) it clicks " configuration ".

The establishment data model unit includes the following steps：

S1.9：Interpolation data connects

S1.91：In library the step of interpolation data source：

(6) newdata source is stored in designated position in library.

S1.92：The step of interpolation data connects in library：

S1.10：Data connection is used in analysis

(6) " determining " is clicked；

S1.10：The editor of data connection

S1.101 is connected in library edit data：

(1) selection " tool " is clicked " management data connection "；

(3) it makes change and preserves data connection；

S1.102 is in library edit data source：

(1) selection " tool " is clicked " management data connection "；

(3) it makes change and preserves data source；

S1.11：Self-defined inquiry

(3) Query Name is keyed in " self-defined inquiry "；

(4) it is keyed in and inquired using the language of selected database；

(5) verification is clicked；

(7) click OK；

S1.12：Create tables of data relationship

The establishment of S1.121 tables of data " relationship "：

(3) click OK；

S1.122 tables " relationship " is checked：

Principal component analysis function prcomp (x ...)

Prcomp.default (x, retx=TRUE, center=TRUE, scale.=FALSE, tol= NULL,...)

Parameter declaration：

Retx logical variables designate whether to return to rotary variable；

Center logical variables are designated whether variable centralization；

Scale. logical variable is designated whether variable standardization；

Parameter declaration：

object：It is inherited from the model object of class avo；

expand.split：Whether it is continuing in level；

Expand.split=TRUE

keep.zero.df：Whether original data length is retained；

Linear least square fitting l sfit (x, y, wt=NULL, intercept=T, tolerance=1.e-07, Yname=NULL)

Parameter declaration：

X vectors or explanatory variable matrix；

Y response variables can be a matrix；

Whether intercept should use intercept item；

Tolerance tolerances will be used in matrix decomposition；

Yname is used for the title of response variable；

Parameter declaration：

X is the matrix or data frame being made of data.

Factors is the number of the factor；

Data data frames or matrix, this is only used when x is formula；

Parameter declaration：

object：The object of class prcomp；

newdata：For the data matrix or numerical value analyzed；

Parameter declaration：

Data represents to do the data frame of variance analysis；

Qr is logic flag, indicates whether to return to Orthogonal Decomposition；

Parameter introduction：

A kind of model of fit of Formula；

Mono- optional data frame of Data, list；

The subset for the observation that Subset chooses；

Hierarchical cluster function hclust (d, method=" complete ", members=NULL)

Parameter declaration：

D represents distance structure or distance matrix；

Method provides the character string of clustering method；

Complete longest distance methods；

K mean clusters kmeans (x, centers, iter.max=10, nstart=1, algorithm=c ())

K-means clustering algorithms are divided into K class using by the matrix X of N*P so that the distance between object in class Maximum, and the distance between class minimum；

Parameter declaration：

X is the matrix or data frame being made of data；

Centers is the center of the number either initial classes of cluster；

It is 10 when iter.max maximum iterations are default；

The number of nstart random collections.

Algorithm specifies the algorithm for cluster calculation；

Association analysis function confint (object, parm, level=0.95 ...)

Confint functions can obtain the confidence interval of model parameter；

Parameter declaration：

The model that object is suitble to；

ARIMA models refer to nonstationary time series being converted into stationary time series, then by dependent variable only to it The present worth and lagged value of lagged value and stochastic error are returned established model；

Parameter declaration：

X digital vectors or univariate time series；

Order integer vectors are the exponent number (p, d, q) of model；

Xreg time serieses, vector or regression matrix；

Function of time time (x ...)

cycle(x,...)

Provide the time point in time series a cycle or position, return single time series or other.

Claims

1. a kind of analyzing and training platform based on big data, which is characterized in that including with lower module：

Management and integrated services module, the centralizations such as data integration, user management, Study document management, data source information management Dispose correlation function；

Further include persistence architecture load-on module；

The persistence architecture load-on module includes flat file data processing unit, database file data loading unit And create data model unit；

The flat file data processing unit includes the following steps：

S1.1：Plane txt file application

(2) addition analysis text file；

(3) Study document is opened；

(4) " determining " is clicked, imports text；

S1.2：Plane excel file applications

(2) Excel Study documents are added；

(3) Excel Study documents are opened；

(4) " determining " is clicked, imports Excel Study documents；

S1.3：Change data step：

(2) conversion regime is selected；

(4) conversion can apply in the dialog box of interpolation data, " interpolation data table " dialog box or from external data tool It is inserted into column or row；" conversion "-" standardization "-" addition " is clicked to show control as described below；

S1.5：Transposition data

(3) column heading：Each unique value in selected classification row or level can be to each polymerization in generated tables of data Form a newline；

(4) value and polymerization：The row of data value are calculated from it.Value in generated tables of data is according in column selector menu The method selected under " polymerization " is calculated；

S1.6：Data are put in reverse

(3) row to be converted：Selected row, wherein including the value that be merged into single row.The column name of these row will act as Class label in the new category row of generation；

(4) classification column name：The column name for having selected the information provided in the row converted can be summarised in by keying in one；

The database file data loading unit includes the following steps：

S1.7：The step of data being opened using OLE DB

(3) it clicks " configuration "；

S1.8：The step of data being opened using ODBC

(3) it clicks " configuration ".

The establishment data model unit includes the following steps：

S1.9：Interpolation data connects

S1.91：In library the step of interpolation data source：

(6) newdata source is stored in designated position in library.

S1.92：The step of interpolation data connects in library：

(2) " data connection " option of selection " addition new projects ", selection " the source connections of data in the library " option from list；

(3) according to selected type of data connection, corresponding information is filled in, is connected to data source, selects database simultaneously " determining ", bullet Go out " view in connection " dialog box；

(5) after the completion of, click OK；Pop up " data connection setting " dialog box；The tables of data added can be shown in " data In table view " list；

S1.10：Data connection is used in analysis

(4) in " interpolation data table " dialog box, by choosing check box, which of data connection view is added in selection For new data table；

(5) " loading method " and " importing tables of data " is selected still " to be retained in tables of data outer ", also may specify whether on demand Load data；

(6) " determining " is clicked；

S1.10：The editor of data connection

S1.101 is connected in library edit data：

(1) selection " tool " is clicked " management data connection "；

(3) it makes change and preserves data connection；

S1.102 is in library edit data source：

(1) selection " tool " is clicked " management data connection "；

(3) it makes change and preserves data source；

S1.11：Self-defined inquiry

(1) using tool>Manage data connection or file>Interpolation data table ... creates a new data for arriving relational database Connection, then selects necessary content, until showing " view in connection " dialog box；

(3) Query Name is keyed in " self-defined inquiry "；

(4) it is keyed in and inquired using the language of selected database；

(5) verification is clicked；

(7) click OK；

S1.12：Create tables of data relationship

For establishing the relationship between tables of data in data connection；When there are relationships between tables of data, data are established by relationship Model；

The establishment of S1.121 tables of data " relationship "：

(3) click OK；

S1.122 tables " relationship " is checked：

Plus sige represents that the table establishes one or more structural relations with other tables in database, to check relationship knot Structure clicks plus sige so that view is unfolded；

Principal component analysis function prcomp (x ...)

Prcomp.default (x, retx=TRUE, center=TRUE, scale.=FALSE, tol=NULL ...)

Parameter declaration：

Retx logical variables designate whether to return to rotary variable；

Center logical variables are designated whether variable centralization；

Scale. logical variable is designated whether variable standardization；

Principal component analysis refers to turn to multi objective a kind of statistical analysis technique of a few overall target, the principal component of generation It can reflect the linear combination of the most information, usually original variable of original variable；

Principal component information inquiry function summary.aov (object, intercept=FALSE, split, expand.split =TRUE, keep.zero.df=TRUE ...)；

Summary functions can extract principal component information, provide the mean value of minimum value, maximum value, quartile and numeric type variable, And the Frequency statistics of factor vector sum logical type vector；

Parameter declaration：

object：It is inherited from the model object of class avo；

expand.split：Whether it is continuing in level；

Expand.split=TRUE

keep.zero.df：Whether original data length is retained；

Linear least square fitting lsfit (x, y, wt=NULL, intercept=T, tolerance=1.e-07, yname= NULL)

It is suitble to weighted least-squares method multiple regression；It returns the estimation coefficient of explanatory variable matrix and residual error and QR decomposes List；

Parameter declaration：

X vectors or explanatory variable matrix；

Y response variables can be a matrix；

Whether intercept should use intercept item；

Tolerance tolerances will be used in matrix decomposition；

Yname is used for the title of response variable；

Factanal functions can set out from sample, sample variance matrix or sample correlation matrix and do factorial analysis to data；

Parameter declaration：

X is the matrix or data frame being made of data.

Factors is the number of the factor；

Data data frames or matrix, this is only used when x is formula；

Parameter declaration：

object：The object of class prcomp；

newdata：For the data matrix or numerical value analyzed；

Variance analysis calculates aov (formula, data=NULL, projections=FALSE, qr=TRUE, contrasts =NULL ...)

Parameter declaration：

Data represents to do the data frame of variance analysis；

Qr is logic flag, indicates whether to return to Orthogonal Decomposition；

Regression forecasting function lm (formula, data, weights, subset, na.action, method=" qr ", model= FALSE, x=FALSE, y=FALSE, contrasts=NULL ...)

Lm () function can return fitting as a result, it can be used for returned, single stratigraphic analysis, variance and covariance analysis；

Parameter introduction：

A kind of model of fit of Formula；

Mono- optional data frame of Data, list；

The subset for the observation that Subset chooses；

Model, x, y, qr logical value are fitted in the model component of object and return to model framework, matrix, QR points of response and matrix Solution；

Hierarchical cluster function hclust (d, method=" complete ", members=NULL) hclust can be in distance or phase Like performing hierarchical clustering in property structure；

Parameter declaration：

D represents distance structure or distance matrix；

Method provides the character string of clustering method；

Complete longest distance methods；

K mean cluster kmeans (x, centers, iter.max=10, nstart=1, algorithm=c ())

K-means clustering algorithms are divided into K class using by the matrix X of N*P so that the distance between object is most in class Greatly, the distance between class minimum；

Parameter declaration：

X is the matrix or data frame being made of data；

Centers is the center of the number either initial classes of cluster；

It is 10 when iter.max maximum iterations are default；

The number of nstart random collections.

Algorithm specifies the algorithm for cluster calculation；

Association analysis function confint (object, parm, level=0.95 ...)

Confint functions can obtain the confidence interval of model parameter；

Parameter declaration：

The model that object is suitble to；

ARIMA time series modeling functions arima (x, order=c (0,0,0), seasonal=list (order=c (0,0, 0), period=NA), xreg=NULL, include.mean=TRUE, transform.pars=TRUE, fixed= NULL, init=NULL, method=c (" CSS-ML ", " ML ", " CSS "), n.cond, optim.method=" BFGS ", Optim.control=list (), kappa=1e+06)

ARIMA models refer to nonstationary time series being converted into stationary time series, then by dependent variable only to its lag The present worth and lagged value of value and stochastic error are returned established model；

Parameter declaration：

X digital vectors or univariate time series；

Order integer vectors are the exponent number (p, d, q) of model；

Xreg time serieses, vector or regression matrix；

Function of time time (x ...)

cycle(x,...)

Mono- time series object of x；

Function of time ar (x, aic=TRUE, order.max=NULL, method=c (" yule-walker ", " burg ", “ols”,

" mle ", " yw "), na.action=na.fail, series=deparse (substitute (x)) ...)

Parameter declaration：

order.max：Autoregressive maximum sequence is suitble to time series；

na.action：Handle the function of missing values.