CN108897818A - Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state - Google Patents
Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state Download PDFInfo
- Publication number
- CN108897818A CN108897818A CN201810638889.4A CN201810638889A CN108897818A CN 108897818 A CN108897818 A CN 108897818A CN 201810638889 A CN201810638889 A CN 201810638889A CN 108897818 A CN108897818 A CN 108897818A
- Authority
- CN
- China
- Prior art keywords
- handling procedure
- data handling
- binomial
- data
- predicted value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The application provides the method, apparatus and readable storage medium storing program for executing of a kind of determining data handling procedure ageing state, and wherein method includes:It is determined to the data handling procedure dispatched by data warehouse;Determine the aging character vector of the data handling procedure;The aging character vector is input in the binomial disaggregated model trained, predicted value corresponding with the data handling procedure is calculated by the binomial disaggregated model;The ageing state of the data handling procedure is determined based on the predicted value.The application can comprehensively, accurately predict the ageing state of data handling procedure based on the predicted value that binomial disaggregated model is calculated, and actively discover the data handling procedure of aging in time;The data handling procedure that can also find aging in real time substantially increases the timeliness and detection effect of data handling procedure detection.
Description
Technical field
This application involves technical field of data processing more particularly to a kind of sides of determining data handling procedure ageing state
Method, device and readable storage medium storing program for executing.
Background technique
In data warehouse, data pick-up (Extract), conversion (Transform), load (Load) are (referred to as
) etc. ETL data handling procedures are constantly superimposed with the promotion of system complexity.In order to reduce system complexity, the prior art
A kind of life cycle being achieved in that by registering the data handling procedure before data handling procedure is online, works as data processing
When process reaches its corresponding life cycle, then the corresponding data of the data handling procedure is achieved and destroyed.However, by
The ageing state of data handling procedure cannot be accurately determined in the processing mode based on life cycle, if life cycle arrived
But data handling procedure still has business value, and data handling procedure causes to lose its business value, simultaneously meeting due to destroyed
Cause partial data treatment process itself not have business value, but due to the limitation of life cycle configuration, do not obtain and
When destruction.
Summary of the invention
In view of this, the application provides detection method, device and the readable storage medium storing program for executing of a kind of data handling procedure, pass through
The ageing state of active detecting data handling procedure, it is ensured that the data handling procedure for not having business value can be sold in time
It ruins, and avoids the data handling procedure still with business value destroyed.
To achieve the above object, it is as follows to provide technical solution by the application:
According to a first aspect of the present application, a kind of method of determining data handling procedure ageing state is proposed, including:
It is determined to the data handling procedure dispatched by data warehouse;
Determine the aging character vector of the data handling procedure;
The aging character vector is input in the binomial disaggregated model trained, the binomial disaggregated model meter is passed through
Calculate predicted value corresponding with the data handling procedure;
The ageing state of the data handling procedure is determined based on the predicted value.
According to a second aspect of the present application, a kind of state of determining data handling procedure is proposed, including:
First determining module, the data handling procedure for being determined to be dispatched by data warehouse;
Second determining module, for determining the aging character of the determining data handling procedure of first determining module
Vector;
Computing module, the aging character vector for determining second determining module are input to trained two
In item disaggregated model, predicted value corresponding with the data handling procedure is calculated by the binomial disaggregated model;
Third determining module, for determining the data processing based on the predicted value that the computing module is calculated
The ageing state of process.
According to the third aspect of the application, a kind of computer readable storage medium is proposed, the storage medium is stored with
Computer program, the computer program are used to execute the determination data handling procedure ageing state that above-mentioned first aspect proposes
Method.
By above technical scheme as it can be seen that the application can be comprehensive, quasi- based on the predicted value that binomial disaggregated model is calculated
The ageing state of data handling procedure really is predicted, actively discovers the data handling procedure of aging in time;It can also send out in real time
The data handling procedure of existing aging, substantially increases the timeliness and detection effect of data handling procedure detection.
Detailed description of the invention
Fig. 1 is a kind of stream of the method for determining data handling procedure ageing state shown in one exemplary embodiment of the application
Cheng Tu.
Fig. 2 is a kind of method of determining data handling procedure ageing state shown in the application another exemplary embodiment
Flow chart.
Fig. 3 is the flow chart of the training binomial disaggregated model shown in one exemplary embodiment of the application.
Fig. 4 is a kind of knot of the device of determining data handling procedure ageing state shown in one exemplary embodiment of the application
Composition.
Fig. 5 is a kind of device of determining data handling procedure ageing state shown in the application another exemplary embodiment
Structure chart.
Fig. 6 is the structure chart of a kind of electronic equipment shown in one exemplary embodiment of the application.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended
The example of the consistent device and method of some aspects be described in detail in claims, the application.
It is only to be not intended to be limiting the application merely for for the purpose of describing particular embodiments in term used in this application.
It is also intended in the application and the "an" of singular used in the attached claims, " described " and "the" including majority
Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps
It may be combined containing one or more associated any or all of project listed.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the application
A little information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not departing from
In the case where the application range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to as
One information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ...
When " or " in response to determination ".
Fig. 1 is a kind of stream of the method for determining data handling procedure ageing state shown in one exemplary embodiment of the application
Cheng Tu;As shown in Figure 1, including the following steps:
Step 101, it is determined to the data handling procedure dispatched by data warehouse.
In one embodiment, data handling procedure can be the data pick-up in data warehouse, data conversion, data
The processes such as loading, data update.In one embodiment, each data processing in data warehouse can be recorded by tables of data
The state of journey is determined to the data handling procedure dispatched by data warehouse by tables of data.Wherein, data processing
The state of journey may include state under presence and line, and presence indicates that data handling procedure can be counted in data warehouse
It is executed according to the scheduling system call in warehouse, state table shows that data handling procedure will not be scheduled system in data warehouse under line
Scheduling executes.For example, there is 100 data handling procedures in data warehouse, wherein 80 data handling procedures are in threadiness
State indicates that this 80 data handling procedures can be called or be accessed by other data handling procedures, 20 data handling procedures
The state under line indicates that this 20 data handling procedures are not called or accessed by other data handling procedures, passes through
Step 101, can find from this 100 data handling procedures can be processed by 80 data that data warehouse calls
Journey.
Step 102, the aging character vector of data handling procedure is determined.
In one embodiment, aging character vector may include multiple elements, each element can be to indicate at data
The parameter of process characteristic is managed, for example, when the logic latest update of the downstream dependence number of data handling procedure, data handling procedure
Between, the inquiry temperature of data handling procedure, etc..Wherein, downstream rely on that number indicates could be after after notebook data treatment process executes
The number of the continuous downstream data treatment process executed, for example, for data handling procedure A1, A2, A3, it is necessary to need data processing
After process A is finished, data handling procedure A1, A2, A3 can be executed, therefore data handling procedure A obtains downstream and relies on number
It is 3;Logic latest update duration indicates the time point of the logic last time modification of data handling procedure apart from current point in time
Time span can be measured as unit of day or hour etc., for example, data handling procedure was 1 day 24 April in 2018:00
Modification, current point in time is 3 days 12 April in 2018:00, then a length of 36 hours when logic latest update;Inquiring temperature indicates
The corresponding data model of notebook data treatment process is in the preset duration apart from current point in time (for example, nearest one day or N days, N
For natural number) it is queried the number of (alternatively, self-service query).
Step 103, aging character vector is input in the binomial disaggregated model trained, passes through binomial disaggregated model meter
Calculate predicted value corresponding with data handling procedure.
In one embodiment, binomial disaggregated model can be decision-tree model, Naive Bayes Classification Model, binomial logic
This base of a fruit regression model, etc., for different binomial disaggregated models, training parameter is also different, and the application is to binomial disaggregated model
Design parameter with no restrictions.In one embodiment, the size of predicted value can by the output of specific binomial disaggregated model Lai
It determines, for example, predicted value is between 0 to 1 for binomial Multiple regression model.
Step 104, the ageing state of data handling procedure is determined based on predicted value.
In one embodiment, the size of predicted value can indicate the degree of aging of data handling procedure, for example, data processing
The aging character vector of process is input in binomial Multiple regression model, obtains the predicted value of the data handling procedure, should
Predicted value is between 0 to 1, if predicted value is closer to 1, corresponding ageing state indicates serious aging, and data handling procedure is
Do not called by data warehouse, for predicted value closer to 0, corresponding ageing state indicates unaged, data handling procedure still by
Data warehouse calls.
In the present embodiment, since the calculating of predicted value is calculated by the aging character vector participation of data handling procedure
, the element in aging character vector can reflect out the called or accessed state of data handling procedure, therefore be based on
The predicted value that binomial disaggregated model is calculated can comprehensively, accurately predict the ageing state of data handling procedure, in time
Actively discover the data handling procedure of aging;In addition, the application passes through the ageing state of active detecting data handling procedure, in real time
It was found that the data handling procedure of aging, it is ensured that the data handling procedure for not having business value can be destroyed in time, and be kept away
The data handling procedure exempted from still with business value is destroyed, substantially increases the timeliness and detection of data handling procedure detection
Effect, and then ensure in data warehouse still to there is the data handling procedure of business value to keep online, it is ensured that data ecology is good for
Kang Yunhang.
Fig. 2 is a kind of method of determining data handling procedure ageing state shown in the application another exemplary embodiment
Flow chart, the present embodiment is on the basis of above-mentioned embodiment illustrated in fig. 1, how to determine data handling procedure based on predicted value
Ageing state and how determining illustrates for the aging character vector of data handling procedure, as shown in Fig. 2, packet
Include following steps:
Step 201, it is determined to the data handling procedure dispatched by data warehouse.
The description of step 201 may refer to the description of above-mentioned embodiment illustrated in fig. 1, and this will not be detailed here.
Step 202, characteristic parameter of the data handling procedure in predetermined period is determined.
In one embodiment, predetermined period can be as unit of day, can also be as unit of hour, for example, with 1 day for one
A period, characteristic parameter of the detection data treatment process in 1 day before current point in time.Characteristic parameter can be figure
The downstream of data handling procedure described in 1 illustrated embodiment relies on number, the logic latest update time of data handling procedure, number
According to the inquiry temperature, etc. for the treatment of process.In one embodiment, the training parameter of the quantity of characteristic parameter and binomial disaggregated model
Quantity it is identical.
Step 203, the aging character vector of data handling procedure is determined based on characteristic parameter.
In one embodiment, can be determined according to the parameter vector in binomial disaggregated model characteristic parameter aging character to
Sequence in amount, for example, the putting in order for element in parameter vector in binomial disaggregated model relies on number, logic most for downstream
New renewal time inquires temperature, then putting in order for the characteristic parameter in aging character vector is also:Downstream relies on number, logic
The latest update time inquires temperature, for another example the element in the parameter vector in binomial disaggregated model puts in order as logic
Latest update time, downstream rely on number, inquiry temperature, then putting in order for the characteristic parameter in aging character vector is also:It patrols
Collect the latest update time, downstream relies on number, inquiry temperature.
Step 204, aging character vector is input in the binomial disaggregated model trained, passes through binomial disaggregated model meter
Calculate predicted value corresponding with data handling procedure.
In one embodiment, when binomial disaggregated model is binomial Multiple regression model, following formula meter can be passed through
Calculate predicted value:
Wherein, hθ(X) predicted value is indicated, X indicates that aging character vector, θ indicate in binomial Multiple regression model
Parameter vector, the element in parameter vector are obtained by training, and T indicates that vector is inverted, wherein the dimension of aging character vector with
The dimension of parameter vector is identical.
According to the feature of sigmoid function it is found that hθ(X) closer to 1, the degree of aging of data handling procedure is heavier,
The offline probability of data handling procedure is bigger, hθ(X) closer to 0, the degree of aging of data handling procedure is lighter, maintains data
The online probability for the treatment of process is bigger.
Step 205, the size relation of predicted value and preset threshold is determined.
In one embodiment, preset threshold can be determined according to different binomial disaggregated models, for example, when binomial is classified
When model is item Multiple regression model, predicted value can set 0.5 for preset threshold, need to illustrate between 0 to 1
It is that the application with no restrictions, can determine the quantity and size of preset threshold according to specific binomial disaggregated model, and
And the result of ageing state can be adjusted by finely tuning the size of preset threshold.
Step 206, the ageing state of data handling procedure is determined based on size relation.
For example, data handling procedure can be considered as at data weathered in data warehouse when predicted value is greater than 0.5
Data handling procedure can be considered as also not aged data processing in data warehouse when surveying measured value less than 0.5 by reason process
Journey, there is still a need for called or access for data handling procedure.
The present embodiment is on the basis of above-mentioned embodiment illustrated in fig. 1, due to spy of the data handling procedure in predetermined period
Sign parameter can accurately embody the called or accessed state of data handling procedure, therefore determined based on characteristic parameter
Aging character vector can also accurately embody the called or accessed state of data handling procedure, so from aging character to
Measuring the predicted value obtained as the input of binomial disaggregated model can be mentioned significantly with the ageing state of Accurate Prediction data handling procedure
The high accuracy of predicted value;Due to can by finely tune preset threshold size come adjust ageing state as a result, can make
The ageing state that size relation based on predicted value and preset threshold obtains can accurately embody the true of data handling procedure
State, it is ensured that the accuracy of prediction result.
On the basis of above-mentioned Fig. 1 and embodiment illustrated in fig. 2, determined at data by step 104 or step 206
After the ageing state of reason process, it can also include the following steps:
The ageing state of data handling procedure is generated into notification message;
Push the notification message.
By automatic push notification message, the close loop maneuver to data handling procedure may be implemented, alleviate by manually tieing up
Protect the work of the life cycle of data handling procedure.
Optionally, can also include the following steps:
If ageing state indicates that data handling procedure needs to be adjusted to down status by presence, data processing is controlled
Process is adjusted to down status by presence.
Down status is adjusted to by presence by automatically controlling data handling procedure, may be implemented processed to data
The close loop maneuver of journey alleviates the workload of the life cycle by manual maintenance data handling procedure.
Fig. 3 is the flow chart of the training binomial disaggregated model shown in one exemplary embodiment of the application, and the present embodiment is with such as
What illustrated for training binomial disaggregated model, as shown in figure 3, including the following steps:
Step 301, determining can be by data warehouse scheduling in the preset time period before preset time point
Data handling procedure.
In one embodiment, preset time point can be current point in time, or apart from one timing of current point in time
Between section time point, the application to specific time of preset time point with no restrictions.In one embodiment, preset time period can be with
As unit of year or the moon, time span be can be set longer, so that it is guaranteed that can be quasi- by the parameter vector that sample obtains
The ageing state of true response data treatment process.In one embodiment, the metadata of data handling procedure can be integrated,
For being for example online (that is, the state in 1 year does not change) and once in current point in time nearest 1 year
The data processing of off-line state (that is, the state in 1 year becomes off-line state by presence) is changed into after being online
Process is counted.
Step 302, be determined to by data warehouse dispatch the corresponding aging character vector of data handling procedure with
And corresponding state change.
In one embodiment, similar with the description of above-mentioned statistics presence, it can be for apart from preset time point (example
Such as, current point in time) it is 1 year nearest in the characteristic parameter of data handling procedure that is online counted, characteristic parameter
Such as can be:The downstream of data handling procedure relies on number, the logic latest update time of data handling procedure, data processing
Query engine temperature of journey etc..It can refer to above-mentioned embodiment illustrated in fig. 2 by the description that characteristic parameter obtains aging character vector
Description, this will not be detailed here.
Step 303, the aging character vector sum based on the data handling procedure that can be dispatched by data warehouse is corresponding
State change, generate training sample set.
For example, the training sample set generated is:
{(X1,y1),(X2,y2),...,(Xm,ym)}
Wherein, yiIndicate data handling procedure in the state of predetermined period and previous predetermined period where preset time point
Difference label, 0 indicate keep the current preset period and the state of previous predetermined period it is constant, for example, the state of the previous day for
Line, then the state on the same day is still maintained at line, and 1 expression current preset period and the state of previous predetermined period change, for example,
The state of previous predetermined period be it is online, then the state for the predetermined period being currently located need to carry out offline operation;Xi=(xi1,
xi2,...,xid) indicate d dimensional feature sample space in an example, xi1,xi2,...,xidxi1,xi2,...,xidIndicate example
Corresponding d characteristic parameter, d characteristic parameter stroke aging character vector described herein.
Step 304, training sample set, training binomial disaggregated model are based on.
It is illustrated so that binomial disaggregated model is binomial Multiple regression model as an example,:
Construct loss function:
Above-mentioned loss function indicates as the y=1 of a sample have using the calculated loss of θ parameter much.When obtaining
When final parameter θ, error when being classified by binomial logistic model degree training dataset is minimum.
Calculating makes loss function minimum θ:
According to gradient descent method, the renewal process for obtaining θ is:
In the present embodiment, each data handling procedure in data warehouse can be reflected due to training sample set
The corresponding state change of aging character vector sum for the data handling procedure that can be dispatched by data warehouse, therefore trained
To binomial disaggregated model in parameter vector may insure the ageing state of Accurate Prediction data handling procedure.
On the basis of above-mentioned embodiment illustrated in fig. 3, it can also include the following steps:
Obtain the training sample increased newly in predetermined period;
Based on newly-increased training sample more new training sample set;
Based on updated training sample set training binomial disaggregated model.
In one embodiment, for predetermined period as unit of the moon or day, the application does not do the specific time of predetermined period
Limitation.How the description based on newly-increased training sample set training binomial disaggregated model can refer to above-mentioned embodiment illustrated in fig. 3
Description, and will not be described here in detail.
Pass through updated training sample set training binomial disaggregated model, it can be ensured that binomial disaggregated model calculates predicted value
Accuracy.
Corresponding with the embodiment of method of aforementioned determining data handling procedure ageing state, present invention also provides determinations
The embodiment of the device of data handling procedure ageing state.
Fig. 4 is a kind of knot of the device of determining data handling procedure ageing state shown in one exemplary embodiment of the application
Composition, referring to FIG. 4, device includes:
First determining module 41, the data handling procedure for being determined to be dispatched by data warehouse;
Second determining module 42, for determine the first determining module 41 determine data handling procedure aging character to
Amount;
Computing module 43, the aging character vector for determining the second determining module 42 are input to the binomial point trained
In class model, predicted value corresponding with data handling procedure is calculated by binomial disaggregated model;
Third determining module 44, the predicted value for being calculated based on computing module 43 determine the old of data handling procedure
Change state.
In the present embodiment, since predicted value is aging character vector participation calculating of the computing module 43 by data handling procedure
It obtains, the element in aging character vector can reflect out the called or accessed state of data handling procedure, therefore
The predicted value that third determining module 44 is calculated based on binomial disaggregated model can comprehensively, accurately predict data processing
The ageing state of journey actively discovers the data handling procedure of aging in time;In addition, passing through 42 active detecting number of the second determining module
According to the aging character vector for the treatment of process, it can be ensured that the data handling procedure of the discovery aging in real time of third determining module 44, it is ensured that
The data handling procedure for not having business value can be destroyed in time, and avoid the data processing still with business value
Journey is destroyed, substantially increases the timeliness and detection effect of data handling procedure detection, and then ensures still have in data warehouse
The data handling procedure for having business to be worth keeps online, it is ensured that the health operation of data ecology.
Fig. 5 is a kind of device of determining data handling procedure ageing state shown in the application another exemplary embodiment
Structure chart, as shown in figure 5, on the basis of above-mentioned embodiment illustrated in fig. 4, third determining module 44 may include:
First determination unit 441, for determining the size relation of predicted value and preset threshold;
Second determination unit 442, the size relation for being determined based on the first determination unit 441 determine data handling procedure
Ageing state.
Due to it is can adjusting ageing state by finely tuning the size of preset threshold as a result, can make based on predicted value with
The ageing state that the size relation of preset threshold obtains can accurately embody the time of day of data handling procedure, it is ensured that prediction
As a result accuracy.
In one embodiment, the second determining module 42 may include:
Third determination unit 421, for determining characteristic parameter of the data handling procedure in predetermined period;
4th determination unit 422, the characteristic parameter for being determined based on third determination unit 421 determine data handling procedure
Aging character vector.
It is adjusted since characteristic parameter of the data handling procedure in predetermined period can accurately embody data handling procedure
With or accessed state, therefore the aging character vector that the 4th determination unit 422 is determined based on characteristic parameter can also be quasi-
The called or accessed state of data handling procedure is really embodied, and then by aging character vector as binomial disaggregated model
The accuracy of predicted value can be substantially increased with the ageing state of Accurate Prediction data handling procedure by inputting obtained predicted value.
In one embodiment, device further includes:
4th determining module 45, can be by data warehouse in the preset time period before preset time point for determination
The data handling procedure of system call;
5th determining module 46, the number that can be dispatched by data warehouse determined for determining the 4th determining module 45
According to the corresponding aging character vector for the treatment of process and corresponding state change;
Sample set generation module 47, for based on the 5th determining module 46 determine can by data warehouse dispatch
The aging character vector of data handling procedure and corresponding state change generate training sample set;
Training module 48, the training sample set for being obtained based on sample set generation module 47, training binomial disaggregated model,
Computing module 43 calculates predicted value based on the binomial disaggregated model after training.
Can reflect each data handling procedure in data warehouse due to training sample set can be by data
The corresponding state change of aging character vector sum of the data handling procedure of warehouse system scheduling, therefore instructed by training module 48
The parameter vector in binomial disaggregated model got may insure the ageing state of Accurate Prediction data handling procedure.
In one embodiment, binomial disaggregated model is binomial Multiple regression model, and computing module 43 passes through following public
Formula calculates predicted value:
Wherein, hθ(X) predicted value is indicated, X indicates that aging character vector, θ indicate in binomial Multiple regression model
Parameter vector, the element in parameter vector are obtained by training, and T indicates that vector is inverted.
The application determines that the embodiment of the device of data handling procedure ageing state can be applied in electronic equipment.Device is real
Applying example can also be realized by software realization by way of hardware or software and hardware combining.Taking software implementation as an example, make
It is by the processor of electronic equipment where it by meter corresponding in nonvolatile memory for the device on a logical meaning
Calculation machine program instruction is read into memory what operation was formed.For hardware view, as shown in fig. 6, being determined at data for this Shen
A kind of hardware structure diagram of electronic equipment where the device of reason process ageing state, in addition to processor shown in fig. 6, memory, net
Except network interface and nonvolatile memory, the list equipment in embodiment where device is generally according to the electronic equipment
Actual functional capability can also include other hardware, repeat no more to this.
Wherein, nonvolatile memory is alternatively referred to as computer readable storage medium, is stored thereon with computer program, meter
Calculation machine program is used to execute the method flow in the above-mentioned any illustrated embodiment of Fig. 1-Fig. 3.
The function of each unit and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus
Realization process, details are not described herein.For device embodiment, since it corresponds essentially to embodiment of the method, so related
Place illustrates referring to the part of embodiment of the method.The apparatus embodiments described above are merely exemplary, wherein institute
Stating unit as illustrated by the separation member may or may not be physically separated, and component shown as a unit can
To be or may not be physical unit, it can it is in one place, or may be distributed over multiple network units.
Some or all of the modules therein can be selected to realize the purpose of application scheme according to the actual needs.This field is common
Technical staff can understand and implement without creative efforts.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the application
Its embodiment.This application is intended to cover any variations, uses, or adaptations of the application, these modifications, purposes or
Person's adaptive change follows the general principle of the application and including the undocumented common knowledge in the art of the application
Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the application are by following
Claim is pointed out.
The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the application protection.
Claims (11)
1. a kind of method of determining data handling procedure ageing state, which is characterized in that the method includes:
It is determined to the data handling procedure dispatched by data warehouse;
Determine the aging character vector of the data handling procedure;
The aging character vector is input in the binomial disaggregated model trained, is calculated by the binomial disaggregated model
Predicted value corresponding with the data handling procedure;
The ageing state of the data handling procedure is determined based on the predicted value.
2. the method according to claim 1, wherein described determine the data processing based on the predicted value
The ageing state of journey, including:
Determine the size relation of the predicted value and preset threshold;
The ageing state of the data handling procedure is determined based on the size relation.
3. the method according to claim 1, wherein the aging character of the determination data handling procedure to
The step of amount, including:
Determine characteristic parameter of the data handling procedure in predetermined period;
The aging character vector of the data handling procedure is determined based on the characteristic parameter.
4. the method according to claim 1, wherein the method also includes:
Determine the data handling procedure that can be dispatched by data warehouse in the preset time period before preset time point;
The corresponding aging character vector of data handling procedure that can be dispatched by data warehouse described in determination and corresponding
State change;
Based on corresponding described in the corresponding aging character vector sum of data handling procedure that can be dispatched by data warehouse
State change, generate training sample set;
Based on the training sample set, the training binomial disaggregated model.
5. the method according to claim 1, wherein the binomial disaggregated model is binomial logistic regression mould
Type leads in described the step of calculating predicted value corresponding with the data handling procedure by the binomial disaggregated model
It crosses following formula and calculates the predicted value:
Wherein, hθ(X) predicted value is indicated, X indicates that the aging character vector, θ indicate the binomial logistic regression mould
Parameter vector in type, the element in the parameter vector are obtained by training, and T indicates that vector is inverted.
6. a kind of device of the state of determining data handling procedure, which is characterized in that described device includes:
First determining module, the data handling procedure for being determined to be dispatched by data warehouse;
Second determining module, for determine the aging character of the data handling procedure that first determining module determines to
Amount;
Computing module, the aging character vector for determining second determining module are input to the binomial point trained
In class model, predicted value corresponding with the data handling procedure is calculated by the binomial disaggregated model;
Third determining module, for determining the data handling procedure based on the predicted value that the computing module is calculated
Ageing state.
7. device according to claim 6, which is characterized in that the third determining module includes:
First determination unit, for determining the size relation of the predicted value and preset threshold;
Second determination unit, the size relation for being determined based on first determination unit determine the data processing
The ageing state of journey.
8. device according to claim 6, which is characterized in that second determining module includes:
Third determination unit, for determining characteristic parameter of the data handling procedure in predetermined period;
4th determination unit, the characteristic parameter for being determined based on the third determination unit determine the data processing
The aging character vector of journey.
9. device according to claim 6, which is characterized in that described device further includes:
4th determining module, can be by data warehouse tune in the preset time period before preset time point for determination
The data handling procedure of degree;
5th determining module, for determining that the 4th determining module can be by the number of data warehouse scheduling described in determining
According to the corresponding aging character vector for the treatment of process and corresponding state change;
Sample set generation module, for based on the 5th determining module determine described in can by data warehouse dispatch
The aging character vector of data handling procedure and corresponding state change generate training sample set;
Training module, the training sample set for being obtained based on the sample set generation module, the training binomial classification
Model.
10. according to the method described in claim 6, it is characterized in that, the binomial disaggregated model is binomial logistic regression
Model, the computing module calculate the predicted value by following formula:
Wherein, hθ(X) predicted value is indicated, X indicates that the aging character vector, θ indicate the binomial logistic regression mould
Parameter vector in type, the element in the parameter vector are obtained by training, and T indicates that vector is inverted.
11. a kind of computer readable storage medium, which is characterized in that the storage medium is stored with computer program, the meter
The method that calculation machine program is used to execute any determination data handling procedure ageing state of the claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810638889.4A CN108897818B (en) | 2018-06-20 | 2018-06-20 | Method and device for determining aging state of data processing process and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810638889.4A CN108897818B (en) | 2018-06-20 | 2018-06-20 | Method and device for determining aging state of data processing process and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108897818A true CN108897818A (en) | 2018-11-27 |
CN108897818B CN108897818B (en) | 2020-12-01 |
Family
ID=64345294
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810638889.4A Active CN108897818B (en) | 2018-06-20 | 2018-06-20 | Method and device for determining aging state of data processing process and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108897818B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110297820A (en) * | 2019-06-28 | 2019-10-01 | 京东数字科技控股有限公司 | A kind of data processing method, device, equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7058615B2 (en) * | 2003-04-24 | 2006-06-06 | International Business Machines Corporation | Scheduling for data warehouse ETL processing and data mining execution |
CN102117306A (en) * | 2010-01-04 | 2011-07-06 | 阿里巴巴集团控股有限公司 | Method and system for monitoring ETL (extract-transform-load) data processing process |
CN102999528A (en) * | 2011-09-16 | 2013-03-27 | 阿里巴巴集团控股有限公司 | Method and device for ETL (Extract Transform and Load) task off-lining and data cleaning in data warehouse |
CN106407233A (en) * | 2015-08-03 | 2017-02-15 | 阿里巴巴集团控股有限公司 | A data processing method and apparatus |
CN107484189A (en) * | 2017-07-27 | 2017-12-15 | 北京市天元网络技术股份有限公司 | LTE data handling systems |
US20170371939A1 (en) * | 2016-06-23 | 2017-12-28 | International Business Machines Corporation | Shipping of data through etl stages |
CN107977754A (en) * | 2017-12-18 | 2018-05-01 | 深圳前海微众银行股份有限公司 | Data predication method, system and computer-readable recording medium |
-
2018
- 2018-06-20 CN CN201810638889.4A patent/CN108897818B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7058615B2 (en) * | 2003-04-24 | 2006-06-06 | International Business Machines Corporation | Scheduling for data warehouse ETL processing and data mining execution |
CN102117306A (en) * | 2010-01-04 | 2011-07-06 | 阿里巴巴集团控股有限公司 | Method and system for monitoring ETL (extract-transform-load) data processing process |
CN102999528A (en) * | 2011-09-16 | 2013-03-27 | 阿里巴巴集团控股有限公司 | Method and device for ETL (Extract Transform and Load) task off-lining and data cleaning in data warehouse |
CN106407233A (en) * | 2015-08-03 | 2017-02-15 | 阿里巴巴集团控股有限公司 | A data processing method and apparatus |
US20170371939A1 (en) * | 2016-06-23 | 2017-12-28 | International Business Machines Corporation | Shipping of data through etl stages |
CN107484189A (en) * | 2017-07-27 | 2017-12-15 | 北京市天元网络技术股份有限公司 | LTE data handling systems |
CN107977754A (en) * | 2017-12-18 | 2018-05-01 | 深圳前海微众银行股份有限公司 | Data predication method, system and computer-readable recording medium |
Non-Patent Citations (2)
Title |
---|
嵇晓 等: "工业数据仓库设计方法及其在质量分析中的应用", 《控制与决策》 * |
黎毅编著: "《新世纪研究生教学用书 会计实证研究方法》", 30 July 2015, 东北财经大学出版社 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110297820A (en) * | 2019-06-28 | 2019-10-01 | 京东数字科技控股有限公司 | A kind of data processing method, device, equipment and storage medium |
CN110297820B (en) * | 2019-06-28 | 2020-09-01 | 京东数字科技控股有限公司 | Data processing method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108897818B (en) | 2020-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11734066B2 (en) | Resource scheduling using machine learning | |
US10163061B2 (en) | Quality-directed adaptive analytic retraining | |
US20200226503A1 (en) | Predictive issue detection | |
WO2019062414A1 (en) | Method and apparatus for managing and controlling application program, storage medium, and electronic device | |
CN108337358B (en) | Application cleaning method and device, storage medium and electronic equipment | |
US20140180982A1 (en) | Dynamic model data facility and automated operational model building and usage | |
CN107704070B (en) | Application cleaning method and device, storage medium and electronic equipment | |
WO2019062413A1 (en) | Method and apparatus for managing and controlling application program, storage medium, and electronic device | |
US11301705B2 (en) | Object detection using multiple neural network configurations | |
CN111105786B (en) | Multi-sampling-rate voice recognition method, device, system and storage medium | |
CN107870810B (en) | Application cleaning method and device, storage medium and electronic equipment | |
CN112529301A (en) | Power consumption prediction method, equipment and storage medium | |
Toor et al. | Adaptive telecom churn prediction for concept-sensitive imbalance data streams | |
Boovaraghavan et al. | MLIoT: An end-to-end machine learning system for the Internet-of-Things | |
US9053434B2 (en) | Determining an obverse weight | |
CN108897818A (en) | Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state | |
US20220172298A1 (en) | Utilizing a machine learning model for predicting issues associated with a closing process of an entity | |
Leroux et al. | Mobile application usage prediction through context-based learning | |
Lo | Predicting software reliability with support vector machines | |
CN109961163A (en) | Gender prediction's method, apparatus, storage medium and electronic equipment | |
CN114443896B (en) | Data processing method and method for training predictive model | |
US20190065987A1 (en) | Capturing knowledge coverage of machine learning models | |
CN111898626B (en) | Model determination method and device and electronic equipment | |
US20220292393A1 (en) | Utilizing machine learning models to generate initiative plans | |
Sagaama et al. | Automatic parameter tuning for big data pipelines with deep reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |