CN108897818A - Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state - Google Patents

Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state Download PDF

Info

Publication number
CN108897818A
CN108897818A CN201810638889.4A CN201810638889A CN108897818A CN 108897818 A CN108897818 A CN 108897818A CN 201810638889 A CN201810638889 A CN 201810638889A CN 108897818 A CN108897818 A CN 108897818A
Authority
CN
China
Prior art keywords
handling procedure
data handling
binomial
data
predicted value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810638889.4A
Other languages
Chinese (zh)
Other versions
CN108897818B (en
Inventor
喻灿
夏睿
刘强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201810638889.4A priority Critical patent/CN108897818B/en
Publication of CN108897818A publication Critical patent/CN108897818A/en
Application granted granted Critical
Publication of CN108897818B publication Critical patent/CN108897818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application provides the method, apparatus and readable storage medium storing program for executing of a kind of determining data handling procedure ageing state, and wherein method includes:It is determined to the data handling procedure dispatched by data warehouse;Determine the aging character vector of the data handling procedure;The aging character vector is input in the binomial disaggregated model trained, predicted value corresponding with the data handling procedure is calculated by the binomial disaggregated model;The ageing state of the data handling procedure is determined based on the predicted value.The application can comprehensively, accurately predict the ageing state of data handling procedure based on the predicted value that binomial disaggregated model is calculated, and actively discover the data handling procedure of aging in time;The data handling procedure that can also find aging in real time substantially increases the timeliness and detection effect of data handling procedure detection.

Description

Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state
Technical field
This application involves technical field of data processing more particularly to a kind of sides of determining data handling procedure ageing state Method, device and readable storage medium storing program for executing.
Background technique
In data warehouse, data pick-up (Extract), conversion (Transform), load (Load) are (referred to as ) etc. ETL data handling procedures are constantly superimposed with the promotion of system complexity.In order to reduce system complexity, the prior art A kind of life cycle being achieved in that by registering the data handling procedure before data handling procedure is online, works as data processing When process reaches its corresponding life cycle, then the corresponding data of the data handling procedure is achieved and destroyed.However, by The ageing state of data handling procedure cannot be accurately determined in the processing mode based on life cycle, if life cycle arrived But data handling procedure still has business value, and data handling procedure causes to lose its business value, simultaneously meeting due to destroyed Cause partial data treatment process itself not have business value, but due to the limitation of life cycle configuration, do not obtain and When destruction.
Summary of the invention
In view of this, the application provides detection method, device and the readable storage medium storing program for executing of a kind of data handling procedure, pass through The ageing state of active detecting data handling procedure, it is ensured that the data handling procedure for not having business value can be sold in time It ruins, and avoids the data handling procedure still with business value destroyed.
To achieve the above object, it is as follows to provide technical solution by the application:
According to a first aspect of the present application, a kind of method of determining data handling procedure ageing state is proposed, including:
It is determined to the data handling procedure dispatched by data warehouse;
Determine the aging character vector of the data handling procedure;
The aging character vector is input in the binomial disaggregated model trained, the binomial disaggregated model meter is passed through Calculate predicted value corresponding with the data handling procedure;
The ageing state of the data handling procedure is determined based on the predicted value.
According to a second aspect of the present application, a kind of state of determining data handling procedure is proposed, including:
First determining module, the data handling procedure for being determined to be dispatched by data warehouse;
Second determining module, for determining the aging character of the determining data handling procedure of first determining module Vector;
Computing module, the aging character vector for determining second determining module are input to trained two In item disaggregated model, predicted value corresponding with the data handling procedure is calculated by the binomial disaggregated model;
Third determining module, for determining the data processing based on the predicted value that the computing module is calculated The ageing state of process.
According to the third aspect of the application, a kind of computer readable storage medium is proposed, the storage medium is stored with Computer program, the computer program are used to execute the determination data handling procedure ageing state that above-mentioned first aspect proposes Method.
By above technical scheme as it can be seen that the application can be comprehensive, quasi- based on the predicted value that binomial disaggregated model is calculated The ageing state of data handling procedure really is predicted, actively discovers the data handling procedure of aging in time;It can also send out in real time The data handling procedure of existing aging, substantially increases the timeliness and detection effect of data handling procedure detection.
Detailed description of the invention
Fig. 1 is a kind of stream of the method for determining data handling procedure ageing state shown in one exemplary embodiment of the application Cheng Tu.
Fig. 2 is a kind of method of determining data handling procedure ageing state shown in the application another exemplary embodiment Flow chart.
Fig. 3 is the flow chart of the training binomial disaggregated model shown in one exemplary embodiment of the application.
Fig. 4 is a kind of knot of the device of determining data handling procedure ageing state shown in one exemplary embodiment of the application Composition.
Fig. 5 is a kind of device of determining data handling procedure ageing state shown in the application another exemplary embodiment Structure chart.
Fig. 6 is the structure chart of a kind of electronic equipment shown in one exemplary embodiment of the application.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects be described in detail in claims, the application.
It is only to be not intended to be limiting the application merely for for the purpose of describing particular embodiments in term used in this application. It is also intended in the application and the "an" of singular used in the attached claims, " described " and "the" including majority Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps It may be combined containing one or more associated any or all of project listed.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the application A little information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not departing from In the case where the application range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to as One information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... When " or " in response to determination ".
Fig. 1 is a kind of stream of the method for determining data handling procedure ageing state shown in one exemplary embodiment of the application Cheng Tu;As shown in Figure 1, including the following steps:
Step 101, it is determined to the data handling procedure dispatched by data warehouse.
In one embodiment, data handling procedure can be the data pick-up in data warehouse, data conversion, data The processes such as loading, data update.In one embodiment, each data processing in data warehouse can be recorded by tables of data The state of journey is determined to the data handling procedure dispatched by data warehouse by tables of data.Wherein, data processing The state of journey may include state under presence and line, and presence indicates that data handling procedure can be counted in data warehouse It is executed according to the scheduling system call in warehouse, state table shows that data handling procedure will not be scheduled system in data warehouse under line Scheduling executes.For example, there is 100 data handling procedures in data warehouse, wherein 80 data handling procedures are in threadiness State indicates that this 80 data handling procedures can be called or be accessed by other data handling procedures, 20 data handling procedures The state under line indicates that this 20 data handling procedures are not called or accessed by other data handling procedures, passes through Step 101, can find from this 100 data handling procedures can be processed by 80 data that data warehouse calls Journey.
Step 102, the aging character vector of data handling procedure is determined.
In one embodiment, aging character vector may include multiple elements, each element can be to indicate at data The parameter of process characteristic is managed, for example, when the logic latest update of the downstream dependence number of data handling procedure, data handling procedure Between, the inquiry temperature of data handling procedure, etc..Wherein, downstream rely on that number indicates could be after after notebook data treatment process executes The number of the continuous downstream data treatment process executed, for example, for data handling procedure A1, A2, A3, it is necessary to need data processing After process A is finished, data handling procedure A1, A2, A3 can be executed, therefore data handling procedure A obtains downstream and relies on number It is 3;Logic latest update duration indicates the time point of the logic last time modification of data handling procedure apart from current point in time Time span can be measured as unit of day or hour etc., for example, data handling procedure was 1 day 24 April in 2018:00 Modification, current point in time is 3 days 12 April in 2018:00, then a length of 36 hours when logic latest update;Inquiring temperature indicates The corresponding data model of notebook data treatment process is in the preset duration apart from current point in time (for example, nearest one day or N days, N For natural number) it is queried the number of (alternatively, self-service query).
Step 103, aging character vector is input in the binomial disaggregated model trained, passes through binomial disaggregated model meter Calculate predicted value corresponding with data handling procedure.
In one embodiment, binomial disaggregated model can be decision-tree model, Naive Bayes Classification Model, binomial logic This base of a fruit regression model, etc., for different binomial disaggregated models, training parameter is also different, and the application is to binomial disaggregated model Design parameter with no restrictions.In one embodiment, the size of predicted value can by the output of specific binomial disaggregated model Lai It determines, for example, predicted value is between 0 to 1 for binomial Multiple regression model.
Step 104, the ageing state of data handling procedure is determined based on predicted value.
In one embodiment, the size of predicted value can indicate the degree of aging of data handling procedure, for example, data processing The aging character vector of process is input in binomial Multiple regression model, obtains the predicted value of the data handling procedure, should Predicted value is between 0 to 1, if predicted value is closer to 1, corresponding ageing state indicates serious aging, and data handling procedure is Do not called by data warehouse, for predicted value closer to 0, corresponding ageing state indicates unaged, data handling procedure still by Data warehouse calls.
In the present embodiment, since the calculating of predicted value is calculated by the aging character vector participation of data handling procedure , the element in aging character vector can reflect out the called or accessed state of data handling procedure, therefore be based on The predicted value that binomial disaggregated model is calculated can comprehensively, accurately predict the ageing state of data handling procedure, in time Actively discover the data handling procedure of aging;In addition, the application passes through the ageing state of active detecting data handling procedure, in real time It was found that the data handling procedure of aging, it is ensured that the data handling procedure for not having business value can be destroyed in time, and be kept away The data handling procedure exempted from still with business value is destroyed, substantially increases the timeliness and detection of data handling procedure detection Effect, and then ensure in data warehouse still to there is the data handling procedure of business value to keep online, it is ensured that data ecology is good for Kang Yunhang.
Fig. 2 is a kind of method of determining data handling procedure ageing state shown in the application another exemplary embodiment Flow chart, the present embodiment is on the basis of above-mentioned embodiment illustrated in fig. 1, how to determine data handling procedure based on predicted value Ageing state and how determining illustrates for the aging character vector of data handling procedure, as shown in Fig. 2, packet Include following steps:
Step 201, it is determined to the data handling procedure dispatched by data warehouse.
The description of step 201 may refer to the description of above-mentioned embodiment illustrated in fig. 1, and this will not be detailed here.
Step 202, characteristic parameter of the data handling procedure in predetermined period is determined.
In one embodiment, predetermined period can be as unit of day, can also be as unit of hour, for example, with 1 day for one A period, characteristic parameter of the detection data treatment process in 1 day before current point in time.Characteristic parameter can be figure The downstream of data handling procedure described in 1 illustrated embodiment relies on number, the logic latest update time of data handling procedure, number According to the inquiry temperature, etc. for the treatment of process.In one embodiment, the training parameter of the quantity of characteristic parameter and binomial disaggregated model Quantity it is identical.
Step 203, the aging character vector of data handling procedure is determined based on characteristic parameter.
In one embodiment, can be determined according to the parameter vector in binomial disaggregated model characteristic parameter aging character to Sequence in amount, for example, the putting in order for element in parameter vector in binomial disaggregated model relies on number, logic most for downstream New renewal time inquires temperature, then putting in order for the characteristic parameter in aging character vector is also:Downstream relies on number, logic The latest update time inquires temperature, for another example the element in the parameter vector in binomial disaggregated model puts in order as logic Latest update time, downstream rely on number, inquiry temperature, then putting in order for the characteristic parameter in aging character vector is also:It patrols Collect the latest update time, downstream relies on number, inquiry temperature.
Step 204, aging character vector is input in the binomial disaggregated model trained, passes through binomial disaggregated model meter Calculate predicted value corresponding with data handling procedure.
In one embodiment, when binomial disaggregated model is binomial Multiple regression model, following formula meter can be passed through Calculate predicted value:
Wherein, hθ(X) predicted value is indicated, X indicates that aging character vector, θ indicate in binomial Multiple regression model Parameter vector, the element in parameter vector are obtained by training, and T indicates that vector is inverted, wherein the dimension of aging character vector with The dimension of parameter vector is identical.
According to the feature of sigmoid function it is found that hθ(X) closer to 1, the degree of aging of data handling procedure is heavier, The offline probability of data handling procedure is bigger, hθ(X) closer to 0, the degree of aging of data handling procedure is lighter, maintains data The online probability for the treatment of process is bigger.
Step 205, the size relation of predicted value and preset threshold is determined.
In one embodiment, preset threshold can be determined according to different binomial disaggregated models, for example, when binomial is classified When model is item Multiple regression model, predicted value can set 0.5 for preset threshold, need to illustrate between 0 to 1 It is that the application with no restrictions, can determine the quantity and size of preset threshold according to specific binomial disaggregated model, and And the result of ageing state can be adjusted by finely tuning the size of preset threshold.
Step 206, the ageing state of data handling procedure is determined based on size relation.
For example, data handling procedure can be considered as at data weathered in data warehouse when predicted value is greater than 0.5 Data handling procedure can be considered as also not aged data processing in data warehouse when surveying measured value less than 0.5 by reason process Journey, there is still a need for called or access for data handling procedure.
The present embodiment is on the basis of above-mentioned embodiment illustrated in fig. 1, due to spy of the data handling procedure in predetermined period Sign parameter can accurately embody the called or accessed state of data handling procedure, therefore determined based on characteristic parameter Aging character vector can also accurately embody the called or accessed state of data handling procedure, so from aging character to Measuring the predicted value obtained as the input of binomial disaggregated model can be mentioned significantly with the ageing state of Accurate Prediction data handling procedure The high accuracy of predicted value;Due to can by finely tune preset threshold size come adjust ageing state as a result, can make The ageing state that size relation based on predicted value and preset threshold obtains can accurately embody the true of data handling procedure State, it is ensured that the accuracy of prediction result.
On the basis of above-mentioned Fig. 1 and embodiment illustrated in fig. 2, determined at data by step 104 or step 206 After the ageing state of reason process, it can also include the following steps:
The ageing state of data handling procedure is generated into notification message;
Push the notification message.
By automatic push notification message, the close loop maneuver to data handling procedure may be implemented, alleviate by manually tieing up Protect the work of the life cycle of data handling procedure.
Optionally, can also include the following steps:
If ageing state indicates that data handling procedure needs to be adjusted to down status by presence, data processing is controlled Process is adjusted to down status by presence.
Down status is adjusted to by presence by automatically controlling data handling procedure, may be implemented processed to data The close loop maneuver of journey alleviates the workload of the life cycle by manual maintenance data handling procedure.
Fig. 3 is the flow chart of the training binomial disaggregated model shown in one exemplary embodiment of the application, and the present embodiment is with such as What illustrated for training binomial disaggregated model, as shown in figure 3, including the following steps:
Step 301, determining can be by data warehouse scheduling in the preset time period before preset time point Data handling procedure.
In one embodiment, preset time point can be current point in time, or apart from one timing of current point in time Between section time point, the application to specific time of preset time point with no restrictions.In one embodiment, preset time period can be with As unit of year or the moon, time span be can be set longer, so that it is guaranteed that can be quasi- by the parameter vector that sample obtains The ageing state of true response data treatment process.In one embodiment, the metadata of data handling procedure can be integrated, For being for example online (that is, the state in 1 year does not change) and once in current point in time nearest 1 year The data processing of off-line state (that is, the state in 1 year becomes off-line state by presence) is changed into after being online Process is counted.
Step 302, be determined to by data warehouse dispatch the corresponding aging character vector of data handling procedure with And corresponding state change.
In one embodiment, similar with the description of above-mentioned statistics presence, it can be for apart from preset time point (example Such as, current point in time) it is 1 year nearest in the characteristic parameter of data handling procedure that is online counted, characteristic parameter Such as can be:The downstream of data handling procedure relies on number, the logic latest update time of data handling procedure, data processing Query engine temperature of journey etc..It can refer to above-mentioned embodiment illustrated in fig. 2 by the description that characteristic parameter obtains aging character vector Description, this will not be detailed here.
Step 303, the aging character vector sum based on the data handling procedure that can be dispatched by data warehouse is corresponding State change, generate training sample set.
For example, the training sample set generated is:
{(X1,y1),(X2,y2),...,(Xm,ym)}
Wherein, yiIndicate data handling procedure in the state of predetermined period and previous predetermined period where preset time point Difference label, 0 indicate keep the current preset period and the state of previous predetermined period it is constant, for example, the state of the previous day for Line, then the state on the same day is still maintained at line, and 1 expression current preset period and the state of previous predetermined period change, for example, The state of previous predetermined period be it is online, then the state for the predetermined period being currently located need to carry out offline operation;Xi=(xi1, xi2,...,xid) indicate d dimensional feature sample space in an example, xi1,xi2,...,xidxi1,xi2,...,xidIndicate example Corresponding d characteristic parameter, d characteristic parameter stroke aging character vector described herein.
Step 304, training sample set, training binomial disaggregated model are based on.
It is illustrated so that binomial disaggregated model is binomial Multiple regression model as an example,:
Construct loss function:
Above-mentioned loss function indicates as the y=1 of a sample have using the calculated loss of θ parameter much.When obtaining When final parameter θ, error when being classified by binomial logistic model degree training dataset is minimum.
Calculating makes loss function minimum θ:
According to gradient descent method, the renewal process for obtaining θ is:
In the present embodiment, each data handling procedure in data warehouse can be reflected due to training sample set The corresponding state change of aging character vector sum for the data handling procedure that can be dispatched by data warehouse, therefore trained To binomial disaggregated model in parameter vector may insure the ageing state of Accurate Prediction data handling procedure.
On the basis of above-mentioned embodiment illustrated in fig. 3, it can also include the following steps:
Obtain the training sample increased newly in predetermined period;
Based on newly-increased training sample more new training sample set;
Based on updated training sample set training binomial disaggregated model.
In one embodiment, for predetermined period as unit of the moon or day, the application does not do the specific time of predetermined period Limitation.How the description based on newly-increased training sample set training binomial disaggregated model can refer to above-mentioned embodiment illustrated in fig. 3 Description, and will not be described here in detail.
Pass through updated training sample set training binomial disaggregated model, it can be ensured that binomial disaggregated model calculates predicted value Accuracy.
Corresponding with the embodiment of method of aforementioned determining data handling procedure ageing state, present invention also provides determinations The embodiment of the device of data handling procedure ageing state.
Fig. 4 is a kind of knot of the device of determining data handling procedure ageing state shown in one exemplary embodiment of the application Composition, referring to FIG. 4, device includes:
First determining module 41, the data handling procedure for being determined to be dispatched by data warehouse;
Second determining module 42, for determine the first determining module 41 determine data handling procedure aging character to Amount;
Computing module 43, the aging character vector for determining the second determining module 42 are input to the binomial point trained In class model, predicted value corresponding with data handling procedure is calculated by binomial disaggregated model;
Third determining module 44, the predicted value for being calculated based on computing module 43 determine the old of data handling procedure Change state.
In the present embodiment, since predicted value is aging character vector participation calculating of the computing module 43 by data handling procedure It obtains, the element in aging character vector can reflect out the called or accessed state of data handling procedure, therefore The predicted value that third determining module 44 is calculated based on binomial disaggregated model can comprehensively, accurately predict data processing The ageing state of journey actively discovers the data handling procedure of aging in time;In addition, passing through 42 active detecting number of the second determining module According to the aging character vector for the treatment of process, it can be ensured that the data handling procedure of the discovery aging in real time of third determining module 44, it is ensured that The data handling procedure for not having business value can be destroyed in time, and avoid the data processing still with business value Journey is destroyed, substantially increases the timeliness and detection effect of data handling procedure detection, and then ensures still have in data warehouse The data handling procedure for having business to be worth keeps online, it is ensured that the health operation of data ecology.
Fig. 5 is a kind of device of determining data handling procedure ageing state shown in the application another exemplary embodiment Structure chart, as shown in figure 5, on the basis of above-mentioned embodiment illustrated in fig. 4, third determining module 44 may include:
First determination unit 441, for determining the size relation of predicted value and preset threshold;
Second determination unit 442, the size relation for being determined based on the first determination unit 441 determine data handling procedure Ageing state.
Due to it is can adjusting ageing state by finely tuning the size of preset threshold as a result, can make based on predicted value with The ageing state that the size relation of preset threshold obtains can accurately embody the time of day of data handling procedure, it is ensured that prediction As a result accuracy.
In one embodiment, the second determining module 42 may include:
Third determination unit 421, for determining characteristic parameter of the data handling procedure in predetermined period;
4th determination unit 422, the characteristic parameter for being determined based on third determination unit 421 determine data handling procedure Aging character vector.
It is adjusted since characteristic parameter of the data handling procedure in predetermined period can accurately embody data handling procedure With or accessed state, therefore the aging character vector that the 4th determination unit 422 is determined based on characteristic parameter can also be quasi- The called or accessed state of data handling procedure is really embodied, and then by aging character vector as binomial disaggregated model The accuracy of predicted value can be substantially increased with the ageing state of Accurate Prediction data handling procedure by inputting obtained predicted value.
In one embodiment, device further includes:
4th determining module 45, can be by data warehouse in the preset time period before preset time point for determination The data handling procedure of system call;
5th determining module 46, the number that can be dispatched by data warehouse determined for determining the 4th determining module 45 According to the corresponding aging character vector for the treatment of process and corresponding state change;
Sample set generation module 47, for based on the 5th determining module 46 determine can by data warehouse dispatch The aging character vector of data handling procedure and corresponding state change generate training sample set;
Training module 48, the training sample set for being obtained based on sample set generation module 47, training binomial disaggregated model, Computing module 43 calculates predicted value based on the binomial disaggregated model after training.
Can reflect each data handling procedure in data warehouse due to training sample set can be by data The corresponding state change of aging character vector sum of the data handling procedure of warehouse system scheduling, therefore instructed by training module 48 The parameter vector in binomial disaggregated model got may insure the ageing state of Accurate Prediction data handling procedure.
In one embodiment, binomial disaggregated model is binomial Multiple regression model, and computing module 43 passes through following public Formula calculates predicted value:
Wherein, hθ(X) predicted value is indicated, X indicates that aging character vector, θ indicate in binomial Multiple regression model Parameter vector, the element in parameter vector are obtained by training, and T indicates that vector is inverted.
The application determines that the embodiment of the device of data handling procedure ageing state can be applied in electronic equipment.Device is real Applying example can also be realized by software realization by way of hardware or software and hardware combining.Taking software implementation as an example, make It is by the processor of electronic equipment where it by meter corresponding in nonvolatile memory for the device on a logical meaning Calculation machine program instruction is read into memory what operation was formed.For hardware view, as shown in fig. 6, being determined at data for this Shen A kind of hardware structure diagram of electronic equipment where the device of reason process ageing state, in addition to processor shown in fig. 6, memory, net Except network interface and nonvolatile memory, the list equipment in embodiment where device is generally according to the electronic equipment Actual functional capability can also include other hardware, repeat no more to this.
Wherein, nonvolatile memory is alternatively referred to as computer readable storage medium, is stored thereon with computer program, meter Calculation machine program is used to execute the method flow in the above-mentioned any illustrated embodiment of Fig. 1-Fig. 3.
The function of each unit and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus Realization process, details are not described herein.For device embodiment, since it corresponds essentially to embodiment of the method, so related Place illustrates referring to the part of embodiment of the method.The apparatus embodiments described above are merely exemplary, wherein institute Stating unit as illustrated by the separation member may or may not be physically separated, and component shown as a unit can To be or may not be physical unit, it can it is in one place, or may be distributed over multiple network units. Some or all of the modules therein can be selected to realize the purpose of application scheme according to the actual needs.This field is common Technical staff can understand and implement without creative efforts.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the application Its embodiment.This application is intended to cover any variations, uses, or adaptations of the application, these modifications, purposes or Person's adaptive change follows the general principle of the application and including the undocumented common knowledge in the art of the application Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the application are by following Claim is pointed out.
The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the application protection.

Claims (11)

1. a kind of method of determining data handling procedure ageing state, which is characterized in that the method includes:
It is determined to the data handling procedure dispatched by data warehouse;
Determine the aging character vector of the data handling procedure;
The aging character vector is input in the binomial disaggregated model trained, is calculated by the binomial disaggregated model Predicted value corresponding with the data handling procedure;
The ageing state of the data handling procedure is determined based on the predicted value.
2. the method according to claim 1, wherein described determine the data processing based on the predicted value The ageing state of journey, including:
Determine the size relation of the predicted value and preset threshold;
The ageing state of the data handling procedure is determined based on the size relation.
3. the method according to claim 1, wherein the aging character of the determination data handling procedure to The step of amount, including:
Determine characteristic parameter of the data handling procedure in predetermined period;
The aging character vector of the data handling procedure is determined based on the characteristic parameter.
4. the method according to claim 1, wherein the method also includes:
Determine the data handling procedure that can be dispatched by data warehouse in the preset time period before preset time point;
The corresponding aging character vector of data handling procedure that can be dispatched by data warehouse described in determination and corresponding State change;
Based on corresponding described in the corresponding aging character vector sum of data handling procedure that can be dispatched by data warehouse State change, generate training sample set;
Based on the training sample set, the training binomial disaggregated model.
5. the method according to claim 1, wherein the binomial disaggregated model is binomial logistic regression mould Type leads in described the step of calculating predicted value corresponding with the data handling procedure by the binomial disaggregated model It crosses following formula and calculates the predicted value:
Wherein, hθ(X) predicted value is indicated, X indicates that the aging character vector, θ indicate the binomial logistic regression mould Parameter vector in type, the element in the parameter vector are obtained by training, and T indicates that vector is inverted.
6. a kind of device of the state of determining data handling procedure, which is characterized in that described device includes:
First determining module, the data handling procedure for being determined to be dispatched by data warehouse;
Second determining module, for determine the aging character of the data handling procedure that first determining module determines to Amount;
Computing module, the aging character vector for determining second determining module are input to the binomial point trained In class model, predicted value corresponding with the data handling procedure is calculated by the binomial disaggregated model;
Third determining module, for determining the data handling procedure based on the predicted value that the computing module is calculated Ageing state.
7. device according to claim 6, which is characterized in that the third determining module includes:
First determination unit, for determining the size relation of the predicted value and preset threshold;
Second determination unit, the size relation for being determined based on first determination unit determine the data processing The ageing state of journey.
8. device according to claim 6, which is characterized in that second determining module includes:
Third determination unit, for determining characteristic parameter of the data handling procedure in predetermined period;
4th determination unit, the characteristic parameter for being determined based on the third determination unit determine the data processing The aging character vector of journey.
9. device according to claim 6, which is characterized in that described device further includes:
4th determining module, can be by data warehouse tune in the preset time period before preset time point for determination The data handling procedure of degree;
5th determining module, for determining that the 4th determining module can be by the number of data warehouse scheduling described in determining According to the corresponding aging character vector for the treatment of process and corresponding state change;
Sample set generation module, for based on the 5th determining module determine described in can by data warehouse dispatch The aging character vector of data handling procedure and corresponding state change generate training sample set;
Training module, the training sample set for being obtained based on the sample set generation module, the training binomial classification Model.
10. according to the method described in claim 6, it is characterized in that, the binomial disaggregated model is binomial logistic regression Model, the computing module calculate the predicted value by following formula:
Wherein, hθ(X) predicted value is indicated, X indicates that the aging character vector, θ indicate the binomial logistic regression mould Parameter vector in type, the element in the parameter vector are obtained by training, and T indicates that vector is inverted.
11. a kind of computer readable storage medium, which is characterized in that the storage medium is stored with computer program, the meter The method that calculation machine program is used to execute any determination data handling procedure ageing state of the claims 1-5.
CN201810638889.4A 2018-06-20 2018-06-20 Method and device for determining aging state of data processing process and readable storage medium Active CN108897818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810638889.4A CN108897818B (en) 2018-06-20 2018-06-20 Method and device for determining aging state of data processing process and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810638889.4A CN108897818B (en) 2018-06-20 2018-06-20 Method and device for determining aging state of data processing process and readable storage medium

Publications (2)

Publication Number Publication Date
CN108897818A true CN108897818A (en) 2018-11-27
CN108897818B CN108897818B (en) 2020-12-01

Family

ID=64345294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810638889.4A Active CN108897818B (en) 2018-06-20 2018-06-20 Method and device for determining aging state of data processing process and readable storage medium

Country Status (1)

Country Link
CN (1) CN108897818B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110297820A (en) * 2019-06-28 2019-10-01 京东数字科技控股有限公司 A kind of data processing method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7058615B2 (en) * 2003-04-24 2006-06-06 International Business Machines Corporation Scheduling for data warehouse ETL processing and data mining execution
CN102117306A (en) * 2010-01-04 2011-07-06 阿里巴巴集团控股有限公司 Method and system for monitoring ETL (extract-transform-load) data processing process
CN102999528A (en) * 2011-09-16 2013-03-27 阿里巴巴集团控股有限公司 Method and device for ETL (Extract Transform and Load) task off-lining and data cleaning in data warehouse
CN106407233A (en) * 2015-08-03 2017-02-15 阿里巴巴集团控股有限公司 A data processing method and apparatus
CN107484189A (en) * 2017-07-27 2017-12-15 北京市天元网络技术股份有限公司 LTE data handling systems
US20170371939A1 (en) * 2016-06-23 2017-12-28 International Business Machines Corporation Shipping of data through etl stages
CN107977754A (en) * 2017-12-18 2018-05-01 深圳前海微众银行股份有限公司 Data predication method, system and computer-readable recording medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7058615B2 (en) * 2003-04-24 2006-06-06 International Business Machines Corporation Scheduling for data warehouse ETL processing and data mining execution
CN102117306A (en) * 2010-01-04 2011-07-06 阿里巴巴集团控股有限公司 Method and system for monitoring ETL (extract-transform-load) data processing process
CN102999528A (en) * 2011-09-16 2013-03-27 阿里巴巴集团控股有限公司 Method and device for ETL (Extract Transform and Load) task off-lining and data cleaning in data warehouse
CN106407233A (en) * 2015-08-03 2017-02-15 阿里巴巴集团控股有限公司 A data processing method and apparatus
US20170371939A1 (en) * 2016-06-23 2017-12-28 International Business Machines Corporation Shipping of data through etl stages
CN107484189A (en) * 2017-07-27 2017-12-15 北京市天元网络技术股份有限公司 LTE data handling systems
CN107977754A (en) * 2017-12-18 2018-05-01 深圳前海微众银行股份有限公司 Data predication method, system and computer-readable recording medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
嵇晓 等: "工业数据仓库设计方法及其在质量分析中的应用", 《控制与决策》 *
黎毅编著: "《新世纪研究生教学用书 会计实证研究方法》", 30 July 2015, 东北财经大学出版社 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110297820A (en) * 2019-06-28 2019-10-01 京东数字科技控股有限公司 A kind of data processing method, device, equipment and storage medium
CN110297820B (en) * 2019-06-28 2020-09-01 京东数字科技控股有限公司 Data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN108897818B (en) 2020-12-01

Similar Documents

Publication Publication Date Title
US11734066B2 (en) Resource scheduling using machine learning
US10163061B2 (en) Quality-directed adaptive analytic retraining
US20200226503A1 (en) Predictive issue detection
WO2019062414A1 (en) Method and apparatus for managing and controlling application program, storage medium, and electronic device
CN108337358B (en) Application cleaning method and device, storage medium and electronic equipment
US20140180982A1 (en) Dynamic model data facility and automated operational model building and usage
CN107704070B (en) Application cleaning method and device, storage medium and electronic equipment
WO2019062413A1 (en) Method and apparatus for managing and controlling application program, storage medium, and electronic device
US11301705B2 (en) Object detection using multiple neural network configurations
CN111105786B (en) Multi-sampling-rate voice recognition method, device, system and storage medium
CN107870810B (en) Application cleaning method and device, storage medium and electronic equipment
CN112529301A (en) Power consumption prediction method, equipment and storage medium
Toor et al. Adaptive telecom churn prediction for concept-sensitive imbalance data streams
Boovaraghavan et al. MLIoT: An end-to-end machine learning system for the Internet-of-Things
US9053434B2 (en) Determining an obverse weight
CN108897818A (en) Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state
US20220172298A1 (en) Utilizing a machine learning model for predicting issues associated with a closing process of an entity
Leroux et al. Mobile application usage prediction through context-based learning
Lo Predicting software reliability with support vector machines
CN109961163A (en) Gender prediction's method, apparatus, storage medium and electronic equipment
CN114443896B (en) Data processing method and method for training predictive model
US20190065987A1 (en) Capturing knowledge coverage of machine learning models
CN111898626B (en) Model determination method and device and electronic equipment
US20220292393A1 (en) Utilizing machine learning models to generate initiative plans
Sagaama et al. Automatic parameter tuning for big data pipelines with deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant