CN118295682A - Application updating method and device based on time sequence data - Google Patents

Application updating method and device based on time sequence data Download PDF

Info

Publication number
CN118295682A
CN118295682A CN202410484148.0A CN202410484148A CN118295682A CN 118295682 A CN118295682 A CN 118295682A CN 202410484148 A CN202410484148 A CN 202410484148A CN 118295682 A CN118295682 A CN 118295682A
Authority
CN
China
Prior art keywords
data
time sequence
model
sequence data
decision tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410484148.0A
Other languages
Chinese (zh)
Inventor
王贺
夏冬
苏德
武钊庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Publication of CN118295682A publication Critical patent/CN118295682A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses an application updating method and device based on time sequence data, which relate to the technical field of artificial intelligence and can also be used in the financial field, and the method comprises the following steps: receiving time sequence data of application used by a user; detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data; and determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason. Based on multidimensional information, the method and the device are oriented to user experience quantification indexes, so that abnormal identification and analysis of time sequence data of the user are realized, and user experience analysis reports can be generated without the participation of expert manpower, so that the application can be updated more directionally and pertinently.

Description

Application updating method and device based on time sequence data
Technical Field
The application belongs to the technical field of artificial intelligence, and can be applied to the technical field of the financial industry, in particular to an application updating method and device based on time sequence data.
Background
In the prior art, a large number of user experience analysis methods exist, but most of the methods are simple calculation of quantitative indexes, and few of the methods are used for carrying out anomaly identification, but all the methods are manually processed when the reasons of the anomalies are analyzed, so that the method is time-consuming and labor-consuming. Specifically:
1. Abnormal identification and reason analysis of the user experience quantization index can not be automatically completed, and expert manpower is required to participate;
2. The input data types of the existing anomaly detection model are not limited to generalization capability of the model, and when facing a time sequence analysis task, the model can only acquire time sequence data in a normal mode, so that the model is easy to be over-fitted during training;
3. Most of the abnormal attributions of the existing users to the time sequence data of the application are based on statistics and manpower, have no visualization and flexibility, can only deal with simple linear relations, and are difficult to deal with nonlinear, dynamic and interdependent relations in a complex system.
Disclosure of Invention
The invention can be used in the technical field of application of artificial intelligence processing technology in finance, and can also be used in any field except the finance field.
The invention aims to provide an application updating method based on time sequence data, which is based on multidimensional information and oriented to user experience quantitative indexes to realize automatic anomaly identification and analysis, so that a user experience analysis report can be automatically generated without the participation of expert manpower to update the application more directionally and pertinently.
Another object of the present invention is to provide an application updating apparatus based on time series data. It is a further object of the present invention to provide an electronic device comprising a memory storing a computer program and a processor implementing the steps of the above-mentioned time-series data based application updating method when the processor executes the computer program. It is a further object of the present invention to provide a readable medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described time-series data based application updating method.
In order to solve the technical problems in the background technology of the application, the application provides the following technical scheme:
In a first aspect, the present invention provides an application update method based on time series data, including:
Receiving time sequence data of application used by a user;
detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
And determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
In some embodiments of the invention, the step of generating the time series data anomaly detection model comprises:
And performing the following iterative operation until the target value and the true value corresponding to the current training round of the time sequence data anomaly detection model are smaller than a preset threshold value:
Generating a target value of the current training round according to the residual error of the last training round of the time sequence data anomaly detection model;
determining a decision tree model to be added for the current training round according to the target value;
And generating a time sequence data anomaly detection model corresponding to the current training round according to the decision tree model added by the current training round and the decision tree model corresponding to the training round before the current training round.
In some embodiments of the present invention, before determining the decision tree model to which the current training round should be added according to the target value, the method further comprises:
Calculating a loss function of each node in the decision tree model corresponding to the training round before the current training round;
and determining the adding nodes of the decision tree model to be added in the current training round according to the loss function.
In some embodiments of the invention, determining a decision tree model to which the current training round should be added based on the target value comprises:
determining the decision tree model to be added according to the target value;
and adding the decision tree model to be added to the adding node.
In some embodiments of the invention, the step of generating the graph model comprises:
generating an initial graph model of the graph model according to a plurality of business data causing the abnormal data; wherein, the node of the initial graph model at least corresponds to one business data, and the edge of the initial graph model at least corresponds to one business data;
Determining the direction of the edge, the weight of the node and the weight of the edge according to the association relation among the plurality of service data;
and generating the graph model according to the initial graph model, the direction of the edge, the weight of the node and the weight of the edge.
In some embodiments of the present invention, determining a cause of the anomaly data from the anomaly data and a pre-generated graph model includes:
Traversing the graph model according to the abnormal data to calculate the contribution degree of each node and the contribution degree of each side;
and determining the reason for causing the abnormal data according to the contribution degree of each node and the contribution degree of each edge.
In some embodiments of the present invention, detecting abnormal data of the time series data according to the time series data and a pre-generated time series data abnormal detection model includes:
calculating the learning rate, the depth and the number of decision trees corresponding to the time sequence data according to the time sequence data abnormality detection model;
And determining the abnormal data according to the learning rate, the depth of the decision tree and the number.
In a second aspect, the present invention provides an application updating apparatus based on time series data, the apparatus comprising:
the time sequence data receiving module is used for receiving time sequence data of application used by a user;
The abnormal data detection module is used for detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
and the reason determining module is used for determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model and updating the application according to the reason.
In some embodiments of the present invention, an application updating apparatus based on time series data further includes:
a detection model generation module, configured to generate the time-series data anomaly detection model, where the detection model generation module includes:
the iterative operation unit is used for performing the following iterative operation until the target value and the true value corresponding to the current training round of the time sequence data anomaly detection model are smaller than a preset threshold value:
The target value generation unit is used for generating a target value of the current training round according to the residual error of the last training round of the time sequence data anomaly detection model;
the decision tree model determining unit is used for determining a decision tree model to be added in the current training round according to the target value;
The detection model generation unit is used for generating a time sequence data abnormality detection model corresponding to the current training round according to the decision tree model added by the current training round and the decision tree model corresponding to the training round before the current training round.
In some embodiments of the invention, the detection model generation module further comprises:
the node loss function calculation unit is used for calculating the loss function of each node in the decision tree model corresponding to the training round before the current training round;
And the adding node determining unit is used for determining adding nodes of the decision tree model to be added in the current training round according to the loss function.
In some embodiments of the invention, the decision tree model determination unit comprises:
a decision tree model determining subunit, configured to determine the decision tree model to be added according to the target value;
and the decision tree model adding unit is used for adding the decision tree model to be added into the adding node.
In some embodiments of the present invention, an application updating apparatus based on time series data further includes:
A graph model generation module for generating the graph model, the graph model generation module comprising:
An initial model generation unit configured to generate an initial graph model of the graph model from a plurality of pieces of traffic data that cause the abnormal data; wherein, the node of the initial graph model at least corresponds to one business data, and the edge of the initial graph model at least corresponds to one business data;
A direction weight determining unit, configured to determine a direction of the edge, a weight of the node, and a weight of the edge according to an association relationship between the plurality of service data;
and the graph model generating unit is used for generating the graph model according to the initial graph model, the direction of the edge, the weight of the node and the weight of the edge.
In some embodiments of the invention, the cause determination module comprises:
A contribution calculation unit for traversing the graph model according to the abnormal data to calculate the contribution of each node and the contribution of each edge;
And the reason determining unit is used for determining the reason for causing the abnormal data according to the contribution degree of each node and the contribution degree of each side.
In some embodiments of the invention, the abnormal data detection module includes:
The learning rate calculation unit is used for calculating the learning rate, the depth of the decision tree and the number corresponding to the time sequence data according to the time sequence data abnormality detection model;
And the abnormal data determining unit is used for determining the abnormal data according to the learning rate, the depth of the decision tree and the number.
In a third aspect, the invention provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of a method of updating an application based on time series data.
In a fourth aspect, the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of a method for updating an application based on time series data when the program is executed by the processor.
In a fifth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a time series data based application updating method.
As can be seen from the above description, the embodiment of the present invention provides an application updating method and apparatus based on time series data, and the corresponding method includes: firstly, receiving time sequence data of application used by a user; then, detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data; and finally, determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
According to the application updating method and device based on the time sequence data, firstly, based on multidimensional information, the user experience quantization index is oriented, the abnormal time sequence data is identified and analyzed, and the user experience analysis report can be automatically generated without the participation of expert manpower.
Secondly, the invention provides a new time sequence data abnormality detection model, and a self-supervision learning module is introduced on the basis of the model, so that general data can be subjected to generalized characteristic representation learning, and the generalization capability of the model can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart showing a method for updating an application based on time series data according to an embodiment of the present invention;
FIG. 2 is a second flow chart of an application update method based on time series data according to an embodiment of the invention;
FIG. 3 is a flowchart illustrating a step 400 of an application updating method based on time series data according to an embodiment of the present invention;
FIG. 4 is a second flowchart illustrating a step 400 of an application update method based on time series data according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a step 403 of an application update method based on time series data according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating a third step 400 of an application update method based on time series data according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating a method for updating applications based on time series data according to an embodiment of the present invention;
FIG. 8 is a flowchart illustrating a method 500 for updating an application based on time series data according to an embodiment of the present invention;
FIG. 9 is a flowchart illustrating a step 300 of an application update method based on time series data according to an embodiment of the present invention;
FIG. 10 is a flowchart illustrating a step 200 of an application update method based on time series data according to an embodiment of the present invention;
FIG. 11 is a flowchart of an application update method based on time series data according to an embodiment of the present invention;
FIG. 12 is a schematic flow chart of step S1 in the embodiment of the invention;
FIG. 13 is a schematic flow chart of step S2 in the embodiment of the invention;
FIG. 14 is a block diagram of an application updating device based on time series data according to an embodiment of the invention;
FIG. 15 is a block diagram of a timing data based application update device according to an embodiment of the invention;
FIG. 16 is a block diagram I of a detection model generation module 40 according to an embodiment of the present invention;
FIG. 17 is a second block diagram of the detection model generation module 40 according to an embodiment of the present invention;
FIG. 18 is a block diagram of a decision tree model determination unit 40c in an embodiment of the present invention;
FIG. 19 is a block diagram III of an application updating device based on time series data according to an embodiment of the invention;
FIG. 20 is a block diagram of the graph model generation module 50 in an embodiment of the invention;
FIG. 21 is a block diagram of the cause determination module 30 according to an embodiment of the present invention;
FIG. 22 is a block diagram of the abnormal data detection module 20 according to an embodiment of the present invention;
Fig. 23 is a schematic structural diagram of an electronic device in an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It is noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of the present application and in the foregoing figures, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus. Embodiments of the application and features of the embodiments may be combined with each other without conflict. The application will be described in detail below with reference to the drawings in connection with embodiments.
The information collected in the technical scheme is information and data which are authorized by a user or are fully authorized by each party, and the related data are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, so that the information complies with related laws and regulations and standards of related countries and regions, necessary security measures are adopted, the public welfare is not violated, and a corresponding operation entrance is provided for the user to select authorization or rejection.
Providing a corresponding operation inlet for the user, and enabling the user to select to agree or reject the automatic decision result; if the user selects refusal, the expert decision flow is entered.
An embodiment of the present invention provides a specific implementation manner of an application update method based on time series data, referring to fig. 1, the method specifically includes the following contents:
step 100: receiving time sequence data of application used by a user;
step 200: detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
step 300: and determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
As can be seen from the above description, the embodiment of the present invention provides an application updating method based on time series data, including: firstly, receiving time sequence data of application used by a user; then, detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data; and finally, determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
According to the application updating method based on the time sequence data, firstly, based on multidimensional information, the user experience quantization index is oriented, the identification and analysis of abnormal time sequence data are achieved, and the user experience analysis report can be automatically generated without the participation of expert manpower.
Secondly, the invention provides a new time sequence data abnormality detection model, and a self-supervision learning module is introduced on the basis of the model, so that general data can be subjected to generalized characteristic representation learning, and the generalization capability of the model can be improved.
The time series data in step 100 refers to a sequence of data points arranged in a time series, wherein each data point is associated with a time stamp. The time sequence data is characterized by comprising the following steps:
time dependence: there is a time dependence between the data points, i.e. the data at the previous point in time may affect the data at the subsequent point in time.
Seasonal: the pattern repeatedly appears within a certain period of time.
Trending: data show a long-term rising or falling trend over time.
Periodicity: in addition to the usual seasonality, there may be long-period fluctuations, such as economic periods and the like.
Noise: random fluctuations that may be introduced during data collection and transmission.
Preferably, the time series data in step 100 includes: user experience quantization index: satisfaction score, user dwell time, user click-through volume, conversion rate, average access duration, jump rate, page load speed, user return visit rate. Characteristic data of the user: such as age, asset, login frequency, transfer frequency, investment record.
It will be appreciated that the time series data anomaly detection model may result in overfitting of training data, particularly when the amount of data is small or the feature space is large. The time sequence data abnormality detection model has a plurality of super parameters to be optimized, such as the depth of the tree, the learning rate and the like. Proper selection of these hyper-parameters is critical to the performance of the model, but requires extensive experimentation and debugging. The temporal data anomaly detection model is typically used for structured data, and is not the best choice for unstructured data (e.g., text, images, etc.). For the above reasons, a self-supervision learning module needs to be introduced into the time series data anomaly detection model.
The time series data anomaly detection model in step 200 has the following advantages: the model is simpler to prevent overfitting and, in addition, labels are generated using the self-characteristics of the data, which can help reduce overfitting to normal mode data. Through the labels generated by self-supervision learning, more learning signals can be provided for the model, so that the generalization capability of the model on unseen data is improved.
For step 300, the graph model refers to a graph model that is used to represent conditional dependencies between a set of variables (the multiple causes of anomaly data). In this model, each node represents a cause and the edges represent the probability relationships between the causes.
In some embodiments of the present invention, referring to fig. 2, an application updating method based on time series data further includes:
Step 400: and generating the time sequence data abnormality detection model.
Specifically, a time series data anomaly detection model is generated by means of lifting gradients, in particular, an existing model is improved by iteratively adding a new decision tree model. I.e. each iteration a new "weak learner" is added to compensate for the deficiency of the existing model.
Next, referring to fig. 3, step 400 includes:
Step 401: and performing the following iterative operation until the target value and the true value corresponding to the current training round of the time sequence data anomaly detection model are smaller than a preset threshold value:
step 402: generating a target value of the current training round according to the residual error of the last training round of the time sequence data anomaly detection model;
In each training round, the residuals (i.e., the difference between the true value and the current predicted value) of all decision trees in the previous round need to be predicted. And evaluate on the prediction of the current model using the negative gradient of the loss function to generate a target value for the next decision tree.
Step 403: determining a decision tree model to be added for the current training round according to the target value;
The residual error generated by the previous iteration is reduced by each iteration of the steps. When the model prediction result is inconsistent with the actual observation value, a new decision tree is generated in the gradient direction of residual error reduction so as to reduce the residual error of the last time, and iteration is continuously repeated until the output result is basically nearly consistent with the actual observation value (one mark of continuous optimization and improvement of the model is the iterative decline of the loss function of the model).
Step 404: and generating a time sequence data anomaly detection model corresponding to the current training round according to the decision tree model added by the current training round and the decision tree model corresponding to the training round before the current training round.
In some embodiments of the present invention, referring to fig. 4, prior to step 403, step 400 further comprises:
Step 405: calculating a loss function of each node in the decision tree model corresponding to the training round before the current training round;
step 406: and determining the adding nodes of the decision tree model to be added in the current training round according to the loss function.
For each node, the potential improvement of the model performance by the possible segmentation points for each feature is calculated (the process is parallel). The split point that makes the loss most reduced is selected as the added node by the loss function of each node. The process prefers a depth-first algorithm to stop growing after the decision tree reaches a specified maximum depth.
In some embodiments of the present invention, referring to fig. 5, step 403 includes:
step 4031: determining the decision tree model to be added according to the target value;
Specifically, the root node, the internal node and the leaf node of the decision tree model to be added are determined according to the target value, specifically:
root node: no edge is entered, but zero or more edges are left;
internal nodes: the device comprises an inlet edge and a plurality of outlet edges;
Leaf node: one inlet edge is arranged, and no outlet edge is arranged;
it should be noted that each leaf node has a class label, the root node and the internal node contain attribute test conditions, and each root node and the internal node are judged corresponding to a condition for separating records with different characteristics. When judging a record, starting from the root node, entering a corresponding branch according to the judgment, and until the record reaches the leaf node, wherein the class of the leaf node is a classification result.
Step 4032: and adding the decision tree model to be added to the adding node.
In some embodiments of the present invention, referring to fig. 6, step 400 further comprises:
Step 407: adding a regular term in an objective function of the decision tree model to be added in the current training round;
The regularization term includes the number of leaf nodes of the decision tree model that should be added and the L2 norm of the leaf node output value, in this case: the objective function of the decision tree model to be added is:
Wherein: l is a loss function that measures the error between the model predictor y i (t) and the true value y i. Ω is a regularization term for controlling model complexity, defined as (Ω (f) =γt+1/2λ|w 2 |), is the number of leaf nodes of the tree, w is the leaf node weight, and γ and λ are the regularization coefficients of the two parts, respectively.
In some embodiments of the present invention, referring to fig. 7, an application updating method based on time series data further includes:
Step 500: generating the graph model, next, referring to fig. 8, step 500 includes:
step 501: generating an initial graph model of the graph model according to a plurality of business data causing the abnormal data; wherein, the node of the initial graph model at least corresponds to one business data, and the edge of the initial graph model at least corresponds to one business data;
preferably, the service data in step 501 includes:
① A multidimensional user experience quantitative index is calculated based on user behavior data of the mobile banking app, such as satisfaction score, user residence time, user click-through amount, conversion rate, average access duration, jump rate, page loading speed and user return visit rate. It is assumed that daily index data has been acquired over the years.
② Characteristic data of the user: such as age, asset, login frequency, transfer frequency, investment record.
③ Marketing campaign data of mobile banking: such as advertising marketing of whether or not there are financial products, marketing scope (intra-app messages, cell phone short message messages, weChat public number messages, out-of-line website messages).
④ Modification conditions of mobile phone banks: whether UI transformation of financial module exists, whether business flow of the financial module is optimized, and whether background performance is optimized.
⑤ Social public opinion and economic data: whether the public opinion influence degree related to the company exists, whether the public opinion influence degree related to financial products exists, the large scale index, annual income per person and other data reflect social focus and economy.
Step 502: determining the direction of the edge, the weight of the node and the weight of the edge according to the association relation among the plurality of service data;
Step 503: and generating the graph model according to the initial graph model, the direction of the edge, the weight of the node and the weight of the edge.
In some embodiments of the present invention, referring to fig. 9, step 300 comprises:
step 301: traversing the graph model according to the abnormal data to calculate the contribution degree of each node and the contribution degree of each side;
First, a depth-first search is used to determine the path of propagation of the anomaly data in the graph model. The degree of contribution of a node may be regarded as its importance in propagating or identifying anomalous data. Next, the contribution degree of the node is calculated from centrality (e.g., degree centrality, near centrality, medium centrality) of the corresponding node in the propagation anomaly data in the graph model.
The contribution degree of the edge can be evaluated according to the traffic or information amount carried by the edge in the connection anomaly data. For example: if an edge connects two frequently abnormal nodes, the contribution of the edge may be high.
Step 302: and determining the reason for causing the abnormal data according to the contribution degree of each node and the contribution degree of each edge.
In some embodiments of the present invention, referring to fig. 10, step 200 comprises:
Step 201: calculating the learning rate, the depth and the number of decision trees corresponding to the time sequence data according to the time sequence data abnormality detection model;
step 202: and determining the abnormal data according to the learning rate, the depth of the decision tree and the number.
It will be appreciated that different time series data (generally, different volumes) are input into the time series data anomaly detection model, and the learning rate, the depth of the decision tree, and the number of the models involved in calculating anomaly data contained in the time series data are different, based on which anomaly data can be reversely deduced.
As can be seen from the above description, the embodiment of the present invention provides an application updating method based on time series data, including: firstly, receiving time sequence data of application used by a user; then, detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data; and finally, determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
According to the application updating method based on the time sequence data, the user definition of the hour level in the dimension of each index is supported on timeliness of anomaly identification and reason analysis, for example, each index can be independently configured to be analyzed once per hour, day, week and month.
In addition, the graph-based attribution analysis model can be a key factor and a factor used by the key factor in a complex system, so that the limitation of attribution analysis in the existing method is overcome.
In order to further explain the scheme, the invention also provides a specific implementation mode of the application updating method based on time sequence data, referring to fig. 11, specifically comprising the following steps.
Term interpretation:
User experience quantization index: a numerical criterion for measuring and evaluating the quality of experience perceived by a user when using a product or service, herein referred to as the financial function of a mobile banking app. These metrics are typically obtained by collecting and analyzing user behavior, feedback, and data to understand user satisfaction with the product or service, frequency of use, conversion, and the like.
Abnormality identification and analysis: in general, the user experience quantization indexes have the same change rule in different time period dimensions, and abnormal data identification and analysis are performed based on the assumption. For example, the change laws of the conversion rate of 2022 from 1 to 12 months and the conversion rate of 2023 from 1 to 12 months are consistent, and if the conversion rate is found to be abnormally low or abnormally high in 2023 from 12 months, it is regarded as abnormal data, which is a simple recognition method of abnormality recognition.
Attribution analysis: attribution analysis is a commonly used method of data analysis for determining the cause or influencing factor of an event or phenomenon occurring.
S1: and carrying out anomaly identification according to the time sequence data used by the mobile phone bank of the user.
It is assumed that the analysis timeliness of the conversion index is one day, that is, whether the actual value on the T day is abnormal or not is to be analyzed on the t+1 day. Firstly, based on the five data (the conversion rate of history and T-1 day, the characteristic data of a user, the marketing activity data of a mobile phone bank, the transformation condition of the mobile phone bank, the social public opinion and the economic data), abnormal points are judged by using a time sequence data abnormal detection model block.
Preferably, random noise (preferably gaussian noise) can be added to the time series data to increase the diversity of the data and to increase the robustness of the model to noise. The time sequence of the time sequence data can be changed, so that the diversity of the data is increased. Or randomly upsetting the order of some paragraphs in the time series to test the sensitivity of the model to time sequence variation. Or the scale of the time series data is changed to test the sensitivity of the model to different scales. The data is scaled linearly or the sign of the data is inverted for simulating the inverse variation. In addition, the time series data can be convolved by using a Gaussian check to reduce noise and fluctuation in the time series data and enable the data to be smoothed.
It will be appreciated that adding the self-supervised learning module to the time series data anomaly detection model may result in the following
The beneficial effects are that:
the self-supervised learning module may utilize the self-characteristics of the data to generate tags that may help reduce overfitting to the normal mode data.
Through the labels generated by self-supervision learning, more learning signals can be provided for the model, so that the generalization capability of the model on unseen data is improved.
The self-supervised learning module may learn useful representations of data that may be used as input features to the time series data anomaly detection model to enhance the expressive power of the model.
The self-supervision learning module can automatically generate labels, so that the cost and the workload of manually marking data are reduced.
It should be noted that the self-supervised learning module and the time series data anomaly detection model need to be trained simultaneously, so that the performance and generalization capability of the model are improved.
In summary, by integrating the self-supervised learning module, the performance of the time series data anomaly detection model can be improved by utilizing the self-characteristics of the data, and the risk of overfitting to the normal mode data can be reduced.
S2: and carrying out attribution analysis according to the abnormal recognition result and the graph model.
And calculating the influence weight of a plurality of influence factors on the financial conversion rate by using a attribution analysis model based on the graph, and judging which of the factors causes the abnormality of the conversion rate index. Referring to fig. 13, step S2 specifically includes the steps of:
Sa: and constructing a graph model.
It can be understood that the candidate factors in the present scenario are not only unidirectional influence results, but also different candidate factors can mutually influence each other, so that the relationship between the candidate factors can be better represented by using the graph model.
Specifically, the conversion rate and candidate factors are used for constructing a graph model comprising nodes and edges, and specific candidate factors are software performance parameters (client crash rate, client page rendering time, client service response time and the like), business parameters (whether financial functions have back-end reconstruction, whether financial functions have page reconstruction, whether financial functions have flow reconstruction, whether financial functions or financial products have marketing activities, the conversion rate of financial competing products such as funds, newly increased financial account number, financial product purchase number and redemption number, mobile banking month activity number and the like), social information (deposit interest rate, loan interest rate, resident income level, deposit number and large-disc index).
Sb: and (5) preprocessing data.
Data cleaning, conversion and standardization are carried out, and the accuracy and consistency of the data are ensured. It should be noted that the conversion and various factors should be time series data.
Sc: and (5) weight distribution.
An initial weight, i.e., the weight of each edge, is specified by expert experience.
Sd: traversing the graph model.
And traversing the graph model by adopting a depth-first search algorithm, and calculating the contribution degree of each node and each edge to identify key factors influencing the result.
Se: and outputting a result.
And visualizing the result obtained by the traversal calculation, outputting weights (constructing the relation among the entities, wherein the connection attribute among the entities is the influence weight), and displaying the attribution analysis result in an intuitive way, so that the method is convenient for users to understand and apply. In addition, timeliness of analysis can be configured for each index, automatic report generation is supported, the report generation period can be set in a self-defined mode, and objects for distributing configuration reports are supported.
As can be seen from the foregoing description, the embodiment of the present invention provides an application updating method based on time series data, including: firstly, receiving time sequence data of application used by a user; then, detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data; and finally, determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
Firstly, the invention is oriented to user experience quantification indexes, and uses anomaly identification and attribution analysis to realize automatic anomaly identification and analysis, and can automatically generate a user experience analysis report without the participation of expert manpower. Then, the time sequence data abnormality detection model used in the invention can improve generalization capability due to the addition of a self-supervision learning module. Finally, the invention uses the attribution analysis based on the atlas, thereby being capable of carrying out visual and flexible depth analysis on time sequence data.
Based on the same inventive concept, the embodiment of the present application also provides an application updating device based on time series data, which can be used to implement the method described in the above embodiment, such as the following embodiment. Since the principle of the application updating device based on the time sequence data for solving the problem is similar to that of the application updating method based on the time sequence data, the implementation of the application updating device based on the time sequence data can be referred to the implementation of the application updating method based on the time sequence data, and the repetition is omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the system described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
An embodiment of the present invention provides a specific implementation manner of a time series data based application updating apparatus capable of implementing a time series data based application updating method, referring to fig. 14, the time series data based application updating apparatus specifically includes the following:
a time series data receiving module 10 for receiving time series data of a user application;
An abnormal data detection module 20, configured to detect abnormal data of the time series data according to the time series data and a pre-generated time series data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
the reason determining module 30 is configured to determine a reason for causing the abnormal data according to the abnormal data and a pre-generated graph model, and update an application according to the reason.
In some embodiments of the present invention, referring to fig. 15, an application updating apparatus based on time series data further includes:
A detection model generating module 40, configured to generate the time series data anomaly detection model, referring to fig. 16, the detection model generating module 40 includes:
An iterative operation unit 40a, configured to perform the following iterative operation until a target value and a true value corresponding to a current training round of the abnormal time series data detection model are smaller than a preset threshold value:
a target value generating unit 40b, configured to generate a target value of a current training round according to a residual error of a previous training round of the time-series data anomaly detection model;
a decision tree model determining unit 40c, configured to determine, according to the target value, a decision tree model to which the current training round should be added;
The detection model generating unit 40d is configured to generate a time sequence data anomaly detection model corresponding to the current training round according to the decision tree model to which the current training round should be added and the decision tree model corresponding to the training round before the current training round.
In some embodiments of the present invention, referring to fig. 17, the detection model generating module 40 further includes:
A node loss function calculation unit 40e, configured to calculate a loss function of each node in the decision tree model corresponding to a training round before the current training round;
An adding node determining unit 40f, configured to determine an adding node of the decision tree model to which the current training round should be added according to the loss function.
In some embodiments of the present invention, referring to fig. 18, the decision tree model determining unit 40c includes:
a decision tree model determination subunit 40c1, configured to determine the decision tree model to be added according to the target value;
A decision tree model adding unit 40c2 for adding the decision tree model to be added to the adding node.
In some embodiments of the present invention, referring to fig. 19, an application updating apparatus based on time series data further includes:
A graph model generation module 50 for generating the graph model, see fig. 20, the graph model generation module 50 comprising:
an initial model generating unit 50a for generating an initial graph model of the graph model from a plurality of business data that cause the abnormal data; wherein, the node of the initial graph model at least corresponds to one business data, and the edge of the initial graph model at least corresponds to one business data;
A direction weight determining unit 50b for determining the direction of the edge, the weight of the node, and the weight of the edge according to the association relationship between the plurality of service data;
a graph model generating unit 50c for generating the graph model according to the initial graph model, the direction of the edge, the weight of the node, and the weight of the edge.
In some embodiments of the present invention, referring to fig. 21, the cause determination module 30 includes:
A contribution calculation unit 30a for traversing the graph model according to the anomaly data to calculate a contribution of each node and a contribution of each edge;
a cause determining unit 30b, configured to determine a cause of the abnormal data according to the contribution degree of each node and the contribution degree of each edge.
In some embodiments of the present invention, referring to fig. 22, the abnormal data detection module 20 includes:
A learning rate calculation unit 20a, configured to calculate a learning rate, a depth of a decision tree, and a number corresponding to the time series data according to the time series data anomaly detection model;
An abnormal data determining unit 20b for determining the abnormal data according to the learning rate, the depth of the decision tree and the number.
As can be seen from the above description, the embodiment of the present invention provides an application updating device based on time series data, including: the time sequence data receiving module is used for receiving time sequence data of application used by a user; the abnormal data detection module is used for detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data; and the reason determining module is used for determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model and updating the application according to the reason.
According to the application updating device based on the time sequence data, firstly, based on multidimensional information, the recognition and analysis of abnormal time sequence data are realized for the user experience quantization index, and the user experience analysis report can be automatically generated without the participation of expert manpower.
Secondly, the invention provides a new time sequence data abnormality detection model, and a self-supervision learning module is introduced on the basis of the model, so that general data can be subjected to generalized characteristic representation learning, and the generalization capability of the model can be improved.
It should be noted that, the method and the device for updating the application based on the time sequence data provided by the embodiment of the invention can be used in the financial field and also can be used in any technical field except the financial field, and the application field of the method and the device for updating the application based on the time sequence data is not limited.
The embodiment of the present application further provides a specific implementation manner of an electronic device capable of implementing all the steps in the application updating method based on time series data in the foregoing embodiment, and referring to fig. 23, the electronic device specifically includes the following contents:
a processor 1201, a memory 1202, a communication interface (Communications Interface) 1203, and a bus 1204;
wherein the processor 1201, the memory 1202 and the communication interface 1203 perform communication with each other through the bus 1204; the communication interface 1203 is configured to implement information transmission between the server device and the client device;
The processor 1201 is configured to invoke a computer program in the memory 1202, and when the processor executes the computer program, the processor implements all the steps in the time-series data-based application updating method in the above embodiment, for example, when the processor executes the computer program, the processor implements the following steps:
step 100: receiving time sequence data of application used by a user;
step 200: detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
step 300: and determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
The embodiment of the present application also provides a computer-readable storage medium capable of implementing all the steps in the time-series data-based application updating method in the above embodiment, on which a computer program is stored, which when executed by a processor implements all the steps in the time-series data-based application updating method in the above embodiment, for example, the processor implements the following steps when executing the computer program:
step 100: receiving time sequence data of application used by a user;
step 200: detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
step 300: and determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a hardware+program class embodiment, the description is relatively simple, as it is substantially similar to the method embodiment, as relevant see the partial description of the method embodiment.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Although the application provides method operational steps as an example or a flowchart, more or fewer operational steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented in an actual device or client product, the instructions may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment) as shown in the embodiments or figures.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, when implementing the embodiments of the present disclosure, the functions of each module may be implemented in the same or multiple pieces of software and/or hardware, or a module that implements the same function may be implemented by multiple sub-modules or a combination of sub-units, or the like. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments. In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present specification. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
The foregoing is merely an example of an embodiment of the present disclosure and is not intended to limit the embodiment of the present disclosure. Various modifications and variations of the illustrative embodiments will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the embodiments of the present specification, should be included in the scope of the claims of the embodiments of the present specification.

Claims (10)

1. An application updating method based on time series data, comprising:
Receiving time sequence data of application used by a user;
detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
And determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model, and updating the application according to the reason.
2. The application updating method according to claim 1, wherein the step of generating the time series data abnormality detection model includes:
And performing the following iterative operation until the target value and the true value corresponding to the current training round of the time sequence data anomaly detection model are smaller than a preset threshold value:
Generating a target value of the current training round according to the residual error of the last training round of the time sequence data anomaly detection model;
determining a decision tree model to be added for the current training round according to the target value;
And generating a time sequence data anomaly detection model corresponding to the current training round according to the decision tree model added by the current training round and the decision tree model corresponding to the training round before the current training round.
3. The application updating method according to claim 2, characterized by further comprising, before determining the decision tree model to which the current training round should be added according to the target value:
Calculating a loss function of each node in the decision tree model corresponding to the training round before the current training round;
and determining the adding nodes of the decision tree model to be added in the current training round according to the loss function.
4. The application updating method according to claim 3, wherein determining a decision tree model to which the current training round should be added according to the target value comprises:
determining the decision tree model to be added according to the target value;
and adding the decision tree model to be added to the adding node.
5. The application updating method according to claim 1, wherein the step of generating the graph model includes:
generating an initial graph model of the graph model according to a plurality of business data causing the abnormal data; wherein, the node of the initial graph model at least corresponds to one business data, and the edge of the initial graph model at least corresponds to one business data;
Determining the direction of the edge, the weight of the node and the weight of the edge according to the association relation among the plurality of service data;
and generating the graph model according to the initial graph model, the direction of the edge, the weight of the node and the weight of the edge.
6. The application updating method according to any one of claims 1 to 5, characterized in that determining a cause of the abnormal data from the abnormal data and a pre-generated graph model includes:
Traversing the graph model according to the abnormal data to calculate the contribution degree of each node and the contribution degree of each side;
and determining the reason for causing the abnormal data according to the contribution degree of each node and the contribution degree of each edge.
7. The application updating method according to claim 1, wherein detecting abnormal data of the time series data based on the time series data and a pre-generated time series data abnormality detection model, comprises:
calculating the learning rate, the depth and the number of decision trees corresponding to the time sequence data according to the time sequence data abnormality detection model;
And determining the abnormal data according to the learning rate, the depth of the decision tree and the number.
8. An application updating apparatus based on time series data, comprising:
the time sequence data receiving module is used for receiving time sequence data of application used by a user;
The abnormal data detection module is used for detecting abnormal data of the time sequence data according to the time sequence data and a pre-generated time sequence data abnormal detection model; the time sequence data anomaly detection model comprises a plurality of decision tree models, wherein the latter decision tree model is used for correcting the prediction residual error of the former decision tree model, and the time sequence data anomaly detection model is generated by training together according to marked training data and unmarked training data;
and the reason determining module is used for determining the reason for causing the abnormal data according to the abnormal data and the pre-generated graph model and updating the application according to the reason.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the time-series data based application updating method of any one of claims 1 to 7 when the program is executed by the processor.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the time-series data based application updating method of any of claims 1 to 7.
CN202410484148.0A 2024-04-22 Application updating method and device based on time sequence data Pending CN118295682A (en)

Publications (1)

Publication Number Publication Date
CN118295682A true CN118295682A (en) 2024-07-05

Family

ID=

Similar Documents

Publication Publication Date Title
US11972430B2 (en) Artificial intelligence fraud management solution
Büyüközkan et al. Assessment of lean manufacturing effect on business performance using Bayesian Belief Networks
CN111369299B (en) Identification method, device, equipment and computer readable storage medium
Lohani et al. Machine learning based model for prediction of loan approval
CN112784986A (en) Feature interpretation method, device, equipment and medium for deep learning calculation result
CN111932367A (en) Pre-credit evaluation method and device
De Bock et al. Explainable AI for operational research: A defining framework, methods, applications, and a research agenda
Shukla et al. Comparative analysis of ml algorithms & stream lit web application
US11995667B2 (en) Systems and methods for business analytics model scoring and selection
CN111598329A (en) Time sequence data prediction method based on automatic parameter adjustment recurrent neural network
CN113807469A (en) Multi-energy user value prediction method, device, storage medium and equipment
CN114581249B (en) Financial product recommendation method and system based on investment risk bearing capacity assessment
Branchi et al. Learning to act: a reinforcement learning approach to recommend the best next activities
Vorobyev et al. Reducing false positives in bank anti-fraud systems based on rule induction in distributed tree-based models
Bozorgi et al. Prescriptive process monitoring based on causal effect estimation
CN115423499A (en) Model training method, price prediction method, terminal device, and storage medium
Brunk Structuring business process context information for process monitoring and prediction
Khalid et al. Predicting risk through artificial intelligence based on machine learning algorithms: a case of Pakistani nonfinancial firms
Zhang A deep learning model for ERP enterprise financial management system
Balcilar et al. Was the recent downturn in US real GDP predictable?
Bonello et al. Machine learning models for predicting financial distress
CN113537731B (en) Design resource capability assessment method based on reinforcement learning
CN118295682A (en) Application updating method and device based on time sequence data
US20200357049A1 (en) Tuning hyperparameters for predicting spending behaviors
US11004156B2 (en) Method and system for predicting and indexing probability of financial stress

Legal Events

Date Code Title Description
PB01 Publication