CN111913887B - Software behavior prediction method based on beta distribution and Bayesian estimation - Google Patents

Software behavior prediction method based on beta distribution and Bayesian estimation Download PDF

Info

Publication number
CN111913887B
CN111913887B CN202010836514.6A CN202010836514A CN111913887B CN 111913887 B CN111913887 B CN 111913887B CN 202010836514 A CN202010836514 A CN 202010836514A CN 111913887 B CN111913887 B CN 111913887B
Authority
CN
China
Prior art keywords
probability
software
behavior
model
beta
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010836514.6A
Other languages
Chinese (zh)
Other versions
CN111913887A (en
Inventor
唐剑
赵亮
唐艺
浦戈光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Defense Technology Innovation Institute PLA Academy of Military Science
Original Assignee
National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Defense Technology Innovation Institute PLA Academy of Military Science filed Critical National Defense Technology Innovation Institute PLA Academy of Military Science
Priority to CN202010836514.6A priority Critical patent/CN111913887B/en
Publication of CN111913887A publication Critical patent/CN111913887A/en
Application granted granted Critical
Publication of CN111913887B publication Critical patent/CN111913887B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a software behavior prediction method based on beta distribution and Bayesian estimation, which is used for greatly influencing a model under given behavior training data without converging the model to a fixed value, predicting next binary behavior and achieving good model training and behavior prediction effects. The invention is based on the model prediction technology of probability distribution, bayesian estimation and hidden Markov models, and completes the functions of model training and prediction under the conditions of small data volume and large model change. The tool uses a hidden Markov model to model and describe the situation, then selects the beta probability distribution, realizes the function of updating the model by single data on the basis of using the Bayesian theory, and realizes the function that the model can be greatly changed according to the actual situation and the prerequisite prior data on the basis that the internal rule accords with the beta distribution by setting the prerequisite prior data. The invention can achieve the effects of obtaining a single data change model by training under given binary behavior training data and predicting the next behavior, and expands the applicable objects of model training software.

Description

Software behavior prediction method based on beta distribution and Bayesian estimation
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a software behavior prediction method based on beta distribution and Bayesian estimation.
Background
Currently, most model training software or algorithms are based on a large amount of data, and through the processes of multiple iterations, fitting and learning, the data is finally converged, so that a model with a fixed probability, such as a neural network algorithm, a clustering algorithm, a maximum expectation algorithm and the like, is obtained. Therefore, there is a lack of such mainstream model training software or algorithm that it cannot be used in situations where the model is changing and does not tend to converge. The situation has two characteristics, firstly, the data amount of a stable and probability-invariant model is small, and even only one model can exist in an extreme case; second, the model, while inherently regular, tracks in the course of changes, varies widely via a single data update. Therefore, a converged, stable model cannot be obtained using existing model training methods without having a large amount of data, a model, or explicit properties that can be converged.
Bayesian estimation typically requires four steps: supposing that the parameters to be estimated are regarded as random variables which accord with certain prior probability distribution, the estimation mode is adopted, and the prior probability density is converted into the posterior probability density through a Bayes rule by observing a sample. Two basic methods of probability density estimation: parameter estimates (parameter methods), which assume that random variables follow a certain distribution according to a general knowledge of the problem, the parameters of the distribution function being estimated by training data, such as ML estimates, bayesian estimates; nonparametric estimates (nonparametric methods), estimates of the probability density without the model, but only with the training data itself. Bayesian estimation is used in many fields, and for positioning applications, the target to be positioned or tracked is a good estimation of the target from a series of measurements.
In the project, a model prediction technology based on probability distribution, bayesian estimation and hidden Markov models is used, and the functions of model training and prediction under the conditions of small data volume and large model change are completed. The hidden Markov model is used for modeling and describing the situation, then the beta probability distribution is selected, the function of updating the model by single data is realized on the basis of using the Bayesian theory, and the function that the model can be greatly changed according to the actual situation and the prerequisite prior data on the basis that the rule accords with the beta distribution is realized by setting the prerequisite prior data.
Disclosure of Invention
The invention provides a software behavior prediction method based on beta distribution and Bayesian estimation and realizes a corresponding behavior prediction tool, wherein the tool takes behavior trajectory data of an experimental object and probability parameters set by experimenters as input, firstly carries out pretreatment on the behavior parameters of the experimental object to obtain a behavior trajectory of the experimental object, then uses the behavior trajectory of the experimental object to train a beta distribution step by step, firstly uses past behaviors as priors and integrates to obtain the current probability in single training aiming at single binary behaviors of the experimental object, then uses Bayesian theory knowledge and probability parameters set by the experimenters as priors to change the mode of probability change, and finally obtains the probability of a next behavior prediction value.
The invention provides a software behavior prediction method based on beta distribution and Bayesian estimation, which comprises the following steps:
s1, preprocessing software behavior data to obtain a software behavior track;
s2, gradually training the beta distribution by using the software behavior track, wherein the software behavior track can be decomposed into continuous single binary behaviors, and specifically, a beta distribution training algorithm is constructed aiming at the single binary behaviors of the software, namely: :
firstly, a beta probability distribution is established, and the beta probability distribution is a conjugate prior distribution, so that the posterior distribution after single data update is the same as the prior distribution structure, and the difference is only in parameters. There are two events, a and B, that are mutually exclusive, i.e., a occurrence will result in B not occurring, B occurs will result in a not occurring, and a and B will occur at least one, assuming that what needs to be predicted is the probability of a occurring. Let s k Is the probability of occurrence of A at the kth time, r k Is the probability of occurrence of A at the kth time, r k Satisfying the beta probability distribution. Then there are:
p(s k =1|S k-1 )=∫p(s k =1|r k )p(r k |S k-1 )dr k
p(s k =1|S k-1 )=∫r k p(r k |S k-1 )dr k
secondly, through software behavior trajectory training, setting alpha as the probability that the probability of A occurrence remains unchanged, p 0 Is at initialization time r k And finally obtaining the beta probability distribution of single software behavior prediction according to the occurrence probability, namely:
p(r k |S k-1 )=αp(r k-1 |S k-1 )+(1-α)p 0 (r k )
s3, constructing a prior value optimization model algorithm by taking the single same behavior of the past software as a prior value, namely:
adding prior value information on the well-established beta probability distribution model, and assuming that the probability of the establishment of the prior value is as follows: p (attribute), and if the prediction probability before adding the prior value information is p (original), the following formula holds true according to the bayesian theory:
Figure BDA0002639873160000031
wherein p (experimenter | original) is the probability that the prior information is established under the condition of the prediction probability, namely the probability that the prior value p (original) is established in the corresponding beta distribution;
s4, optimizing the behavior predicted value of the software by selectively using a memory regression optimization algorithm, wherein the specific memory regression optimization processing comprises the following steps:
after adding the prior value information, an optional memory decay optimization is added. The memory decline algorithm receives a behavior sequence actionList, and after the memory decline processing, returns a Beta distribution list betaMemo DecayList, wherein the length of the list is the same as that of the behavior sequence actionList, and the list represents Beta distribution in a probability prediction function corresponding to each behavior in the actionList.
The one-time memory decay algorithm receives a behavior sequence actionList and then returns a single beta distribution that has undergone memory decay. The essence of the method is that in the process of training by using each behavior in actionList, the more the behavior at the back is, the higher the weight setting is, the more the influence of the behavior on the model is, and the more the behavior at the front is, the lower the weight setting is, which indicates that the influence of the long-term historical information on the current model is reduced. The weight setting is optionally changed, and has two options of discrete and linear, and the distribution shows that the influence of the reinforced recent behaviors on the model is increased linearly with the influence of the behaviors from far to near.
The method provided by the invention has the following advantages:
the software behavior prediction method based on the beta distribution and the Bayesian estimation is based on a single binary behavior update model, and is not a stable model with invariable probability which can be obtained only through training of a large amount of data.
According to the software behavior prediction method based on the beta distribution and the Bayesian estimation, the prior information can be added, so that the prediction probability can be more fit for the actual situation of behavior prediction, namely, the model can not stably change in a small range, but can be greatly changed based on the historical data and the prior information of the model, and more accurate judgment can be made.
According to the software behavior prediction method based on the beta distribution and the Bayes estimation, provided by the invention, a user can select to add a memory decline model and further select a memory decline mode, discrete or linear or exponential form.
Drawings
FIG. 1 is a flow chart of a method for behavior prediction;
FIG. 2 is a diagram of a beta distribution based behavior prediction architecture;
FIG. 3 is a functional block diagram of a behavior prediction tool.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
The embodiment of the present invention provides a behavior prediction method based on a beta distribution and bayesian estimation model training, and in order to make understanding of the present invention more clear to those skilled in the art, a detailed description of the present invention will be described below with reference to specific implementations and accompanying drawings. On the basis of the invention, all other achievements obtained without creative work belong to the protection scope of the invention.
As shown in fig. 2 and 3, a behavioral prediction system structure diagram based on beta distribution takes behavioral trajectory data of an experimental object and probability parameters set by experimenters as input, firstly, the behavioral parameters of the experimental object are preprocessed to obtain a behavioral trajectory of the experimental object, then, the behavioral trajectory of the experimental object is used for gradually training one beta distribution, in single training aiming at single binary behaviors of the experimental object, firstly, past behaviors are used as priors, current probabilities are obtained through integration, then, the probability parameters set by the experimenters are added into bayesian theoretical knowledge to be used as priors, a probability change mode is changed, and finally, a probability predicted value of a next behavior is obtained.
As shown in fig. 1, the behavior prediction method based on the model training of the beta distribution and the bayesian estimation includes the following steps:
s1, preprocessing software behavior data to obtain a software behavior track;
s2, gradually training the beta distribution by using the software behavior track, decomposing the software behavior track into continuous single binary behaviors, and specifically constructing a beta distribution training algorithm aiming at the single binary behaviors of the software, namely: :
firstly, a beta probability distribution is established, and the beta probability distribution is a conjugate prior distribution, so that the posterior distribution after single data update is the same as the prior distribution structure, and the difference is only in parameters. There are two events, a and B, that are mutually exclusive, i.e., a occurrence will result in B not occurring, B occurs will result in a not occurring, and a and B will occur at least one, assuming that what needs to be predicted is the probability of a occurring. Let s k Is the probability of occurrence of A at the kth time, r k Is the probability of occurrence of A at the kth time, r k Satisfying the beta probability distribution. Then there are:
p(s k =1|S k-1 )=∫p(s k =1|r k )p(r k |S k-1 )dr k
p(s k =1|S k-1 )=∫r k p(r k |S k-1 )dr k
secondly, through software behavior trajectory training, setting alpha as the probability that the probability of A occurrence remains unchanged, p 0 Is at initialization time r k And finally obtaining the beta probability distribution of single software behavior prediction according to the occurrence probability, namely:
p(r k |S k-1 )=αp(r k-1 |S k-1 )+(1-α)p 0 (r k )
s3, constructing a prior value optimization model algorithm by taking the single same behavior of the past software as a prior value, namely:
adding prior value information on the well-established beta probability distribution model, and assuming that the probability of the establishment of the prior value is as follows: p (experimental), and if the prediction probability before adding the prior value information is p (original), the following formula holds according to the bayesian theory:
Figure BDA0002639873160000051
wherein, p (experimenter | original) is the probability that the prior information holds in the case of the predicted probability, that is, the probability that the prior value p (original) holds in the corresponding beta distribution;
s4, optimizing the behavior predicted value of the software by selectively using a memory regression optimization algorithm, wherein the specific memory regression optimization processing comprises the following steps:
after adding the prior value information, an optional memory decay optimization is added. The Memory Decay Algorithm (Memory Decay Algorithm) receives a behavior sequence actionList, and after the Memory Decay processing, returns a Beta distribution list betaMemo DecayList, the length of the list is the same as that of the behavior sequence actionList, and the list represents Beta distribution in a probability prediction function corresponding to each behavior in the actionList.
The memory decay algorithm is as follows:
Figure BDA0002639873160000052
Figure BDA0002639873160000061
the one-time memory decay algorithm receives a behavior sequence actionList and then returns a single beta distribution that has undergone memory decay. The essence of this is that, in the training process using each behavior in actionList, the more the behavior at the back is, the higher the weight setting is, the more the influence of the behavior on the model is, and the more the behavior at the front is, the lower the weight setting is, which indicates that the influence of the history information at a long time on the current model is reduced. The weight setting is optional, as shown in fig. 1, there are two options of discrete and linear, and the distribution indicates that the influence of the recent behaviors on the model is strengthened and the influence of the behaviors from far to near is increased linearly.
The discrete memory regression model algorithm is shown as follows:
Figure BDA0002639873160000062
Figure BDA0002639873160000071
finally, it should be noted that: the above description is only for the purpose of illustrating embodiments of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions of the foregoing embodiments or make equivalent substitutions for some technical features. Any modification, replacement, and improvement, etc., within the spirit and scope of the present invention are within the scope of the present invention.

Claims (3)

1. A software behavior prediction method based on beta distribution and Bayesian estimation is characterized in that:
s1, preprocessing software behavior data to obtain a software behavior track;
s2, gradually training the beta distribution by using the software behavior track, wherein the software behavior track can be decomposed into continuous single binary behaviors, and specifically, a beta distribution training algorithm is constructed aiming at the single binary behaviors of the software, namely:
firstly, establishing a beta probability distribution, and setting two events A and B which are mutually exclusive, namely, A occurrence can cause B not to occur, B occurrence can cause A not to occur, and A and BAt least one, set s will occur k Is the probability of occurrence of A at the kth time, r k Is the probability of occurrence of A at the kth time, r k Satisfying the beta probability distribution, then there are:
p(s k =1|S k-1 )=∫p(s k =1|r k )p(r k |S k-1 )dr k
p(s k =1|S k-1 )=∫r k p(r k |S k-1 )dr k
secondly, through software behavior trajectory training, setting alpha as the probability that the probability of A occurrence remains unchanged, p 0 Is at initialization time r k And finally obtaining the beta probability distribution of single software behavior prediction according to the occurrence probability, namely:
p(r k |S k-1 )=αp(r k-1 |S k-1 )+(1-α)p 0 (r k )
s3, constructing a prior value optimization model algorithm by taking the single same behavior of the past software as a prior value, namely:
adding prior value information on the well-established beta probability distribution model, assuming that the probability of the establishment of the prior value is p (experimenter), and the prediction probability before the addition of the prior value information is p (origin), according to the Bayesian theory, the following formula is established:
Figure FDA0002639873150000011
wherein, p (experimenter | original) is the probability that the prior information holds in the case of the predicted probability, that is, the probability that the prior value p (original) holds in the corresponding beta distribution;
then the predicted value of the single binary behavior of the software can be obtained, and further a software behavior prediction sequence is obtained;
s4, optimizing the behavior predicted value of the software by selectively using a memory regression optimization algorithm, wherein the specific memory regression optimization processing comprises the following steps: after adding prior value information, receiving a historical sequence of single software behaviors, and returning a beta distribution list after memory regression processing, wherein the length of the list is the same as that of the historical sequence of the software behaviors, and the beta distribution list represents beta distribution in a probability prediction function corresponding to each software behavior in the historical sequence of the software behaviors.
2. The software behavior prediction method based on beta distribution and bayes estimation according to claim 1, characterized in that: and setting a weight value for the historical sequence of the single software behavior in the S4, wherein the longer the current time is, the lower the weight value is, and the closer the current time is, the higher the weight setting is.
3. The software behavior prediction method based on beta distribution and bayes estimation according to claim 2, characterized in that: the weight values set for the historical sequence of single software behaviors in S4 have two options of discrete and linear.
CN202010836514.6A 2020-08-19 2020-08-19 Software behavior prediction method based on beta distribution and Bayesian estimation Active CN111913887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010836514.6A CN111913887B (en) 2020-08-19 2020-08-19 Software behavior prediction method based on beta distribution and Bayesian estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010836514.6A CN111913887B (en) 2020-08-19 2020-08-19 Software behavior prediction method based on beta distribution and Bayesian estimation

Publications (2)

Publication Number Publication Date
CN111913887A CN111913887A (en) 2020-11-10
CN111913887B true CN111913887B (en) 2022-11-11

Family

ID=73278346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010836514.6A Active CN111913887B (en) 2020-08-19 2020-08-19 Software behavior prediction method based on beta distribution and Bayesian estimation

Country Status (1)

Country Link
CN (1) CN111913887B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590486A (en) * 2021-02-23 2021-11-02 中国人民解放军军事科学院国防科技创新研究院 Open source software code quality evaluation method based on measurement
CN113158234B (en) * 2021-03-29 2022-09-27 上海雾帜智能科技有限公司 Method, device, equipment and medium for quantifying occurrence frequency of security event

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358311A (en) * 2017-06-07 2017-11-17 西安工业大学 A kind of Time Series Forecasting Methods
CN107679566A (en) * 2017-09-22 2018-02-09 西安电子科技大学 A kind of Bayesian network parameters learning method for merging expert's priori

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8589228B2 (en) * 2010-06-07 2013-11-19 Microsoft Corporation Click modeling for URL placements in query response pages

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358311A (en) * 2017-06-07 2017-11-17 西安工业大学 A kind of Time Series Forecasting Methods
CN107679566A (en) * 2017-09-22 2018-02-09 西安电子科技大学 A kind of Bayesian network parameters learning method for merging expert's priori

Also Published As

Publication number Publication date
CN111913887A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
Tamar et al. Variance adjusted actor critic algorithms
Alessandri et al. Receding-horizon estimation for discrete-time linear systems
Hren et al. Optimistic planning of deterministic systems
US7933861B2 (en) Process data warehouse
CN111913887B (en) Software behavior prediction method based on beta distribution and Bayesian estimation
CN112686383B (en) Method, system and device for reducing distributed random gradient of communication parallelism
Mesquita et al. Embarrassingly parallel MCMC using deep invertible transformations
CN114463596A (en) Small sample image identification method, device and equipment of hypergraph neural network
Gao et al. Optimization methods for large-scale machine learning
Sunmola et al. Model transfer for Markov decision tasks via parameter matching
CN108134687B (en) Gray model local area network peak flow prediction method based on Markov chain
Sunehag et al. Consistency of feature Markov processes
CN113763710B (en) Short-term traffic flow prediction method based on nonlinear adaptive system
CN114282330A (en) Distribution network real-time dynamic reconstruction method and system based on branch dual-depth Q network
Zhang et al. Dynamic layer-wise sparsification for distributed deep learning
CN113095466A (en) Algorithm of satisfiability model theoretical solver based on meta-learning model
CN117520385B (en) Database query optimization method based on exploration value and query cost
Sandmann On optimal importance sampling for discrete-time Markov chains
CN117784615B (en) Fire control system fault prediction method based on IMPA-RF
Zheng et al. Green Simulation Based Policy Optimization with Partial Historical Trajectory Reuse
CN112270353B (en) Clustering method for multi-target group evolution software module
Sharma et al. Multi-Time scale smoothed functional With nesterov’s acceleration
CN116757272A (en) Continuous motion control reinforcement learning framework and learning method
Zhai et al. Learning Sampling Policy to Achieve Fewer Queries for Zeroth-Order Optimization
Luo et al. Variational hidden conditional random fields with beta processes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant