CN114242169B - Antigen epitope prediction method for B cells - Google Patents

Antigen epitope prediction method for B cells Download PDF

Info

Publication number
CN114242169B
CN114242169B CN202111537519.XA CN202111537519A CN114242169B CN 114242169 B CN114242169 B CN 114242169B CN 202111537519 A CN202111537519 A CN 202111537519A CN 114242169 B CN114242169 B CN 114242169B
Authority
CN
China
Prior art keywords
action
amino acid
state
value
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111537519.XA
Other languages
Chinese (zh)
Other versions
CN114242169A (en
Inventor
羊红光
周云飞
成彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute Of Applied Mathematics Hebei Academy Of Sciences
Original Assignee
Institute Of Applied Mathematics Hebei Academy Of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute Of Applied Mathematics Hebei Academy Of Sciences filed Critical Institute Of Applied Mathematics Hebei Academy Of Sciences
Priority to CN202111537519.XA priority Critical patent/CN114242169B/en
Publication of CN114242169A publication Critical patent/CN114242169A/en
Application granted granted Critical
Publication of CN114242169B publication Critical patent/CN114242169B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Abstract

An epitope prediction method for B cells, the method comprising first forming a pretrained set PT; in each epoode of the q_learning algorithm, the Q agent takes any 8 consecutive amino acid residues in the primary sequence of the protein as a state, to select k residues from the 12 consecutive residues following each state to incorporate the state as the first action; selecting one of n complementary classifiers as a second action option, searching in PT according to a continuous action search method, giving instant rewards to the searched amino acid sequence by a tendency rewarding rule, calculating Q value and updating until the change of a cost function is less than 1%, and ending training; the amino acid sequence is then searched for in the protein primary sequence using the trained strategy and classified by the selected classifier. According to the invention, the prediction capability of the B cell epitope is greatly enhanced through automatic iteration, and the accuracy of epitope classification is improved.

Description

Antigen epitope prediction method for B cells
Technical Field
The invention relates to an epitope prediction method for B cells, which can accurately predict B cell epitopes and belongs to the technical field of artificial intelligent detection of microorganisms.
Background
The accurate determination of B cell antigen epitope is an important basis for designing bioactive medicine and epitope vaccine, is a key step for developing disease kit, and is a basic technology for researching immunodiagnosis and immunotherapy. The machine learning-based B cell epitope prediction is an important technical route for determining the epitope, and has the advantages of greatly saving time, money and labor cost compared with other technical routes.
The SEPPA is an epitope prediction software recommended in the Immune Epitope Database (IEDB) established by the national institute of allergy and infectious diseases, which has been updated to version 3.0 in 2019. The scholars responsible for developing SEPPA 3.0 have pointed out in their paper that conformational epitope prediction progressed smoothly but slowly over the last decade.
The prior epitope prediction adopts a supervised learning strategy, and learns the epitope sample and the non-epitope sample to obtain a classification predictor. Although new epitope prediction methods are continuously developed, the prediction accuracy is continuously improved, and the problems of low universality, low classification accuracy, slow update of a prediction model and the like exist. In particular, the conventional window method, in which an integer is preset as the number of amino acids in the predicted result before prediction, is very artificial, and it is difficult to predict an epitope having an optimal length.
With the revolutionary breakthrough of alpha gold in the field of protein structure prediction, which defeats the strongest human player in the go war, these successes give us great insight. Both breakthroughs have a common characteristic of introducing an automatic learning mechanism, so that the model is continuously iterated by itself, and strong recognition capability is gradually generated.
However, the existing methods are non-automatic learning and cannot enhance the prediction capability through automatic iteration, so that an automatic mechanism is introduced into the antigen epitope prediction of the B cells, and a method capable of accurately determining the antigen epitope of the B cells is very necessary to design.
Disclosure of Invention
The invention aims at overcoming the defects of the prior art and provides an epitope prediction method for B cells so as to improve the accuracy of B cell epitope prediction.
The problems of the invention are solved by the following technical proposal:
an epitope prediction method for B cells comprises the steps of firstly searching B cell epitope sequence data from an IEDB database to form a set EPT, extracting corresponding protein primary sequences from a uniport database to form a pre-training set PT; based on the Q_learning algorithm, one action of the algorithm is changed into two actions to train; in each epoode, the Q agent takes any 8 consecutive amino acid residues in the primary sequence of the protein as a state to select k residues from the 12 consecutive residues following each state to incorporate the state as the first action; selecting one of n complementary classifiers as a second action option, searching in a protein primary sequence in PT according to a continuous action search method, giving instant rewards to the searched amino acid sequence by a tendency rewarding rule, calculating a Q value and updating until the change of a value function is less than 1%, and ending training; then searching an amino acid sequence in the protein primary sequence by using a strategy obtained by training, and classifying by a selected classifier, thereby realizing B cell epitope prediction.
The above epitope prediction method for B cells, comprising the steps of:
a. b cell antigen epitope sequence data are searched from an IEDB database to form an assembly EPT, corresponding protein primary sequences are extracted from a uniport database to form a pre-training assembly PT, and an assembly containing n more than or equal to 2 complementary classifiers is selected as a second action;
b. taking any 8 consecutive amino acid residues in the primary sequence of the protein as a state, and selecting k residues from 12 consecutive residues following each state to incorporate the state as a first action; selecting one of n complementary classifiers as a second action option, initializing Q values corresponding to all states and actions to 0, setting learning rate alpha to any number between 0 and 1, setting discount factor gamma to any number between 0 and 1, setting value of epoode, and initializing state s 0 Any 8 amino acid residues that are a pre-training set;
c. in each epoode, the Q agent searches among the primary sequences of proteins in the collection PT according to a continuous action search method: at step t, the Q agent selects an action from the first set of actionsThen select action +.>Awarding the prize R according to a tendency prize law after the two actions are executed t And the next observation state s t+1 Then updating the Q value, and updating the state and the action table at the same time, and ending the search training process when the change of the cost function is less than 1%;
d. and searching out an amino acid combination in the primary sequence of each protein by using a strategy obtained by training, classifying by a selected classifier, and if the result of the classifier shows that the searched amino acid sequence is an epitope, judging the amino acid sequence as a B cell epitope, otherwise, judging the amino acid sequence as the B cell epitope.
The method for predicting the epitope of the B cell comprises the following specific searching processes of the continuous action searching method:
any 8 amino acid residues of the primary sequence of each protein are taken as initial state s 0 The corresponding amino acid sequence is denoted as X 1 X 2 …X 8, wherein Xj Represents the jth amino acid, j=1, 2, …,8 to go from the initial state s 0 K residues in the following 12 continuous residues are selected to be combined into the state to be used as a first action, wherein k is more than or equal to 1 and less than or equal to 12, and one of n complementary classifiers is selected to be used as a second action option; according to the correspondingValue selection of a first action and a second action, wherein a 1 ,a 2 Respectively, all possible actions in the first action and all possible actions in the second action, then calculating rewards for the two actions by a tendentiousness rewards rule, and calculating a cost function according to the following formula:
wherein ,Vπ (s) is the cost function in state s, pi is the policy,is expected to be R t Is the benefit of the t step, V(s) t+1 ) Is the next state s t+1 A lower cost function;
the Q value is calculated according to the following formula:
wherein Qπ (s,a 1 ,a 2 ) Is a cost function of performing two consecutive actions in state s,is the next state s t+1 Execute two consecutive actions down->Is a cost function of (2);
and simultaneously updating the Q value according to the following steps:
then changing the state, repeating the steps, and updating the Q value.
The above method for predicting epitope of B cell, wherein the predisposition rewarding rule is:
extracting features from the amino acid sequence searched by the first action as input of a classifier selected by the second action, and calculating classification score SC of the amino acid sequence by the classifier t In the set EPT, the probability of occurrence of various amino acids and the probability of occurrence of an amino acid pair comprising two consecutive amino acids are calculated for any one amino acid as i Probability of occurrenceThe calculation is performed according to the following formula:
wherein asi Represents any one of the amino acids in 20, num (as i ) Representation as i At the number of times set EPT occurs, maxnum (as 1 ,as 2 ,…,as 20 ) Represents the maximum number of occurrences of 20 amino acids in the set EPT, minnum (as 1 ,as 2 ,…,as 20 ) A minimum value representing the number of occurrences of 20 amino acids in the aggregate EPT;
any one of amino acid pairs AA i Probability of occurrence of (a)The calculation is performed according to the following formula:
wherein AAi Represents one of 400 amino acid pairs, num (AA i ) Representation of AA i At the number of times set EPT occurs, maxnum (AA 1 ,AA 2 ,…,AA 400 ) Represents the maximum number of occurrences of 400 amino acid pairs in the set EPT, minnum (AA 1 ,AA 2 ,…,AA 400 ) A minimum value representing the number of occurrences of 400 amino acids in the aggregate EPT;
the timely rewards obtained for the amino acid sequence generated in step t are calculated from the following formula:
wherein len (sq) t ) Representing the sequence sq t Comprising the number of amino acids, len (sq t ) The number of consecutive amino acid pairs is indicated.
Advantageous effects
The method combines the Q_learning algorithm, the continuous action search method and the tendency rewarding rule, and introduces the complementary classifier, and greatly enhances the prediction capability of B cell epitopes through automatic iteration, and improves the accuracy of epitope classification.
Drawings
The invention is described in further detail below with reference to the accompanying drawings.
FIG. 1 is a schematic diagram of the Q_learning algorithm.
Detailed Description
The invention provides an epitope prediction method for B cells, which adopts an epitope prediction method based on continuous action search of a Q-learning method, wherein one action selects sequence length and one action selects a complementary classifier, so that the autonomous selection of sequence length is realized, and the optimal classifier can be selected for classification.
In the Q-Learning reinforcement Learning algorithm, each state-action has a corresponding Q value. Therefore, the Learning process of the Q-Learning algorithm is to iteratively calculate the Q value of the Learning state-action pair. Finally, the optimal action strategy obtained by the learner is to select the action corresponding to the maximum Q value in the state s. The Q value Q (s, a) based on the action a in the state s is defined as the cumulative return value obtained by the learner executing the action a in the state s and executing the action according to a certain action strategy. The basic equation for Q value update is:
Q(s t ,a t )=Q(s t ,a t )+α[Rs t +γmaxQ(s t+1 ,a)-Q(s t ,at)]
the above formula (wherein: a is optional action in the state; rs) t An immediate prize awarded for the environment in state s at time t; alpha is the learning rate; q(s) t ,a t ) Evaluation value of state-operation (s, a) at time t.
The Q-Learning algorithm pseudocode is shown in Table 1:
TABLE 1Q-Learning algorithm pseudocode
FIG. 1 is a schematic diagram of the Q_learning algorithm.
Interpretation of technical terms
(1) Protein primary sequence, sequence composed of 20 amino acids, such as ADFCEGHIKLST.
(2) B cell epitopes, which are part of the primary sequence of a protein, may be composed of partial sequences.
(3) An "epoode" is a process in which an Agent (Agent) performs a policy within an environment from start to end.
(4) Agent (Agent): a software and hardware mechanism. It takes corresponding measures through interaction with the surrounding environment. Agent chinese, also known as Agent, is often referred to as Q Agent in Q learning algorithm, which is the subject of exploration and learning to the environment.
(5) Action (Action): various possible actions that the agent may take. Although the actions themselves are somewhat self-explanatory, we still need to have agents able to choose from a series of discrete and possible actions.
(6) Environment (Environment): there is an interaction between the external environment and the agent, and a responsive relationship. The environment takes as input the current state and action of the agent and takes as output the rewards and next state of the agent. An environment is all presence outside of an agent.
(7) State (State): the state is a self-discovery, specific and immediate case by the agent, including: specific location, time of day, and transient configuration that associates agents with other important things.
(8) Rewards (Reward): rewards are feedback by which we can measure the success or failure of various actions of an agent in a given state.
(9) Discount factor (discover): the discount factor is a multiplier. Future rewards found by the agent are multiplied by the factor to attenuate the cumulative impact of such rewards on the agent's current action selection. This is the core of reinforcement learning, i.e., by gradually decreasing the value of future rewards in order to give more weight to the most recent actions. This is critical for a paradigm based on the principle of "delayed action".
(10) Policy (Policy): is a function of the input state observation and output actions. Is the policy that the agent uses to determine the next action based on the current state. It can map different states onto various actions to promise the highest rewards.
(11) Value (Value): it is defined as the long-term expected rewards (not short-term rewards) with discounts for the current state under a specific policy. Short term rewards are transient rewards that are earned by an agent in a certain state and taking a specific action. The value is the amount of rewards that an agent expects to be earned from a certain state until the future.
(12) Q value (Q-value) or action value (action-value): the difference from the "value" is that the Q value requires an additional parameter, i.e. the current action. It refers to a long-term rewards generated by the current state for an action under a specific policy.
(13) Bellman equation: it is a set of equations that decompose the value function into an instant prize plus discounted future values.
(14) Value iteration (Value iteration): this is an algorithm that computes a function with the best state value by iteratively refining the estimates for the values. The algorithm initializes the value function to an arbitrary random value and then repeatedly updates the values of the Q value and the value function until they converge.
(15) Policy iteration (Policy iteration): since the agent is only concerned with finding the optimal strategy, the optimal strategy sometimes converges before the cost function. Thus, policy iteration should not repeatedly improve the evaluation of the value function, but rather need to redefine the policy at each step and calculate the value from the new policy until the policy converges.
(16) Q learning (Q-learning): as an example of a model-less learning algorithm, it does not assume that the agent has already performed a transition of state and rewards model like a finger palm, but rather "thinks" that the agent will find the correct action through trial and error. Therefore, the basic idea of Q learning is: during agent interaction with the environment, a sample of the Q-value function is observed to approximate the Q-function of a "state-action pair".
The epitope prediction method based on machine learning mostly adopts marked sequences as positive samples and unlabeled sequences as negative samples, uses amino acid physical and chemical properties, statistical properties, structural information and the like as characteristic inputs, trains a classifier by using a common classification learning algorithm, and classifies the sequences by using the classifier. This prediction method generally sets a window in advance, and the window size is the number of amino acids contained in the result. Because of the large sequence differences of epitopes, it is difficult to predict epitopes with a trained classifier, and thus the integration method is more advantageous.
The method comprises the steps of firstly searching B cell epitope sequence data from an IEDB database to form a set EPT, and extracting a corresponding protein primary sequence from a uniport database to form a pre-training set PT; in each epoode, the Q agent searches for a combination of residues in the PT according to a continuous action search method, taking any 8 consecutive amino acid residues in the PT as states, and selecting k residues from 12 consecutive residues following each state to incorporate the state as a first action; and (3) selecting one of n complementary classifiers as a second action option, giving instant rewards to the searched amino acid sequence by a tendentiousness rewarding rule, calculating a Q value and updating until the change of a cost function is less than 1%, and then predicting the epitope of the B cell by using the trained classifier.
The specific searching steps are as follows:
first, searching B cell epitope sequence data from an IEDB database to form a set EPT, extracting corresponding protein primary sequences from a uniport database to form a pre-training set PT, and selecting a set containing n more than or equal to 2 complementary classifiers as a second action.
A second step of initializing Q values corresponding to all states and actions to 0, setting learning rate alpha to any number between 0 and 1, setting discount factor gamma to any number between 0 and 1, setting value of epinode, and setting state s 0 Any 8 amino acid residues from a pre-training set.
Third, in each epoode, the Q agent searches in the collection PT according to the continuous action search method, and in the t-th step, the Q agent selects an action from the collection of the first actionsThen select an action from the set of second actionsThe two actions are continuously carried out, and rewards R are given according to a tendentiousness rewarding rule after the execution is finished t And the next observation state s t+1 . The Q value is then updated, as are the states and action tables. When the change in the cost function is less than 1%, the search training process ends.
And fourthly, searching out an amino acid combination in the primary sequence of each protein by using a strategy obtained by training, classifying by a selected classifier, and considering the searched amino acid sequence as an epitope if the result of the classifier shows that the searched amino acid sequence is the epitope, otherwise, not considering the amino acid sequence as the epitope.
In the third step, the "continuous motion search method" is performed by using any 8 amino acid residues of the primary sequence of each protein as the initial state s 0 The corresponding sequence is denoted as X 1 X 2 …X 8 In s 0 The latter i residues combine to act, i=1, 2, …,12. According to the correspondingValue and state s 0 Selecting a first action and a second action, wherein a 1 ,a 2 Refers to all possible actions in the first action and all possible actions in the second action, respectively, perform action +.>Corresponding to the initial state s 0 The first action is to select k residues from the latter 12 consecutive residues and incorporate this state, resulting in a first fragment X of the amino acid sequence of length k+8 1 X 2 …X 8 …X k+8 Execute action->Selecting an mth classifier in the second action set, wherein m is equal to or less than 1 and n, calculating rewards for the two actions by using a tendentiousreward rule, calculating a cost function according to a formula (1), calculating a Q value according to a formula (2), and observing the next state s 1 And simultaneously, the Q value is updated according to the formula (3). Then, bySum state s 1 Select action-> and />Calculating a cost function according to the formula (1), calculating a Q value according to the formula (2), and observing the next state s 2 At the same time according to the publicEquation (3) updates the Q value. Thereafter, training is performed in the same manner as described above.
The cost function under each group of continuous actions in the learning network is:
wherein ,Vπ (s) is the cost function in state s, pi is the policy,is expected to be R t Is the benefit of the t step, V(s) t+1 ) Is the next state s t+1 The following cost function.
The calculation formula of the Q value of each group of actions at the step t:
wherein Qπ (s,a 1 ,a 2 ) Is a cost function of performing two consecutive actions in state s,is the next state s t+1 Execute two consecutive actions down->Is a cost function of (a).
The Q value updating and calculating method comprises the following steps:
and in the third step, the tendency rewarding rule is calculated by two parts of classification scores given to sequences by the classifier and the occurrence probability of the sequences. Extracting features from the amino acid sequence searched for in the first action as input to the classifier selected in the second action, calculating a classification score for the amino acid sequence from the classifier, e.g. in step t, classifying the amino acid sequence obtained in the first action in the second actionThe score calculated by the machine is denoted as SC t . In the aggregate EPT, the number of occurrences of each amino acid and the amino acid pair comprising two consecutive amino acids were counted. For any one amino acid as i Probability of occurrenceAccording to the formula->Calculation of as i Represents any one of the amino acids in 20, num (as i ) Representation as i At the number of times set EPT occurs, maxnum (as 1 ,as 2 ,…,as 20 ) Represents the maximum number of occurrences of 20 amino acids in the set EPT, minnum (as 1 ,as 2 ,…,as 20 ) Represents the minimum number of occurrences of 20 amino acids in the aggregate EPT. Any one of amino acid pairs AA i Is->According to the formula->Performing a calculation, wherein AA i Represents one of 400 amino acid pairs, num (AA i ) Representation of AA i At the number of times set EPT occurs, maxnum (AA 1 ,AA 2 ,…,AA 400 ) Represents the maximum number of occurrences of 400 amino acid pairs in the set EPT, minnum (AA 1 ,AA 2 ,…,AA 400 ) Represents the minimum number of occurrences of 400 amino acids in the aggregate EPT.
The time-ordered rewards obtained from the amino acid sequence generated in step t are represented by the formulaCalculation, wherein len (sq t ) Representing the sequence sq t Comprising the number of amino acids, len (sq t ) The number of consecutive amino acid pairs is indicated.
Compared with the traditional epitope prediction scheme based on the window method, the epitope prediction method based on continuous action search realizes autonomous selection of the epitope sequence, eliminates the influence of human factors, introduces a complementary classifier, and improves the classification accuracy.
The invention adopts a continuous action search method, which takes the number of amino acids in a common epitope sequence of 8-20 as a design basis, and meets the calculation requirement of a complementary classifier under the condition of covering the length of the common epitope.
The method adopts a tendency rewarding rule, not only considers the calculation score of the classifier on the search sequence, but also considers the statistical characteristics of the sequence, comprehensively judges the score of the classifier, the occurrence probability of amino acid and the occurrence probability of amino acid pairs, and calculates the timely rewards of two continuous actions under each state.

Claims (1)

1. The method is characterized in that firstly, B cell epitope sequence data is searched from an IEDB database to form an EPT set, and a corresponding protein primary sequence is extracted from a uniport database to form a pretrained PT set; based on the Q_learning algorithm, one action of the algorithm is changed into two actions to train; in each epoode, the Q agent takes any 8 consecutive amino acid residues in the primary sequence of the protein as a state to select k residues from the 12 consecutive residues following each state to incorporate the state as a first action; selecting one of n complementary classifiers as a second action, searching in a protein primary sequence in PT according to a continuous action search method, giving instant rewards to the searched amino acid sequence by a tendency rewarding rule, calculating a Q value and updating until the change of a value function is less than 1%, and ending training; then searching an amino acid sequence in the protein primary sequence by using a strategy obtained by training, and classifying by a selected classifier, thereby realizing B cell epitope prediction;
the method comprises the following steps:
a. b cell antigen epitope sequence data are searched from an IEDB database to form an assembly EPT, corresponding protein primary sequences are extracted from a uniport database to form a pre-training assembly PT, n complementary classifiers are selected to be used as a second action assembly, and n is more than or equal to 2;
b. taking any 8 consecutive amino acid residues in the primary sequence of the protein as a state, and selecting k residues from 12 consecutive residues following each state to incorporate the state as a first action; selecting one of n complementary classifiers as a second action, initializing Q values corresponding to all states and actions to 0, setting learning rate alpha to any number between 0 and 1, setting discount factor gamma to any number between 0 and 1, setting value of epoode, and initializing state s 0 Any 8 amino acid residues that are a pre-training set;
c. in each epoode, the Q agent searches among the primary sequences of proteins in the collection PT according to a continuous action search method: at the t-th step, let its state be s t The Q-agent selects an action from a set of first actionsThen select action +.>Awarding the prize R according to a tendency prize law after the two actions are executed t And the next observation state s t+1 Then updating the Q value, and updating the state and the action table at the same time, and ending the search training process when the change of the cost function is less than 1%;
d. searching out an amino acid combination in the primary sequence of each protein by utilizing a strategy obtained by training, classifying by a selected classifier, and if the result of the classifier shows that the searched amino acid sequence is an epitope, considering the epitope as a B cell epitope, otherwise, not considering the epitope as a B cell epitope;
the specific searching process of the continuous action searching method comprises the following steps:
any 8 amino acid residues of the primary sequence of each protein are taken as initial state s 0 The corresponding amino acid sequence is denoted as X 1 X 2 …X j, wherein Xj Represents the jth amino groupAcid, j=1, 2, …,8 to go from initial state s 0 K residues from the following 12 consecutive residues are selected to be incorporated into the state as a first action, 1.ltoreq.k.ltoreq.12, and one of n complementary classifiers is selected as a second action; according to the correspondingValue selection of the first action and the second action, wherein +.>Respectively, all possible actions in the first action and all possible actions in the second action, and then calculating rewards for the two actions by a tendentiousness rewarding rule, and calculating a cost function according to the following formula:
wherein ,Vπ (s t ) Is state s t The cost function, pi, below, is the policy,is expected to be R t Is a reward for t steps, V(s) t+1 ) Is the next state s t+1 A lower cost function;
the Q value is calculated according to the following formula:
wherein ,is state s t A cost function for performing two consecutive actions, < ->Is the next state s t+1 Execute two consecutive actions down->Is a cost function of (2);
and simultaneously updating the Q value according to the following steps:
then changing the state, repeating the steps, and updating the Q value;
the tendentiousness rewarding rule is as follows:
extracting features from the amino acid sequence searched by the first action as input of a classifier selected by the second action, and calculating a classification score SC of the amino acid sequence by the classifier t In the set EPT, the probability of occurrence of various amino acids and the probability of occurrence of an amino acid pair comprising two consecutive amino acids are calculated for any one amino acid as u Probability of occurrenceThe calculation is performed according to the following formula:
wherein asu Represents any one of 20 amino acids, num (as u ) Representation as u At the number of times set EPT occurs, maxnum (as 1 ,as 2 ,…,as 20 ) Represents the maximum number of occurrences of 20 amino acids in the set EPT,
minnum(as 1 ,as 2 ,…,as 20 ) A minimum value representing the number of occurrences of 20 amino acids in the aggregate EPT;
any one of amino acid pairs AA v Probability of occurrence of (a)The calculation is performed according to the following formula:
wherein AAv Represents one of 400 amino acid pairs, num (AA v ) Representation of AA v At the number of times set EPT occurs, maxnum (AA 1 ,AA 2 ,…,AA 400 ) Represents the maximum of the number of occurrences of 400 amino acid pairs in the set EPT,
minnum(AA 1 ,AA 2 ,…,AA 400 ) A minimum value representing the number of occurrences of 400 amino acids in the aggregate EPT;
the timely rewards obtained for the amino acid sequence generated in step t are calculated from the following formula:
wherein len (sq t ) Representing the sequence sq t Comprising the number of amino acids.
CN202111537519.XA 2021-12-15 2021-12-15 Antigen epitope prediction method for B cells Active CN114242169B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111537519.XA CN114242169B (en) 2021-12-15 2021-12-15 Antigen epitope prediction method for B cells

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111537519.XA CN114242169B (en) 2021-12-15 2021-12-15 Antigen epitope prediction method for B cells

Publications (2)

Publication Number Publication Date
CN114242169A CN114242169A (en) 2022-03-25
CN114242169B true CN114242169B (en) 2023-10-20

Family

ID=80756774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111537519.XA Active CN114242169B (en) 2021-12-15 2021-12-15 Antigen epitope prediction method for B cells

Country Status (1)

Country Link
CN (1) CN114242169B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013177214A2 (en) * 2012-05-21 2013-11-28 Distributed Bio Inc Epitope focusing by variable effective antigen surface concentration
CN105868583A (en) * 2016-04-06 2016-08-17 东北师范大学 Method for predicting epitope through cost-sensitive integrating and clustering on basis of sequence
CN107033226A (en) * 2017-06-27 2017-08-11 中国农业科学院兰州兽医研究所 A kind of PPR virus F protein epitope peptide and its determination, preparation method and application
WO2017184590A1 (en) * 2016-04-18 2017-10-26 The Broad Institute Inc. Improved hla epitope prediction
CN107341363A (en) * 2017-06-29 2017-11-10 河北省科学院应用数学研究所 A kind of Forecasting Methodology of proteantigen epitope
CN107909153A (en) * 2017-11-24 2018-04-13 天津科技大学 The modelling decision search learning method of confrontation network is generated based on condition
JP6500144B1 (en) * 2018-03-28 2019-04-10 Kotaiバイオテクノロジーズ株式会社 Efficient clustering of immune entities
CN112106141A (en) * 2018-03-16 2020-12-18 弘泰生物科技股份有限公司 Efficient clustering of immune entities

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008097802A2 (en) * 2007-02-02 2008-08-14 Medical Discovery Partners Llc Epitope-mediated antigen prediction
CA3060900A1 (en) * 2018-11-05 2020-05-05 Royal Bank Of Canada System and method for deep reinforcement learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013177214A2 (en) * 2012-05-21 2013-11-28 Distributed Bio Inc Epitope focusing by variable effective antigen surface concentration
CN105868583A (en) * 2016-04-06 2016-08-17 东北师范大学 Method for predicting epitope through cost-sensitive integrating and clustering on basis of sequence
WO2017184590A1 (en) * 2016-04-18 2017-10-26 The Broad Institute Inc. Improved hla epitope prediction
CN107033226A (en) * 2017-06-27 2017-08-11 中国农业科学院兰州兽医研究所 A kind of PPR virus F protein epitope peptide and its determination, preparation method and application
CN107341363A (en) * 2017-06-29 2017-11-10 河北省科学院应用数学研究所 A kind of Forecasting Methodology of proteantigen epitope
CN107909153A (en) * 2017-11-24 2018-04-13 天津科技大学 The modelling decision search learning method of confrontation network is generated based on condition
CN112106141A (en) * 2018-03-16 2020-12-18 弘泰生物科技股份有限公司 Efficient clustering of immune entities
JP6500144B1 (en) * 2018-03-28 2019-04-10 Kotaiバイオテクノロジーズ株式会社 Efficient clustering of immune entities

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
强化学习研究综述;马骋乾;谢伟;孙伟杰;;指挥控制与仿真(06);全文 *

Also Published As

Publication number Publication date
CN114242169A (en) 2022-03-25

Similar Documents

Publication Publication Date Title
Erev et al. Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria
CN109190537A (en) A kind of more personage&#39;s Attitude estimation methods based on mask perceived depth intensified learning
CN106021990B (en) A method of biological gene is subjected to classification and Urine scent with specific character
CN111367790B (en) Meta heuristic test case ordering method based on mixed model
CN111950393B (en) Time sequence action fragment segmentation method based on boundary search agent
CN109145373B (en) Residual life prediction method and device based on improved ESGP and prediction interval
CN111352419B (en) Path planning method and system for updating experience playback cache based on time sequence difference
CN111260658B (en) Deep reinforcement learning method for image segmentation
CN111784595A (en) Dynamic label smooth weighting loss method and device based on historical records
CN111310799A (en) Active learning algorithm based on historical evaluation result
CN113239211A (en) Reinforced learning knowledge graph reasoning method based on course learning
CN114242169B (en) Antigen epitope prediction method for B cells
CN110488020A (en) A kind of protein glycation site identification method
CN112651499A (en) Structural model pruning method based on ant colony optimization algorithm and interlayer information
CN113283467A (en) Weak supervision picture classification method based on average loss and category-by-category selection
CN106709829B (en) Learning situation diagnosis method and system based on online question bank
CN116865251A (en) Short-term load probability prediction method and system
CN109378034B (en) Protein prediction method based on distance distribution estimation
CN116680578A (en) Cross-modal model-based deep semantic understanding method
JP2021047586A (en) Apple quality estimation program and system
CN115062759A (en) Fault diagnosis method based on improved long and short memory neural network
JP2001312712A (en) Non-linear time series prediction method and recording medium with non-linear time series prediction program recorded thereon
CN115240871A (en) Epidemic disease prediction method based on deep embedded clustering element learning
CN112989088B (en) Visual relation example learning method based on reinforcement learning
CN109326319B (en) Protein conformation space optimization method based on secondary structure knowledge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant