CN113577769A - Game character action control method, device, equipment and storage medium - Google Patents

Game character action control method, device, equipment and storage medium Download PDF

Info

Publication number
CN113577769A
CN113577769A CN202110769501.6A CN202110769501A CN113577769A CN 113577769 A CN113577769 A CN 113577769A CN 202110769501 A CN202110769501 A CN 202110769501A CN 113577769 A CN113577769 A CN 113577769A
Authority
CN
China
Prior art keywords
game
predicted
action
role
enemy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110769501.6A
Other languages
Chinese (zh)
Other versions
CN113577769B (en
Inventor
刘舟
徐键滨
吴梓辉
徐雅
王理平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Sanqi Jiyao Network Technology Co ltd
Original Assignee
Guangzhou Sanqi Jiyao Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Sanqi Jiyao Network Technology Co ltd filed Critical Guangzhou Sanqi Jiyao Network Technology Co ltd
Priority to CN202110769501.6A priority Critical patent/CN113577769B/en
Publication of CN113577769A publication Critical patent/CN113577769A/en
Application granted granted Critical
Publication of CN113577769B publication Critical patent/CN113577769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/52Controlling the output signals based on the game progress involving aspects of the displayed game scene
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/55Controlling game characters or game objects based on the game progress

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a game role action control method, which comprises the following steps: acquiring current game state information of a game role, and acquiring current game state information of an enemy game role in a special game state; predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role; inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model, and determining a predicted action output strategy currently taken by the game role; and controlling the game role to output the predicted action according to the predicted action output strategy. The invention also discloses a game role action control device, equipment and a computer readable storage medium. By adopting the embodiment of the invention, the predicted action output strategy of the game role can be determined by combining the current game states of other roles, so that the predicted output result of the role action is accurate and reasonable.

Description

Game character action control method, device, equipment and storage medium
Technical Field
The present invention relates to the field of game role control technologies, and in particular, to a method, an apparatus, a device, and a storage medium for controlling actions of game roles.
Background
With the gradual development of internet technology, the electronic sports industry is receiving more and more extensive attention. For a sports game, in order to ensure good experience of a user in a game process, a role action of a game role at the next moment needs to be predicted in the game process, for example, when the user hangs up, the game role normally outputs an action according to the predicted role action at the next moment. However, the output of the predicted action of the game character cannot be reasonably output in combination with the current game state of another character, and the result of the predicted output of the character action is not accurate and reasonable. For example, the game character predicts the action of releasing a certain skill for the enemy game character based on the current game state information, but the enemy game character may be out of the hit range of the skill at the next time, and the predicted action cannot hit the enemy game character normally.
Disclosure of Invention
The embodiment of the invention aims to provide a game role action control method, a game role action control device, game role equipment and a storage medium, which can determine a predicted action output strategy of a game role by combining the current game states of other roles, so that the predicted action output result of the role action is accurate and reasonable.
In order to achieve the above object, an embodiment of the present invention provides a method for controlling actions of a game character, including:
acquiring current game state information of a game role, and acquiring current game state information of an enemy game role in a special game state;
predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role;
inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model, and determining a predicted action output strategy currently taken by the game role;
and controlling the game role to output the predicted action according to the predicted action output strategy.
As an improvement of the above, the predicted action output policy includes at least one of a determination policy of an attack target, a determination policy of a game action, and a determination policy of a movement path.
As an improvement of the above, the controlling the game character to output the predicted motion according to the predicted motion output policy includes:
when unreasonable actions exist in the predicted actions, an action limiting mechanism corresponding to the predicted actions is obtained;
performing action limiting operation on the unreasonable action according to the action limiting mechanism so as to adjust the sampling probability of the unreasonable action;
and outputting the predicted action after the action limiting operation is finished.
As an improvement of the above, after outputting the predicted motion after the motion limiting operation is completed, the method further includes:
and inputting the predicted action after the action limiting operation is finished into the AI model as sample data so as to train the AI model.
As an improvement of the above, the predicted action is a pitch angle of the game character; then, controlling the game character to output the predicted action according to the predicted action output strategy, further comprising:
acquiring an application scene of the game role using pre-aiming skills;
when the game role does not see an attack target, using pre-aiming skills, and outputting the AI model prediction angle as a pitch angle;
when the game role sees the attack target, the pre-aiming skill is used, and the included angle between the game role and the attack target is calculated to be used as the pitch angle.
As an improvement of the scheme, the AI model comprises a critic network and an actor network, and the predicted action is output through the actor network.
As an improvement of the above scheme, the AI model is obtained by training a preset AI model training method, and the AI model training method includes:
acquiring sample data; wherein the sample data comprises game state data and actual awards for the game character;
inputting the game state data into the AI model to generate a predicted reward for the game character;
calculating a loss function of the AI model according to the predicted reward and the actual reward; the actual reward is calculated through a preset reward mechanism;
and optimizing the AI model according to the loss function until the loss function is converged.
In order to achieve the above object, an embodiment of the present invention further provides a game character motion control apparatus, including:
the game state information acquisition module is used for acquiring the current game state information of the game role and acquiring the current game state information of the enemy game role in the special game state;
the predicted position information acquisition module is used for predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role;
the predicted action output strategy determining module is used for inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model and determining the predicted action output strategy currently taken by the game role;
and the predicted action output module is used for controlling the game role to output the predicted action according to the predicted action output strategy.
To achieve the above object, an embodiment of the present invention further provides a game character motion control device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor implements the game character motion control method according to any one of the above embodiments when executing the computer program.
In order to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute the game character action control method according to any one of the above embodiments.
Compared with the prior art, the game role action control method, the device, the equipment and the storage medium disclosed by the embodiment of the invention have the advantages that firstly, the current game state information of the game role is obtained, and the current game state information of the enemy game role in a special game state is obtained; then, predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role; and finally, inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model, and determining the predicted action output strategy currently taken by the game role so as to control the game role to output the predicted action according to the predicted action output strategy. The game state information of the enemy game role is fully considered when the predicted action output strategy adopted by the game role at present is determined, so that the predicted position information of the enemy game role at the next moment can be predicted based on the game state information of the enemy game role, the game role can make a response strategy in advance, the skill hit rate of the game role is further improved, and the predicted action output result of the game role action is accurate and reasonable.
Drawings
FIG. 1 is a flow chart of a method for controlling actions of a game character according to an embodiment of the present invention;
fig. 2 is a block diagram of a game character motion control apparatus according to an embodiment of the present invention;
fig. 3 is a block diagram of a game character motion control device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of a game character motion control method according to an embodiment of the present invention, where the game character motion control method includes:
s1, obtaining the current game state information of the game character, and obtaining the current game state information of the enemy game character in the special game state;
s2, predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role;
s3, inputting the current game state information of the game role and the forecast position information of the enemy game role into an AI model, and determining the forecast action output strategy currently taken by the game role;
and S4, controlling the game character to output the predicted action according to the predicted action output strategy.
Specifically, in step S1, the game state information includes position information, attribute information, battle state information, and skill information, and ray information (unique to the AI character) for detecting obstacles around the AI. The position information is the current position information of the game role; the attribute information includes a weapon type of the game character, such as a firearm, and other weapon attribute information (weapon bullet count) and team information for distinguishing the game character from an opponent game character; the combat state information is the life value of the game role, whether enemies are killed or not, whether the enemies die or not, whether the enemies are in a squatting state or not, the bullet residual amount (only for AI roles), whether the enemies are on fire or not, whether the enemies are invisible or not, and whether the AI is visible for the roles (only for AI); the skill information further comprises current state information of the skill type encoding skill column, which is optionally the current state information of each skill in the skill column of the game character, for example, three skills are in the skill column of the game character, the current state information comprises an available state and an unavailable state, and it can be understood that the skill is in the unavailable state when the skill is in recovery.
It should be noted that before obtaining the current game state information of the game character and the enemy game character, it is required to first detect whether the game character and the enemy game character satisfy a preset combat condition, where the combat condition includes: the hostile game character is in the attack range of the game character. There may be a plurality of enemy game characters within the attack range of the game character, and when the enemy game character is in the special game state, the position information of at least one enemy game character at the next moment can be predicted at the same time.
Illustratively, the special game state includes at least one of blood volume below a preset blood volume threshold value and skill support volume below a preset skill support volume threshold value, the blood volume is used for representing the life value of the game character, and the skill support volume is used for representing the skill availability degree of the game character. Since the behavior of a enemy game character is easily restricted and generally has regularity when the blood volume is low or the skill support amount is low, the accuracy of predicting the position information at the next time is high for an enemy game character having a low blood volume or a low skill support amount. It should be noted that the blood volume threshold and the skill support threshold may be set according to different types of game characters, and the present invention is not limited thereto.
Specifically, in step S2, after the current game state information of the enemy game character is acquired, the predicted position information of the enemy game character at the next time is predicted based on the current game state information of the enemy game character.
For example, the current game state information of the enemy game character can be input into a preset position prediction model, and the position prediction model outputs the predicted position information of the enemy game character at the next moment according to the current game state information of the enemy game character. When sample data is acquired in the position prediction model training process, samples with the blood volume of an enemy lower than a blood volume threshold value and the skill support volume lower than a skill support volume threshold value need to be respectively extracted, game state information of an enemy game role is used as training data (model input), position information of the role at the next moment is extracted and used as target data (model output), prediction is carried out according to a multilayer perceptron model, and a mean square error is used as a loss function of the model for training. It should be noted that the multi-layered perceptron model and the specific application of the loss function using the mean square error as the model may refer to the prior art, and will not be described herein.
Specifically, in step S3, the current game state information of the game character and the predicted position information of the enemy game character are input to the AI model, so that the AI model outputs a predicted action output policy of the enemy game character, the predicted action output policy including at least one of a determination policy of an attack target, a determination policy of a game action, and a determination policy of a movement path.
For example, when two enemy game characters (e.g., the enemy game character a and the enemy game character B) exist in the attack range of the local game character, the AI model may determine a primary attack target according to the predicted position information of the enemy game character, for example, select the enemy game character a whose predicted position information is closer to the local game character as the attack target. After determining the attack target, the AI model may map out a movement path of the game character to the attack target and a game action to be taken, such as the game action being an attack with a specified skill.
In order to keep the input dimensions of the AI model consistent, when the predicted position information is input to the AI model, since other enemy game characters do not perform position prediction, and there is no predicted value, it is necessary to fill 0 with the predicted position values of the other enemy game characters, and input the value to the AI model together with the predicted position information of the enemy game character in the special game state having the predicted value.
Optionally, the AI model is obtained by training through a preset AI model training method, where the AI model training method includes steps S31 to S34:
s31, acquiring sample data; wherein the sample data comprises game state data and actual awards for the game character;
s32, inputting the game state data into the AI model to generate a predicted reward of the game role;
s33, calculating a loss function of the AI model according to the predicted reward and the actual reward; the actual reward is calculated through a preset reward mechanism;
s34, optimizing the AI model according to the loss function until the convergence of the loss function
Specifically, in step S31, the game state data includes: the game character is set to be in a first state information of currently walking at the time, a first action information of currently walking at the time, and a second state information of next walking at the time.
Specifically, in step S32, the AI model according to the embodiment of the present invention is a DDPG model, and the DDPG model has two networks: an actor network and a critic network. Inputting the first state information and the first action information into a criticic network to obtain a current time-walking first state action value; then, inputting the second state information of next time walking into an actor network to obtain the predicted action of next time walking, namely the second action information, and inputting the second state information and the second action information into a critic network to obtain the second state action value of next time walking; and finally, obtaining a value difference value by subtracting the first state action value and the second state action value, and using the value difference value as the prediction reward.
It is worth noting that the action information refers to actions taken by the virtual character in the environment; the state information is the result of taking the action and is reflected in the state of the game. For example, if the action information is shooting, the corresponding state information is enemy blood loss, enemy death, and the like; for another example, if the action information is jumping, and if the box body is jumped, the corresponding state information is that the height of the box body is increased. In the embodiment of the present invention, acquiring a state from a game environment every n frames is referred to as a time step, and it can be understood that the current time step and the next time step are two consecutive time steps. State action value refers to the expected return for a certain state and action, such as the return being the accumulated value of a game award, and the expected return being the average of the returns for multiple games. The first network is a neural network, the input of the first network is state and action, the output is the value of the state and action, and the parameters of the neural network are adjusted to be close to the reality by back propagation through a loss function.
Specifically, in steps S33-S34, the bonus mechanism indicates that the game character can obtain a corresponding bonus when some conditions are met during the game, such as: giving a forward reward when the operation state of the virtual character is a normal operation; and when the operation state of the virtual character is unconventional operation, giving a negative reward. The critic network and the actor network are respectively divided into an evaluate network and a target network, the target network does not participate in training, and parameters are copied from the evaluate network periodically. The predicted action is output through the operator network. Illustratively, the sum of the squares of the difference between the predicted reward and the actual reward is the loss function of the critic network, and the loss function of the actor network is the predicted value of the critic network. The criticc network loss function satisfies the following equation:
Figure BDA0003152294890000081
wherein N is the total amount of the sample data; y isi=ri+γQ′(si+1,μ′(si+1μ′)|θQ′);riIs the ithReward of time steps; gamma is a super parameter, a discount factor; q (s, a | theta)Q) For criticc network, the parameter is θQ;μ(s|θμ) For an actor network, the parameter is θμ(ii) a Q' is a target network of the critic network; mu' is a target network of the operator network, the parameters of the target network are updated from the Q network and the mu network regularly without participating in training, and the parameters are updated by the operator network through the predicted value of the maximization criticc network.
Specifically, in step S4, after the predicted motion output policy is determined, the game character is controlled to output the corresponding predicted motion according to the predicted motion output policy. When the game role outputs the predicted action, the reasonability of the predicted action needs to be judged, and if the predicted action is unreasonable, the sampling probability of the unreasonable action needs to be adjusted.
Alternatively, the controlling the game character to output the predicted motion according to the predicted motion output policy includes steps S411 to S413:
s411, when unreasonable actions exist in the predicted actions, acquiring an action limiting mechanism corresponding to the predicted actions;
s412, performing action limiting operation on the unreasonable action according to the action limiting mechanism to adjust the sampling probability of the unreasonable action;
and S413, outputting the predicted action after the action limiting operation is finished.
Specifically, the output of the AI model is limited by determining whether the predicted action is consistent with a rule, and for a discrete action, the AI model outputs a probability of selecting the action, and for an unreasonable action, the sampling probability is adjusted to 0, so that the action cannot be sampled.
Illustratively, when the predicted action is a movement path selection, the irrational action is: in the direction of advance in which the obstacle is present; then, the action restriction mechanism is: the sampling probability of the advancing direction in which the obstacle exists is adjusted to 0. When the predicted action output is an attack target selection, the unreasonable action output is as follows: selecting an invisible attack target; then, the action restriction mechanism is: and adjusting the sampling probability of the invisible attack target to be 0. When the predicted action is taken as a game action, the unreasonable action output is: skill in selecting a cooling state; then, the action restriction mechanism is: the sampling probability of the skill in the cooling state is adjusted to 0.
For example, the predicted action is to select an attack target, the AI model may select an enemy that is invisible outside the field of view, at this time, certain processing needs to be performed on the action output of the AI model, and for selecting attack targets enemy 1, enemy 2, and enemy 3, the probability of selecting enemy 1, the probability of selecting enemy 2, and the probability of selecting enemy 3, the probability sum being 1, are output, and if enemy 1 is invisible at this time, the probability of selecting enemy 1 is adjusted to 0, and only a visible enemy is selected.
Further, when the predicted movement is used as the pitch angle of the game character; then, the method for controlling the game character to output the predicted motion according to the predicted motion output policy further includes steps S421 to S423:
s421, acquiring an application scene of the game role using the pre-aiming skill;
s422, when the game role does not see an attack target, using a pre-aiming skill, and outputting the AI model prediction angle as a pitching angle;
and S423, when the game character sees the attack target, using the pre-aiming skill, and calculating a preset position included angle between the game character and the attack target as a pitching angle.
Illustratively, the pitch angle is the pitch turn angle, which represents the pitch angle that the game character adopts when using its pre-aiming skills. When the game role does not see the attack target, the pre-aiming skill is used, the AI model prediction angle is output as the pitch angle, and the pitch angle cannot be directly calculated because the game role does not see the attack target in the visual field range, so the AI model prediction angle can be directly adopted as the pitch angle. When the game character sees the attack target, the pitching angle of the game character can be directly calculated by using the pre-aiming skill, and if the preset position is the head, the included angle between the horizontal line and the connecting line of the game character and the head of the enemy game character is used as the pitching angle.
Further, after the predicted operation after the operation limiting operation is performed is output in step S413, the method further includes step S414:
and S414, inputting the predicted motion after the motion limiting operation is finished into the AI model as sample data so as to train the AI model.
For example, the predicted motion after motion limitation may be input into the AI model as sample data, and the AI model may be trained, and thus, no good rewarded (reward) is generated due to unreasonable motion, and if no motion limitation is performed, the predicted motion is directly input into the AI model, which may result in too many invalid samples generated by the AI model, and the invalid samples cannot be applied to the AI model.
Compared with the prior art, the game role action control method disclosed by the embodiment of the invention comprises the steps of firstly, obtaining the current game state information of the game role, and obtaining the current game state information of the enemy game role in a special game state; then, predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role; and finally, inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model, and determining the predicted action output strategy currently taken by the game role so as to control the game role to output the predicted action according to the predicted action output strategy. The game state information of the enemy game role is fully considered when the predicted action output strategy adopted by the game role at present is determined, so that the predicted position information of the enemy game role at the next moment can be predicted based on the game state information of the enemy game role, the game role can make a response strategy in advance, the skill hit rate of the game role is further improved, and the predicted action output result of the game role action is accurate and reasonable.
Referring to fig. 2, fig. 2 is a block diagram of a game character motion control device 10 according to an embodiment of the present invention, where the game character motion control device 10 includes:
a game state information obtaining module 11, configured to obtain current game state information of a local game character, and obtain current game state information of an enemy game character in a special game state;
a predicted position information obtaining module 12, configured to predict, according to the current game state information of the enemy game character, predicted position information of the enemy game character at the next time;
a predicted action output strategy determining module 13, configured to input current game state information of the game character and predicted position information of the enemy game character into an AI model, and determine a predicted action output strategy currently taken by the game character;
and the predicted action output module 14 is used for controlling the game role to output the predicted action according to the predicted action output strategy.
Specifically, the game state information includes position information, attribute information, fighting state information, and skill information, and ray information (unique to the AI character) for detecting obstacles around the AI. The position information is the current position information of the game role; the attribute information includes a weapon type of the game character, such as a firearm, and other weapon attribute information (weapon bullet count) and team information for distinguishing the game character from an opponent game character; the combat state information is the life value of the game role, whether enemies are killed or not, whether the enemies die or not, whether the enemies are in a squatting state or not, the bullet residual amount (only for AI roles), whether the enemies are on fire or not, whether the enemies are invisible or not, and whether the AI is visible for the roles (only for AI); the skill information further comprises current state information of the skill type encoding skill column, which is optionally the current state information of each skill in the skill column of the game character, for example, three skills are in the skill column of the game character, the current state information comprises an available state and an unavailable state, and it can be understood that the skill is in the unavailable state when the skill is in recovery.
It should be noted that before obtaining the current game state information of the game character and the enemy game character, it is required to first detect whether the game character and the enemy game character satisfy a preset combat condition, where the combat condition includes: the hostile game character is in the attack range of the game character. There may be a plurality of enemy game characters within the attack range of the game character, and when the enemy game character is in the special game state, the position information of at least one enemy game character at the next moment can be predicted at the same time.
Illustratively, the special game state includes at least one of blood volume below a preset blood volume threshold value and skill support volume below a preset skill support volume threshold value, the blood volume is used for representing the life value of the game character, and the skill support volume is used for representing the skill availability degree of the game character. Since the behavior of a enemy game character is easily restricted and generally has regularity when the blood volume is low or the skill support amount is low, the accuracy of predicting the position information at the next time is high for an enemy game character having a low blood volume or a low skill support amount. It should be noted that the blood volume threshold and the skill support threshold may be set according to different types of game characters, and the present invention is not limited thereto.
Specifically, after the predicted position information obtaining module 12 obtains the current game state information of the enemy game character, the predicted position information of the enemy game character at the next moment is predicted according to the current game state information of the enemy game character.
For example, the current game state information of the enemy game character can be input into a preset position prediction model, and the position prediction model outputs the predicted position information of the enemy game character at the next moment according to the current game state information of the enemy game character. When sample data is acquired in the position prediction model training process, samples with the blood volume of an enemy lower than a blood volume threshold value and the skill support volume lower than a skill support volume threshold value need to be respectively extracted, game state information of an enemy game role is used as training data (model input), position information of the role at the next moment is extracted and used as target data (model output), prediction is carried out according to a multilayer perceptron model, and a mean square error is used as a loss function of the model for training. It should be noted that the multi-layered perceptron model and the specific application of the loss function using the mean square error as the model may refer to the prior art, and will not be described herein.
Specifically, the predicted action output policy determination module 13 inputs current game state information of the game character and predicted position information of the enemy game character into an AI model, so that the AI model outputs a predicted action output policy of the enemy game character, where the predicted action output policy includes at least one of a determination policy of an attack target, a determination policy of a game action, and a determination policy of a movement path.
For example, when two enemy game characters (e.g., the enemy game character a and the enemy game character B) exist in the attack range of the local game character, the AI model may determine a primary attack target according to the predicted position information of the enemy game character, for example, select the enemy game character a whose predicted position information is closer to the local game character as the attack target. After determining the attack target, the AI model may map out a movement path of the game character to the attack target and a game action to be taken, such as the game action being an attack with a specified skill.
In order to keep the input dimensions of the AI model consistent, when the predicted position information is input to the AI model, since other enemy game characters do not perform position prediction, and there is no predicted value, it is necessary to fill 0 with the predicted position values of the other enemy game characters, and input the value to the AI model together with the predicted position information of the enemy game character in the special game state having the predicted value.
Optionally, the AI model is obtained by training through a preset AI model training method, where the AI model training method includes:
s31, acquiring sample data; wherein the sample data comprises game state data and actual awards for the game character;
s32, inputting the game state data into the AI model to generate a predicted reward of the game role;
s33, calculating a loss function of the AI model according to the predicted reward and the actual reward; the actual reward is calculated through a preset reward mechanism;
s34, optimizing the AI model according to the loss function until the convergence of the loss function
Specifically, in step S31, the game state data includes: the game character is set to be in a first state information of currently walking at the time, a first action information of currently walking at the time, and a second state information of next walking at the time.
Specifically, in step S32, the AI model according to the embodiment of the present invention is a DDPG model, and the DDPG model has two networks: an actor network and a critic network. Inputting the first state information and the first action information into a criticic network to obtain a current time-walking first state action value; then, inputting the second state information of next time walking into an actor network to obtain the predicted action of next time walking, namely the second action information, and inputting the second state information and the second action information into a critic network to obtain the second state action value of next time walking; and finally, obtaining a value difference value by subtracting the first state action value and the second state action value, and using the value difference value as the prediction reward.
It is worth noting that the action information refers to actions taken by the virtual character in the environment; the state information is the result of taking the action and is reflected in the state of the game. For example, if the action information is shooting, the corresponding state information is enemy blood loss, enemy death, and the like; for another example, if the action information is jumping, and if the box body is jumped, the corresponding state information is that the height of the box body is increased. In the embodiment of the present invention, acquiring a state from a game environment every n frames is referred to as a time step, and it can be understood that the current time step and the next time step are two consecutive time steps. State action value refers to the expected return for a certain state and action, such as the return being the accumulated value of a game award, and the expected return being the average of the returns for multiple games. The first network is a neural network, the input of the first network is state and action, the output is the value of the state and action, and the parameters of the neural network are adjusted to be close to the reality by back propagation through a loss function.
Specifically, in steps S33-S34, the bonus mechanism indicates that the game character can obtain a corresponding bonus when some conditions are met during the game, such as: giving a forward reward when the operation state of the virtual character is a normal operation; and when the operation state of the virtual character is unconventional operation, giving a negative reward. The critic network and the actor network are respectively divided into an evaluate network and a target network, the target network does not participate in training, and parameters are copied from the evaluate network periodically. The predicted action is output through the operator network. Illustratively, the sum of the squares of the difference between the predicted reward and the actual reward is the loss function of the critic network, and the loss function of the actor network is the predicted value of the critic network. The criticc network loss function satisfies the following equation:
Figure BDA0003152294890000151
wherein N is the total amount of the sample data; y isi=ri+γQ′(si+1,μ′(si+1μ′)|θQ′);riReward for the ith time step; gamma is a super parameter, a discount factor; q (s, a | theta)Q) For criticc network, the parameter is θQ;μ(s|θμ) An actor network, with a parameter θ μ; q' is a target network of the critic network; mu' is a target network of the actor network, the parameters of the target network do not participate in training, and the parameters are periodically updated from the Q network and the mu networkThe operator network updates the parameters through the predicted values of the maximization critic network.
Specifically, after determining the predicted action output policy, the predicted action output module 14 controls the game character to output the corresponding predicted action according to the predicted action output policy. When the game role outputs the predicted action, the reasonability of the predicted action needs to be judged, and if the predicted action is unreasonable, the sampling probability of the unreasonable action needs to be adjusted.
Optionally, the predicted action output module 14 is configured to:
when unreasonable actions exist in the predicted actions, an action limiting mechanism corresponding to the predicted actions is obtained;
performing action limiting operation on the unreasonable action according to the action limiting mechanism so as to adjust the sampling probability of the unreasonable action;
and outputting the predicted action after the action limiting operation is finished.
Specifically, the output of the AI model is limited by determining whether the predicted action is consistent with a rule, and for a discrete action, the AI model outputs a probability of selecting the action, and for an unreasonable action, the sampling probability is adjusted to 0, so that the action cannot be sampled.
Illustratively, when the predicted action is a movement path selection, the irrational action is: in the direction of advance in which the obstacle is present; then, the action restriction mechanism is: the sampling probability of the advancing direction in which the obstacle exists is adjusted to 0. When the predicted action output is an attack target selection, the unreasonable action output is as follows: selecting an invisible attack target; then, the action restriction mechanism is: and adjusting the sampling probability of the invisible attack target to be 0. When the predicted action is taken as a game action, the unreasonable action output is: skill in selecting a cooling state; then, the action restriction mechanism is: the sampling probability of the skill in the cooling state is adjusted to 0.
For example, the predicted action is to select an attack target, the AI model may select an enemy that is invisible outside the field of view, at this time, certain processing needs to be performed on the action output of the AI model, and for selecting attack targets enemy 1, enemy 2, and enemy 3, the probability of selecting enemy 1, the probability of selecting enemy 2, and the probability of selecting enemy 3, the probability sum being 1, are output, and if enemy 1 is invisible at this time, the probability of selecting enemy 1 is adjusted to 0, and only a visible enemy is selected.
Further, when the predicted movement is used as the pitch angle of the game character; then, the predicted action output module is further configured to:
acquiring an application scene of the game role using pre-aiming skills;
when the game role does not see an attack target, using pre-aiming skills, and outputting the AI model prediction angle as a pitch angle;
when the game role sees the attack target, the pre-aiming skill is used, and the included angle between the game role and the attack target is calculated to be used as the pitch angle.
Illustratively, the pitch angle is the pitch turn angle, which represents the pitch angle that the game character adopts when using its pre-aiming skills. When the game role does not see the attack target, the pre-aiming skill is used, the AI model prediction angle is output as the pitch angle, and the pitch angle cannot be directly calculated because the game role does not see the attack target in the visual field range, so the AI model prediction angle can be directly adopted as the pitch angle. When the game character sees the attack target, the pitching angle of the game character can be directly calculated by using the pre-aiming skill, and if the preset position is the head, the included angle between the horizontal line and the connecting line of the game character and the head of the enemy game character is used as the pitching angle.
Further, after the predicted action output module 14 outputs the predicted action after the action limiting operation is performed, the predicted action output module 14 is further configured to:
and inputting the predicted action after the action limiting operation is finished into the AI model as sample data so as to train the AI model.
For example, the predicted motion after motion limitation may be input into the AI model as sample data, and the AI model may be trained, and thus, no good rewarded (reward) is generated due to unreasonable motion, and if no motion limitation is performed, the predicted motion is directly input into the AI model, which may result in too many invalid samples generated by the AI model, and the invalid samples cannot be applied to the AI model.
Compared with the prior art, the game role action control device 10 disclosed in the embodiment of the invention firstly obtains the current game state information of the game role and obtains the current game state information of the enemy game role in the special game state; then, predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role; and finally, inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model, and determining the predicted action output strategy currently taken by the game role so as to control the game role to output the predicted action according to the predicted action output strategy. The game state information of the enemy game role is fully considered when the predicted action output strategy adopted by the game role at present is determined, so that the predicted position information of the enemy game role at the next moment can be predicted based on the game state information of the enemy game role, the game role can make a response strategy in advance, the skill hit rate of the game role is further improved, and the predicted action output result of the game role action is accurate and reasonable.
Referring to fig. 3, fig. 3 is a block diagram illustrating a game character motion control device 20 according to an embodiment of the present invention; the game character motion control device 20 includes: a processor 21, a memory 22 and a computer program stored in said memory 22 and executable on said processor 21. The processor 21 implements the steps in the above-described respective game character motion control method embodiments when executing the computer program. Alternatively, the processor 21 implements the functions of the modules/units in the above-described device embodiments when executing the computer program.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 22 and executed by the processor 21 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the game character action control device 20.
The game role action control device 20 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The game character motion control device 20 may include, but is not limited to, a processor 21, a memory 22. Those skilled in the art will appreciate that the schematic diagram is merely an example of the game character motion control device 20 and does not constitute a limitation of the game character motion control device 20 and may include more or fewer components than shown, or some components in combination, or different components, e.g., the game character motion control device 20 may also include input-output devices, network access devices, buses, etc.
The Processor 21 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 21 is a control center of the game character motion control apparatus 20 and connects the respective parts of the entire game character motion control apparatus 20 by various interfaces and lines.
The memory 22 may be used to store the computer programs and/or modules, and the processor 21 may implement various functions of the game character motion control apparatus 20 by executing or executing the computer programs and/or modules stored in the memory 22 and calling data stored in the memory 22. The memory 22 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 22 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The integrated modules/units of the game character motion control device 20 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and used by the processor 21 to implement the steps of the above embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (10)

1. A game character motion control method is characterized by comprising the following steps:
acquiring current game state information of a game role, and acquiring current game state information of an enemy game role in a special game state;
predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role;
inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model, and determining a predicted action output strategy currently taken by the game role;
and controlling the game role to output the predicted action according to the predicted action output strategy.
2. A game character motion control method according to claim 1, wherein the predicted motion output policy includes at least one of a determination policy of an attack target, a determination policy of a game motion, and a determination policy of a movement path.
3. A game character motion control method according to claim 1, wherein controlling the game character to output the predicted motion in accordance with the predicted motion output policy includes:
when unreasonable actions exist in the predicted actions, an action limiting mechanism corresponding to the predicted actions is obtained;
performing action limiting operation on the unreasonable action according to the action limiting mechanism so as to adjust the sampling probability of the unreasonable action;
and outputting the predicted action after the action limiting operation is finished.
4. A method for controlling actions of a game character according to claim 3, wherein after outputting the predicted action after the action limiting operation is completed, the method further comprises:
and inputting the predicted action after the action limiting operation is finished into the AI model as sample data so as to train the AI model.
5. The game character motion control method according to claim 1, wherein the predicted motion is a pitch angle of the game character; then, controlling the game character to output the predicted action according to the predicted action output strategy, further comprising:
acquiring an application scene of the game role using pre-aiming skills;
when the game role does not see an attack target, using pre-aiming skills, and outputting the AI model prediction angle as a pitch angle;
when the game role sees the attack target, the pre-aiming skill is used, and the included angle between the game role and the attack target is calculated to be used as the pitch angle.
6. The game character motion control method of claim 1, wherein the AI model includes a critic network and an operator network, and the predicted motion is output through the operator network.
7. The game character motion control method according to any one of claims 1 to 6, wherein the AI model is trained by a preset AI model training method, the AI model training method comprising:
acquiring sample data; wherein the sample data comprises game state data and actual awards for the game character;
inputting the game state data into the AI model to generate a predicted reward for the game character;
calculating a loss function of the AI model according to the predicted reward and the actual reward; the actual reward is calculated through a preset reward mechanism;
and optimizing the AI model according to the loss function until the loss function is converged.
8. A game character motion control device, comprising:
the game state information acquisition module is used for acquiring the current game state information of the game role and acquiring the current game state information of the enemy game role in the special game state;
the predicted position information acquisition module is used for predicting the predicted position information of the enemy game role at the next moment according to the current game state information of the enemy game role;
the predicted action output strategy determining module is used for inputting the current game state information of the game role and the predicted position information of the enemy game role into an AI model and determining the predicted action output strategy currently taken by the game role;
and the predicted action output module is used for controlling the game role to output the predicted action according to the predicted action output strategy.
9. A game character motion control apparatus comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the game character motion control method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, comprising a stored computer program, wherein when the computer program runs, the computer-readable storage medium controls an apparatus to execute the game character motion control method according to any one of claims 1 to 7.
CN202110769501.6A 2021-07-07 2021-07-07 Game character action control method, apparatus, device and storage medium Active CN113577769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110769501.6A CN113577769B (en) 2021-07-07 2021-07-07 Game character action control method, apparatus, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110769501.6A CN113577769B (en) 2021-07-07 2021-07-07 Game character action control method, apparatus, device and storage medium

Publications (2)

Publication Number Publication Date
CN113577769A true CN113577769A (en) 2021-11-02
CN113577769B CN113577769B (en) 2024-03-08

Family

ID=78246226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110769501.6A Active CN113577769B (en) 2021-07-07 2021-07-07 Game character action control method, apparatus, device and storage medium

Country Status (1)

Country Link
CN (1) CN113577769B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106390456A (en) * 2016-09-30 2017-02-15 腾讯科技(深圳)有限公司 Generating method and generating device for role behaviors in game
KR20180083703A (en) * 2017-01-13 2018-07-23 주식회사 엔씨소프트 Method of decision making for a fighting action game character based on artificial neural networks and computer program therefor
CN111632379A (en) * 2020-04-28 2020-09-08 腾讯科技(深圳)有限公司 Game role behavior control method and device, storage medium and electronic equipment
KR102154828B1 (en) * 2019-05-13 2020-09-10 숭실대학교산학협력단 Method and apparatus for predicting game results
CN112221120A (en) * 2020-10-15 2021-01-15 网易(杭州)网络有限公司 Game state synchronization method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106390456A (en) * 2016-09-30 2017-02-15 腾讯科技(深圳)有限公司 Generating method and generating device for role behaviors in game
KR20180083703A (en) * 2017-01-13 2018-07-23 주식회사 엔씨소프트 Method of decision making for a fighting action game character based on artificial neural networks and computer program therefor
KR102154828B1 (en) * 2019-05-13 2020-09-10 숭실대학교산학협력단 Method and apparatus for predicting game results
CN111632379A (en) * 2020-04-28 2020-09-08 腾讯科技(深圳)有限公司 Game role behavior control method and device, storage medium and electronic equipment
CN112221120A (en) * 2020-10-15 2021-01-15 网易(杭州)网络有限公司 Game state synchronization method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113577769B (en) 2024-03-08

Similar Documents

Publication Publication Date Title
KR102523888B1 (en) Method, Apparatus and Device for Scheduling Virtual Objects in a Virtual Environment
WO2021213026A1 (en) Virtual object control method and apparatus, and device and storage medium
CN108920221B (en) Game difficulty adjusting method and device, electronic equipment and storage medium
CN109529352B (en) Method, device and equipment for evaluating scheduling policy in virtual environment
CN113440846B (en) Game display control method and device, storage medium and electronic equipment
CN111672116B (en) Method, device, terminal and storage medium for controlling virtual object release technology
CN113134233B (en) Control display method and device, computer equipment and storage medium
US20230336792A1 (en) Display method and apparatus for event livestreaming, device and storage medium
CN111111204A (en) Interactive model training method and device, computer equipment and storage medium
JP7447296B2 (en) Interactive processing method, device, electronic device and computer program for virtual tools
CN112221152A (en) Artificial intelligence AI model training method, device, equipment and medium
JP2023533078A (en) Automatic harassment monitoring system
US20200384359A1 (en) Latency erasure
EP3995190A1 (en) Virtual environment image display method and apparatus, device and medium
CN113509726B (en) Interaction model training method, device, computer equipment and storage medium
Lopez-Gordo et al. Performance prediction at single-action level to a first-person shooter video game
CN111265871A (en) Virtual object control method and device, equipment and storage medium
CN113577769B (en) Game character action control method, apparatus, device and storage medium
CN114935893B (en) Motion control method and device for aircraft in combat scene based on double-layer model
CN113663335A (en) AI model training method, device, equipment and storage medium for FPS game
WO2023138175A1 (en) Card placing method and apparatus, device, storage medium and program product
JP6843410B1 (en) Programs, information processing equipment, methods
CN112933600B (en) Virtual object control method, device, computer equipment and storage medium
US20240123341A1 (en) Method, apparatus, electronic device and storage medium for combat control
CN114254722B (en) Multi-intelligent-model fusion method for game confrontation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant