WO2023071221A1 - 一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品 - Google Patents

一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品 Download PDF

Info

Publication number
WO2023071221A1
WO2023071221A1 PCT/CN2022/098707 CN2022098707W WO2023071221A1 WO 2023071221 A1 WO2023071221 A1 WO 2023071221A1 CN 2022098707 W CN2022098707 W CN 2022098707W WO 2023071221 A1 WO2023071221 A1 WO 2023071221A1
Authority
WO
WIPO (PCT)
Prior art keywords
game
target
data
state information
type
Prior art date
Application number
PCT/CN2022/098707
Other languages
English (en)
French (fr)
Inventor
孙阳霆
周航
刘宇
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023071221A1 publication Critical patent/WO2023071221A1/zh

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/53Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
    • A63F13/537Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen
    • A63F13/5372Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen for tagging characters, objects or locations in the game scene, e.g. displaying a circle under the character controlled by the player
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/55Controlling game characters or game objects based on the game progress
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to but not limited to the technical field of deep learning, and in particular relates to an interaction method, device, computer equipment, storage medium, computer program and computer program product in a game.
  • AI artificial intelligence
  • Embodiments of the present disclosure at least provide an interaction method, device, computer equipment, storage medium, computer program, and computer program product in a game.
  • An embodiment of the present disclosure provides an interaction method in a game, which is applied to an agent; the game includes a plurality of virtual objects; the plurality of virtual objects include a target virtual object controlled by the agent;
  • the method includes: obtaining current game state information; the current game state information includes at least one of the following: the current state information of the target virtual object, the current state information of the non-target virtual object, and the current state information of the game scene;
  • the trained target neural network performs interactive action prediction processing based on the current game state information, and obtains the target time when the target virtual object performs an interactive action and the target type; in response to the current time reaching the target time, control the target The virtual object performs an interactive action of the target type.
  • the target virtual object is executed.
  • the target time and target type of the interactive action are predicted, and after the current time reaches the target time, the target virtual object is controlled to execute the target type of interactive action, so that the intention of the operation can be informed to other agents or players, and the player and The cooperation between agents and between agents can improve the cooperation in the game process.
  • the current state information of the target virtual object includes at least one of the following: the first virtual resource type, the first virtual resource amount, the first building state, and the first skill state of the target virtual object. , the first position in the game scene, the first life value, the first mana value, the first camp information, the first gain data, and the first debuff data;
  • the current state information of the non-target virtual object includes the following At least one of the above: the second virtual resource type, the second virtual resource amount, the second building state, the second skill state, the second position in the game scene, and the second life value of the non-target virtual object , second magic value, second camp information, second gain data, second debuff data, type information, and interactive action information;
  • the current state information of the game scene includes at least one of the following: the target virtual object The visible area information, the third virtual resource type, the virtual resource location, and the remaining time for refreshing the virtual resource.
  • the interaction action includes at least one of the following: performing action marking on the map of the game scene, sending voice interaction information to non-target virtual objects of the same camp, sending text to non-target virtual objects of the same camp Interaction information; the type of the interaction action includes at least one of the following: retreat, attack, defense, and request for support.
  • the target neural network is trained in the following manner: based on at least one of the levels of the multiple players in the game, the length of the game match, and the number of times the interactive action occurs, from multiple players In the first game match respectively participated in, determine the target game match; Based on the target game match, determine the first game data respectively corresponding to at least one type of interactive action to be predicted; the first game data includes: In the target game match, at least one frame of target game state information between the first time point and the second time point; wherein, the first time point is a time point earlier than the occurrence time of the interactive action , the second time point is a time point that is the same as or later than the occurrence time of the interactive action; the first game data is used as sample data to train an original neural network to obtain the target neural network.
  • the target game match with higher game quality is selected, Improved the quality of the sample data and thus the quality of the resulting target neural network.
  • the first game data respectively corresponding to at least one type of interactive action to be predicted based on the target game match it further includes: analyzing the different types of interactions occurring in the target game match Counting the number of times of interactive actions of different types to obtain the total number of occurrences corresponding to different types of interactive actions; based on the total number of times of occurrences corresponding to the different types of interactive actions, the type to be predicted is determined.
  • the determining the first game data respectively corresponding to at least one type of interaction action to be predicted based on the target game match includes: determining the type to be predicted for each target game match The occurrence time of the interactive action in each target game match; based on the occurrence time, determine the first time point and the second time point; based on the first time point and the second time point At the time point, from the original game data corresponding to each target game match, intercept the target game state information of the interaction action of the type to be predicted in each target game match; the target game match
  • the corresponding original game data includes multiple frames of game state information.
  • using the first game data as sample data to train the original neural network to obtain the target neural network includes: using the first game data as sample data, and using the first game data as sample data to obtain the target neural network.
  • the to-be-recognized type of the interactive action corresponding to the data is used as supervisory data, and the original neural network is supervised and trained to obtain a plurality of initialized neural networks; wherein, different initialized neural networks have different training parameters during training; based on different initialized neural networks Constitute different agents, and use different agents to control virtual characters to play games, and obtain second original game data; use the second original game data to strengthen training for corresponding initialization neural networks, and obtain multiple initialization neural networks.
  • the target neural network is trained by means of supervised learning, so that the trained target neural network can learn different types of interactive actions of human players, for example, it can include various types of interactive actions such as attack, retreat, defense, and request for support;
  • the multi-agent reinforcement learning method is used to train the target neural network, which increases the cooperative awareness of the final generated target neural network and enables the target neural network to have better combat capabilities.
  • the performance information respectively corresponding to the candidate neural networks includes: game scores when using the initialized neural networks corresponding to the candidate neural networks to play games.
  • An embodiment of the present disclosure provides an interaction device in a game, including: applied to an agent; the game includes a plurality of virtual objects; the plurality of virtual objects include a target virtual object controlled by the agent;
  • the device includes: an acquisition part configured to acquire current game state information; the current game state information includes at least one of the following: current state information of the target virtual object, current state information of the non-target virtual object, game scene The current state information of the current state information;
  • the processing part is configured to use the pre-trained target neural network to perform interactive action prediction processing based on the current game state information, and obtain the target time and target type of the target virtual object to perform the interactive action; control The part is configured to control the target virtual object to perform the target type of interactive action in response to the current time reaching the target time.
  • the current state information of the target virtual object includes at least one of the following: the first virtual resource type, the first virtual resource amount, the first building state, and the first skill state of the target virtual object. , the first position in the game scene, the first life value, the first mana value, the first camp information, the first gain data, and the first debuff data;
  • the current state information of the non-target virtual object includes the following At least one of the above: the second virtual resource type, the second virtual resource amount, the second building state, the second skill state, the second position in the game scene, and the second life value of the non-target virtual object , second magic value, second camp information, second gain data, second debuff data, type information, and interactive action information;
  • the current state information of the game scene includes at least one of the following: the target virtual object The visible area information, the third virtual resource type, the virtual resource location, and the remaining time for refreshing the virtual resource.
  • the interaction action includes at least one of the following: performing action marking on the map of the game scene, sending voice interaction information to non-target virtual objects of the same camp, sending text to non-target virtual objects of the same camp Interaction information; the type of the interaction action includes at least one of the following: retreat, attack, defense, and request for support.
  • the processing part is further configured to train the target neural network in the following manner: based on the levels of the multiple players in the game, the duration of the game, and the number of times the interactive action occurs At least one, determining a target game match from a first game match in which a plurality of players respectively participate; based on the target game match, determining first game data corresponding to at least one type of interactive action to be predicted;
  • the first game data includes: in the target game match, at least one frame of target game state information between the first time point and the second time point; wherein, the first time point is earlier than the The time point of the occurrence time of the interactive action, the second time point is the same as or later than the time point of the occurrence time of the interactive action; the first game data is used as sample data, and the original neural network is trained to obtain The target neural network.
  • the processing part before the determining of the first game data respectively corresponding to at least one interaction action of the type to be predicted based on the target game match, is further configured to: Counting the number of different types of interactive actions that appear in the game to obtain the total number of occurrences corresponding to the different types of interactive actions; determine the type to be predicted based on the total number of occurrences corresponding to the different types of interactive actions.
  • the processing part when determining the first game data respectively corresponding to at least one interaction action of the type to be predicted based on the target game match, is further configured to: for each target game match, determining the occurrence time of the interaction action of the type to be predicted in each target game match; based on the occurrence time, determining the first time point and the second time point; based on the At the first time point and the second time point, intercept the target game of the interaction action of the type to be predicted in each target game game from the original game data corresponding to each target game game State information: the original game data corresponding to the target game match includes multiple frames of game state information.
  • the processing part when the processing part uses the first game data as sample data to train the original neural network to obtain the target neural network, it is further configured to: use the first game data as Sample data, using the to-be-recognized type of the interactive action corresponding to the first game data as supervisory data, performing supervisory training on the original neural network to obtain a plurality of initialized neural networks; wherein, the training of different initialized neural networks during training The parameters are different; different agents are formed based on different initialization neural networks, and different agents are used to control virtual characters to play games, and the second original game data is obtained; the corresponding initialization neural network is paired with the second original game data performing intensive training to obtain candidate neural networks respectively corresponding to a plurality of initialization neural networks; and determining at least one target neural network from the candidate neural networks based on performance information respectively corresponding to the candidate neural networks.
  • the performance information respectively corresponding to the candidate neural networks includes: game scores when using the initialized neural networks corresponding to the candidate neural networks to play games.
  • An embodiment of the present disclosure provides a computer device, including: a processor and a memory, the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory When the machine-readable instructions are executed by the processor, the processor performs some or all of the steps in the above method.
  • An embodiment of the present disclosure provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, part or all of the steps in the above method are executed.
  • An embodiment of the present disclosure provides a computer program, including computer readable codes.
  • a processor in the computer device executes a part or part of the above method. All steps.
  • An embodiment of the present disclosure provides a computer program product.
  • the computer program product includes a non-transitory computer-readable storage medium storing a computer program.
  • the computer program is read and executed by a computer, a part or part of the above method is implemented. All steps.
  • FIG. 1 is a schematic diagram of an implementation flow of an interaction method in a game provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a display interface after marking actions in the game scene map in the interaction method in the game provided by the embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of the implementation flow of a training method for a target neural network in an interaction method in a game provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of the composition and structure of an interactive device in a game provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of the composition and structure of a computer device provided by an embodiment of the present disclosure.
  • the agent has a certain decision-making intelligence. Using the decision-making intelligence it has, it can control the virtual objects and players in the game.
  • decision-making intelligence it has, it can control the virtual objects and players in the game.
  • existing game partners are usually played by hard-coded robots (bots).
  • RTS Real-Time Strategy Game
  • MOBA Multiplayer Online Battle Arena
  • an embodiment of the present disclosure provides an interaction method in a game, using a pre-trained target neural network, based on the current state information of the target virtual object during the game, the current state information of the non-target virtual object, and the game scene
  • the current state is at least one of the information, predict the target time and the target type of the target virtual object to perform the interactive action, and control the target virtual object to perform the target type of interactive action after the current time reaches the target time, so as to be able to Inform other agents or players of the intention of the operation, realize the mutual cooperation between players and agents, and between agents and agents, and improve the degree of cooperation during the game.
  • the execution subject of the interaction method in a game provided by the embodiments of the present disclosure is generally a computer with certain computing power.
  • Computer equipment the computer equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be User Equipment (User Equipment, UE), mobile equipment, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant) Digital Assistant, PDA), handheld devices, computing devices, vehicle-mounted devices, wearable devices, etc.
  • the interaction method in the game may be implemented by the processor invoking computer-readable instructions stored in the memory.
  • Fig. 1 is a schematic diagram of the implementation flow of an interaction method in a game provided by an embodiment of the present disclosure. As shown in Fig. 1, the method is applied to an agent; the game includes a plurality of virtual objects; the plurality of virtual objects include A target virtual object controlled by an agent; the method includes steps S101-S103, wherein:
  • the intelligent body includes AI (Artificial Intelligence, AI) that is generated by training using deep learning technology and reinforcement learning technology, and can manipulate the target virtual objects in the game to play the game;
  • the game contains multiple virtual objects, among which many A virtual object includes: a target virtual object, and a non-target virtual object; wherein, the target virtual object includes a virtual object controlled by an agent in the game;
  • a non-target virtual object may include, but is not limited to, a virtual object controlled by a human player, other The virtual object controlled by the agent, and the non-player virtual character in the game, where the non-player virtual character in the game may include but not limited to non-player characters (Non-Player Character, NPC) in the game scene, etc.
  • NPC non-Player Character
  • the current game state information includes but not limited to at least one of the following: current state information of the target virtual object, current state information of non-target virtual objects, and current state information of the game scene; wherein, the current state of the target virtual object
  • the information includes but is not limited to at least one of the following: the first virtual resource type, the first virtual resource amount, the first building state, the first skill state, the first position in the game scene, and the first life of the target virtual object. Value, first mana value, first faction information, first buff data, first debuff data, etc.
  • the first virtual resource type may include, but not limited to, gold coins owned by the target virtual object in the game for purchasing items such as props, building materials owned by the target virtual object, buildings owned by the target virtual object, and objects owned by the target virtual object. At least one of the types of combat units, etc.; the first amount of virtual resources can include, for example, the number of corresponding virtual resources, such as the number of gold coins owned by the target virtual object, the number of building materials owned by the target virtual object, and the number of buildings owned by the target virtual object , and at least one of the number of combat units owned by the target virtual object; the first building state includes the state of the building owned by the target virtual character, for example, it may include but not limited to: completed construction state, unconstructed state, At least one of the under-construction state, damage state, remaining blood volume, etc.; the first skill state may include, but not limited to: the number of skills possessed by the target virtual object, skill level, whether the skill is in a cooling state, skill type, etc.
  • the first skill state also includes the time the skill needs to be cooled;
  • the skill type can include at least one of attack skills, defensive skills, etc.;
  • the first skill state in the game scene A position includes the position information of the target virtual object in the game scene;
  • the first camp information may include, but not limited to: the camp to which the target virtual object belongs, for example including: friendly camp, enemy camp, neutral camp, etc.;
  • Gain data (buff) for example, may include but not limited to: at least one of the target virtual object's HP gain data, mana value gain data, state gain, etc.;
  • the first debuff data (debuff) for example, may include but not limited to: At least one of health debuff data, mana debuff data, status debuff, etc. of the target virtual object.
  • the current state information of the non-target virtual object may include but not limited to at least one of the following: the second virtual resource type, the second virtual resource amount, the second building state, the second skill state, The second position, the second life value, the second mana value, the second camp information, the second gain data, the second debuff data, type information, and interactive action information in the game scene;
  • the second virtual resource type For example, it may include but not limited to gold coins owned by non-target virtual objects in the game for purchasing props and other items, building materials owned by non-target virtual objects, buildings owned by non-target virtual objects, and arms owned by non-target virtual objects, etc.
  • the second amount of virtual resources may include the amount of corresponding virtual resources, such as the number of gold coins owned by the non-target virtual object, the amount of building materials owned by the non-target virtual object, the number of buildings owned by the non-target virtual object, and the number of non-target virtual objects.
  • the second skill state also includes the time the skill needs to be cooled;
  • the skill type can include at least one of attack skills, defensive skills, etc.;
  • the second skill state in the game scene The second position includes the position information of the non-target virtual object in the game scene;
  • the second camp information may include, but not limited to: the camp to which the non-target virtual object belongs, for example, may include but not limited to friendly camp, enemy camp, neutral camp, etc.
  • the second gain data (buff) may include but not limited to: at least one of the non-target virtual object's life value gain data, mana value gain data, state gain, etc.
  • the second debuff data (debuff) such as It may include but is not limited to: at least one of the health value debuff data, magic value debuff data, state debuff data, etc.
  • the type information includes the game character controlled by the player, the minions in the game scene, the At least one of the wild monsters in the game; here, the game character controlled by the player and the minions in the game scene can belong to the same camp, or they can belong to different camps, but the wild monsters in the game scene and the game character controlled by the player, and The minions in the game scene belong to different camps, that is, hostile camps;
  • the interactive action information may include but not limited to: at least one of the moment when the interactive action is issued, and the type of the interactive action.
  • the type of the interactive action here may include, but not Limited to: at least one of the retreat interaction used to notify teammates to retreat, the offensive interaction used to notify teammates to attack, the defensive interaction used to notify teammates to defend, and the request support interaction used to request teammates to support, etc. A sort of.
  • the current state information of the game scene may include, but not limited to, at least one of the following: the visible area information of the target virtual object, the third virtual resource type, the position of the virtual resource, and the remaining time for refreshing the virtual resource, etc.;
  • the visible area information of the target virtual object may include but not limited to: the game map area located within the current visible range of the target virtual character;
  • the third virtual resource type may include but not limited to: unoccupied buildings in the game scene, and unoccupied At least one of recruited combat units, uncollected resources, etc.;
  • virtual resource locations may include, but are not limited to, the location of the above-mentioned resources in the game scene, such as the location of a building that is not occupied by the game field, and the location of an unoccupied At least one of the position of recruited combat units, the position of uncollected resources, etc.;
  • the remaining time of virtual resource refresh can include, but not limited to: the available collection time of resources available for collection in the game scene, the game At least one of the refreshing time of
  • the embodiments of the present disclosure can The next step is processed through the following steps S102:
  • the interactive action may include, but not limited to, at least one of the following: marking actions in the map of the game scene, sending voice interaction information to non-target virtual objects of the same camp, sending text interaction to non-target virtual objects of the same camp Information, etc.; here, voice interaction information and text interaction information can be sent to non-target virtual objects in the same camp through at least one of broadcasting or private chat.
  • the schematic diagram of the display interface after the action marking in the game scene map can be shown in Figure 2, in the game scene 10, the defense mark point 11 is marked with a black dot mark, to remind players and intelligence players in the game The body defends the position where the defensive mark point 11 is located.
  • the target neural network can be obtained by training in the following manner: based on at least one of the levels of the multiple players in the game, the duration of the game, and the number of times that the interaction occurs, from multiple players In the first game match respectively participated in, determine the target game match; Based on the target game match, determine the first game data respectively corresponding to at least one type of interactive action to be predicted; Use the first game data as sample data, training The original neural network, get the target neural network.
  • the first game data may include but not limited to: in the target game match, at least one frame of target game state information between the first time point and the second time point;
  • the second time point is the time point of the occurrence time of the interactive action, and the second time point is the time point of the occurrence time of the interactive action;
  • the target game state information may include, but not limited to: the state information of the target virtual object, the status information of the non-target virtual object At least one of the state information, the state information of the game scene, etc.
  • the target game state information here, please refer to the relevant description of the current game state information in S101.
  • the first game match can be screened based on but not limited to at least one of the following A1-A6, and a high-quality target game match can be selected:
  • the battle level can include the battle level corresponding to each game in the first game match, which can be determined according to the levels corresponding to the virtual objects controlled by multiple players in each game; the preset battle level threshold can be set according to needs, There is no limitation here.
  • the levels corresponding to the virtual objects controlled by multiple players in each game may be determined according to the game scores obtained by the virtual objects controlled by multiple players in each game.
  • the levels corresponding to the virtual objects controlled by the three players on our side include: 50, 60, 66; the levels corresponding to the virtual objects controlled by the three players on the enemy side include: 55 , 58, 70; then it can be determined according to the levels of the virtual objects controlled by multiple players included in the game that the battle level corresponding to the game includes: 60; at this time, because the battle level 60 corresponding to the game is higher than the preset Set the battle level threshold to 50, and if it is determined that the game is a high-quality game match, then this game can be used as the target game match.
  • the battle levels corresponding to the first game matches arrange the first game matches according to the order of the battle levels from large to small, and based on the ranking results, place the ranking order at the first preset ranking order position A game match, as the target game match.
  • the first preset arrangement sequence bit can be set according to actual needs, and there is no limitation here.
  • the first game match whose battle level is in the top 9 may be used as the target game match.
  • the preset game duration threshold can be set according to requirements, and there is no limitation here.
  • the first game match with a game match duration longer than 20 minutes may be used as the target game match based on the game match duration corresponding to the first game match. Bureau.
  • A4 According to the duration of the game matches corresponding to the first game matches, arrange the first game matches according to the order of the game match duration from largest to smallest, and based on the arrangement results, place the arrangement order in the second preset arrangement order The first game session before the bit is used as the target game session.
  • the second preset arrangement sequence bit can be set according to actual needs, and there is no limitation here.
  • the first game match with the top 7 game match durations may be used as the target game match.
  • A5 Based on the number of occurrences of the interactive actions corresponding to the first game match, select the first game match with the number of occurrences of the interactive action greater than the first preset threshold value as the target game match.
  • the first preset times threshold can be set according to actual needs, and there is no limitation here.
  • interactive action recognition can be performed on each game in the first game match, and the number of interactive actions in each game can be determined. If the first preset threshold value includes 5 times, the first game match can be selected. The first game match in which the interactive actions appear more than 5 times is used as the target game match.
  • A6 According to the number of occurrences of the interactive actions corresponding to the first game match, arrange the first game match in descending order of the number of interactive actions, and based on the result of the arrangement, place the order in the third preset The first game match before the sequence bit is used as the target game match.
  • the third preset arrangement sequence can be set according to actual needs, and there is no limitation here.
  • the third preset arrangement order bit is 6, then the first game match in which the number of occurrences of the interactive action ranks among the top 5 may be used as the target game match.
  • the number of different types of interactive actions that appear in the target game can be counted to obtain the total number of occurrences corresponding to different types of interactive actions; based on different types of interactive actions Corresponding to the total number of occurrences respectively, determine the type to be predicted.
  • interactive actions may include, but not limited to: attack, defense, request for support, retreat, and assembly, for example.
  • the corresponding occurrences can be based on different types of interactive actions. For the total number of times, the type of the interaction action whose total number of occurrences is greater than the second preset number of times threshold is selected as the type to be predicted.
  • the second preset times threshold can be set according to actual needs, and there is no limitation here; the first preset times threshold and the second preset times threshold can be the same or different.
  • the corresponding occurrences can be based on different types of interactive actions. For the total number of times, the different types of interactive actions are arranged in descending order of the total number of occurrences, and based on the arrangement result, the type of the interactive action whose arrangement order is before the fourth preset arrangement order is taken as the type to be predicted.
  • the fourth preset arrangement order bit can be set according to actual needs, and there is no limitation here; it should be noted that the first preset arrangement order bit, the second preset arrangement order bit, and the third preset arrangement order bit .
  • the fourth preset arrangement order bits may be the same or different.
  • the importance of different types of interactive actions appearing in the target game match can be analyzed, and the types of interactive actions whose importance is greater than the preset importance threshold , as the type to be predicted; wherein, the preset importance threshold can be set according to actual needs, and there is no limitation here.
  • the target game match after filtering out the interaction action of the type to be predicted, for each target game match, determine the occurrence time of the interaction action of the type to be predicted in each target game match ; Based on the time of occurrence, determine the first time point and the second time point; based on the first time point and the second time point, from the original game data corresponding to each target game match, intercept the type of interaction to be predicted at each Target game state information in a target game session.
  • the original game data corresponding to the target game match includes multi-frame game state information.
  • the first time point and the second time point at which the interaction action of the type to be predicted occurs can be determined in each target game match, and Based on the original game data between the first time point and the second time point in each target game match, determine the target game state information of the interaction action of the type to be predicted in each target game match.
  • the original neural network can be trained based on the following method network to obtain the target neural network: the first game data is used as sample data, and the type of interaction to be recognized corresponding to the first game data is used as supervision data, and the original neural network is supervised and trained to obtain multiple initialization neural networks; based on Different initialized neural networks constitute different agents, and use different agents to control virtual characters to play games, and obtain the second original game data; use the second original game data to carry out intensive training on the corresponding initialized neural networks, and obtain multiple The candidate neural networks respectively corresponding to the initialization neural networks; based on the performance information respectively corresponding to the candidate neural networks, at least one target neural network is determined from the candidate neural networks.
  • the original neural network may include but not limited to: a network with a complex structure composed of a convolutional neural network and a recurrent neural network; different initialization neural networks have different training parameters during training;
  • the second original game data includes The original game data generated after the agent composed of the initialized neural network controls the virtual object to play the game;
  • the performance information corresponding to the candidate neural network can include but not limited to: use the initialized neural network corresponding to the candidate neural network to play the game game ratings at the time.
  • the target game state information of at least one type of interactive action to be predicted in each target game match is used as sample data, and the type of interactive action corresponding to the target game state information is used as supervision data, using different training parameters , perform supervised training on the original neural network, and obtain different initialized neural networks.
  • different agents can be formed based on different initialized neural networks, and different agents can be used to control virtual characters for game matches to generate the second original game data; and use the first 2.
  • Original game data The initialization neural network utilized when generating the second original game data is intensively trained to obtain candidate neural networks respectively corresponding to each initialization neural network; after obtaining the candidate neural networks, it is possible to The corresponding initialization neural network controls the game score of the virtual object when the game is played. From the candidate neural network, select the candidate neural network whose game score is greater than the preset score threshold (that is, select the candidate neural network that is better at cooperating during the game process). neural network) as the target neural network; wherein, the preset scoring threshold can be set according to actual needs, and there is no limitation here.
  • FIG. 3 is a schematic diagram of the implementation flow of a method for training a target neural network provided in an embodiment of the present disclosure. As shown in FIG. 3 , The training method includes the following steps S301 to S304:
  • S301 Determine the target game match from the first game match with multiple players respectively based on at least one of the match levels of the multiple players in the game, the duration of the game match, and the number of times of interactive actions. .
  • the current game state information can be input into the target neural network, and the target neural network is used to predict the target time and target type of the target virtual object to perform an interactive action.
  • the embodiment of the present disclosure further includes:
  • the target time for the target virtual object to perform the interactive action is obtained as the first At one moment, the target type of the interactive action is defense; then, after the current moment reaches the first moment, the target virtual object is controlled to mark the defensive action on the map of the game scene.
  • the Predict the target time and target type of the target virtual object to perform the interactive action and control the target virtual object to perform the target type of interactive action after the current time reaches the target time, so as to inform other agents or players of the operation intention , realize the mutual cooperation between the player and the intelligent body, and between the intelligent body and the intelligent body, and improve the cooperation degree in the game process.
  • a simpler target neural network can be supported to improve the efficiency of agent learning and communication.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiments of the present disclosure also provide an in-game interaction device corresponding to an in-game interaction method. Since the problem-solving principle of the device in the embodiments of the present disclosure is the same as that of the above-mentioned one in the embodiments of the present disclosure, The interaction method in this game is similar, so the implementation of the device can refer to the implementation of the method.
  • FIG. 4 it is a schematic diagram of an interactive device in a game provided by an embodiment of the present disclosure.
  • the device is applied to an agent, and the game includes a plurality of virtual objects; the plurality of virtual objects include controlled
  • the device includes: an acquisition part 401, a processing part 402, and a control part 403; wherein,
  • the acquisition part 401 is configured to acquire current game state information;
  • the current game state information includes at least one of the following: current state information of the target virtual object, current state information of non-target virtual objects, and current state of the game scene Information;
  • the processing part 402 is configured to use the pre-trained target neural network to perform interactive action prediction processing based on the current game state information, and obtain the target time and target type of the target virtual object to perform the interactive action;
  • the control part 403 is controlled by It is configured to control the target virtual object to perform the target type of interactive action in response to the current time reaching the target time.
  • the current state information of the target virtual object includes at least one of the following: the first virtual resource type, the first virtual resource amount, the first building state, and the first skill state of the target virtual object. , the first position in the game scene, the first life value, the first mana value, the first camp information, the first gain data, and the first debuff data;
  • the current state information of the non-target virtual object includes the following At least one of the above: the second virtual resource type, the second virtual resource amount, the second building state, the second skill state, the second position in the game scene, and the second life value of the non-target virtual object , second magic value, second camp information, second gain data, second debuff data, type information, and interactive action information;
  • the current state information of the game scene includes at least one of the following: the target virtual object The visible area information, the third virtual resource type, the virtual resource location, and the remaining time for refreshing the virtual resource.
  • the interaction action includes at least one of the following: performing action marking on the map of the game scene, sending voice interaction information to non-target virtual objects of the same camp, sending text to non-target virtual objects of the same camp Interaction information; the type of the interaction action includes at least one of the following: retreat, attack, defense, and request for support.
  • the processing part 402 is further configured to train the target neural network in the following manner: based on at least one of the levels of the players in the game, the duration of the game, and the number of times of interactive actions
  • One method is to determine a target game match from the first game match in which a plurality of players respectively participate; based on the target game match, determine the first game data respectively corresponding to at least one type of interactive action to be predicted;
  • the first game data includes: in the target game match, at least one frame of target game state information between the first time point and the second time point; wherein, the first time point is earlier than the interaction The time point of the occurrence time of the action, the second time point is the same as or later than the time point of the occurrence time of the interactive action; the first game data is used as sample data to train the original neural network to obtain the The target neural network is described.
  • the processing part 402 before the processing part 402 determines the first game data respectively corresponding to at least one interaction action of the type to be predicted based on the target game match, it is further configured to: The number of different types of interactive actions that appear in the game is counted to obtain the total number of occurrences corresponding to the different types of interactive actions; based on the total number of occurrences corresponding to the different types of interactive actions, the type to be predicted is determined.
  • the processing part 402 determines the first game data respectively corresponding to at least one type of interaction action to be predicted based on the target game match, it is further configured to: for each target game match game, determining the occurrence time of the interaction action of the type to be predicted in each target game game; based on the occurrence time, determining the first time point and the second time point; based on the second time point At a time point and a second time point, from the original game data corresponding to each target game match, intercept the target game state information of the interaction action of the type to be predicted in each target game match; The original game data corresponding to the target game match includes multiple frames of game state information.
  • the processing part 402 uses the first game data as sample data to train the original neural network to obtain the target neural network, it is further configured to: use the first game data as sample data data, using the to-be-recognized type of the interactive action corresponding to the first game data as supervisory data, and performing supervised training on the original neural network to obtain a plurality of initialized neural networks; wherein, the training parameters of different initialized neural networks during training Different; different agents are formed based on different initialization neural networks, and different agents are used to control virtual characters to play games to obtain second original game data; use the second original game data to perform corresponding initialization neural networks Intensify the training to obtain multiple candidate neural networks respectively corresponding to the initialization neural networks; based on the performance information respectively corresponding to the candidate neural networks, determine at least one target neural network from the candidate neural networks.
  • the performance information respectively corresponding to the candidate neural networks includes: game scores when using the initialized neural networks corresponding to the candidate neural networks to play games.
  • a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a unit, and it may also be a module or a non-module of.
  • FIG. 5 it is a schematic structural diagram of a computer device 500 provided by an embodiment of the present application, including a processor 501 , a memory 502 , and a bus 503 .
  • the memory 502 is used to store execution instructions, including a memory 5021 and an external memory 5022; the memory 5021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 501 and exchange data with an external memory 5022 such as a hard disk.
  • the processor 501 exchanges data with the external memory 5022 through the memory 5021.
  • the processor 501 communicates with the memory 502 through the bus 503, so that the processor 501 executes the following instructions:
  • the current game state information includes at least one of the following: the current state information of the target virtual object, the current state information of the non-target virtual object, and the current state information of the game scene;
  • the neural network performs interactive action prediction processing based on the current game state information, and obtains the target time when the target virtual object executes the interactive action and the target type; in response to the current time reaching the target time, controls the target virtual object to execute the The target type of interaction.
  • An embodiment of the present disclosure provides a computer-readable storage medium, on which a computer program is stored.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure provides a computer program, the computer program includes computer-readable codes, and when the computer-readable codes run in a computer device, a processor in the computer device executes the above-mentioned method Some or all of the steps in the examples.
  • An embodiment of the present disclosure provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the interaction method in the game described in the above method embodiment, please refer to the above method Example.
  • the above-mentioned computer program product may be realized by hardware, software or a combination thereof.
  • the computer program product may be embodied as a computer storage medium, and in other embodiments, the computer program product may be embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • This disclosure relates to the field of augmented reality.
  • acquiring the image information of the target object in the real environment and then using various visual correlation algorithms to detect or identify the relevant features, states and attributes of the target object, and thus obtain the image information that matches the specific application.
  • AR effect combining virtual and reality.
  • the target object may involve faces, limbs, gestures, actions, etc. related to the human body, or markers and markers related to objects, or sand tables, display areas or display items related to venues or places.
  • Vision-related algorithms can involve visual positioning, SLAM, 3D reconstruction, image registration, background segmentation, object key point extraction and tracking, object pose or depth detection, etc.
  • Specific applications can not only involve interactive scenes such as guided tours, navigation, explanations, reconstructions, virtual effect overlays and display related to real scenes or objects, but also special effects processing related to people, such as makeup beautification, body beautification, special effect display, virtual Interactive scenarios such as model display.
  • the relevant features, states and attributes of the target object can be detected or identified through the convolutional neural network.
  • the above-mentioned convolutional neural network is a network model obtained by performing model training based on a deep learning framework.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • Embodiments of the present disclosure provide an interaction method, device, computer equipment, storage medium, computer program, and computer program product in a game, wherein it is applied to an agent; the game includes multiple virtual objects; multiple virtual objects The object includes a target virtual object controlled by the intelligent body; the method includes: obtaining current game state information; the current game state information includes at least one of the following: current state information of the target virtual object, non- The current state information of the target virtual object and the current state information of the game scene; use the pre-trained target neural network to perform interactive action prediction processing based on the current game state information, and obtain the target time when the target virtual object performs an interactive action and the target type ; In response to the current time reaching the target time, controlling the target virtual object to perform the target type of interaction.
  • the intelligent body can inform other intelligent bodies or players of the intention of operation in the game, realize the mutual cooperation between the player and the intelligent body, and between the intelligent body and the intelligent body, and improve the cooperation in the game process Spend.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Optics & Photonics (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本公开实施例提供了一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品,其中,应用于智能体;所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象;所述方法包括:获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到目标虚拟对象执行交互动作的目标时刻、以及目标类型;响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。

Description

一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品
相关申请的交叉引用
本公开基于申请号为202111269017.3、申请日为2021年10月29日、申请名称为“一种游戏中的交互方法、装置、计算机设备及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及但不限于深度学习技术领域,尤其涉及一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品。
背景技术
随着计算机技术的发展,将人工智能(Artificial Intelligence,AI)应用到游戏领域已经成为当前游戏行业的发展趋势,相关技术中,在一些游戏中使用了由深度强化学习训练的智能体来担任玩家的游戏伙伴;这类智能体通常可以根据游戏局面和玩家的操作信息作出相对应的操作来进行配合,配合程度低。
发明内容
本公开实施例至少提供一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品。
本公开实施例提供了一种游戏中的交互方法,应用于智能体;所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象;所述方法包括:获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到所述目标虚拟对象执行交互动作的目标时刻、以及目标类型;响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
这样,利用预先训练的目标神经网络,基于游戏过程中目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、以及游戏场景的当前状态系信息中的至少一种,对目标虚拟对象执行交互动作的目标时刻、以及目标类型进行预测,并在当前时刻到达目标时刻后,控制目标虚拟对象执行目标类型的交互动作,从而能够将操作的意图告知给其他的智能体或者玩家,实现玩家和智能体之间、智能体和智能体之间的相互配合,提升游戏过程中的配合度。
在一些实施方式中,所述目标虚拟对象的当前状态信息包括下述至少 一种:所述目标虚拟对象具有的第一虚拟资源类型、第一虚拟资源量、第一建筑状态、第一技能状态、在所述游戏场景中的第一位置、第一生命值、第一魔法值、第一阵营信息、第一增益数据、第一减益数据;所述非目标虚拟对象的当前状态信息包括下述至少一种:所述非目标虚拟对象具有的第二虚拟资源类型、第二虚拟资源量、第二建筑状态、第二技能状态、在所述游戏场景中的第二位置、第二生命值、第二魔法值、第二阵营信息、第二增益数据、第二减益数据、类型信息、以及交互动作信息;所述游戏场景的当前状态信息包括下述至少一种:所述目标虚拟对象的可视区域信息、第三虚拟资源类型、虚拟资源位置、以及虚拟资源刷新的剩余时间。
在一些实施方式中,所述交互动作包括下述至少一种:在游戏场景的地图中进行动作标记、向同一阵营的非目标虚拟对象发送语音交互信息、向同一阵营的非目标虚拟对象发送文字交互信息;所述交互动作的类型包括下述至少一种:撤退、进攻、防守、以及请求支援。
在一些实施方式中,所述目标神经网络采用下述方式训练得到:基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从多个玩家分别参与的第一游戏对局中,确定目标游戏对局;基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据;所述第一游戏数据包括:所述目标游戏对局中,第一时间点和第二时间点之间的至少一帧目标游戏状态信息;其中,所述第一时间点为早于在所述交互动作的发生时间的时间点,所述第二时间点为同于、或者晚于所述交互动作的发生时间的时间点;将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络。
这样,通过在第一游戏对局中,基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,选取游戏质量较高的目标游戏对局,提高了样本数据的质量,从而提高了生成的目标神经网络的质量。
在一些实施方式中,所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据之前,还包括:对所述目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;基于所述不同类型的交互动作分别对应的出现总次数,确定所述待预测类型。
这样,能够便捷在目标游戏对局中包含的不同类型的交互动作中确定待预测类型的交互动作,改善了交互数据重要性分析困难的问题。
在一些实施方式中,所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据,包括:针对每个目标游戏对局,确定所述待预测类型的交互动作在所述每个目标游戏对局中的发生时间;基于所述发生时间,确定所述第一时间点和所述第二时间点;基于所述第一时间点和所述第二时间点,从所述每个目标游戏对局对应的原始游 戏数据中,截取所述待预测类型的交互动作在所述每个目标游戏对局中的目标游戏状态信息;所述目标游戏对局对应的原始游戏数据中包括多帧游戏状态信息。
在一些实施方式中,所述将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络,包括:将所述第一游戏数据作为样本数据,将所述第一游戏数据对应的交互动作的待识别类型作为监督数据,对所述原始神经网络进行监督训练,得到多个初始化神经网络;其中,不同初始化神经网络在训练时的训练参数不同;基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;利用所述第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分别对应的备选神经网络;基于所述备选神经网络分别对应的性能信息,从所述备选神经网络中,确定至少一个目标神经网络。
这样,利用监督学习的方式,训练目标神经网络,使得训练得到的目标神经网络学习到人类玩家的不同类型的交互动作,例如可以包括攻击、撤退、防守、请求支援等多种类型的交互动作;此外,采用多智能体强化学习的方式,训练目标神经网络,增加了最终生成的目标神经网络的合作意识,使目标神经网络具有更好地对战能力。
在一些实施方式中,所述备选神经网络分别对应的性能信息,包括:利用所述备选神经网络对应的初始化神经网络进行游戏对局时的游戏评分。
本公开实施例提供一种游戏中的交互装置,包括:应用于智能体;所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象;所述装置包括:获取部分,被配置为获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;处理部分,被配置为利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到所述目标虚拟对象执行交互动作的目标时刻、以及目标类型;控制部分,被配置为响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
在一些实施方式中,所述目标虚拟对象的当前状态信息包括下述至少一种:所述目标虚拟对象具有的第一虚拟资源类型、第一虚拟资源量、第一建筑状态、第一技能状态、在所述游戏场景中的第一位置、第一生命值、第一魔法值、第一阵营信息、第一增益数据、第一减益数据;所述非目标虚拟对象的当前状态信息包括下述至少一种:所述非目标虚拟对象具有的第二虚拟资源类型、第二虚拟资源量、第二建筑状态、第二技能状态、在所述游戏场景中的第二位置、第二生命值、第二魔法值、第二阵营信息、第二增益数据、第二减益数据、类型信息、以及交互动作信息;所述游戏场景的当前状态信息包括下述至少一种:所述目标虚拟对象的可视区域信 息、第三虚拟资源类型、虚拟资源位置、以及虚拟资源刷新的剩余时间。
在一些实施方式中,所述交互动作包括下述至少一种:在游戏场景的地图中进行动作标记、向同一阵营的非目标虚拟对象发送语音交互信息、向同一阵营的非目标虚拟对象发送文字交互信息;所述交互动作的类型包括下述至少一种:撤退、进攻、防守、以及请求支援。
在一些实施方式中,所述处理部分还被配置为采用下述方式训练得到所述目标神经网络:基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从多个玩家分别参与的第一游戏对局中,确定目标游戏对局;基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据;所述第一游戏数据包括:所述目标游戏对局中,第一时间点和第二时间点之间的至少一帧目标游戏状态信息;其中,所述第一时间点为早于在所述交互动作的发生时间的时间点,所述第二时间点为同于、或者晚于所述交互动作的发生时间的时间点;将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络。
在一些实施方式中,所述处理部分在所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据之前,还被配置为:对所述目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;基于所述不同类型的交互动作分别对应的出现总次数,确定所述待预测类型。
在一些实施方式中,所述处理部分在所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据时,还被配置为:针对每个目标游戏对局,确定所述待预测类型的交互动作在所述每个目标游戏对局中的发生时间;基于所述发生时间,确定所述第一时间点和所述第二时间点;基于所述第一时间点和所述第二时间点,从所述每个目标游戏对局对应的原始游戏数据中,截取所述待预测类型的交互动作在所述每个目标游戏对局中的目标游戏状态信息;所述目标游戏对局对应的原始游戏数据中包括多帧游戏状态信息。
在一些实施方式中,所述处理部分在所述将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络时,还被配置为:将所述第一游戏数据作为样本数据,将所述第一游戏数据对应的交互动作的待识别类型作为监督数据,对所述原始神经网络进行监督训练,得到多个初始化神经网络;其中,不同初始化神经网络在训练时的训练参数不同;基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;利用所述第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分别对应的备选神经网络;基于所述备选神经网络分别对应的性能信息,从所述备选神经网络中,确定至少一个目标神经网络。
在一些实施方式中,所述备选神经网络分别对应的性能信息,包括:利用所述备选神经网络对应的初始化神经网络进行游戏对局时的游戏评分。
本公开实施例提供一种计算机设备,包括:处理器、存储器,所述存储器存储有所述处理器可执行的机器可读指令,所述处理器用于执行所述存储器中存储的机器可读指令,所述机器可读指令被所述处理器执行时,所述处理器执行上述方法中的部分或全部步骤。
本公开实施例提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被运行时执行上述方法中的部分或全部步骤。
本公开实施例提供一种计算机程序,包括计算机可读代码,在所述计算机可读代码在计算机设备中运行的情况下,所述计算机设备中的处理器执行用于实现上述方法中的部分或全部步骤。
本公开实施例提供一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序被计算机读取并执行时,实现上述方法中的部分或全部步骤。
关于上述游戏中的交互装置、计算机设备、计算机可读存储介质、计算机程序及计算机程序产品的效果描述参见上述游戏中的交互方法的说明。
为使本公开实施例的上述目的、特征和优点能更明显易懂,下文特举示例性实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1为本公开实施例所提供的一种游戏中的交互方法的实现流程示意图;
图2为本公开实施例所提供的游戏中的交互方法中,一种在游戏场景地图中进行动作标记后的展示界面的示意图;
图3为本公开实施例所提供的游戏中的交互方法中,一种目标神经网络的训练方法的实现流程示意图;
图4为本公开实施例所提供的一种游戏中的交互装置的组成结构示意图;
图5为本公开实施例所提供的一种计算机设备的组成结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本 公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
经研究发现,智能体设计是多人合作类游戏设计中一项重要组成部分,智能体具有一定的决策智能,利用所具有的决策智能,能够在游戏中控制虚拟对象和玩家进行游戏。在多人合作类游戏中,除真实玩家互相组队外,现有的游戏伙伴通常是由硬编码的机器人(bot)充当。少部分即时战略游戏(Real-Time Strategy Game,RTS)或者多人在线战术竞技游戏(Multiplayer Online Battle Arena,MOBA)类游戏中使用了由深度强化学习训练的智能体来担任玩家的游戏伙伴;在实际游戏过程中,不同玩家之间为了能够更好的配合,在共同游戏的过程中,还会相互发出交互信号;但智能体通常仅能根据游戏局面和玩家的操作信息作出相对应的操作,其操作意图通常是无法告知其他玩家、或者其他的智能体的,导致了当前利用智能体来进行游戏时,配合度差的问题。
智能体与玩家互动能力差会导致游戏体验下降,同时对于多人合作类游戏来说,了解队友真实意图又对游戏战局起到至关重要的影响,所以不能实时交互的智能体对游戏表现的影响也是不可忽视的。因此,可交互式智能体即成了多人游戏设计中一个重要的技术。
基于上述研究,本公开实施例提供了一种游戏中的交互方法,利用预先训练的目标神经网络,基于游戏过程中目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、以及游戏场景的当前状态系信息中的至少一种,对目标虚拟对象执行交互动作的目标时刻、以及目标类型进行预测,并在当前时刻到达目标时刻后,控制目标虚拟对象执行目标类型的交互动作,从而能够将操作的意图告知给其他的智能体或者玩家,实现玩家和智能体之间、智能体和智能体之间的相互配合,提升游戏过程中的配合度。
针对相关技术中的方案所存在的缺陷以及所提出的解决方案,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及文中本公开实施例针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
为便于对本公开实施例进行理解,首先对本公开实施例所公开的一种 游戏中的交互方法进行详细介绍,本公开实施例所提供的游戏中的交互方法的执行主体一般为具有一定计算能力的计算机设备,该计算机设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该游戏中的交互方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
下面对本公开实施例提供的游戏中的交互方法加以说明。
图1为本公开实施例提供的一种游戏中的交互方法的实现流程示意图,如图1所示,所述方法应用于智能体;所述游戏包括多个虚拟对象;多个虚拟对象中包括受控于智能体的目标虚拟对象;所述方法包括步骤S101~S103,其中:
S101、获取当前游戏状态信息。
其中,智能体包括利用深度学习技术、以及强化学习等技术训练生成的,能够操控游戏中的目标虚拟对象进行游戏的人工智能(Artificial Intelligence,AI);游戏中包含多个虚拟对象,其中,多个虚拟对象包括:目标虚拟对象、以及非目标虚拟对象;其中,目标虚拟对象包括游戏中被智能体控制的虚拟对象;非目标虚拟对象例如可以包括但不限于人类玩家控制的虚拟对象、其他的智能体控制的虚拟对象、以及游戏中的非玩家虚拟角色,这里游戏中的非玩家虚拟角色例如可以包括但不限于游戏场景中的非玩家角色(Non-Player Character,NPC)等。
在实施中,当前游戏状态信息包括但不限于下述至少一种:目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;其中,目标虚拟对象的当前状态信息包括但不限于下述至少一种:目标虚拟对象具有的第一虚拟资源类型、第一虚拟资源量、第一建筑状态、第一技能状态、在游戏场景中的第一位置、第一生命值、第一魔法值、第一阵营信息、第一增益数据、第一减益数据等。
这里,第一虚拟资源类型例如可以包括但不限于游戏中目标虚拟对象拥有的用于购买道具等物品的金币、目标虚拟对象拥有的建筑材料、目标虚拟对象拥有的建筑、以及目标虚拟对象拥有的战斗单位的种类等中的至少一种;第一虚拟资源量例如可以包括对应虚拟资源的数量,如目标虚拟对象拥有的金币数量、目标虚拟对象拥有的建筑材料数量、目标虚拟对象拥有的建筑数量、以及目标虚拟对象拥有的战斗单位的数量等中的至少一种;第一建筑状态包括目标虚拟角色拥有的建筑所处的状态,例如可以包括但不限于:已完成建设状态、未建设状态、正在建设状态、损坏状态、剩余血量等中的至少一种;第一技能状态例如可以包括但不限于:目标虚拟对象具有的技能数量、技能等级、技能是否在冷却状态、技能类型等中的至少一种,其中,若技能处于冷却状态中时,第一技能状态还包括该技 能需要冷却的时间;技能类型例如可以包括攻击技能、防守技能等中的至少一种;在游戏场景中的第一位置包括目标虚拟对象在游戏场景中的位置信息;第一阵营信息例如可以包括但不限于:目标虚拟对象所归属的阵营,例如包括:友方阵营、敌方阵营、中立阵营等;第一增益数据(buff)例如可以包括但不限于:目标虚拟对象的生命值增益数据、魔法值增益数据、状态增益等中的至少一种;第一减益数据(debuff)例如可以包括但不限于:目标虚拟对象的生命值减益数据、魔法值减益数据、状态减益等中的至少一种。
另外,非目标虚拟对象的当前状态信息例如可以包括但不限于下述至少一种:非目标虚拟对象具有的第二虚拟资源类型、第二虚拟资源量、第二建筑状态、第二技能状态、在游戏场景中的第二位置、第二生命值、第二魔法值、第二阵营信息、第二增益数据、第二减益数据、类型信息、以及交互动作信息;这里,第二虚拟资源类型例如可以包括但不限于游戏中非目标虚拟对象拥有的用于购买道具等物品的金币、非目标虚拟对象拥有的建筑材料、非目标虚拟对象拥有的建筑、以及非目标虚拟对象拥有的兵种等中的至少一种;第二虚拟资源量可以包括对应的虚拟资源的数量,如非目标虚拟对象拥有的金币数量、非目标虚拟对象拥有的建筑材料数量、非目标虚拟对象拥有的建筑数量、以及非目标虚拟对象拥有的战斗单位的数量等中的至少一种;第二建筑状态包括非目标虚拟角色拥有的建筑所处的状态,例如可以包括但不限于:已完成建设状态、未建设状态、正在建设状态、损坏状态、剩余血量等中的至少一种;第二技能状态例如可以包括但不限于:非目标虚拟对象具有的技能数量、技能等级、技能是否在冷却状态、技能类型等中的至少一种,其中,若技能处于冷却状态中时,第二技能状态还包括该技能需要冷却的时间;技能类型例如可以包括攻击技能、防守技能等中的至少一种;在游戏场景中的第二位置包括非目标虚拟对象在游戏场景中的位置信息;第二阵营信息例如可以包括但不限于:非目标虚拟对象所归属的阵营,例如可以包括但不限于友方阵营、敌方阵营、中立阵营等;第二增益数据(buff)例如可以包括但不限于:非目标虚拟对象的生命值增益数据、魔法值增益数据、状态增益等中的至少一种;第二减益数据(debuff)例如可以包括但不限于:非目标虚拟对象的生命值减益数据、魔法值减益数据、状态减益等中的至少一种;类型信息包括玩家控制的游戏角色、游戏场景中的小兵、游戏场景中的野怪等中的至少一种;这里,玩家控制的游戏角色和游戏场景中的小兵可以属于相同阵营,也可以属于不同阵营,但游戏场景中的野怪与玩家控制的游戏角色、以及游戏场景中的小兵属于不同阵营,即敌对阵营;交互动作信息可以包括但不限于:交互动作发出的时刻、以及交互动作的类型等中的至少一种,这里交互动作的类型例如可以包括但不限于:用于通知队友进行撤退的撤退交互动作、用于通知队友进行进攻的进攻交互动作、用于通知队友进行防守的 防守交互动作、以及用于请求队友支援的请求支援交互动作等中的至少一种。
另外,游戏场景的当前状态信息例如可以包括但不限于下述至少一种:目标虚拟对象的可视区域信息、第三虚拟资源类型、虚拟资源位置、以及虚拟资源刷新的剩余时间等;其中,目标虚拟对象的可视区域信息可以包括但不限于:位于目标虚拟角色当前可视范围内的游戏地图区域;第三虚拟资源类型例如可以包括但不限于:游戏场景中未占据的建筑、以及未被招纳的战斗单位、未采集的资源等中的至少一种;虚拟资源位置例如可以包括但不限于上述资源在游戏场景中的所处位置,例如游戏场未占据的建筑的位置、以及未被招纳的战斗单位的位置、未采集的资源的位置等中的至少一种;虚拟资源刷新的剩余时间例如可以包括但不限于:游戏场景中可供采集的资源的可供采集时间、游戏场景中野怪、小怪等的刷新时间等中的至少一种。
在一些实施方式中,在游戏过程中,基于S101获取当前时刻游戏中目标虚拟对象的当前状态信息、以及非目标虚拟对象的当前状态信息、以及游戏场景的当前状态信息后,本公开实施例可以通过如下步骤S102进行下一步处理:
S102、利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到目标虚拟对象执行交互动作的目标时刻、以及目标类型。
其中,交互动作例如可以包括但不限于下述至少一种:在游戏场景的地图中进行动作标记、向同一阵营的非目标虚拟对象发送语音交互信息、向同一阵营的非目标虚拟对象发送文字交互信息等;这里,可以通过广播形式、或者私聊等中的至少一种方式向同一阵营的非目标虚拟对象发送语音交互信息、文字交互信息。示例性的,在游戏场景地图中进行动作标记后的展示界面的示意图可以如图2所示,在游戏场景10中以黑色圆点标识标记出防守标记点11,以提示游戏中的玩家和智能体对防守标记点11所在的位置进行防守。
在一些实施方式中,可以通过下述方式训练得到目标神经网络:基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从与多个玩家分别参与的第一游戏对局中,确定目标游戏对局;基于目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据;将第一游戏数据作为样本数据,训练原始神经网络,得到目标神经网络。
其中,第一游戏数据可以包括但不限于:目标游戏对局中,第一时间点和第二时间点之间的至少一帧目标游戏状态信息;其中,第一时间点为早于在交互动作的发生时间的时间点,第二时间点为同于、或者晚于交互动作的发生时间的时间点;目标游戏状态信息例如可以包括但不限于:目 标虚拟对象的状态信息、非目标虚拟对象的状态信息、游戏场景的状态信息等中的至少一种,这里关于目标游戏状态信息的相关描述可以参见S101中针对当前游戏状态信息的相关描述。
在训练目标神经网络过程中,为了提高目标神经网络的质量,因此需要在大量的人类对局游戏数据(即多个玩家分别参与的第一游戏对局)中选取高质量对局游戏数据,从而才能对玩家行为进行无偏大的有效模拟,所以可以根据但不限于下述A1~A6中的至少一种,对第一游戏对局进行筛选,选取高质量的目标游戏对局:
A1、基于第一游戏对局分别对应的对战等级,选取对战等级大于预设对战等级阈值的第一游戏对局,作为目标游戏对局。
其中,对战等级可以包括第一游戏对局中每局游戏对应的对战等级,可以根据每局游戏中多个玩家分别操控的虚拟对象对应的等级确定;预设对战等级阈值可以根据需要设定,在此不做限制。在一些实施方式中,每局游戏中多个玩家分别操控的虚拟对象对应的等级可以是根据每局游戏中多个玩家分别操控的虚拟对象所获得的游戏评分确定的。
示例性的,当一局游戏中位于我方的三个玩家分别操控的虚拟对象对应的等级包括:50、60、66;位于敌方的三个玩家分别操控的虚拟对象对应的等级包括:55、58、70;则可以根据该局游戏中包含的多个玩家分别操控的虚拟对象的等级,确定该局游戏对应的对战等级包括:60;此时由于该局游戏对应的对战等级60大于预设对战等级阈值50,确定该局游戏为高质量游戏对局,则可以将该局游戏作为目标游戏对局。
A2、根据第一游戏对局分别对应的对战等级,将第一游戏对局按照对战等级从大到小的顺序进行排列,基于排列结果,将排列顺序位于第一预设排列顺序位之前的第一游戏对局,作为目标游戏对局。
其中,第一预设排列顺序位可以根据实际需求设定,此处不做限制。
示例性的,第一预设排列顺序位为10,则可以将对战等级位于前9的第一游戏对局,作为目标游戏对局。
A3、基于第一游戏对局分别对应的游戏对局时长,选取游戏对局时长大于预设对局时长阈值的第一游戏对局,作为目标游戏对局。
其中,预设对局时长阈值可以根据需求设定,此处不做限制。
示例性的,若预设对局时长阈值包括20分钟,则可以基于第一游戏对局分别对应的游戏对局时长,将游戏对局时长大于20分钟的第一游戏对局,作为目标游戏对局。
A4、根据第一游戏对局分别对应的游戏对局时长,将第一游戏对局按照游戏对局时长从大到小的顺序进行排列,基于排列结果,将排列顺序位于第二预设排列顺序位之前的第一游戏对局,作为目标游戏对局。
其中,第二预设排列顺序位可以根据实际需求设定,此处不做限制。
示例性的,第二预设排列顺序位为8,则可以将游戏对局时长位于前7 的第一游戏对局,作为目标游戏对局。
A5、基于第一游戏对局分别对应的交互动作出现的次数,选取交互动作出现的次数大于第一预设次数阈值的第一游戏对局,作为目标游戏对局。
其中,第一预设次数阈值可以根据实际需求设定,此处不做限制。
示例性的,可以对第一游戏对局中每局游戏进行交互动作识别,确定每局游戏中交互动作出现的次数,若第一预设次数阈值包括5次,则可以选取第一游戏对局中交互动作出现的次数大于5次的第一游戏对局,作为目标游戏对局。
A6、根据第一游戏对局分别对应的交互动作出现的次数,将第一游戏对局按照交互动作出现的次数由多到少的顺序进行排列,基于排列结果,将排列顺序位于第三预设排列顺序位之前的第一游戏对局,作为目标游戏对局。
其中,第三预设排列顺序位可以根据实际需求设定,此处不做限制。
示例性的,第三预设排列顺序位为6,则可以将交互动作出现的次数位于前5的第一游戏对局,作为目标游戏对局。
在基于上述A1~A6,从与多个玩家分别参与的第一游戏对局中,确定目标游戏对局后,可以对目标游戏对局中出现的多种类型的交互动作进行筛选,选取其中出现频率最高的待预测类型的交互动作,例如攻击、防守、请求支援以及撤退等类型的交互动作,剔除目标游戏对局中出现的重要程度较低的交互动作,比如移动、生产、以及局内普通沟通对话等类型的交互动作,在一些实施方式中:可以对目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;基于不同类型交互动作分别对应的出现总次数,确定待预测类型。
其中,不同类型的交互动作例如可以包括但不限于:攻击、防守、请求支援、撤退、以及集结。
在一些实施方式中,在对目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数后,可以基于不同类型的交互动作分别对应的出现总次数,选取出现总次数大于第二预设次数阈值的交互动作的类型,作为待预测类型。
其中,第二预设次数阈值可以根据实际需求设定,此处不做限制;第一预设次数阈值与第二预设次数阈值可以相同,也可以不同。
在一些实施方式中,在对目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数后,可以基于不同类型的交互动作分别对应的出现总次数,按照出现总次数由大到小的顺序,对不同类型的交互动作进行排列,基于排列结果,将排列顺序位于第四预设排列顺序位之前的交互动作的类型,作为待预测类型。
其中,第四预设排列顺序位可以根据实际需求设定,此处不做限制;需要说明的是,第一预设排列顺序位、第二预设排列顺序位、第三预设排 列顺序位、第四预设排列顺序位可以相同,也可以不同。
在一些实施方式中,可以基于目标游戏对局中的游戏数据,对目标游戏对局中出现的不同类型的交互动作的重要程度进行分析,将重要程度大于预设重要程度阈值的交互动作的类型,作为待预测类型;其中,预设重要程度阈值可以根据实际需求设定,此处不做限制。
在一些实施方式中,在目标游戏对局中,筛选出待预测类型的交互动作后,可以针对每个目标游戏对局,确定待预测类型的交互动作在每个目标游戏对局中的发生时间;基于发生时间,确定第一时间点和第二时间点;基于第一时间点和第二时间点,从每个目标游戏对局对应的原始游戏数据中,截取待预测类型的交互动作在每个目标游戏对局中的目标游戏状态信息。
其中,目标游戏对局对应的原始游戏数据中包括多帧游戏状态信息。
示例性的,在确定待预测类型的交互动作之后,可以针对每个目标游戏对局,在各目标游戏对局中确定待预测类型的交互动作发生的第一时间点和第二时间点,并基于各目标游戏对局中第一时间点到第二时间点之间的原始游戏数据,确定待预测类型的交互动作在各目标游戏对局中的目标游戏状态信息。
在确定各种待预测类型的交互动作在各目标游戏对局中的目标游戏状态信息,即确定各待预测类型的交互动作分别对应的第一游戏数据后,可以基于下述方法,训练原始神经网络,以得到目标神经网络:将第一游戏数据作为样本数据,将第一游戏数据对应的交互动作的待识别类型作为监督数据,对原始神经网络进行监督训练,得到多个初始化神经网络;基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;利用第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分别对应的备选神经网络;基于备选神经网络分别对应的性能信息,从备选神经网络中,确定至少一个目标神经网络。
其中,原始神经网络可以包括但不限于:由卷积神经网络、循环神经网络构成的具有复杂结构的网络;不同初始化神经网络在训练时的训练参数不同;第二原始游戏数据包括利用由不同的初始化神经网络构成的智能体控制虚拟对象进行对局后所产生的原始游戏数据;备选神经网络分别对应的性能信息可以包括但不限于:利用备选神经网络对应的初始化神经网络进行游戏对局时的游戏评分。
示例性的,将至少一种待预测类型的交互动作在各目标游戏对局中的目标游戏状态信息作为样本数据,将目标游戏状态信息对应的交互动作的类型作为监督数据,利用不同的训练参数,对原始神经网络进行监督训练,得到不同的初始化神经网络。
在通过监督训练,得到多个不同的初始化神经网络后,可以基于不同 的初始化神经网络构成不同的智能体,利用不同的智能体控制虚拟角色进行游戏对局产生第二原始游戏数据;并利用第二原始游戏数据对产生该第二原始游戏数据时所利用的初始化神经网络进行强化训练,得到各初始化神经网络分别对应的备选神经网络;在得到备选神经网络后,可以基于备选神经网络对应的初始化神经网络控制虚拟对象进行游戏对局时的游戏评分,从备选神经网络中,选取游戏评分大于预设评分阈值的备选神经网络(即选取更善于在游戏过程中进行配合的备选神经网络),作为目标神经网络;其中,预设评分阈值可以根据实际需求设定,在此不做限制。
在本公开的另一实施例中,还提供了一种目标神经网络的训练方法,图3为本公开实施例提供的一种目标神经网络的训练方法的实现流程示意图,如图3所示,该训练方法包括如下步骤S301至S304:
S301、基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从与多个玩家分别参与的第一游戏对局中,确定目标游戏对局。
S302、对目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;基于不同类型的交互动作分别对应的出现总次数,确定待预测类型。
S303、针对每个目标游戏对局,确定待预测类型的交互动作在每个目标游戏对局中的发生时间;基于发生时间,确定第一时间点和第二时间点;基于第一时间点和第二时间点,从每个目标游戏对局对应的原始游戏数据中,截取待预测类型的交互动作在所述每个目标游戏对局中的目标游戏状态信息,以确定至少一种待预测类型的交互动作分别对应的第一游戏数据。
S304、将第一游戏数据作为样本数据,将第一游戏数据对应的交互动作的待识别类型作为监督数据,对原始神经网络进行监督训练,得到多个初始化神经网络;基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;利用第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分别对应的备选神经网络;基于备选神经网络分别对应的性能信息,从备选神经网络中,确定至少一个目标神经网络。
这里,S301~S304的实施方式可参见上述相关实施方式中的描述。
在一些实施方式中,在生成目标神经网络之后,可以将当前游戏状态信息输入到目标神经网络中,利用目标神经网络对目标虚拟对象执行交互动作的目标时刻和目标类型进行预测。
在对目标虚拟对象执行交互动作的目标时刻、以及目标类型进行预测处理后,本公开实施例还包括:
S103、响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
示例性的,若将当前游戏状态信息输入到目标神经网络中,利用目标 神经网络对目标虚拟对象执行交互动作的目标时刻和目标类型进行预测后,得到目标虚拟对象执行交互动作的目标时刻为第一时刻,交互动作的目标类型为防守;则在当前时刻到达第一时刻后,控制目标虚拟对象在游戏场景的地图中进行防守动作标记。
本公开实施例中,利用预先训练的目标神经网络,基于游戏过程中目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、以及游戏场景的当前状态系信息中的至少一种,对目标虚拟对象执行交互动作的目标时刻、以及目标类型进行预测,并在当前时刻到达目标时刻后,控制目标虚拟对象执行目标类型的交互动作,从而能够将操作的意图告知给其他的智能体或者玩家,实现玩家和智能体之间、智能体和智能体之间的相互配合,提升游戏过程中的配合度。此外,通过采用监督学习的方式对目标神经网络进行训练,可以支持更简单的目标神经网络来提升智能体学习交流的效率。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一技术构思,本公开实施例中还提供了与一种游戏中的交互方法对应的一种游戏中的交互装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述一种游戏中的交互方法相似,因此装置的实施可以参见方法的实施。
参照图4所示,为本公开实施例提供的一种游戏中的交互装置的示意图,所述装置应用于智能体,所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象,所述装置包括:获取部分401、处理部分402、控制部分403;其中,
获取部分401,被配置为获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;处理部分402,被配置为利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到目标虚拟对象执行交互动作的目标时刻、以及目标类型;控制部分403,被配置为响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
在一些实施方式中,所述目标虚拟对象的当前状态信息包括下述至少一种:所述目标虚拟对象具有的第一虚拟资源类型、第一虚拟资源量、第一建筑状态、第一技能状态、在所述游戏场景中的第一位置、第一生命值、第一魔法值、第一阵营信息、第一增益数据、第一减益数据;所述非目标虚拟对象的当前状态信息包括下述至少一种:所述非目标虚拟对象具有的第二虚拟资源类型、第二虚拟资源量、第二建筑状态、第二技能状态、在所述游戏场景中的第二位置、第二生命值、第二魔法值、第二阵营信息、 第二增益数据、第二减益数据、类型信息、以及交互动作信息;所述游戏场景的当前状态信息包括下述至少一种:所述目标虚拟对象的可视区域信息、第三虚拟资源类型、虚拟资源位置、以及虚拟资源刷新的剩余时间。
在一些实施方式中,所述交互动作包括下述至少一种:在游戏场景的地图中进行动作标记、向同一阵营的非目标虚拟对象发送语音交互信息、向同一阵营的非目标虚拟对象发送文字交互信息;所述交互动作的类型包括下述至少一种:撤退、进攻、防守、以及请求支援。
在一些实施方式中,处理部分402还被配置为采用下述方式训练得到所述目标神经网络:基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从多个玩家分别参与的第一游戏对局中,确定目标游戏对局;基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据;所述第一游戏数据包括:所述目标游戏对局中,第一时间点和第二时间点之间的至少一帧目标游戏状态信息;其中,所述第一时间点为早于在所述交互动作的发生时间的时间点,所述第二时间点为同于、或者晚于所述交互动作的发生时间的时间点;将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络。
在一些实施方式中,处理部分402在所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据之前,还被配置为:对所述目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;基于所述不同类型的交互动作分别对应的出现总次数,确定所述待预测类型。
在一些实施方式中,处理部分402在所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据时,还被配置为:针对每个目标游戏对局,确定所述待预测类型的交互动作在所述每个目标游戏对局中的发生时间;基于所述发生时间,确定所述第一时间点和所述第二时间点;基于所述第一时间点和第二时间点,从所述每个目标游戏对局对应的原始游戏数据中,截取所述待预测类型的交互动作在所述每个目标游戏对局中的目标游戏状态信息;所述目标游戏对局对应的原始游戏数据中包括多帧游戏状态信息。
在一些实施方式中,处理部分402在所述将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络时,还被配置为:将所述第一游戏数据作为样本数据,将所述第一游戏数据对应的交互动作的待识别类型作为监督数据,对所述原始神经网络进行监督训练,得到多个初始化神经网络;其中,不同初始化神经网络在训练时的训练参数不同;基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;利用所述第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分 别对应的备选神经网络;基于所述备选神经网络分别对应的性能信息,从所述备选神经网络中,确定至少一个目标神经网络。
在一些实施方式中,所述备选神经网络分别对应的性能信息,包括:利用所述备选神经网络对应的初始化神经网络进行游戏对局时的游戏评分。
关于装置中的各部分的处理流程、以及各部分之间的交互流程的描述可以参照上述方法实施例中的相关说明。
需要说明的是,在本公开实施例以及其他的实施例中,“部分”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是单元,还可以是模块也可以是非模块化的。
基于同一技术构思,本申请实施例还提供了一种计算机设备。参照图5所示,为本申请实施例提供的计算机设备500的结构示意图,包括处理器501、存储器502、和总线503。其中,存储器502用于存储执行指令,包括内存5021和外部存储器5022;这里的内存5021也称内存储器,用于暂时存放处理器501中的运算数据,以及与硬盘等外部存储器5022交换的数据,处理器501通过内存5021与外部存储器5022进行数据交换,当计算机设备500运行时,处理器501与存储器502之间通过总线503通信,使得处理器501执行以下指令:
获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到目标虚拟对象执行交互动作的目标时刻、以及目标类型;响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
其中,处理器501的处理流程可以参照上述方法实施例的记载。
本公开实施例提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的游戏中的交互方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例提供一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在计算机设备中运行的情况下,所述计算机设备中的处理器执行用于实现上述方法实施例中的部分或全部步骤。
本公开实施例提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的游戏中的交互方法的步骤,可参见上述方法实施例。
其中,上述计算机程序产品可以通过硬件、软件或其结合的方式实现。在一些实施例中,所述计算机程序产品可以体现为计算机存储介质,在另一些实施例中,计算机程序产品可以体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
这里需要指出的是:以上计算机设备、存储介质、计算机程序及计算机程序产品实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本公开计算机设备、存储介质、计算机程序及计算机程序产品实施例中未披露的技术细节,请参照本公开方法实施例的描述而理解。
本公开涉及增强现实领域,通过获取现实环境中的目标对象的图像信息,进而借助各类视觉相关算法实现对目标对象的相关特征、状态及属性进行检测或识别处理,从而得到与具体应用匹配的虚拟与现实相结合的AR效果。示例性的,目标对象可涉及与人体相关的脸部、肢体、手势、动作等,或者与物体相关的标识物、标志物,或者与场馆或场所相关的沙盘、展示区域或展示物品等。视觉相关算法可涉及视觉定位、SLAM、三维重建、图像注册、背景分割、对象的关键点提取及跟踪、对象的位姿或深度检测等。具体应用不仅可以涉及跟真实场景或物品相关的导览、导航、讲解、重建、虚拟效果叠加展示等交互场景,还可以涉及与人相关的特效处理,比如妆容美化、肢体美化、特效展示、虚拟模型展示等交互场景。可通过卷积神经网络,实现对目标对象的相关特征、状态及属性进行检测或识别处理。上述卷积神经网络是基于深度学习框架进行模型训练而得到的网络模型。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的工作过程,可以参考前述方法实施例中的对应过程。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软 件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的示例性实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。
工业实用性
本公开实施例提供了一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品,其中,应用于智能体;所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象;所述方法包括:获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到目标虚拟对象执行交互动作的目标时刻、以及目标类型;响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。根据本公开实施例,智能体能够在游戏中将操作的意图告知给其他的智能体或者玩家,实现玩家和智能体之间、智能体和智能体之间的相互配合,提升游戏过程中的配合度。

Claims (20)

  1. 一种游戏中的交互方法,应用于智能体;所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象;所述方法包括:
    获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;
    利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到所述目标虚拟对象执行交互动作的目标时刻、以及目标类型;
    响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
  2. 根据权利要求1所述的方法,其中,所述目标虚拟对象的当前状态信息包括下述至少一种:所述目标虚拟对象具有的第一虚拟资源类型、第一虚拟资源量、第一建筑状态、第一技能状态、在所述游戏场景中的第一位置、第一生命值、第一魔法值、第一阵营信息、第一增益数据、第一减益数据;
    所述非目标虚拟对象的当前状态信息包括下述至少一种:所述非目标虚拟对象具有的第二虚拟资源类型、第二虚拟资源量、第二建筑状态、第二技能状态、在所述游戏场景中的第二位置、第二生命值、第二魔法值、第二阵营信息、第二增益数据、第二减益数据、类型信息、以及交互动作信息;
    所述游戏场景的当前状态信息包括下述至少一种:所述目标虚拟对象的可视区域信息、第三虚拟资源类型、虚拟资源位置、以及虚拟资源刷新的剩余时间。
  3. 根据权利要求1或2所述的方法,其中,所述交互动作包括下述至少一种:在游戏场景的地图中进行动作标记、向同一阵营的非目标虚拟对象发送语音交互信息、向同一阵营的非目标虚拟对象发送文字交互信息;
    所述交互动作的类型包括下述至少一种:撤退、进攻、防守、以及请求支援。
  4. 根据权利要求1-3任一项所述的方法,其中,所述目标神经网络采用下述方式训练得到:
    基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从多个玩家分别参与的第一游戏对局中,确定目标游戏对局;
    基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据;所述第一游戏数据包括:所述目标游戏对局中,第一 时间点和第二时间点之间的至少一帧目标游戏状态信息;其中,所述第一时间点为早于所述交互动作的发生时间的时间点,所述第二时间点为同于、或者晚于所述交互动作的发生时间的时间点;
    将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络。
  5. 根据权利要求4所述的方法,其中,所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据之前,还包括:
    对所述目标游戏对局中出现的不同类型的交互动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;
    基于所述不同类型的交互动作分别对应的出现总次数,确定所述待预测类型。
  6. 根据权利要求4或5所述的方法,其中,所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据,包括:
    针对每个目标游戏对局,确定所述待预测类型的交互动作在所述每个目标游戏对局中的发生时间;
    基于所述发生时间,确定所述第一时间点和所述第二时间点;
    基于所述第一时间点和所述第二时间点,从所述每个目标游戏对局对应的原始游戏数据中,截取所述待预测类型的交互动作在所述每个目标游戏对局中的目标游戏状态信息;
    所述目标游戏对局对应的原始游戏数据中包括多帧游戏状态信息。
  7. 根据权利要求4-6任一项所述的方法,其中,所述将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络,包括:
    将所述第一游戏数据作为样本数据,将所述第一游戏数据对应的交互动作的待识别类型作为监督数据,对所述原始神经网络进行监督训练,得到多个初始化神经网络;其中,不同初始化神经网络在训练时的训练参数不同;
    基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;
    利用所述第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分别对应的备选神经网络;
    基于所述备选神经网络分别对应的性能信息,从所述备选神经网络中,确定至少一个目标神经网络。
  8. 根据权利要求7所述的方法,其中,所述备选神经网络分别对应的性能信息,包括:利用所述备选神经网络对应的初始化神经网络进行游戏对局时的游戏评分。
  9. 一种游戏中的交互装置,应用于智能体;所述游戏包括多个虚拟对象;多个所述虚拟对象中包括受控于所述智能体的目标虚拟对象;所述装 置包括:
    获取部分,被配置为获取当前游戏状态信息;所述当前游戏状态信息包括下述至少一种:所述目标虚拟对象的当前状态信息、非目标虚拟对象的当前状态信息、游戏场景的当前状态信息;
    处理部分,被配置为利用预先训练的目标神经网络,基于所述当前游戏状态信息进行交互动作预测处理,得到所述目标虚拟对象执行交互动作的目标时刻、以及目标类型;
    控制部分,被配置为响应于当前时刻到达所述目标时刻,控制所述目标虚拟对象执行所述目标类型的交互动作。
  10. 根据权利要求9所述的装置,其中,所述目标虚拟对象的当前状态信息包括下述至少一种:所述目标虚拟对象具有的第一虚拟资源类型、第一虚拟资源量、第一建筑状态、第一技能状态、在所述游戏场景中的第一位置、第一生命值、第一魔法值、第一阵营信息、第一增益数据、第一减益数据;所述非目标虚拟对象的当前状态信息包括下述至少一种:所述非目标虚拟对象具有的第二虚拟资源类型、第二虚拟资源量、第二建筑状态、第二技能状态、在所述游戏场景中的第二位置、第二生命值、第二魔法值、第二阵营信息、第二增益数据、第二减益数据、类型信息、以及交互动作信息;所述游戏场景的当前状态信息包括下述至少一种:所述目标虚拟对象的可视区域信息、第三虚拟资源类型、虚拟资源位置、以及虚拟资源刷新的剩余时间。
  11. 根据权利要求9或10所述的装置,其中,所述交互动作包括下述至少一种:在游戏场景的地图中进行动作标记、向同一阵营的非目标虚拟对象发送语音交互信息、向同一阵营的非目标虚拟对象发送文字交互信息;所述交互动作的类型包括下述至少一种:撤退、进攻、防守、以及请求支援。
  12. 根据权利要求9至11中任一项所述的装置,其中,所述处理部分还被配置为采用下述方式训练得到所述目标神经网络:基于多个玩家分别在游戏中的对战等级、游戏对局时长、交互动作出现的次数中的至少一种,从多个玩家分别参与的第一游戏对局中,确定目标游戏对局;基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据;所述第一游戏数据包括:所述目标游戏对局中,第一时间点和第二时间点之间的至少一帧目标游戏状态信息;其中,所述第一时间点为早于在所述交互动作的发生时间的时间点,所述第二时间点为同于、或者晚于所述交互动作的发生时间的时间点;将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络。
  13. 根据权利要求12所述的装置,其中,所述处理部分在所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据之前,还被配置为:对所述目标游戏对局中出现的不同类型的交互 动作的次数进行统计,得到不同类型的交互动作分别对应的出现总次数;基于所述不同类型的交互动作分别对应的出现总次数,确定所述待预测类型。
  14. 根据权利要求12或13所述的装置,其中,所述处理部分在所述基于所述目标游戏对局,确定至少一种待预测类型的交互动作分别对应的第一游戏数据时,还被配置为:针对每个目标游戏对局,确定所述待预测类型的交互动作在所述每个目标游戏对局中的发生时间;基于所述发生时间,确定所述第一时间点和所述第二时间点;基于所述第一时间点和所述第二时间点,从所述每个目标游戏对局对应的原始游戏数据中,截取所述待预测类型的交互动作在所述每个目标游戏对局中的目标游戏状态信息;所述目标游戏对局对应的原始游戏数据中包括多帧游戏状态信息。
  15. 根据权利要求12至14中任一项所述的装置,其中,所述处理部分在所述将所述第一游戏数据作为样本数据,训练原始神经网络,得到所述目标神经网络时,还被配置为:将所述第一游戏数据作为样本数据,将所述第一游戏数据对应的交互动作的待识别类型作为监督数据,对所述原始神经网络进行监督训练,得到多个初始化神经网络;其中,不同初始化神经网络在训练时的训练参数不同;基于不同的初始化神经网络构成不同的智能体,并利用不同的智能体控制虚拟角色进行游戏对局,得到第二原始游戏数据;利用所述第二原始游戏数据对对应的初始化神经网络进行强化训练,得到多个初始化神经网络分别对应的备选神经网络;基于所述备选神经网络分别对应的性能信息,从所述备选神经网络中,确定至少一个目标神经网络。
  16. 根据权利要求15所述的装置,其中,所述备选神经网络分别对应的性能信息,包括:利用所述备选神经网络对应的初始化神经网络进行游戏对局时的游戏评分。
  17. 一种计算机设备,包括:处理器、存储器,所述存储器存储有所述处理器可执行的机器可读指令,所述处理器用于执行所述存储器中存储的机器可读指令,所述机器可读指令被所述处理器执行时,所述处理器执行如权利要求1至8任一项所述的游戏中的交互方法的步骤。
  18. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被计算机设备运行时,所述计算机设备执行如权利要求1至8任一项所述的游戏中交互方法的步骤。
  19. 一种计算机程序,包括计算机可读代码,在所述计算机可读代码在计算机设备中运行的情况下,所述计算机设备中的处理器执行用于实现权利要求1至8中任一所述的方法中的步骤。
  20. 一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序被计算机读取并执行时,实现权利要求1至8任一项所述方法中的步骤。
PCT/CN2022/098707 2021-10-29 2022-06-14 一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品 WO2023071221A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111269017.3A CN113952723A (zh) 2021-10-29 2021-10-29 一种游戏中的交互方法、装置、计算机设备及存储介质
CN202111269017.3 2021-10-29

Publications (1)

Publication Number Publication Date
WO2023071221A1 true WO2023071221A1 (zh) 2023-05-04

Family

ID=79468233

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/098707 WO2023071221A1 (zh) 2021-10-29 2022-06-14 一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品

Country Status (2)

Country Link
CN (1) CN113952723A (zh)
WO (1) WO2023071221A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117555815A (zh) * 2024-01-11 2024-02-13 腾讯科技(深圳)有限公司 参数预测方法、模型训练方法和相关装置
CN117899483A (zh) * 2024-03-19 2024-04-19 腾讯科技(深圳)有限公司 一种数据处理方法、装置、设备及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113952723A (zh) * 2021-10-29 2022-01-21 北京市商汤科技开发有限公司 一种游戏中的交互方法、装置、计算机设备及存储介质
CN114146420B (zh) * 2022-02-10 2022-04-22 中国科学院自动化研究所 一种资源分配方法、装置及设备
CN116999823A (zh) * 2022-06-23 2023-11-07 腾讯科技(成都)有限公司 信息显示方法、装置和存储介质及电子设备
CN116808590B (zh) * 2023-08-25 2023-11-10 腾讯科技(深圳)有限公司 一种数据处理方法和相关装置
CN117839224A (zh) * 2024-01-10 2024-04-09 广州市光合未来科技文化传媒有限公司 一种ai虚拟人的交互方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930483A (zh) * 2019-11-20 2020-03-27 腾讯科技(深圳)有限公司 一种角色控制的方法、模型训练的方法以及相关装置
CN111589166A (zh) * 2020-05-15 2020-08-28 深圳海普参数科技有限公司 交互式任务控制、智能决策模型训练方法、设备和介质
US20210245056A1 (en) * 2020-02-06 2021-08-12 Nhn Corporation Method and apparatus for predicting game difficulty by using deep-learning based game play server
CN113952723A (zh) * 2021-10-29 2022-01-21 北京市商汤科技开发有限公司 一种游戏中的交互方法、装置、计算机设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930483A (zh) * 2019-11-20 2020-03-27 腾讯科技(深圳)有限公司 一种角色控制的方法、模型训练的方法以及相关装置
US20210245056A1 (en) * 2020-02-06 2021-08-12 Nhn Corporation Method and apparatus for predicting game difficulty by using deep-learning based game play server
CN111589166A (zh) * 2020-05-15 2020-08-28 深圳海普参数科技有限公司 交互式任务控制、智能决策模型训练方法、设备和介质
CN113952723A (zh) * 2021-10-29 2022-01-21 北京市商汤科技开发有限公司 一种游戏中的交互方法、装置、计算机设备及存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117555815A (zh) * 2024-01-11 2024-02-13 腾讯科技(深圳)有限公司 参数预测方法、模型训练方法和相关装置
CN117555815B (zh) * 2024-01-11 2024-04-30 腾讯科技(深圳)有限公司 参数预测方法、模型训练方法和相关装置
CN117899483A (zh) * 2024-03-19 2024-04-19 腾讯科技(深圳)有限公司 一种数据处理方法、装置、设备及存储介质
CN117899483B (zh) * 2024-03-19 2024-05-28 腾讯科技(深圳)有限公司 一种数据处理方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN113952723A (zh) 2022-01-21

Similar Documents

Publication Publication Date Title
WO2023071221A1 (zh) 一种游戏中的交互方法、装置、计算机设备、存储介质、计算机程序及计算机程序产品
WO2022222597A1 (zh) 一种游戏进程的控制方法、装置、电子设备及存储介质
WO2023071854A1 (zh) 游戏中虚拟角色的控制方法、装置、计算机设备、存储介质及程序
CN106267822A (zh) 游戏性能的测试方法和装置
Brondi et al. Evaluating the impact of highly immersive technologies and natural interaction on player engagement and flow experience in games
US20220280870A1 (en) Method, apparatus, device, and storage medium, and program product for displaying voting result
CN111841018B (zh) 模型训练方法、模型使用方法、计算机设备及存储介质
Fizek et al. 13. Laborious playgrounds: Citizen science games as new modes of work/play in the digital age
KR20220113905A (ko) 게임 애플리케이션의 사용자 인터페이스 요소를 햅틱 피드백으로 트랜스크라이빙하기 위한 시스템 및 방법
Nakamura et al. Constructing a human-like agent for the werewolf game using a psychological model based multiple perspectives
CN114307160A (zh) 训练智能体的方法
CN115888119A (zh) 一种游戏ai训练方法、装置、电子设备及存储介质
Youssef et al. Building your kingdom imitation learning for a custom gameplay using unity ml-agents
Ağıl et al. A group‐based approach for gaze behavior of virtual crowds incorporating personalities
Cybulski Enclosures at play: Surveillance in the code and culture of videogames
CN113018862A (zh) 虚拟对象的控制方法、装置、电子设备及存储介质
CN116943204A (zh) 虚拟对象的控制方法、装置和存储介质及电子设备
Wittmann et al. What do games teach us about designing effective human-AI cooperation?-A systematic literature review and thematic synthesis on design patterns of non-player characters.
CN112870727B (zh) 一种游戏中智能体的训练及控制方法
Lankoski et al. Gameplay design patterns for social networks and conflicts
Drachen Analyzing player communication in multi-player games
Hingston et al. Mobile games with intelligence: A killer application?
Baby et al. Implementing artificial intelligence agent within connect 4 using unity3d and machine learning concepts
Miller et al. Panoptyk: information driven mmo engine
CN113769396B (zh) 虚拟场景的交互处理方法、装置、设备、介质及程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885128

Country of ref document: EP

Kind code of ref document: A1