WO2022213702A1 - Method and apparatus for configuring game inference service on cloud platform, and related device - Google Patents

Method and apparatus for configuring game inference service on cloud platform, and related device Download PDF

Info

Publication number
WO2022213702A1
WO2022213702A1 PCT/CN2022/072425 CN2022072425W WO2022213702A1 WO 2022213702 A1 WO2022213702 A1 WO 2022213702A1 CN 2022072425 W CN2022072425 W CN 2022072425W WO 2022213702 A1 WO2022213702 A1 WO 2022213702A1
Authority
WO
WIPO (PCT)
Prior art keywords
game
model
training
inference
cloud platform
Prior art date
Application number
PCT/CN2022/072425
Other languages
French (fr)
Chinese (zh)
Inventor
伍丝琪
邵坤
朱疆成
白小龙
戴宗宏
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202110742189.1A external-priority patent/CN115193053A/en
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2022213702A1 publication Critical patent/WO2022213702A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/55Controlling game characters or game objects based on the game progress
    • A63F13/58Controlling game characters or game objects based on the game progress by computing conditions of game characters, e.g. stamina, strength, motivation or energy level

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a method, apparatus and related equipment for configuring a reasoning service of a game on a cloud platform.
  • AI artificial intelligence
  • game developers such as game manufacturers, etc.
  • AI technology can use AI technology to implement functions such as non-player character (NPC) training, player behavior prediction, and character battle strategies in game scenarios. It is to infer the battle action or battle state of objects such as non-player characters in the game battle process through AI technology.
  • NPC non-player character
  • the embodiments of the present application provide a method for configuring a game reasoning service on a cloud platform, so as to reduce the cost and difficulty of providing a game reasoning service by a cloud service provider.
  • the present application also provides corresponding apparatuses, computing device clusters, computer-readable storage media, and computer program products.
  • an embodiment of the present application provides a method for configuring an inference service for a game on a cloud platform. Specifically, when configuring an inference service for a first game for a game developer on a cloud platform, you can obtain information including an inference service for the first game. A first configuration file of configuration information of a game, so that the inference service of the first game is configured on the cloud platform based on the game algorithm framework of the cloud platform and the acquired first configuration file.
  • the general game algorithm framework and the corresponding configuration file can be used to automatically configure the inference service for the specific game required by the game developer on the cloud platform, so as to use the configured inference service.
  • the reasoning service performs corresponding reasoning on the game, such as reasoning about the action and/or state of any character in the game.
  • the cloud service provider does not need to carry out specialized design of the reasoning service for the game, so that the difficulty and cost of providing and using the reasoning service of the game can be effectively reduced, and the efficiency of providing and using the reasoning service can also be effectively improved.
  • the application of AI technology is inseparable from the training algorithm represented by reinforcement learning (RL) and the computing resources required to support the operation of the training algorithm.
  • RL reinforcement learning
  • the application of game developers can be effectively reduced. The difficulty of AI technology to obtain game reasoning results.
  • not only the reasoning service corresponding to the first game but also the reasoning service corresponding to the second game may be configured on the cloud platform.
  • a second configuration file including configuration information for the second game may be obtained, so that the inference service of the second game is configured on the cloud platform based on the game algorithm framework of the cloud platform and the obtained second configuration file.
  • the inference service of the first game can be used to respond to the inference request sent by the game terminal, where the game terminal can run the first game
  • the device of the game application instance such as a terminal and/or a server, etc.
  • the reasoning request includes the data to be processed for the target object in the game application instance of the first game, such as including the information picture of the target object, etc.
  • the response made may include indication information on the action and/or state of the target object, for example, the indication information may indicate the action (such as attack, long jump, etc.) performed by the target object in the future, and/or may indicate The state of the target object in the future (such as emotion, attack speed, etc.).
  • the game developer can realize the reasoning about the action and/or state of the target object through the reasoning service configured on the cloud platform, which can effectively reduce the difficulty of the game developer applying AI technology to obtain the game reasoning result.
  • the first configuration file includes one or more of the following configuration information: the action space of the target object in the game application instance of the first game, the target in the game application instance of the first game The state space of the object, the first type of target training algorithm, the second type of artificial intelligence AI model, the reward function, the training method of the AI model, the reasoning method of the AI model, the storage address of the AI model, the training method of the AI model and the The specification of the computing resources for inference, so that the configuration of the inference service can be implemented using these configuration information on the cloud platform.
  • the implementation manner of the second configuration file is similar to that of the first configuration file, which can be understood by reference, and will not be repeated here.
  • At least one AI model when configuring the reasoning service of the first game, at least one AI model can be trained according to the first configuration file and the game algorithm framework of the cloud platform, so that at least one AI model can be trained according to the at least one trained
  • the AI model configuration obtains the reasoning service of the first game.
  • inference when using the inference service to infer the action and/or state of the target object in the game application instance of the first game, inference may be specifically performed by using the trained AI model.
  • the multiple training requests are from multiple game application instances of the first game, and different training requests Different training data for the same target object in the multiple game application examples are included, so that at least one AI model can be trained by using the training data in multiple training requests.
  • multiple copies of training data can be generated in parallel, which can effectively improve the efficiency of generating training data, that is, the efficiency of training AI models.
  • the game terminal may also run only one game application instance in the same time period, so that one or more AI models are obtained by training using the training data generated by the single game application instance.
  • the trained at least one AI model includes a first AI model and a second AI model, in this case, the hyperparameters of the first AI model and the second AI model are different, and/or, The reward functions corresponding to the first AI model and the second AI model are different.
  • the hyperparameters of the first AI model and the second AI model are different, and/or, The reward functions corresponding to the first AI model and the second AI model are different.
  • the hyperparameters that can make the inference effect of the AI model higher, and then the quality of the AI model based on the hyperparameters is higher, that is, the configuration based on the AI model The quality of the inference service is high.
  • AI models of different inference types can be trained, for example, AI models of various inference styles can be obtained by training with multiple reward functions, etc., so as to realize the diversification of inference.
  • the AI model to be trained includes the first AI model and the second AI model
  • multiple processes may run on the cloud platform.
  • the first process and the second process are used as examples.
  • the training data in the multiple training requests are sent to the first process and the second process,
  • the training data received by the first process and the second process may be different.
  • the first AI model is trained using the first process and the training data received by the first process
  • the second AI model is trained using the second process and the training data received by the second process.
  • multiple AI models can be trained in parallel on the cloud platform, so that the training efficiency of the AI model can be improved, that is, the efficiency of configuring the reasoning service of the first game can be improved.
  • one or more different types of training algorithms and AI models may be predefined, so that when configuring the reasoning service of the first game, you can The first type of target training algorithm and the second type of AI model in the file, call the first type of target training algorithm and at least one AI model of the second type from the game algorithm framework.
  • the target training algorithm can be, for example, any one of a deep reinforcement learning algorithm, a near-end policy optimization algorithm, a flexible action evaluation algorithm, a deep deterministic policy gradient algorithm, a double-delay deep deterministic policy gradient algorithm, and a rainbow algorithm, Or other applicable algorithms.
  • the AI model for example, can be any one of a deep neural network model, a recurrent neural network model, and a convolutional neural network model, or can also be other applicable models.
  • the inference request sent by the game terminal may be used first.
  • the data format is processed to obtain data in a data format that can be recognized by the cloud platform. In this way, it can be avoided that the game terminal and the cloud platform are difficult to identify the reasoning request sent by the game terminal due to the difference of the deployment environment.
  • a persistent connection may be maintained between the cloud platform and the game terminal, and the cloud platform may receive and respond to inference requests sent by the game terminal through the persistent connection.
  • the cloud platform may receive and respond to inference requests sent by the game terminal through the persistent connection.
  • the first configuration file when acquiring the first configuration file, may be specifically acquired based on the configuration information selected by the game developer.
  • the cloud platform can provide the game developer with a corresponding configuration interface, and multiple configuration information items for the game developer to select are presented on the configuration interface, so that the game developer can choose from multiple configuration information items , so that the cloud platform automatically generates a corresponding first configuration file based on the game developer's selection of the configuration information item. In this way, the configuration efficiency of the game developer can be effectively provided, and the configuration experience can be improved.
  • the present application provides a device for configuring an inference service of a game on a cloud platform
  • the device includes a communication module for acquiring a first configuration file, where the first configuration file includes configuration information for the first game; the configuration module, It is used for the game algorithm framework and the first configuration file based on the cloud platform, and the reasoning service of the first game is configured on the cloud platform.
  • the communication module is further configured to acquire a second configuration file, where the second configuration file includes configuration information for the second game; the configuration module is further configured to obtain the game algorithm framework based on the cloud platform and the second configuration file. Configuration file, configure the reasoning service of the second game on the cloud platform.
  • the apparatus further includes: an inference module, configured to use an inference service of the first game to respond to an inference request sent by the game terminal, wherein the game terminal includes a device running a game application instance of the first game , the inference request includes data to be processed for the target object in the game application instance of the first game, and the response includes indication information for the action and/or state of the target object.
  • an inference module configured to use an inference service of the first game to respond to an inference request sent by the game terminal, wherein the game terminal includes a device running a game application instance of the first game , the inference request includes data to be processed for the target object in the game application instance of the first game, and the response includes indication information for the action and/or state of the target object.
  • the first configuration file includes one or more of the following configuration information: the action space of the target object in the game application instance of the first game, the target in the game application instance of the first game The state space of the object, the first type of target training algorithm, the second type of artificial intelligence AI model, the reward function, the training method of the AI model, the reasoning method of the AI model, the storage address of the AI model, the training method of the AI model and the Specifications of computational resources for inference.
  • the configuration module is specifically configured to: train at least one AI model based on the first configuration file and the game algorithm framework; configure the reasoning service of the first game according to the trained at least one AI model.
  • the configuration module is specifically configured to: receive multiple training requests from the game terminal, the multiple training requests are from multiple game application instances of the first game, and the different training requests include multiple training requests for multiple game applications Different training data of the same target object in the instance; at least one AI model is trained according to the training data in multiple training requests.
  • the at least one AI model includes the first AI model and the second AI model
  • the hyperparameters of the first AI model and the second AI model are different, and/or the first AI model and the second AI model are different.
  • the reward functions corresponding to the two AI models are different.
  • the cloud platform runs a first process and a second process
  • the configuration module is specifically used for: according to the first process The port number and/or IP address and the port number and/or IP address of the second process, send the training data in the multiple training requests to the first process and the second process; using the data received by the first process and the first process The training data trains the first AI model, and the second AI model is trained using the second process and the training data received by the second process.
  • the configuration module is specifically configured to: call the first type of target training in the game algorithm framework according to the first type of target training algorithm and the second type of AI model in the first configuration file an algorithm and at least one AI model of the second type; based on the called target training algorithm of the first type, the at least one AI model of the second type is trained.
  • the method further includes: responding to the inference request sent by the game terminal
  • the format of the data in the request is processed to obtain data in a data format that the cloud platform can recognize.
  • a persistent connection is maintained between the cloud platform and the game terminal, and the cloud platform receives and responds to inference requests sent by the game terminal through the persistent connection.
  • the communication module is specifically configured to acquire the first configuration file based on the configuration information item selected by the game developer.
  • the present application provides a computing device cluster, where the computing device cluster includes at least one computing device, wherein each computing device includes a processor and a memory.
  • the processor is configured to execute instructions stored in the memory, so that the at least one computing device executes the method for configuring an inference service of a game on a cloud platform as in the first aspect or any implementation manner of the first aspect.
  • the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, when the computer-readable storage medium runs on a computing device, the computing device causes the computing device to perform the first aspect or any one of the first aspect.
  • the present application provides a computer program product containing instructions, which, when run on a computing device, enables the computing device to execute the cloud platform described in the first aspect or any implementation manner of the first aspect.
  • the present application may further combine to provide more implementation manners.
  • FIG. 1 is a schematic diagram of an exemplary application scenario provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a game reasoning device provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a method for configuring an inference service of a game on a cloud platform according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a configuration interface provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of the transmission of training data provided by an embodiment of the present application from the game terminal 200 to the game inference device 300;
  • FIG. 6 is a schematic flowchart of a method for configuring an inference service of a game on a cloud platform in combination with a specific scenario provided by an embodiment of the present application;
  • FIG. 7 is a schematic structural diagram of another game reasoning apparatus provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of blood volume at the end of the battle between character A and character B according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the victory rate of character A and character B provided in an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the victory rate of character A of three different fighting styles provided in the embodiment of the present application.
  • FIG. 11 is a schematic diagram of a hardware structure of a computing device cluster according to an embodiment of the present application.
  • FIG. 1 it is a schematic diagram of an exemplary application scenario.
  • the game developer 101 can develop the game through the game terminal 200.
  • the game developer 101 can develop the game environment, design the game map and the actions of each character in the game on the game terminal 200, status, etc.
  • the game developed by the game developer 101 on the game terminal 200 can provide the game player 102 with a game experience.
  • the developed game can be run on the game terminal 200, so that the game player 102 can trigger the game operation for game experience.
  • the game developer 101 in this application refers to the subject who develops and designs the game or develops the game reasoning ability. It should be understood that the game terminal 200 described in FIG.
  • the game terminal 200 may include a server for game development, and may also include a device running the game (eg, a background server of the game developer 101, and/or a terminal device of the game player 102 where the game is installed). It should be understood that FIG. 1 is an example of including one game terminal 200 for exemplary illustration, and in actual application, the number of game terminals 200 may also be multiple.
  • the game terminal 200 may send an inference request to the cloud platform to request an inference service for the game.
  • the game on the game terminal 200 may support the "human-machine battle" mode, that is, when the player character controlled by the game player 102 competes with the non-player character controlled by the machine, the game terminal 200 may request the cloud platform for the non-player character's Battle action (such as attack, jump, etc.) and/or battle state (such as attack speed increase, movement speed decrease, etc.) and other information, so that the game terminal 200 can control the non-player character and the game player 102 according to the battle information provided by the cloud platform. Player characters fight against each other. As shown in FIG.
  • the cloud platform may include a game inference apparatus 300, and the game inference apparatus 300 includes an inference service for the game.
  • the game terminal 200 may specifically request an inference service from the game inference device 300 on the cloud platform, and the game inference device 300 performs corresponding game inference for the game terminal 200 .
  • the game terminal 200 may specifically include the client 201 and the server 202 shown in FIG. 2 , where the client 201 is used to interact with the game player 102 , and the core content of the developed game is deployed in the server 202 .
  • the client 201 sends an inference request to the server 202 while receiving the game operation of the game player 102 .
  • the server 202 may request the game reasoning service from the game reasoning device 300 when some information such as the battle action and/or battle state of the non-player characters is required.
  • the server 202 may forward the battle action and/or state of the non-player characters fed back by the game inference device 300 to the client 201 , so that the client 201 controls the non-player characters to interact with the game player 102 in the game.
  • the server 202 may request game inference services for multiple clients 201 at the same time.
  • the game terminal 200 may only include the client 201 .
  • a complete game is deployed in the client 201 , and the client 201 can send an inference request to the game inference apparatus 300 .
  • the game reasoning device 300 feeds back the responsive reasoning information to the client 201, so that the client 201 can interact with the game player 102 based on the reasoning information.
  • the game inference device 300 may include a communication module 301 and an inference module 302 as shown in FIG. 2 .
  • the communication module 301 can communicate with the game terminal 200 through the seventh layer protocol (ie the application layer, Application Layer) in the Open System Interconnection Model (OSI) , that is, the communication module 301 can receive the inference request sent by the game terminal 200 based on the protocol, and send the inference request or the data to be processed carried in the inference request to the inference module 302 .
  • the reasoning module 302 can use the corresponding AI model to execute the corresponding reasoning process for the game terminal 200 , and send the result obtained by the inference to the game terminal 200 through the communication module 301 .
  • the game inference apparatus 300 may be deployed on a server in a cloud center, or the game inference apparatus 300 may also be a server deployed in an edge center.
  • the communication module 301 and the inference module 302 on the game inference device 300 may be deployed separately, for example, the communication module 301 may be deployed in a server in an edge center, and the inference module 302 may be deployed in a cloud center server etc.
  • the deployment manner of the game inference apparatus 300 is not limited.
  • the cloud center in this application refers to a set of devices established by a cloud service provider and used to provide services for cloud tenants in a region (eg, East China region).
  • the cloud center usually includes a large number of resources, and can provide basic resource services and/or software application services for cloud tenants in various regions in the region.
  • the cloud center includes multiple computing devices (such as servers), hardware resources in each computing device, and virtual resources abstracted based on hardware resources (hardware resources and virtual resources can also be called computing nodes, for example, computing nodes can be containers or A virtual machine) may be used to deploy the aforementioned game inference device 300 .
  • cloud centers may not meet the latency requirements. Therefore, cloud service providers also set up edge centers.
  • the edge center in this application refers to a set of devices established by a cloud service provider in at least one specific area in an area, and the edge center also includes multiple computing devices, which can be used to provide services for tenants in a specific area in an area. Because edge centers are geographically deployed closer to cloud tenants in a specific region than cloud centers, edge centers can provide faster service responses.
  • Cloud service providers can provide cloud services through cloud platforms.
  • Cloud platforms include software resources and hardware resources owned by cloud service providers.
  • cloud platforms include software systems that interact with cloud tenants to sell, configure, and run cloud services.
  • the cloud platform can display at least one cloud service to the user, and after the cloud tenant purchases and configures the cloud service on the cloud platform, when the cloud tenant uses the cloud service, the cloud platform can call the cloud center or the node in the edge center.
  • the device corresponding to the cloud service responds to the service, for example: in this application, after the user configures the game inference service on the graphical user interface of the cloud platform, the game inference device 300 deployed in the cloud center or the edge center can process data from the game terminal 200 the inference request and return the response.
  • the game terminal 200 may complete the configuration of the inference service on the cloud platform in advance before requesting the inference service of the game.
  • the game inference device 300 may be a device for configuring the inference service of the game on the cloud platform.
  • the game inference apparatus 300 may further include a configuration module 303 .
  • the game developer 101 may provide a configuration file for a specific game to the cloud platform.
  • the configuration module 303 in the game inference device 300 can configure the game inference service for the game developer 101 on the cloud platform based on the game algorithm framework of the cloud platform and the configuration file provided by the game developer 101, so as to facilitate the inference module 302 may use the successfully configured reasoning service to perform game reasoning for the game terminal 200 .
  • the game algorithm framework can be pre-defined with multiple training algorithms and multiple AI models, etc., so that the configuration module 303 can select from the game algorithm framework according to the configuration file. model and other content to configure the inference service.
  • the game reasoning device 300 can use the general game algorithm framework and the configuration file provided by the game developer 101 to automatically configure on the cloud platform to meet the requirements of the game developer 101 . required inference services. For example, when the game inference apparatus 300 obtains the first configuration file including the configuration information of the first game, it can configure the inference of the first game on the cloud platform based on the game algorithm framework on the cloud platform and the first configuration file and, when the game inference device obtains the first configuration file including the configuration information of the second game, it can configure the inference service of the second game on the cloud platform based on the game algorithm framework and the second configuration file.
  • the first game and the second game are different games.
  • the subsequent game reasoning device 300 can further use the configured reasoning service to perform game reasoning for the game terminal 200, which makes the cloud service provider do not need to carry out special design for the reasoning service for the game, thereby effectively reducing the need for the cloud service provider to provide a reasoning service.
  • the difficulty and cost of inference services, and the efficiency of cloud service providers to provide inference services can also be effectively improved.
  • the application of AI technology is inseparable from the training algorithm represented by reinforcement learning (RL) and the computing resources required to support the operation of the training algorithm.
  • RL reinforcement learning
  • the game terminal 200 uses the reasoning service on the cloud platform to realize game reasoning, which can effectively reduce the number of game developers 101 applying AI technology to obtain game reasoning results. difficulty.
  • the structure of the game reasoning apparatus 300 shown in FIG. 2 is only used as an exemplary illustration, and other possible implementations may also be adopted for the structure of the game reasoning apparatus 300 in practical application.
  • the game inference device 300 can also be divided into a configuration device and an inference device, wherein the configuration device is used to implement the inference service of configuring the game on the cloud platform, and the inference device is used to complete the The configured reasoning service performs corresponding game reasoning for the game terminal 200 .
  • the game inference apparatus 300 may further include an online service module, which may be configured to publish the inference service as an online cloud service on the cloud platform after the configuration module 303 completes the configuration of the inference service of the game.
  • the configuration module 303 and the inference module 302 may be integrated into one module or the like. This embodiment does not limit the specific implementation of the game inference apparatus 300 .
  • FIG. 3 it is a schematic flowchart of a method for configuring a game inference service on a cloud platform according to an embodiment of the present application.
  • This method can be applied to the game reasoning apparatus 300 shown in FIG. 2 above.
  • the game reasoning device 300 can configure the corresponding reasoning services of various games on the cloud platform according to the configuration files of various games.
  • the reasoning service of the first game is configured on the cloud platform is taken as an example for illustration.
  • the implementation process of the game reasoning apparatus 300 configuring the reasoning service of the second game (and other games) according to the second configuration file corresponding to the second game (and other games) can be understood by referring to the specific implementation of the following embodiments. The example is not limited to this.
  • the method for configuring the inference service of the game on the cloud platform shown in FIG. 3 may specifically include:
  • the game inference apparatus 300 acquires a first configuration file, where the first configuration file includes configuration information for the first game.
  • the game inference device 300 may be requested to configure an inference service for the first game, so as to utilize the stronger data processing capability of the cloud platform and the higher-quality AI model for the game application of the first game.
  • the target object in the instance (such as a game character in a game application instance) performs inference, and the inference service can be used to infer actions and/or states performed by one or more game characters in the game.
  • the game developer 101 may provide the game inference apparatus 300 with a first configuration file, so that the game inference apparatus 300 configures the inference service of the first game on the cloud platform according to the first configuration file. It should be understood that the game developer 101 may send the first configuration file to the game inference apparatus 300 based on the game terminal 200, as shown in FIG. 3 . In other embodiments, the game developer 101 may also use any other device to provide the first configuration file to the game reasoning apparatus 300. For example, the game developer 101 may use any terminal device to log in to the web page of the cloud platform through the web page. A configuration operation is performed on the interface to provide the first configuration file to the game inference device 300 .
  • the first configuration file may include the action space of the target object in the game application instance of the first game, the state space of the target object in the game application instance of the first game, the reward function, the type of AI model, One or more of the type of target training algorithm used to train the AI model, the training method of the AI model, the inference method of the AI model, the storage address of the AI model, and the specifications of computing resources used for AI model training and inference, etc. .
  • the remaining part of the information can be determined by the game reasoning device 300 by itself, for example, the remaining part of the information can be preset on the cloud platform.
  • the action space of the target object includes at least one action of the target object, such as the target object moving forward, backward, left, right, jumping, attacking, dodging and other actions in the first game.
  • the action deduced by the game inference device 300 for the target object is an action in the action space.
  • the target object may be a non-player character in the game application instance of the first game running on the game terminal 200, such as a multiplayer online battle arena (MOBA) controlled by the game terminal 200 A "human machine” or "wild monster” that fights against the player.
  • the target object can also be an object that needs to be predicted to perform actions or states in a future time period (or other time periods). At this time, game player A is experiencing the process of predicting the behavior and/or state of game player B, which can be regarded as the process of game player A experiencing the game.
  • the state space of the target object including at least one state of the target object, such as the state of the non-player character's HP, mana, attack power, defense power, skill cooldown time, and emotion in the game scene.
  • the state deduced by the game inference device 300 for the target object is a state in the state space.
  • the type of target training algorithm refers to the type of algorithm used to train the AI model used to infer the action and/or state of the target object.
  • the type of target training algorithm can be developed by the game before configuring the inference service.
  • the player 101 specifies, of course, it may be specified by the game inference device 300 or the like.
  • the target training algorithm may be, for example, a deep reinforcement learning (deep reinforcement learning, DQN) algorithm, a proximal policy optimization (proximal policy optimization, PPO) algorithm, a soft actor-critic (soft actor-critic, SAC) algorithm, a deep Deterministic policy gradient (deep deterministic policy gradient, DDPG) algorithm, dual-delay deep deterministic policy gradient (twin delayed deep deterministic policy gradient algorithm, TD3) algorithm, rainbow (rainbow) algorithm of any reinforcement learning algorithm, or also There may be other applicable algorithms.
  • DQN deep reinforcement learning
  • PPO proximal policy optimization
  • PPO soft actor-critic
  • SAC soft actor-critic
  • DDPG deep deterministic policy gradient
  • TD3 dual-delay deep deterministic policy gradient
  • rainbow rainbow
  • the type of AI model refers to the type of AI model used to realize the action and/or state of the inference target object.
  • the AI model may be any one of a deep neural network (deep neural network, DNN) model, a recurrent neural network (RNN) model, and a convolutional neural network (convolutional neural network, CNN) model neural network model, or other applicable models.
  • DNN deep neural network
  • RNN recurrent neural network
  • CNN convolutional neural network
  • the training data can be input into the AI model, and the parameters and/or network structure in the AI model can be adjusted according to the difference between the results output by the AI model and the actual results in the training data, so as to This implements the training of AI models.
  • the reward function refers to the function used to control the adjustment direction of the parameters in the AI model in the process of training the AI model.
  • the game reasoning device 300 can calculate the corresponding reward value through the reward function for the result output by the AI model, so that the AI can be determined according to the value between the reward value and the preset threshold.
  • the adjustment direction of the parameters in the model so that the model training is finally completed through multiple iterative adjustments.
  • the training method of the AI model can be the training method of self-play.
  • self-play refers to the game between the model itself and a virtual opponent.
  • the virtual opponent can be the model itself with past experience, or it can be an agent trained by other models.
  • the game reasoning device 300 may acquire the training data for object A generated by one or more game application instances, and input the training data of the object A into the AI model, and the AI model will use the training data of the object A according to the training data of the object A.
  • the reasoning method can be a method of inferring actions and/or states of one style for a target object, or a method of inferring actions and/or states of multiple styles in parallel.
  • the save address of the AI model is used to indicate the save location of the AI model on the cloud platform after the training of the AI model is completed.
  • the computing resource specification used for AI model training and inference refers to the specification of computing resources that the service inference device 300 relies on when training the AI model or using the AI model to perform inference. configuration file to define.
  • the first configuration file may also include other information, such as a heuristic algorithm, etc.
  • the heuristic algorithm may be used to impose rationality constraints on the actions and/or states obtained by the target training algorithm using the AI model inference, etc.
  • the specific implementation of the first configuration file is not limited.
  • the game terminal 200 may present a configuration interface as shown in FIG. 4 to the user, and the configuration interface may present information prompting the game developer 101 to input the relevant configuration, so that the game developer 101 can
  • the configuration interface configures the action space, state space, reward function, target training algorithm, AI model, etc. of the target object (the configuration of the rest of the content is not shown).
  • the game terminal 200 may present configuration information items that can be selected by the game developer 101 on the configuration interface. For example, different types of candidate items for target training algorithms and AI models as shown in FIG.
  • the item can be provided to the game terminal 200 in advance by the game reasoning device 300, so that the game developer 101 can directly select the training algorithm and AI model on the configuration interface, without inputting the specific training algorithm file and AI model file, Thereby, the configuration of the game developer 101 can be further facilitated. Then, the game terminal 200 can automatically generate the corresponding first configuration file according to the configuration information item selected by the game developer 101 and send it to the game inference device 300 .
  • the game developer 101 may also provide the first configuration file to the game inference apparatus 300 in other manners, which is merely an exemplary description, and its specific implementation manner is not limited.
  • the game inference apparatus 300 configures the inference service of the first game on the cloud platform based on the game algorithm framework of the cloud platform and the acquired first configuration file.
  • the game algorithm framework may be an algorithm library in which one or more different types of training algorithms and AI models are pre-defined.
  • Various training algorithms and AI models in the game algorithm framework can be pre-built.
  • the L2 regularization term and/or the Dropout layer can be used to improve the generalization performance of the AI model, so as to avoid the low universality of the inference actions output by the trained AI model as much as possible.
  • the number of network layers and/or the number of neurons in the AI model can also be adaptively adjusted according to actual application requirements.
  • the game reasoning apparatus 300 may call the training algorithm and AI model in the game algorithm framework through an application programming interface (Application Programming Interface, API).
  • API Application Programming Interface
  • the game developer 101 may subscribe to the game algorithm framework on the cloud platform, so that the game reasoning apparatus 300 configures the reasoning service of the first game based on the game algorithm framework subscribed to.
  • the game reasoning apparatus 300 may include a communication module 301 and a configuration module 303 as shown in FIG. 2 .
  • the communication module 301 may send the first configuration file to the configuration module 303 .
  • the configuration module 303 can call the corresponding type of target training algorithm in the game algorithm framework according to the type of target training algorithm defined in the first configuration file, and according to the type of AI model defined in the first configuration file, in the game algorithm framework.
  • One or more AI models of the corresponding type are called in the .
  • the configuration module 303 can use the called target training algorithm to train the AI model, so that the configuration module 303 can further configure the AI model according to the trained AI model. Inference services required by game developers 101 .
  • the specification of computing resources required by the configuration module 303 to implement AI model training and the subsequent reasoning module 302 to use the AI model to perform game reasoning can be determined according to the first configuration file provided by the game developer 101 .
  • the configuration module 303 can train the AI model by means of hyperparameter search.
  • the hyperparameters in the AI model refer to the parameters preset by the AI model before model training, and other parameters remaining in the AI model can be determined through the subsequent model training process.
  • the pre-set hyperparameters may not necessarily enable the inference effect of the AI model for the target object to reach a higher or highest level. Therefore, in this embodiment, the game inference device 300 can search through the hyperparameters.
  • the AI model is trained in a way to determine the hyperparameters that can make the trained AI model achieve a higher inference effect.
  • the configuration module 303 can pre-determine multiple sets of possible values of the hyperparameters, so as to construct multiple AI models based on different values of the hyperparameters, taking the construction of the first AI model and the second AI model as an example (actually When applied, a larger number of AI models can be constructed based on the multiple possibilities of hyperparameter values).
  • the first AI model and the second AI model have different hyperparameters, but both the first AI model and the second AI model belong to the type of AI model defined in the first configuration file. In this way, the configuration module 303 can use the pre-acquired training data to separately train the first AI model and the second AI model.
  • the configuration module 303 can use the reward function defined in the first configuration file to calculate the reward value corresponding to the first AI model and the reward corresponding to the second AI model respectively. value.
  • the configuration module 303 uses the hyperparameters of the first AI model as the hyperparameters to be searched, and uses the first AI model As an AI model with relatively high training effect.
  • the reward value corresponding to the AI model can be calculated by the reward function preconfigured by the game developer 101 .
  • the configuration module 303 can use the hyperparameter of the AI model with the largest reward value among the multiple AI models as the hyperparameter to be searched out.
  • the AI model with this hyperparameter is the AI model with the highest training effect. It should be understood that the hyperparameter search is performed according to the reward value as an example for illustrative description. In practical application, the hyperparameter search may also be completed in other possible ways, which is not limited in this embodiment.
  • the configuration module 303 can create a process on the cloud platform, so that each AI model with different hyperparameters can be serially trained by using this process, wherein, when training each AI model with different hyperparameters
  • the reward function used is the same.
  • the configuration module 303 may create multiple processes on the cloud platform, each process may be responsible for training at least one AI model, and the AI models trained by different processes have different hyperparameters.
  • the configuration module 303 can create the first process and the second process on the cloud platform, where the first process is used to train the first AI model , and the second process is used to train the second AI model.
  • the process created by the configuration module 303 can be represented as an executor (worker), and each executor can be implemented by a process or the like. In this way, the configuration module 303 can use multiple processes to train multiple AI models in parallel, thereby effectively improving the efficiency of hyperparameter search.
  • the processes and AI models can be in one-to-one correspondence, so that the configuration module 303 can train the AI models corresponding to each process in parallel through each process.
  • the configuration module 303 can use multiple processes to first perform parallel training on some AI models , and after completing the training of the part of the AI model, the configuration module 303 may continue to train the remaining part of the AI model by using the multiple processes.
  • hyperparameter data can be exchanged between multiple processes. In this way, the hyperparameters of the AI model with good performance can be reused between processes, and the remaining hyperparameters can be randomly selected. Explore new hyperparameters that can reduce the computational overhead required to train multiple AI models.
  • an optimal AI model trained by the configuration module 303 by means of hyperparameter search can be used to infer a style of action/state.
  • the configuration module 303 can also obtain an AI model capable of inferring a variety of styles of different actions/states by means of population based training (PBT).
  • PBT population based training
  • the reward function used is different when different styles of AI models are used.
  • the configuration module 303 can build the first AI model and the second AI model on the cloud platform, wherein the first AI model can correspond to the reward function 1, and the configuration module 303 can use the reward function 1 to train a model for inferring the first AI model.
  • the second AI model may correspond to the reward function 2, and the configuration module 303 may use the reward function 2 to train the action/state for inferring the second style.
  • Population evolution is an asynchronous automatic hyperparameter adjustment optimization method that combines parallelized search and sequence optimization.
  • the configuration module 303 can obtain AI models for inferring different styles by training different pre-defined reward functions, that is, each reward function can correspond to an AI model of one inference style. For example, in the battle game scenario, for the non-player character A in the battle game, the configuration module 303 can train the non-player character A to obtain "aggressive type", "conservative type” and "aggressive type” through the population evolution process.
  • the AI models of the three combat styles of "Balanced” can each correspond to three different reward functions, that is, the AI models of "aggressive” combat styles can correspond to reward function 1, "Conservative” "The AI model of the combat style can correspond to the reward function 2, and the AI model of the "balanced” combat style can correspond to the reward function 3.
  • the configuration module 303 can train an AI model (such as the AI model with the largest reward value) that belongs to the combat style and has a higher reasoning effect through the above-mentioned hyperparameter search.
  • the different reward functions used in the population evolution process may be determined by the configuration module 303 according to the first configuration file, or may be set by the configuration module 303 itself.
  • a new AI model may be evolved due to iterative training.
  • each AI model including the new AI model and the old AI model
  • the configuration module 303 uses the process to iteratively train the AI model, it can replace the old AI model with the new AI model, so that the process can reuse the port corresponding to the old AI model to receive training data, and use the training data
  • the new AI model it can effectively avoid that the number of AI models obtained before and after population evolution is too large, which leads to the excessive number of ports occupied by the process in the game inference device 300 .
  • the AI model can be trained based on the training method of self-play defined in the first configuration file.
  • the training data required for training the AI model may be provided by the game terminal 200 to the game inference apparatus 300 .
  • the communication module 301 in the game inference device 300 can send the training data to the configuration module 303 .
  • the training data can be specifically generated by the game application instance of the first game running on the game terminal 200, for example, it can be the attack power corresponding to the non-player character and the player character at different battle moments in the game application instance of the first game. , Defense, HP and other game data.
  • the game terminal 200 may send multiple training requests to the communication module 301, and different training requests include different training data of the target object.
  • the multiple training requests come from multiple game application instances of the first game running on the game terminal 200 .
  • multiple game application instances can generate multiple pieces of data about the target object at runtime.
  • multiple game application instances on the game terminal 200 can generate more training data per unit time, thereby speeding up the training process of the AI model by the game inference apparatus 300 .
  • multiple game application instances on the game terminal 200 are used to generate training data in parallel as an example.
  • the game terminal 200 may also only run one game application instance of the first game and generate training data. data.
  • the training data for the target object generated by the game application instance running on the game terminal 200 may contain some invalid data, such as data used to mark the data generation time, data size, etc. 303 has no guiding significance for training the AI model, therefore, the game terminal 200 can filter this part of the data as invalid data. In this way, the amount of training data sent by the game terminal 200 to the game inference device 300 can be reduced, thereby reducing the time delay caused by data communication between the game terminal 200 and the game inference device 300 and improving model training efficiency.
  • the configuration module 303 can use the first process and the training data received by the first process to train the first AI model, and use the second process and the training data received by the second process to train the first AI model. of training data to train a second AI model.
  • the process of training any one of the multiple AI models by a single process can be similar to the above process, and will not be done here. Repeat.
  • the training data used by different processes when training their corresponding AI models may be derived from one or more game application instances of the first game.
  • the communication module 301 can send the training data to one or more processes according to the pre-configuration. For example, as shown in FIG. 5 , a process 1 and a process 2 may be created on the game inference device 300 , and the game application instance 1 , the game application instance 2 and the game application instance of the first game may run on the game terminal 200 n.
  • the communication module 301 can send the training data 1 of the game application instance 1 and the training data 2 of the game application instance 2 to the process 1, and the game application instance
  • the training data n of n is sent to process 2.
  • the communication module 301 may be preconfigured with a corresponding relationship between the process and the game application instance, so that the communication module 301 can send the training data generated by the game application instance to the corresponding process according to the corresponding relationship.
  • the communication module 301 may also send the training data 1 of the game application instance 1 to the process 1 and the process 2 at the same time. be limited.
  • the game inference apparatus 300 can use the training data generated by different game application instances on the game terminal 200 to simultaneously train multiple AI models.
  • it can provide training data to the communication module 301 through one output port, instead of providing training data for AI models with different inference styles through multiple output ports, thereby reducing the need for the game terminal 200. port requirements.
  • the game inference device 300 may also generate a notification message, and send the notification message to the game terminal 200 through the communication module 301 to notify the game terminal 200 of the training process .
  • the game reasoning apparatus 300 may also feed back the training results (eg, including game results, etc.) for the target object in each training process to the game terminal 200 through the communication module 301 . In this way, the game developer 101 can view the relevant data in the model training process on the game terminal 200 through a corresponding interface or window.
  • the game reasoning device 300 is used to create a new AI model and train it as an example for illustration.
  • the existing AI model can also be reused, and the “surgery” method can be used for )" method to train the existing AI model, that is, a new network layer can be added to the network structure in the existing AI model, so that when training the AI model, the hyperparameters in the newly added network layer are mainly searched. And the parameters in the network layer are trained, which can improve the efficiency of model training and reduce the amount of computation required for model training.
  • the game inference device 300 can use the inference service to respond to the inference request for the first game sent by the game terminal 200, so as to realize the game
  • the terminal 200 performs corresponding game reasoning.
  • this embodiment may further include:
  • the game terminal 200 sends an inference request to the game inference apparatus 300, where the inference request includes the data to be processed of the target object in the game application instance of the first game.
  • the data to be processed carried in the inference request may, for example, be data indicating the state of the non-player character (ie, the target object) and the opponent's player character in the first game, such as the non-player character and the opponent.
  • the game screen of the player character may include information such as the HP, magic power, attack power, defense power, skill status of the non-player character and the opposing player character, or may describe the non-player character and the opposing player. Text information of the character's battle status, etc.
  • the data to be processed may be, for example, video or picture data including the user's past actions. In this embodiment, the specific implementation manner of the data to be processed is not limited.
  • the game inference device 300 can provide an API interface to the game terminal 200, so that the game terminal 200 can send multiple inference requests to the communication module 301 in the game inference device 300 through the API interface. Request the inference action and/or state of the target object at multiple moments in the future.
  • the delay of each request by the game terminal 200 for the action inference service may be caused by The connection establishment process between the game terminal 200 and the communication module 301 is added. Based on this, in a possible implementation, the game terminal 200 can establish a long connection with the communication module 301.
  • the game terminal 200 uses Hypertext Transfer Protocol Version 1.1 (Hypertext Transfer Protocol Version 1.1, HTTP1.1) to communicate with A long connection is established between the modules 301 and so on.
  • the game terminal 200 After the game terminal 200 successfully establishes a connection with the communication module 301, each time it requests the game inference device 300 for inference services, it does not need to perform the process of establishing a connection, but can directly infer the game based on the established long connection.
  • the apparatus 300 sends the reasoning request, so that the delay for the game terminal 200 to obtain the reasoning result of the target object can be effectively reduced.
  • the game reasoning apparatus 300 may also receive the training data sent by the game terminal 200 through a pre-established persistent connection.
  • the game terminal 200 and the game inference device 300 may have different deployment environments, for example, the game terminal 200 is deployed in an environment corresponding to a Windows operating system, while the game inference device 300 is deployed in an environment corresponding to a Linux operating system in the environment. At this time, it may be difficult for the game inference device 300 deployed in the environment corresponding to the Linux operating system to directly perform action inference based on the data to be processed generated in the environment corresponding to the Windows operating system. Therefore, the communication module 301 can first determine whether the format of the data to be processed is the target format.
  • the communication module 301 can determine whether the format of the data to be processed is in the first format.
  • the data is decoded to obtain data to be processed in a target format that can be recognized by the game inference device 300 (ie, the cloud platform), and provided to the inference module 302 .
  • the specific implementation of decoding data in one format to obtain data in another format has related applications in actual scenarios, and the process will not be repeated in this embodiment.
  • the communication module 301 can directly send the data to be processed to the inference module 302 .
  • the game terminal 200 provides the training data to the game inference apparatus 300, it may also perform corresponding processing on the format of the training data to obtain data in a format conforming to the deployment environment of the cloud platform.
  • the communication module 301 may also detect whether the second format of the data to be processed in the inference request sent by the other client is the same as the target format, and when the second format is the same as the target When the formats are inconsistent, the communication module 301 may convert the data to be processed in the second format into the data to be processed in the target format.
  • the game inference device 300 uses the inference service of the configured first game to infer the data to be processed in the inference request, and obtains indication information for the action and/or state of the target object.
  • the inference module 302 can call the pre-configured inference service, which relies on the AI model trained in advance through the target training algorithm, so the inference module 302 can first Input the data to be processed into the AI model, and obtain the indication information of the action (such as attack, escape, etc.) and/or state (such as defense increase, emotion) output by the AI model for the target object inference.
  • the indication information of the action such as attack, escape, etc.
  • state such as defense increase, emotion
  • the reasoning module 302 may also use a heuristic algorithm to constrain the indication information output by the AI model.
  • a heuristic algorithm is an algorithm constructed based on intuition or experience, which can give a feasible solution to each instance of the combinatorial optimization problem to be solved under acceptable time and space complexity, wherein the given feasible solution may is the optimal solution, or it may not be the optimal solution, and the degree of deviation of the feasible solution from the optimal solution is usually difficult to predict.
  • constraint rules can be predefined in the heuristic algorithm.
  • the action deduced by the reasoning module 302 for the non-player character is "moving backward", that is, “running away” in the direction away from the player character in battle, but the non-player character "runs away”
  • the character currently has an insurmountable obstacle behind the game position in the battle map, and there are no obstacles in the remaining directions, that is, the non-player character cannot "move backward" in the battle map.
  • the inference module 302 can use a heuristic algorithm to constrain the inference action output by the AI model, specifically constraining the inference action of "move backward” to be “move left” or “move right”, so as to improve the inference module 302 The plausibility of inferred actions for non-player characters.
  • the game reasoning apparatus 300 returns the indication information for the action and/or state of the target object to the game terminal 200.
  • the communication module 301 encodes the indication information of the action and/or state of the target object output by the inference module 302 into the information that the game terminal 200 can recognize. instruction information in the first format, and send the action instruction information in the first format to the game terminal 200, so that the game terminal 200 can recognize the instruction information, and set the target object to perform the action corresponding to the instruction information at the next moment or show the Indicates the status corresponding to the information.
  • the game reasoning apparatus 300 can provide action reasoning services for the game terminals 200 deployed in various environments, the requirements for the deployment environment of the game terminals 200 can be reduced, and the game reasoning apparatus 300 can provide the game terminals 200 with game-specific reasoning services. universality.
  • the process of training the AI model by the game reasoning apparatus 300 and using the trained AI model to infer the action and/or state of the target object is described.
  • the game developer expects the game reasoning device 300 to provide a series of reasoning actions for the non-player character A (referred to as character A) in the game application instance, so that character A can defeat non-player character B (referred to as character A) in the game character B), and character A can use a series of reasoning actions in three different combat styles to achieve victory over character B, including three combat styles of "aggressive", "conservative” and "balanced”.
  • the initial blood volume of character A and character B are the same, the strength of attack power and defense power are the same, and the actions that may be performed are also the same.
  • character A attacks character B within the specified number of steps and reduces character B's HP to 0 and character A still has HP, then character A wins.
  • Set the strength of character A and character B to be the same, and the available moves are also the same.
  • FIG. 6 a schematic flowchart of a method for configuring a game reasoning service on a cloud platform combined with a specific game scene provided by an embodiment of the present application is shown.
  • the method can be applied to the game reasoning apparatus 300 shown in FIG. 7 .
  • the game inference apparatus 300 shown in FIG. 7 may further include an object storage service module 304 , a deployment module 305 and an online cloud service module 306 .
  • the method may specifically include:
  • the object storage service module 304 prestores program codes for implementing the communication module 301, the game algorithm framework, the reasoning module 302 and the configuration module 303.
  • the program codes of the communication module 301 , the game algorithm framework, the reasoning module 302 and the configuration module 303 can be developed in advance by a technician and stored in the object storage service module 304 .
  • the communication module 301 may be implemented by, for example, a flask framework or the like.
  • the deployment module 305 deploys the program code stored by the object storage service module 304 on the cloud platform, and publishes it as an online service.
  • the communication module 301 and the reasoning module 302 for providing game inference services for the game terminal 200 may form a cloud service and be deployed on a cloud platform, etc., to support online provision of services for game developers.
  • the game inference device 300 (or the communication module 301, the game algorithm framework, the inference module 302, and the configuration module 303) can be deployed in a cloud center or an edge center as a game AI (game AI) framework.
  • the online service module 306 pulls the program code of the implementation communication module 301, the game algorithm framework, the reasoning module 302 and the configuration module 303 of the storage node, and deploys them in the cloud center or On the computing nodes in the edge center, in order to use the computing resources on the computing nodes to support the training of AI models and the action reasoning process.
  • the deployment module 305 publishes the program code for realizing the inference service of the game as an online service
  • the game developer 101 can subscribe to the online service on the cloud platform through the game terminal 200 to trigger the game inference device 300 for the game.
  • the developer 101 configures the inference service of the game on the cloud platform.
  • the communication module 301 can provide an API interface to the game terminal 200 to facilitate the game terminal. 200 establishes a communication connection with the game reasoning device 300 through the API interface. Exemplarily, a long connection can be established between the game terminal 200 and the communication module 301 based on protocols such as HTTP 1.1.
  • the game terminal 200 receives the configuration file provided by the game developer 101.
  • the configuration file defines the action space, state space of character A and character B, the type of training algorithm, the type of AI model trained by the training algorithm, and the type of AI model used by the training algorithm.
  • the reward function of the AI model is trained, the environment variable indicating the storage address of the AI model, and the specification of the computing resource, and the configuration file is forwarded to the game inference device 300 .
  • the configuration module 303 in the game inference device 300 invokes the corresponding type of training algorithm and AI model from the game algorithm framework according to the configuration file received by the communication module 301.
  • the action space of character A and character B can include walking forward, walking backward, walking left, walking right, attack 1, attack 2, attack 3, attack 4, attack 5, run forward, and backward Jump, run left, run right, etc.
  • State space including character A and character B's health, position, orientation, mana, attack, defense and other states.
  • hp represents blood volume
  • t and t-1 represent different moments
  • is a preset coefficient value
  • the configuration module 303 in the game inference device 300 can allocate the computing resources on the computing nodes that meet the specifications to the game inference device 300 according to the computing resource specifications in the configuration file, so that the game inference device 300 configures the computing resources.
  • the game's inference service runs through this computing resource.
  • the configuration module 303 can call the corresponding type of training algorithm and AI model from the game algorithm framework according to the type of training algorithm and the type of AI model specified in the configuration file, so as to use the selected training algorithm to perform at least one AI model. train.
  • the network structure of the AI model in the game algorithm framework may include a Dropout layer and an L2 regularization term, so as to improve the generalization performance of the AI model.
  • the game reasoning apparatus 300 may further perform step S606 to obtain training data for training the AI model.
  • S606 Start multiple instances of the same game application on the game terminal 200, and send multiple copies of training data to the communication module 301 by running a preset script.
  • each game application instance includes character A and character B, and each game application instance generates a piece of training data for character A and character B.
  • the script on the game terminal 200 can be developed by technical personnel in advance, and the script can support the communication between the game terminal 200 and the game reasoning device 300 when running.
  • multiple game application instances are simultaneously run on the game terminal 200 and multiple copies of training data are generated in parallel, which can speed up the acquisition of training data by the game inference device 300 , thereby speeding up the model training process of the game inference device 300 .
  • the communication module 301 decodes the training data sent by the game terminal 200 to obtain the training data in the target format that can be recognized by the inference module 302 , and provides the training data to the configuration module 303 .
  • the game terminal 200 and the game inference device 300 may be deployed in different environments. Therefore, after receiving the training data sent by the game terminal 200, the communication module 301 can decode the training data into the game inference device. 300 training data in the target format that can be recognized.
  • the communication module 301 may further preprocess the training data.
  • the communication module 301 can standardize information such as game maps and location coordinates in each piece of training data, and add corresponding features for describing information such as character distance and orientation.
  • the configuration module 303 runs multiple processes, and uses the training data forwarded by the communication module 301 to train multiple AI models in parallel.
  • the configuration module 303 may train an AI model in a distributed manner. Specifically, the configuration module 303 may include multiple processes, and each process may train an AI model based on one or more pieces of training data. For each AI model, the configuration module 303 may assign the role A in the training data to the training data. The data is input into the AI model, and the inference action of the character A output by the AI model is obtained, and then the action of the character A obtained by the inference is used to play a game with the character B, and the obtained game result is used to feed back and adjust the parameters in the AI model . In this way, the efficiency of multiple AI models where the configuration module 303 trains can be improved.
  • the process of model training for character B is similar to the process of model training for character A, which can be described with reference to relevant places, and will not be repeated here.
  • the remaining HP of character A at the end of the battle is lower than the remaining HP of character B at the end of the battle, however, As the number of iterative training of the AI model increases, when the iterative training reaches 100, the remaining HP of character A at the end of the battle begins to be higher than that of character B at the end of the battle, that is, character A can defeat B. This can also be reflected in the graph of the win rate of both sides as shown in Figure 9.
  • the winning rate of character A is close to 100%.
  • the AI model can make the winning rate of character A reach the preset value (eg 98%, etc.) during the most recent preset number of iterations (eg 20 times) ), the configuration module 303 can continue to train the AI model for character B in a similar manner.
  • the configuration module 303 may save the AI model according to the storage address executed in the configuration file.
  • the configuration module 303 feeds back a notification of the completion of the AI model training to the game terminal 200 through the communication module 301.
  • the communication module 301 when data communication is performed between the game terminal 200 and the game inference device 300, the communication module 301 can complete the format conversion of the communication data, so that the communication parties can mutually identify the communication data sent by the other party.
  • the game developer can view the data generated in the process of training the AI model by the game reasoning apparatus 300 through the game terminal 200 .
  • the game developer 101 can view the training effect and the like on the interface of the cloud platform through the game terminal 200 .
  • the game developer 101 can view the AI models of the three combat styles based on the population evolution method on the interface of the cloud platform, and the change curve of the winning rate for character A during the model training process; or, The game developer 101 can view the change curve of the blood volume between the two sides as shown in FIG. 8 , or the change curve of the victory rate between the two sides as shown in FIG. 9 on the interface of the cloud platform.
  • the game terminal 200 sends an action inference request to the game inference device 300 for requesting action inference for the character A, where the action inference request includes the game screen of the character A and the character B and the identification of the fighting style.
  • the reasoning module 302 uses the AI model corresponding to the identifier of the fighting style to infer the action instruction information of the character A according to the game screen of the target format.
  • the reasoning module 302 sends the action indication information of the character A to the communication module 301.
  • the communication module 301 After completing the format conversion of the action indication information, the communication module 301 sends the action indication information in a format that can be recognized by the game terminal 200 to the game terminal 200.
  • Figure 11 provides a computing device cluster. As shown in FIG. 11 , the computing device cluster 1100 can be specifically used to implement the functions of the game reasoning apparatus 300 shown in FIG. 3 .
  • Computing device cluster 1100 includes at least one computing device, where each computing device may include a bus 1101 , a processor 1102 and a memory 1103 .
  • the processor 1102 and the memory 1103 communicate through the bus 1101 .
  • the bus 1101 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 11, but it does not mean that there is only one bus or one type of bus.
  • the processor 1102 can be a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP), a neural network Any one or more of processors such as a network processor (neural network processing unit, NPU).
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • NPU neural network processing unit
  • the memory 1103 may include volatile memory, such as random access memory (RAM).
  • RAM random access memory
  • the memory 1103 may also include non-volatile memory, such as read-only memory (ROM), flash memory, hard drive (HDD) or solid state drive , SSD).
  • ROM read-only memory
  • HDD hard drive
  • SSD solid state drive
  • Executable program codes are stored in the memory 1103, and the processor 1102 executes the executable program codes to execute the aforementioned method for configuring an inference service of a game on a cloud platform executed by the game inference device 300.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center that contains one or more available media.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state drives), and the like.
  • the computer-readable storage medium includes instructions, and the instructions instruct the computing device to execute the method for configuring an inference service for a game on a cloud platform, which is executed by the game inference apparatus 300 described above.
  • the embodiments of the present application also provide a computer program product.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computing device, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted over a wire from a website site, computer or data center. (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) to another website site, computer or data center.
  • a website site e.g coaxial cable, fiber optic, digital subscriber line (DSL)
  • wireless eg infrared, wireless, microwave, etc.
  • the computer program product can be a software installation package, which can be downloaded and executed on a computing device when any of the aforementioned object recognition methods needs to be used.

Abstract

The present application provides a method for configuring a game inference service on a cloud platform, comprising: when configuring, on the cloud platform, the game inference service for a game developer, acquiring a first configuration file that comprises configuration information for a first game, to configure, on the cloud platform, an inference service for the first game on the basis of a game algorithm framework of the cloud platform and the acquired first configuration file. Furthermore, an inference service for a second game can further be configured on the cloud platform on the basis of the game algorithm framework of the cloud platform and a second configuration file corresponding to the second game. In this way, for different games, inference services required by game developers for one or more games can all be configured on the cloud platform, such that cloud service providers do not need to perform specialized design of the inference services. Moreover, the difficulty for small and medium-sized game manufacturers to apply AI technology to obtain game inference results is also effectively reduced. In addition, a corresponding apparatus and a related device are further provided.

Description

在云平台配置游戏的推理服务的方法、装置及相关设备Method, device and related equipment for configuring game reasoning service on cloud platform 技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种在云平台配置游戏的推理服务的方法、装置及相关设备。The present application relates to the technical field of artificial intelligence, and in particular, to a method, apparatus and related equipment for configuring a reasoning service of a game on a cloud platform.
背景技术Background technique
目前,人工智能(artificial intelligence,AI)技术广泛应用于多种领域。例如,在游戏领域中,游戏开发者(如游戏厂商等)可以利用AI技术实现游戏场景中的非玩家角色(non-player character,NPC)训练、玩家行为预测、角色对战策略等功能,具体可以是通过AI技术推理出非玩家角色等对象在游戏对战过程中的对战动作或者对战状态。At present, artificial intelligence (AI) technology is widely used in various fields. For example, in the game field, game developers (such as game manufacturers, etc.) can use AI technology to implement functions such as non-player character (NPC) training, player behavior prediction, and character battle strategies in game scenarios. It is to infer the battle action or battle state of objects such as non-player characters in the game battle process through AI technology.
游戏开发者在进行游戏开发过程中,若要获取基于AI技术的游戏推理能力,需要根据特定需求自主构建整个基于特定游戏的推理服务,或者,向服务提供商(例如云服务提供商)提出针对该游戏的推理服务需求,服务提供商基于该需求为游戏开发者定制针对该游戏的特定推理服务。但是,上述两种获得游戏的推理服务的方式,都使得获得基于AI的游戏的推理服务的成本较高、难度较大。In the process of game development, if game developers want to obtain game reasoning capabilities based on AI technology, they need to independently build the entire reasoning service based on specific games according to specific needs, or propose to service providers (such as cloud service providers) for specific games. The game's inference service requirements, and the service provider customizes the game's specific inference service for the game developer based on the requirements. However, the above two ways of obtaining the reasoning service of the game make the cost and difficulty of obtaining the reasoning service of the AI-based game relatively high.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本申请实施例提供了一种在云平台配置游戏的推理服务的方法,以降低云服务提供商提供游戏的推理服务的成本以及难度。本申请还提供了对应的装置、计算设备集群、计算机可读存储介质以及计算机程序产品。In view of this, the embodiments of the present application provide a method for configuring a game reasoning service on a cloud platform, so as to reduce the cost and difficulty of providing a game reasoning service by a cloud service provider. The present application also provides corresponding apparatuses, computing device clusters, computer-readable storage media, and computer program products.
第一方面,本申请实施例提供了一种在云平台配置游戏的推理服务的方法,具体的,当在云平台上为游戏开发者配置针对第一游戏的推理服务时,可以获取包括针对第一游戏的配置信息的第一配置文件,从而基于云平台的游戏算法框架以及获取的第一配置文件,在云平台配置第一游戏的推理服务。In the first aspect, an embodiment of the present application provides a method for configuring an inference service for a game on a cloud platform. Specifically, when configuring an inference service for a first game for a game developer on a cloud platform, you can obtain information including an inference service for the first game. A first configuration file of configuration information of a game, so that the inference service of the first game is configured on the cloud platform based on the game algorithm framework of the cloud platform and the acquired first configuration file.
如此,针对游戏开发者所开发的游戏,可以利用通用的游戏算法框架以及相应的配置文件在云平台上自动配置出满足该游戏开发者所需的针对特定游戏的推理服务,以便利用配置出的推理服务对该游戏进行相应的推理,如推理该游戏中任意一个角色的动作和/或状态等。这样,云服务提供商无需针对该游戏进行推理服务的专项化设计,从而可以有效降低提供和使用游戏的推理服务的难度以及成本,并且,提供和使用推理服务的效率也能得到有效提升。In this way, for the game developed by the game developer, the general game algorithm framework and the corresponding configuration file can be used to automatically configure the inference service for the specific game required by the game developer on the cloud platform, so as to use the configured inference service. The reasoning service performs corresponding reasoning on the game, such as reasoning about the action and/or state of any character in the game. In this way, the cloud service provider does not need to carry out specialized design of the reasoning service for the game, so that the difficulty and cost of providing and using the reasoning service of the game can be effectively reduced, and the efficiency of providing and using the reasoning service can also be effectively improved.
而且,实际应用中,AI技术的应用离不开以强化学习(reinforcement learning,RL)为代表的训练算法以及支持训练算法运行所需的计算资源,但是,对于中小型游戏厂商等游戏开发者而言,可能难以具备高质量的AI模型以及足够的计算资源来实现游戏推理,通过在云平台上配置出用于推理游戏中角色的动作和/或状态的推理服务,可以有效降低游戏开发者应用AI技术获得游戏推理结果的难度。Moreover, in practical applications, the application of AI technology is inseparable from the training algorithm represented by reinforcement learning (RL) and the computing resources required to support the operation of the training algorithm. However, for small and medium-sized game manufacturers and other game developers, In other words, it may be difficult to have high-quality AI models and enough computing resources to implement game inference. By configuring inference services on the cloud platform to infer the actions and/or states of characters in the game, the application of game developers can be effectively reduced. The difficulty of AI technology to obtain game reasoning results.
在一种可能的实施方式中,在云平台上不仅可以配置出第一游戏对应的推理服务,还可以配置出第二游戏对应的推理服务。具体的,可以获取包括针对第二游戏的配置信息的第二配置文件,从而基于云平台的游戏算法框架以及获取的第二配置文件,在云平台上配置第二游戏的推理服务。如此,可以实现在云平台上针对多款不同的游戏,利用通用的游戏算法框架分别配置出该游戏对应的推理服务,从而可以提高方案实施的普适性。In a possible implementation manner, not only the reasoning service corresponding to the first game but also the reasoning service corresponding to the second game may be configured on the cloud platform. Specifically, a second configuration file including configuration information for the second game may be obtained, so that the inference service of the second game is configured on the cloud platform based on the game algorithm framework of the cloud platform and the obtained second configuration file. In this way, it is possible to configure inference services corresponding to a plurality of different games on the cloud platform by using a general game algorithm framework, thereby improving the universality of solution implementation.
在一种可能的实施方式中,在配置出第一游戏的推理服务后,可以利用该第一游戏的推 理服务对游戏端发送的推理请求进行响应,其中,该游戏端可以是运行第一游戏的游戏应用实例的设备,如终端和/或服务器等,并且,该推理请求包括针对第一游戏的游戏应用实例中的目标对象的待处理数据,如包括目标对象的信息图片等,而推理服务所做出的响应可以包括针对该目标对象的动作和/或状态的指示信息,该指示信息例如可以指示目标对象在未来时刻所执行的动作(如攻击、跳远等),和/或,可以指示目标对象在未来时刻所具有的状态(如情绪、攻击速度等)。如此,游戏开发者可以通过在云平台上配置的推理服务实现对于目标对象的动作和/或状态的推理,从而可以有效降低游戏开发者应用AI技术获得游戏推理结果的难度。In a possible implementation, after the inference service of the first game is configured, the inference service of the first game can be used to respond to the inference request sent by the game terminal, where the game terminal can run the first game The device of the game application instance, such as a terminal and/or a server, etc., and the reasoning request includes the data to be processed for the target object in the game application instance of the first game, such as including the information picture of the target object, etc., and the reasoning service The response made may include indication information on the action and/or state of the target object, for example, the indication information may indicate the action (such as attack, long jump, etc.) performed by the target object in the future, and/or may indicate The state of the target object in the future (such as emotion, attack speed, etc.). In this way, the game developer can realize the reasoning about the action and/or state of the target object through the reasoning service configured on the cloud platform, which can effectively reduce the difficulty of the game developer applying AI technology to obtain the game reasoning result.
在一种可能的实施方式中,第一配置文件包括以下配置信息中的一种或多种:第一游戏的游戏应用实例中的目标对象的动作空间、第一游戏的游戏应用实例中的目标对象的状态空间、目标训练算法的第一类型、人工智能AI模型的第二类型、奖励函数、AI模型的训练方式、AI模型的推理方式、AI模型的保存地址、用于AI模型的训练和推理的计算资源的规格,以便在云平台上利用这些配置信息实现推理服务的配置。其中,第二配置文件与第一配置文件的实现方式类似,可参照理解,在此不做赘述。In a possible implementation, the first configuration file includes one or more of the following configuration information: the action space of the target object in the game application instance of the first game, the target in the game application instance of the first game The state space of the object, the first type of target training algorithm, the second type of artificial intelligence AI model, the reward function, the training method of the AI model, the reasoning method of the AI model, the storage address of the AI model, the training method of the AI model and the The specification of the computing resources for inference, so that the configuration of the inference service can be implemented using these configuration information on the cloud platform. The implementation manner of the second configuration file is similar to that of the first configuration file, which can be understood by reference, and will not be repeated here.
在一种可能的实施方式中,在配置第一游戏的推理服务时,具体可以是根据第一配置文件以及云平台的游戏算法框架,对至少一个AI模型训练,从而可以根据完成训练的至少一个AI模型配置得到第一游戏的推理服务。相应的,在利用该推理服务对第一游戏的游戏应用实例中的目标对象进行动作和/或状态进行推理时,具体可以是通过完成训练的该AI模型进行推理。In a possible implementation, when configuring the reasoning service of the first game, at least one AI model can be trained according to the first configuration file and the game algorithm framework of the cloud platform, so that at least one AI model can be trained according to the at least one trained The AI model configuration obtains the reasoning service of the first game. Correspondingly, when using the inference service to infer the action and/or state of the target object in the game application instance of the first game, inference may be specifically performed by using the trained AI model.
在一种可能的实施方式中,在训练至少一个AI模型时,具体可以是接收游戏端的多个训练请求,该多个训练请求来自于第一游戏的多个游戏应用实例,并且,不同训练请求包括针对该多个游戏应用示例中的同一目标对象的不同训练数据,从而可以利用多个训练请求中的训练数据对至少一个AI模型进行训练。如此,通过在游戏端同时运行多个游戏应用实例,可以并行产生多份训练数据,从而可以有效提高生成训练数据的效率,也即可以提高训练AI模型的效率。In a possible implementation, when training at least one AI model, it may specifically receive multiple training requests from the game terminal, the multiple training requests are from multiple game application instances of the first game, and different training requests Different training data for the same target object in the multiple game application examples are included, so that at least one AI model can be trained by using the training data in multiple training requests. In this way, by running multiple game application instances on the game side at the same time, multiple copies of training data can be generated in parallel, which can effectively improve the efficiency of generating training data, that is, the efficiency of training AI models.
可选地,游戏端也可以是在同一时间段内仅运行一个游戏应用实例,从而利用单个游戏应用实例所产生的训练数据训练得到一个或者多个AI模型。Optionally, the game terminal may also run only one game application instance in the same time period, so that one or more AI models are obtained by training using the training data generated by the single game application instance.
在一种可能的实施方式中,所训练的至少一个AI模型中包括第一AI模型以及第二AI模型,此时,该第一AI模型与第二AI模型的超参数不同,和/或,第一AI模型和第二AI模型对应的奖励函数不同。如此,通过训练具有不同超参数的AI模型,可以从中确定出能够使得AI模型的推理效果较高的超参数,进而基于该超参数的AI模型的质量较高,也即基于该AI模型所配置的推理服务的质量较高。并且,通过利用不同奖励函数对不同的AI模型进行训练,可以训练得到不同推理类型的AI模型,如可以利用多个奖励函数训练得到多种推理风格的AI模型等,实现推理的多样化。In a possible implementation manner, the trained at least one AI model includes a first AI model and a second AI model, in this case, the hyperparameters of the first AI model and the second AI model are different, and/or, The reward functions corresponding to the first AI model and the second AI model are different. In this way, by training AI models with different hyperparameters, it is possible to determine the hyperparameters that can make the inference effect of the AI model higher, and then the quality of the AI model based on the hyperparameters is higher, that is, the configuration based on the AI model The quality of the inference service is high. In addition, by using different reward functions to train different AI models, AI models of different inference types can be trained, for example, AI models of various inference styles can be obtained by training with multiple reward functions, etc., so as to realize the diversification of inference.
在一种可能的实施方式中,当训练的AI模型中包括第一AI模型以及第二AI模型时,可以在云平台上运行有多个进程,此处以第一进程和第二进程为例。在对第一AI模型以及第二AI模型进行训练时,具体可以是根据第一进程的端口号和/或IP地址,将多个训练请求中的训练数据发送至第一进程以及第二进程,其中,第一进程以及第二进程所接收到的训练数据可以不同。然后,利用第一进程以及第一进程接收到的训练数据训练第一AI模型,并且,利用第二进程以及第二进程接收到的训练数据训练第二AI模型。如此,可以实现在云平台上并行训练多个AI模型,从而可以提高AI模型的训练效率,也即可以提高配置第一游戏的推理服务的 效率。In a possible implementation, when the AI model to be trained includes the first AI model and the second AI model, multiple processes may run on the cloud platform. Here, the first process and the second process are used as examples. When training the first AI model and the second AI model, specifically, according to the port number and/or IP address of the first process, the training data in the multiple training requests are sent to the first process and the second process, The training data received by the first process and the second process may be different. Then, the first AI model is trained using the first process and the training data received by the first process, and the second AI model is trained using the second process and the training data received by the second process. In this way, multiple AI models can be trained in parallel on the cloud platform, so that the training efficiency of the AI model can be improved, that is, the efficiency of configuring the reasoning service of the first game can be improved.
在一种可能的实施方式中,云平台的游戏算法框架中,可以预先定义有一种或者多种不同类型的训练算法以及AI模型,从而在配置第一游戏的推理服务时,可以根据第一配置文件中的目标训练算法的第一类型以及AI模型的第二类型,从游戏算法框架中调用第一类型的目标训练算法以及第二类型的至少一个AI模型。In a possible implementation, in the game algorithm framework of the cloud platform, one or more different types of training algorithms and AI models may be predefined, so that when configuring the reasoning service of the first game, you can The first type of target training algorithm and the second type of AI model in the file, call the first type of target training algorithm and at least one AI model of the second type from the game algorithm framework.
示例性地,目标训练算法例如可以是深度强化学习算法、近端策略优化算法、柔性动作评价算法、深度确定性策略梯度算法、双延迟深度确定性策略梯度算法、彩虹算法中的任意一种,或者也可以是其它可适用的算法。AI模型,例如可以是深度神经网络模型、循环神经网络模型、卷积神经网络模型中的任意一种神经网络模型,或者也可以是其它可适用的模型。Exemplarily, the target training algorithm can be, for example, any one of a deep reinforcement learning algorithm, a near-end policy optimization algorithm, a flexible action evaluation algorithm, a deep deterministic policy gradient algorithm, a double-delay deep deterministic policy gradient algorithm, and a rainbow algorithm, Or other applicable algorithms. The AI model, for example, can be any one of a deep neural network model, a recurrent neural network model, and a convolutional neural network model, or can also be other applicable models.
在一种可能的实施方式中,当游戏端和云平台的数据格式不同时,在利用第一游戏的推理服务对游戏端发送的推理请求进行响应之前,可以先对游戏端发送的推理请求中数据的格式进行处理,得到云平台能够识别的数据格式的数据。如此,可以避免游戏端与云平台因为部署环境的差异而导致云平台难以识别游戏端发送的推理请求。In a possible implementation, when the data formats of the game terminal and the cloud platform are different, before using the inference service of the first game to respond to the inference request sent by the game terminal, the inference request sent by the game terminal may be used first. The data format is processed to obtain data in a data format that can be recognized by the cloud platform. In this way, it can be avoided that the game terminal and the cloud platform are difficult to identify the reasoning request sent by the game terminal due to the difference of the deployment environment.
在一种可能的实施方式中,云平台与游戏端之间可以保持长连接,并且,该云平台可以通过该长连接接收以及响应游戏端发送的推理请求。如此,云平台与游戏端之间在建立一次长连接后,相互之间可以进行多次通信,而可以不用每次进行数据通信时均需要重新建立连接,从而可以有效降低云平台与游戏端之间的通信时延。In a possible implementation, a persistent connection may be maintained between the cloud platform and the game terminal, and the cloud platform may receive and respond to inference requests sent by the game terminal through the persistent connection. In this way, after a long connection is established between the cloud platform and the game terminal, they can communicate with each other multiple times, and it is not necessary to re-establish the connection every time for data communication, which can effectively reduce the relationship between the cloud platform and the game terminal. communication delay between.
在一种可能的实施方式中,在获取第一配置文件时,具体可以是基于游戏开发者选择的配置信息向,获取第一配置文件。比如,云平台可以向游戏开发者提供相应的配置界面,并且在该配置界面上呈现有多个可供游戏开发者选择的配置信息项,从而游戏开发者可以从多个配置信息项中进行选择,以便云平台基于游戏开发者的针对配置信息项的选择而自动生成相应的第一配置文件。如此,可以有效提供游戏开发者的配置效率,提高配置体验。In a possible implementation manner, when acquiring the first configuration file, the first configuration file may be specifically acquired based on the configuration information selected by the game developer. For example, the cloud platform can provide the game developer with a corresponding configuration interface, and multiple configuration information items for the game developer to select are presented on the configuration interface, so that the game developer can choose from multiple configuration information items , so that the cloud platform automatically generates a corresponding first configuration file based on the game developer's selection of the configuration information item. In this way, the configuration efficiency of the game developer can be effectively provided, and the configuration experience can be improved.
第二方面,本申请提供一种在云平台配置游戏的推理服务的装置,该装置包括通信模块,用于获取第一配置文件,第一配置文件包括针对第一游戏的配置信息;配置模块,用于基于云平台的游戏算法框架和第一配置文件,在云平台上配置第一游戏的推理服务。In a second aspect, the present application provides a device for configuring an inference service of a game on a cloud platform, the device includes a communication module for acquiring a first configuration file, where the first configuration file includes configuration information for the first game; the configuration module, It is used for the game algorithm framework and the first configuration file based on the cloud platform, and the reasoning service of the first game is configured on the cloud platform.
在一种可能的实施方式中,通信模块,还用于获取第二配置文件,第二配置文件包括针对第二游戏的配置信息;配置模块,还用于基于云平台的游戏算法框架和第二配置文件,在云平台配置第二游戏的推理服务。In a possible implementation manner, the communication module is further configured to acquire a second configuration file, where the second configuration file includes configuration information for the second game; the configuration module is further configured to obtain the game algorithm framework based on the cloud platform and the second configuration file. Configuration file, configure the reasoning service of the second game on the cloud platform.
在一种可能的实施方式中,装置还包括:推理模块,用于利用第一游戏的推理服务对游戏端发送的推理请求进行响应,其中,游戏端包括运行第一游戏的游戏应用实例的设备,推理请求包括针对第一游戏的游戏应用实例中的目标对象的待处理数据,响应包括针对目标对象的动作和/或状态的指示信息。In a possible implementation manner, the apparatus further includes: an inference module, configured to use an inference service of the first game to respond to an inference request sent by the game terminal, wherein the game terminal includes a device running a game application instance of the first game , the inference request includes data to be processed for the target object in the game application instance of the first game, and the response includes indication information for the action and/or state of the target object.
在一种可能的实施方式中,第一配置文件包括以下配置信息中的一种或多种:第一游戏的游戏应用实例中的目标对象的动作空间、第一游戏的游戏应用实例中的目标对象的状态空间、目标训练算法的第一类型、人工智能AI模型的第二类型、奖励函数、AI模型的训练方式、AI模型的推理方式、AI模型的保存地址、用于AI模型的训练和推理的计算资源的规格。In a possible implementation, the first configuration file includes one or more of the following configuration information: the action space of the target object in the game application instance of the first game, the target in the game application instance of the first game The state space of the object, the first type of target training algorithm, the second type of artificial intelligence AI model, the reward function, the training method of the AI model, the reasoning method of the AI model, the storage address of the AI model, the training method of the AI model and the Specifications of computational resources for inference.
在一种可能的实施方式中,配置模块,具体用于:基于第一配置文件和游戏算法框架,对至少一个AI模型进行训练;根据训练完成的至少一个AI模型配置第一游戏的推理服务。In a possible implementation, the configuration module is specifically configured to: train at least one AI model based on the first configuration file and the game algorithm framework; configure the reasoning service of the first game according to the trained at least one AI model.
在一种可能的实施方式中,配置模块,具体用于:接收来自游戏端的多个训练请求,多个训练请求来自于第一游戏的多个游戏应用实例,不同训练请求包括针对多个游戏应用实例中的同一目标对象的不同训练数据;根据多个训练请求中的训练数据对至少一个AI模型进行 训练。In a possible implementation, the configuration module is specifically configured to: receive multiple training requests from the game terminal, the multiple training requests are from multiple game application instances of the first game, and the different training requests include multiple training requests for multiple game applications Different training data of the same target object in the instance; at least one AI model is trained according to the training data in multiple training requests.
在一种可能的实施方式中,当至少一个AI模型包括第一AI模型和第二AI模型时,第一AI模型和第二AI模型的超参数不同,和/或,第一AI模型和第二AI模型对应的奖励函数不同。In a possible implementation, when the at least one AI model includes the first AI model and the second AI model, the hyperparameters of the first AI model and the second AI model are different, and/or the first AI model and the second AI model are different. The reward functions corresponding to the two AI models are different.
在一种可能的实施方式中,当至少一个AI模型包括第一AI模型和第二AI模型时,云平台运行有第一进程和第二进程,配置模块,具体用于:根据第一进程的端口号和/或IP地址和第二进程的端口号和/或IP地址,将多个训练请求中的训练数据发送至第一进程和第二进程;利用第一进程和第一进程接收到的训练数据训练第一AI模型,利用第二进程和第二进程接收到的训练数据训练第二AI模型。In a possible implementation, when at least one AI model includes a first AI model and a second AI model, the cloud platform runs a first process and a second process, and the configuration module is specifically used for: according to the first process The port number and/or IP address and the port number and/or IP address of the second process, send the training data in the multiple training requests to the first process and the second process; using the data received by the first process and the first process The training data trains the first AI model, and the second AI model is trained using the second process and the training data received by the second process.
在一种可能的实施方式中,配置模块,具体用于:根据第一配置文件中的目标训练算法的第一类型以及AI模型的第二类型,在游戏算法框架中调用第一类型的目标训练算法以及第二类型的至少一个AI模型;基于调用的第一类型的目标训练算法,对第二类型的至少一个AI模型进行训练。In a possible implementation manner, the configuration module is specifically configured to: call the first type of target training in the game algorithm framework according to the first type of target training algorithm and the second type of AI model in the first configuration file an algorithm and at least one AI model of the second type; based on the called target training algorithm of the first type, the at least one AI model of the second type is trained.
在一种可能的实施方式中,当游戏端和云平台的数据格式不同时,在利用第一游戏的推理服务对游戏端发送的推理请求进行响应之前,方法还包括:对游戏端发送的推理请求中数据的格式进行处理,得到云平台能够识别的数据格式的数据。In a possible implementation, when the data formats of the game terminal and the cloud platform are different, before using the inference service of the first game to respond to the inference request sent by the game terminal, the method further includes: responding to the inference request sent by the game terminal The format of the data in the request is processed to obtain data in a data format that the cloud platform can recognize.
在一种可能的实施方式中,云平台与游戏端之间保持长连接,并且,云平台通过长连接接收和响应游戏端发送的推理请求。In a possible implementation, a persistent connection is maintained between the cloud platform and the game terminal, and the cloud platform receives and responds to inference requests sent by the game terminal through the persistent connection.
在一种可能的实施方式中,通信模块,具体用于基于游戏开发者选择的配置信息项,获取第一配置文件。In a possible implementation manner, the communication module is specifically configured to acquire the first configuration file based on the configuration information item selected by the game developer.
第三方面,本申请提供一种计算设备集群,所述计算设备集群包括至少一个计算设备,其中,每个计算设备包括处理器、存储器。所述处理器用于执行存储器中存储的指令,以使得所述至少一个计算设备执行如第一方面或第一方面的任一种实现方式中的在云平台配置游戏的推理服务的方法。In a third aspect, the present application provides a computing device cluster, where the computing device cluster includes at least one computing device, wherein each computing device includes a processor and a memory. The processor is configured to execute instructions stored in the memory, so that the at least one computing device executes the method for configuring an inference service of a game on a cloud platform as in the first aspect or any implementation manner of the first aspect.
第四方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算设备上运行时,使得计算设备执行上述第一方面或第一方面的任一种实现方式所述的在云平台配置游戏的推理服务的方法。In a fourth aspect, the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, when the computer-readable storage medium runs on a computing device, the computing device causes the computing device to perform the first aspect or any one of the first aspect. A method for configuring a game reasoning service on a cloud platform according to the implementation manner.
第五方面,本申请提供了一种包含指令的计算机程序产品,当其在计算设备上运行时,使得计算设备执行上述第一方面或第一方面的任一种实现方式所述的在云平台配置游戏的推理服务的方法。In a fifth aspect, the present application provides a computer program product containing instructions, which, when run on a computing device, enables the computing device to execute the cloud platform described in the first aspect or any implementation manner of the first aspect. A method to configure the game's inference service.
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。On the basis of the implementation manners provided by the above aspects, the present application may further combine to provide more implementation manners.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are only some implementations described in the present application. For example, for those of ordinary skill in the art, other drawings can also be obtained from these drawings.
图1为本申请实施例提供的一示例性应用场景示意图;FIG. 1 is a schematic diagram of an exemplary application scenario provided by an embodiment of the present application;
图2为本申请实施例提供的一种游戏推理装置的结构示意图;FIG. 2 is a schematic structural diagram of a game reasoning device provided by an embodiment of the present application;
图3为本申请实施例提供的一种在云平台配置游戏的推理服务的方法的流程示意图;3 is a schematic flowchart of a method for configuring an inference service of a game on a cloud platform according to an embodiment of the present application;
图4为本申请实施例提供的一种配置界面的示意图;4 is a schematic diagram of a configuration interface provided by an embodiment of the present application;
图5为本申请实施例提供的训练数据从游戏端200传输至游戏推理装置300的示意图;FIG. 5 is a schematic diagram of the transmission of training data provided by an embodiment of the present application from the game terminal 200 to the game inference device 300;
图6为本申请实施例提供的一种结合具体场景的在云平台配置游戏的推理服务的方法的流程示意图;6 is a schematic flowchart of a method for configuring an inference service of a game on a cloud platform in combination with a specific scenario provided by an embodiment of the present application;
图7为本申请实施例提供的又一种游戏推理装置的结构示意图;FIG. 7 is a schematic structural diagram of another game reasoning apparatus provided by an embodiment of the present application;
图8为本申请实施例提供的角色A与角色B对战结束时的血量示意图;FIG. 8 is a schematic diagram of blood volume at the end of the battle between character A and character B according to an embodiment of the present application;
图9为本申请实施例提供的角色A与角色B对战胜率示意图;FIG. 9 is a schematic diagram of the victory rate of character A and character B provided in an embodiment of the present application;
图10为本申请实施例提供的三种不同战斗风格的角色A对战胜率示意图;FIG. 10 is a schematic diagram of the victory rate of character A of three different fighting styles provided in the embodiment of the present application;
图11为本申请实施例提供的一种计算设备集群的硬件结构示意图。FIG. 11 is a schematic diagram of a hardware structure of a computing device cluster according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请中的附图,对本申请提供的实施例中的方案进行描述。The solutions in the embodiments provided in this application will be described below with reference to the accompanying drawings in this application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。The terms "first", "second" and the like in the description and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the terms used in this way can be interchanged under appropriate circumstances, and this is only a distinguishing manner adopted when describing objects with the same attributes in the embodiments of the present application.
如图1所示,为一示例性应用场景示意图。在图1所示的应用场景中,游戏开发者101可以通过游戏端200进行游戏开发,例如,游戏开发者101可以在游戏端200上开发游戏环境、设计游戏地图以及游戏中各个角色的动作、状态等。游戏开发者101在游戏端200上所开发的游戏,可以为游戏玩家102提供游戏体验,例如可以在游戏端200上运行开发完成的游戏,以使得游戏玩家102触发游戏操作以进行游戏体验等。应理解,本申请中的游戏开发者101表示对游戏进行开发设计或者进行游戏推理能力开发的主体。应理解,图1所述的游戏端200可以包括用于进行游戏开发的设备,也包括部署游戏的设备。例如:游戏端200可以包括用于进行游戏开发的服务器、也可以包括运行游戏的设备(如:游戏开发者101的后台服务器,和/或,游戏玩家102的安装游戏的终端设备)。应理解,图1是以包括一个游戏端200为例进行示例性说明,实际应用时,游戏端200的数量也可以是多个。As shown in FIG. 1 , it is a schematic diagram of an exemplary application scenario. In the application scenario shown in FIG. 1, the game developer 101 can develop the game through the game terminal 200. For example, the game developer 101 can develop the game environment, design the game map and the actions of each character in the game on the game terminal 200, status, etc. The game developed by the game developer 101 on the game terminal 200 can provide the game player 102 with a game experience. For example, the developed game can be run on the game terminal 200, so that the game player 102 can trigger the game operation for game experience. It should be understood that the game developer 101 in this application refers to the subject who develops and designs the game or develops the game reasoning ability. It should be understood that the game terminal 200 described in FIG. 1 may include a device for developing a game, as well as a device for deploying a game. For example, the game terminal 200 may include a server for game development, and may also include a device running the game (eg, a background server of the game developer 101, and/or a terminal device of the game player 102 where the game is installed). It should be understood that FIG. 1 is an example of including one game terminal 200 for exemplary illustration, and in actual application, the number of game terminals 200 may also be multiple.
在游戏运行过程中,游戏端200可以向云平台发送推理请求,以请求针对该游戏的推理服务。比如,游戏端200上的游戏可以支持“人机对战”模式,即游戏玩家102操控的玩家角色与由机器控制的非玩家角色进行游戏竞技时,游戏端200可以向云平台请求非玩家角色的对战动作(如攻击、跳跃等)和/或对战状态(如攻击速度增加、移动速度降低等)等信息,从而游戏端200可以根据云平台提供的对战信息控制非玩家角色与游戏玩家102操控的玩家角色进行对抗。如图1所示,云平台可以包括游戏推理装置300,该游戏推理装置300中包括有针对该游戏的推理服务。此时,游戏端200具体可以是向云平台上的游戏推理装置300请求推理服务,并由游戏推理装置300为该游戏端200进行相应的游戏推理。During the running of the game, the game terminal 200 may send an inference request to the cloud platform to request an inference service for the game. For example, the game on the game terminal 200 may support the "human-machine battle" mode, that is, when the player character controlled by the game player 102 competes with the non-player character controlled by the machine, the game terminal 200 may request the cloud platform for the non-player character's Battle action (such as attack, jump, etc.) and/or battle state (such as attack speed increase, movement speed decrease, etc.) and other information, so that the game terminal 200 can control the non-player character and the game player 102 according to the battle information provided by the cloud platform. Player characters fight against each other. As shown in FIG. 1 , the cloud platform may include a game inference apparatus 300, and the game inference apparatus 300 includes an inference service for the game. At this time, the game terminal 200 may specifically request an inference service from the game inference device 300 on the cloud platform, and the game inference device 300 performs corresponding game inference for the game terminal 200 .
示例性地,游戏端200具体可以包括图2所示的客户端201以及服务器202,其中,客户端201用于与游戏玩家102交互,服务器202中部署有所开发的游戏的核心内容。此时,客户端201在接收游戏玩家102的游戏操作的同时,向服务器202发送推理请求。当服务器202响应推理请求的过程中,需要一些非玩家角色的对战动作和/或对战状态等信息时,可以向游戏推理装置300请求游戏的推理服务。然后,服务器202可以将游戏推理装置300反馈的非玩家角色的对战动作和/或对战状态转发给客户端201,以便客户端201控制非玩家角色与游戏玩家102进行游戏交互。实际应用场景中,服务器202可以同时为多个客户端201请求游戏的推理服务。Exemplarily, the game terminal 200 may specifically include the client 201 and the server 202 shown in FIG. 2 , where the client 201 is used to interact with the game player 102 , and the core content of the developed game is deployed in the server 202 . At this time, the client 201 sends an inference request to the server 202 while receiving the game operation of the game player 102 . In the process of responding to the reasoning request, the server 202 may request the game reasoning service from the game reasoning device 300 when some information such as the battle action and/or battle state of the non-player characters is required. Then, the server 202 may forward the battle action and/or state of the non-player characters fed back by the game inference device 300 to the client 201 , so that the client 201 controls the non-player characters to interact with the game player 102 in the game. In an actual application scenario, the server 202 may request game inference services for multiple clients 201 at the same time.
在另一种实施例中,游戏端200可以仅包括客户端201。如图2所示,此时,客户端201中部署有完整的游戏,该客户端201可以向游戏推理装置300发送推理请求。然后,游戏推理装置300将响应的推理信息反馈给客户端201,以便客户端201基于该推理信息与游戏玩家102进行游戏交互。In another embodiment, the game terminal 200 may only include the client 201 . As shown in FIG. 2 , at this time, a complete game is deployed in the client 201 , and the client 201 can send an inference request to the game inference apparatus 300 . Then, the game reasoning device 300 feeds back the responsive reasoning information to the client 201, so that the client 201 can interact with the game player 102 based on the reasoning information.
游戏推理装置300中可以包括如图2所示的通信模块301以及推理模块302。在游戏推理装置300提供推理服务的过程中,通信模块301可以通过开放式系统互联模型(Open System Interconnection Model,OSI)中的第七层协议(即应用层,Application Layer)与游戏端200进行通信,即通信模块301可以接收游戏端200基于该协议发送的推理请求,并将该推理请求或者推理请求中所携带的待处理数据发送给推理模块302。推理模块302可以采用相应的AI模型为游戏端200执行相应的推理过程,并将推理得到的结果通过通信模块301发送给游戏端200。The game inference device 300 may include a communication module 301 and an inference module 302 as shown in FIG. 2 . In the process of the game reasoning apparatus 300 providing reasoning services, the communication module 301 can communicate with the game terminal 200 through the seventh layer protocol (ie the application layer, Application Layer) in the Open System Interconnection Model (OSI) , that is, the communication module 301 can receive the inference request sent by the game terminal 200 based on the protocol, and send the inference request or the data to be processed carried in the inference request to the inference module 302 . The reasoning module 302 can use the corresponding AI model to execute the corresponding reasoning process for the game terminal 200 , and send the result obtained by the inference to the game terminal 200 through the communication module 301 .
游戏推理装置300可以部署于云中心的服务器,或者,游戏推理装置300也可以是部署于边缘中心中的服务器。在又一种可能的实施方式中,游戏推理装置300上的通信模块301以及推理模块302可以分开部署,比如,通信模块301可以部署于边缘中心中的服务器,而推理模块302可以部署于云中心的服务器等。本实施例中,对于游戏推理装置300的部署方式并不进行限定。应理解,本申请中的云中心表示云服务提供商设立的,用于为一个区域(region)(例如:华东区域)的云租户提供服务的设备集合。云中心通常包括大量的资源,可以为区域中各地区的云租户提供基础资源服务和/或软件应用服务。云中心包括多台计算设备(如服务器),各计算设备中的硬件资源以及基于硬件资源抽象而成的虚拟资源(硬件资源和虚拟资源也可以称为计算节点,例如计算节点具体可以是容器或者虚拟机)可以被用于部署前述游戏推理装置300。The game inference apparatus 300 may be deployed on a server in a cloud center, or the game inference apparatus 300 may also be a server deployed in an edge center. In yet another possible implementation, the communication module 301 and the inference module 302 on the game inference device 300 may be deployed separately, for example, the communication module 301 may be deployed in a server in an edge center, and the inference module 302 may be deployed in a cloud center server etc. In this embodiment, the deployment manner of the game inference apparatus 300 is not limited. It should be understood that the cloud center in this application refers to a set of devices established by a cloud service provider and used to provide services for cloud tenants in a region (eg, East China region). The cloud center usually includes a large number of resources, and can provide basic resource services and/or software application services for cloud tenants in various regions in the region. The cloud center includes multiple computing devices (such as servers), hardware resources in each computing device, and virtual resources abstracted based on hardware resources (hardware resources and virtual resources can also be called computing nodes, for example, computing nodes can be containers or A virtual machine) may be used to deploy the aforementioned game inference device 300 .
然而,对于一些时延要求较高的应用场景,利用云中心为一些云租户提供服务可能存在时延不满足要求的情况,因此云服务提供商还设立有边缘中心。However, for some application scenarios with high latency requirements, the use of cloud centers to provide services to some cloud tenants may not meet the latency requirements. Therefore, cloud service providers also set up edge centers.
本申请中的边缘中心表示云服务提供商设立在一个区域中的至少一个特定地区的设备集合,边缘中心也包括多个计算设备,可以用于为一个区域中的特定地区的租户提供服务。由于边缘中心相比于云中心在地理上部署在距离特定地区的云租户更近的地方,边缘中心可以更快速地提供服务响应。The edge center in this application refers to a set of devices established by a cloud service provider in at least one specific area in an area, and the edge center also includes multiple computing devices, which can be used to provide services for tenants in a specific area in an area. Because edge centers are geographically deployed closer to cloud tenants in a specific region than cloud centers, edge centers can provide faster service responses.
云服务提供商可以通过云平台提供云服务,云平台包括云服务提供商拥有的软件资源和硬件资源,具体地,云平台包括与云租户交互的进行云服务的销售、配置、运行的软件系统,以及云服务提供商的云中心或者边缘中心。云平台可以向用户展示至少一种云服务,进而云租户在云平台购买和配置云服务后,当云租户使用云服务的过程中,云平台可以调用云中心或者边缘中心中的节点中部署的对应云服务的装置进行服务的响应,例如:本申请中,当用户在云平台的图形用户界面配置游戏的推理服务后,部署在云中心或者边缘中心的游戏推理装置300可以处理来自游戏端200的推理请求,并返回响应。Cloud service providers can provide cloud services through cloud platforms. Cloud platforms include software resources and hardware resources owned by cloud service providers. Specifically, cloud platforms include software systems that interact with cloud tenants to sell, configure, and run cloud services. , and the cloud center or edge center of the cloud service provider. The cloud platform can display at least one cloud service to the user, and after the cloud tenant purchases and configures the cloud service on the cloud platform, when the cloud tenant uses the cloud service, the cloud platform can call the cloud center or the node in the edge center. The device corresponding to the cloud service responds to the service, for example: in this application, after the user configures the game inference service on the graphical user interface of the cloud platform, the game inference device 300 deployed in the cloud center or the edge center can process data from the game terminal 200 the inference request and return the response.
游戏端200在请求游戏的推理服务之前,可以预先在云平台完成该推理服务的配置,此时,游戏推理装置300可以是在云平台上配置游戏的推理服务的装置。具体实现时,如图2所示,游戏推理装置300中还可以包括配置模块303。在配置推理服务的过程中,游戏开发者101可以向云平台提供针对特定游戏的配置文件。这样,游戏推理装置300中的配置模块303可以基于云平台的游戏算法框架以及游戏开发者101提供的配置文件,在云平台上为该游戏开发者101配置该游戏的推理服务,以便于推理模块302可以利用配置成功的推理服务为游戏端200进行游戏推理。其中,游戏算法框架中可以预先定义有多种训练算法以及多种AI模型等,从而配置模块303可以根据该配置文件从该游戏算法框架中选择实现针对游戏的推理服务所需的训练算法、AI模型等内容来配置推理服务。The game terminal 200 may complete the configuration of the inference service on the cloud platform in advance before requesting the inference service of the game. At this time, the game inference device 300 may be a device for configuring the inference service of the game on the cloud platform. During specific implementation, as shown in FIG. 2 , the game inference apparatus 300 may further include a configuration module 303 . In the process of configuring the inference service, the game developer 101 may provide a configuration file for a specific game to the cloud platform. In this way, the configuration module 303 in the game inference device 300 can configure the game inference service for the game developer 101 on the cloud platform based on the game algorithm framework of the cloud platform and the configuration file provided by the game developer 101, so as to facilitate the inference module 302 may use the successfully configured reasoning service to perform game reasoning for the game terminal 200 . Among them, the game algorithm framework can be pre-defined with multiple training algorithms and multiple AI models, etc., so that the configuration module 303 can select from the game algorithm framework according to the configuration file. model and other content to configure the inference service.
如此,针对不同游戏开发者101所开发的各款游戏,游戏推理装置300均可以利用通用的游戏算法框架以及根据游戏开发者101提供的配置文件在云平台上自动配置出满足游戏开发 者101所需的推理服务。比如,当游戏推理装置300获取到包括第一游戏的配置信息的第一配置文件时,可以基于云平台上的游戏算法框架以及该第一配置文件,在云平台上配置得到第一游戏的推理服务;并且,当游戏推理装置获取到包括第二游戏的配置信息的第一配置文件时,可以基于该游戏算法框架以及该第二配置文件,在云平台上配置得到第二游戏的推理服务。其中,第一游戏与第二游戏为不同的游戏。这样,后续游戏推理装置300可以进一步利用配置出的推理服务为游戏端200进行游戏推理,这使得云服务提供商无需针对该游戏进行推理服务的专项化设计,从而可以有效降低云服务提供商提供推理服务的难度以及成本,并且,云服务提供商提供推理服务的效率也能得到有效提升。In this way, for each game developed by different game developers 101 , the game reasoning device 300 can use the general game algorithm framework and the configuration file provided by the game developer 101 to automatically configure on the cloud platform to meet the requirements of the game developer 101 . required inference services. For example, when the game inference apparatus 300 obtains the first configuration file including the configuration information of the first game, it can configure the inference of the first game on the cloud platform based on the game algorithm framework on the cloud platform and the first configuration file and, when the game inference device obtains the first configuration file including the configuration information of the second game, it can configure the inference service of the second game on the cloud platform based on the game algorithm framework and the second configuration file. The first game and the second game are different games. In this way, the subsequent game reasoning device 300 can further use the configured reasoning service to perform game reasoning for the game terminal 200, which makes the cloud service provider do not need to carry out special design for the reasoning service for the game, thereby effectively reducing the need for the cloud service provider to provide a reasoning service. The difficulty and cost of inference services, and the efficiency of cloud service providers to provide inference services can also be effectively improved.
而且,实际应用中,AI技术的应用离不开以强化学习(reinforcement learning,RL)为代表的训练算法以及支持训练算法运行所需的计算资源,但是,对于中小型游戏厂商等游戏开发者而言,可能难以具备高质量的AI模型以及足够的计算资源来实现游戏推理,因此,游戏端200利用云平台上的推理服务实现游戏推理,可以有效降低游戏开发者101应用AI技术获得游戏推理结果的难度。Moreover, in practical applications, the application of AI technology is inseparable from the training algorithm represented by reinforcement learning (RL) and the computing resources required to support the operation of the training algorithm. However, for small and medium-sized game manufacturers and other game developers, In other words, it may be difficult to have a high-quality AI model and enough computing resources to realize game reasoning. Therefore, the game terminal 200 uses the reasoning service on the cloud platform to realize game reasoning, which can effectively reduce the number of game developers 101 applying AI technology to obtain game reasoning results. difficulty.
应理解,图2所示的游戏推理装置300的结构仅作为一种示例性说明,实际应用时,游戏推理装置300的结构也可以采用其它可能的实施方式。比如,在一种可能的实施方式中,游戏推理装置300也可以拆分为配置装置以及推理装置,其中,配置装置用于实现在云平台上配置游戏的推理服务,而推理装置用于通过完成配置的推理服务为游戏端200进行相应的游戏推理。或者,游戏推理装置300还可以包括在线服务模块,可以用于在配置模块303完成游戏的推理服务的配置后,在云平台上将该推理服务发布成在线云服务。或者,在其它游戏推理装置300中,配置模块303与推理模块302可以集成为一个模块等。本实施例对于游戏推理装置300的具体实现方式并不进行限定。It should be understood that the structure of the game reasoning apparatus 300 shown in FIG. 2 is only used as an exemplary illustration, and other possible implementations may also be adopted for the structure of the game reasoning apparatus 300 in practical application. For example, in a possible implementation, the game inference device 300 can also be divided into a configuration device and an inference device, wherein the configuration device is used to implement the inference service of configuring the game on the cloud platform, and the inference device is used to complete the The configured reasoning service performs corresponding game reasoning for the game terminal 200 . Alternatively, the game inference apparatus 300 may further include an online service module, which may be configured to publish the inference service as an online cloud service on the cloud platform after the configuration module 303 completes the configuration of the inference service of the game. Alternatively, in other game inference devices 300, the configuration module 303 and the inference module 302 may be integrated into one module or the like. This embodiment does not limit the specific implementation of the game inference apparatus 300 .
接下来,对在云平台配置游戏的推理服务的方法的各种非限定性的具体实施方式进行详细描述。Next, various non-limiting specific implementations of the method for configuring the inference service of the game on the cloud platform will be described in detail.
参阅图3,为本申请实施例中一种在云平台配置游戏的推理服务的方法的流程示意图。该方法可以应用于上述图2所示的游戏推理装置300。本实施例中,游戏推理装置300可以根据多种游戏的配置文件,在云平台上分别配置出相应的多种游戏的推理服务,为便于理解与描述,下面以游戏推理装置300根据第一游戏对应的第一配置文件,在云平台上配置出该第一游戏的推理服务为例进行示例性说明。对于游戏推理装置300根据第二游戏(以及其它游戏)对应的第二配置文件配置第二游戏(以及其它游戏)的推理服务的实现过程,可参照下述实施例的具体实现进行理解,本实施例对此并不进行限定。图3所示的在云平台配置游戏的推理服务的方法具体可以包括:Referring to FIG. 3 , it is a schematic flowchart of a method for configuring a game inference service on a cloud platform according to an embodiment of the present application. This method can be applied to the game reasoning apparatus 300 shown in FIG. 2 above. In this embodiment, the game reasoning device 300 can configure the corresponding reasoning services of various games on the cloud platform according to the configuration files of various games. For the corresponding first configuration file, the reasoning service of the first game is configured on the cloud platform is taken as an example for illustration. The implementation process of the game reasoning apparatus 300 configuring the reasoning service of the second game (and other games) according to the second configuration file corresponding to the second game (and other games) can be understood by referring to the specific implementation of the following embodiments. The example is not limited to this. The method for configuring the inference service of the game on the cloud platform shown in FIG. 3 may specifically include:
S301:游戏推理装置300获取第一配置文件,该第一配置文件包括针对第一游戏的配置信息。S301: The game inference apparatus 300 acquires a first configuration file, where the first configuration file includes configuration information for the first game.
通常情况下,游戏端200的数据处理能力以及其所具有的AI模型质量有限,难以实时对目标对象进行动作和/或状态等方面的信息推理,为此,本实施例中,游戏开发者101在开发第一游戏的过程中,可以请求游戏推理装置300为第一游戏配置推理服务,以便利用云平台所具有的更强的数据处理能力以及质量更高的AI模型对第一游戏的游戏应用实例中的目标对象(如游戏应用实例中的游戏角色等)进行推理,该推理服务可以用于推理该游戏中一个或者多个游戏角色所执行的动作和/或状态等。Under normal circumstances, the data processing capability of the game terminal 200 and the quality of the AI model it has are limited, and it is difficult to perform information reasoning on the action and/or state of the target object in real time. Therefore, in this embodiment, the game developer 101 In the process of developing the first game, the game inference device 300 may be requested to configure an inference service for the first game, so as to utilize the stronger data processing capability of the cloud platform and the higher-quality AI model for the game application of the first game The target object in the instance (such as a game character in a game application instance) performs inference, and the inference service can be used to infer actions and/or states performed by one or more game characters in the game.
作为一种实现示例,游戏开发者101可以向游戏推理装置300提供第一配置文件,以便游 戏推理装置300根据该第一配置文件在云平台上配置第一游戏的推理服务。应理解,游戏开发者101可以基于游戏端200向游戏推理装置300发送第一配置文件,如图3所示。在其它实施例中,游戏开发者101也可以利用任何其他设备向游戏推理装置300提供第一配置文件,例如:游戏开发者101可以利用任何终端设备通过网页的方式登录云平台的网页,在网页界面上进行配置操作,以提供第一配置文件至游戏推理装置300。As an implementation example, the game developer 101 may provide the game inference apparatus 300 with a first configuration file, so that the game inference apparatus 300 configures the inference service of the first game on the cloud platform according to the first configuration file. It should be understood that the game developer 101 may send the first configuration file to the game inference apparatus 300 based on the game terminal 200, as shown in FIG. 3 . In other embodiments, the game developer 101 may also use any other device to provide the first configuration file to the game reasoning apparatus 300. For example, the game developer 101 may use any terminal device to log in to the web page of the cloud platform through the web page. A configuration operation is performed on the interface to provide the first configuration file to the game inference device 300 .
作为一种示例,第一配置文件中可以包括第一游戏的游戏应用实例中的目标对象的动作空间、第一游戏的游戏应用实例中的目标对象的状态空间、奖励函数、AI模型的类型、用于训练AI模型的目标训练算法的类型、AI模型的训练方式、AI模型的推理方式、AI模型的保存地址以及用于AI模型训练和推理的计算资源规格等信息中的一种或者多种。可选地,当第一配置文件中包括上述信息中的部分信息时,其余部分信息可以由游戏推理装置300自行确定,如其余部分信息可以在云平台上被预先设定等。As an example, the first configuration file may include the action space of the target object in the game application instance of the first game, the state space of the target object in the game application instance of the first game, the reward function, the type of AI model, One or more of the type of target training algorithm used to train the AI model, the training method of the AI model, the inference method of the AI model, the storage address of the AI model, and the specifications of computing resources used for AI model training and inference, etc. . Optionally, when the first configuration file includes part of the above information, the remaining part of the information can be determined by the game reasoning device 300 by itself, for example, the remaining part of the information can be preset on the cloud platform.
其中,目标对象的动作空间,包括目标对象的至少一种动作,如目标对象在第一游戏中向前移动、向后移动、向左移动、向右移动、跳跃、攻击、闪避等动作。相应的,游戏推理装置300为该目标对象所推理出的动作即为该动作空间中的一种动作。本实施例中,目标对象,可以是游戏端200上正在运行的第一游戏的游戏应用实例中的非玩家角色,如多人在线竞技游戏(multiplayer online battle arena,MOBA)中由游戏端200控制与玩家对战的“人机”或者“野怪”。或者,目标对象,也可以是需要被预测在未来时间段内(或者其它时间段)所进行的动作或者状态的对象,如游戏玩家A可以通过第一游戏的推理服务预测游戏玩家B在未来多个时刻的行为等,此时,游戏玩家A在体验对游戏玩家B的行为和/或状态进行预测的过程,可以视为游戏玩家A体验游戏的过程。The action space of the target object includes at least one action of the target object, such as the target object moving forward, backward, left, right, jumping, attacking, dodging and other actions in the first game. Correspondingly, the action deduced by the game inference device 300 for the target object is an action in the action space. In this embodiment, the target object may be a non-player character in the game application instance of the first game running on the game terminal 200, such as a multiplayer online battle arena (MOBA) controlled by the game terminal 200 A "human machine" or "wild monster" that fights against the player. Alternatively, the target object can also be an object that needs to be predicted to perform actions or states in a future time period (or other time periods). At this time, game player A is experiencing the process of predicting the behavior and/or state of game player B, which can be regarded as the process of game player A experiencing the game.
目标对象的状态空间,包括目标对象的至少一种状态,如游戏场景中非玩家角色的血量、魔法量、攻击力、防御力、技能冷却时长、情绪等状态。相应的,游戏推理装置300为该目标对象所推理出的状态即为该状态空间中的一种状态。The state space of the target object, including at least one state of the target object, such as the state of the non-player character's HP, mana, attack power, defense power, skill cooldown time, and emotion in the game scene. Correspondingly, the state deduced by the game inference device 300 for the target object is a state in the state space.
目标训练算法的类型,是指对用于推理目标对象的动作和/或状态的AI模型进行训练时所采用的算法的类型,该目标训练算法的类型例如可以在配置推理服务之前,由游戏开发者101进行指定,当然,也可以是由游戏推理装置300确定等。示例性地,目标训练算法,例如可以是深度强化学习(deep reinforcement learning,DQN)算法、近端策略优化(proximal policy optimization,PPO)算法、柔性动作评价(soft actor-critic,SAC)算法、深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法、双延迟深度确定性策略梯度(twin delayed deep deterministic policy gradient algorithm,TD3)算法、彩虹(rainbow)算法中的任意一种强化学习算法,或者也可以是其它可适用的算法。The type of target training algorithm refers to the type of algorithm used to train the AI model used to infer the action and/or state of the target object. For example, the type of target training algorithm can be developed by the game before configuring the inference service. The player 101 specifies, of course, it may be specified by the game inference device 300 or the like. Exemplarily, the target training algorithm may be, for example, a deep reinforcement learning (deep reinforcement learning, DQN) algorithm, a proximal policy optimization (proximal policy optimization, PPO) algorithm, a soft actor-critic (soft actor-critic, SAC) algorithm, a deep Deterministic policy gradient (deep deterministic policy gradient, DDPG) algorithm, dual-delay deep deterministic policy gradient (twin delayed deep deterministic policy gradient algorithm, TD3) algorithm, rainbow (rainbow) algorithm of any reinforcement learning algorithm, or also There may be other applicable algorithms.
AI模型的类型,是指实现推理目标对象的动作和/或状态时所采用的AI模型的类型。示例性地,AI模型,可以是深度神经网络(deep neural network,DNN)模型、循环神经网络(recurrent neural network,RNN)模型、卷积神经网络(convolutional neural network,CNN)模型中的任意一种神经网络模型,或者也可以是其它可适用的模型。The type of AI model refers to the type of AI model used to realize the action and/or state of the inference target object. Exemplarily, the AI model may be any one of a deep neural network (deep neural network, DNN) model, a recurrent neural network (RNN) model, and a convolutional neural network (convolutional neural network, CNN) model neural network model, or other applicable models.
目标训练算法在运行过程中,可以将训练数据输入至AI模型中,并根据AI模型输出的结果与训练数据中的实际结果之间的差异,调整AI模型中的参数和/或网络结构,以此实现对AI模型的训练。During the operation of the target training algorithm, the training data can be input into the AI model, and the parameters and/or network structure in the AI model can be adjusted according to the difference between the results output by the AI model and the actual results in the training data, so as to This implements the training of AI models.
奖励函数,是指在训练AI模型的过程中用于控制AI模型中参数的调整方向的函数。具体的,在模型训练过程中,游戏推理装置300针对该AI模型输出的结果,可以通过该奖励函数计算出相应的奖励值,从而可以根据该奖励值与预设阈值之间的大小,确定AI模型中参数的调 整方向,以此通过多次的迭代调整最终完成模型训练。The reward function refers to the function used to control the adjustment direction of the parameters in the AI model in the process of training the AI model. Specifically, in the model training process, the game reasoning device 300 can calculate the corresponding reward value through the reward function for the result output by the AI model, so that the AI can be determined according to the value between the reward value and the preset threshold. The adjustment direction of the parameters in the model, so that the model training is finally completed through multiple iterative adjustments.
AI模型的训练方式,例如可以是自我博弈的训练方式。其中,自我博弈,是指模型本身与虚拟对手进行博弈,该虚拟对手可以是拥有过去经验的模型本身,也可以是其它模型训练出的智能体等。具体地,游戏推理装置300可以获取一个或者多个游戏应用实例所产生的针对对象A的训练数据,并将该对象A的训练数据输入至AI模型中,由AI模型根据该对象A的训练数据输出对象A的推理动作,并利用该推理动作与对象B进行博弈,从而可以根据对象A与对象B之间的博弈结果对AI模型中的参数进行调整。The training method of the AI model, for example, can be the training method of self-play. Among them, self-play refers to the game between the model itself and a virtual opponent. The virtual opponent can be the model itself with past experience, or it can be an agent trained by other models. Specifically, the game reasoning device 300 may acquire the training data for object A generated by one or more game application instances, and input the training data of the object A into the AI model, and the AI model will use the training data of the object A according to the training data of the object A. Output the inference action of object A, and use the inference action to play a game with object B, so that the parameters in the AI model can be adjusted according to the game result between object A and object B.
推理方式,例如可以是为目标对象推理一种风格的动作和/或状态的方式,或并行推理多种风格的动作和/或状态的方式。The reasoning method, for example, can be a method of inferring actions and/or states of one style for a target object, or a method of inferring actions and/or states of multiple styles in parallel.
AI模型的保存地址,用于指示在完成对于AI模型的训练后,该AI模型在云平台上的保存位置。The save address of the AI model is used to indicate the save location of the AI model on the cloud platform after the training of the AI model is completed.
用于AI模型训练和推理的计算资源规格,是指服务推理装置300在训练AI模型或者利用AI模型进行推理时所依赖的计算资源的规格,其规格大小例如可以由游戏开发者101通过第一配置文件进行定义。The computing resource specification used for AI model training and inference refers to the specification of computing resources that the service inference device 300 relies on when training the AI model or using the AI model to perform inference. configuration file to define.
实际应用时,第一配置文件中还可以包括其它信息,如还包括启发式算法等,该启发式算法可以用于对目标训练算法利用AI模型推理得到的动作和/或状态进行合理性约束等。本实施例中,对于第一配置文件的具体实现并不进行限定。In practical applications, the first configuration file may also include other information, such as a heuristic algorithm, etc., the heuristic algorithm may be used to impose rationality constraints on the actions and/or states obtained by the target training algorithm using the AI model inference, etc. . In this embodiment, the specific implementation of the first configuration file is not limited.
在一种可能的实施方式中,游戏端200可以向用户呈现如图4所示的配置界面,该配置界面可以呈现有提示游戏开发者101输入相关配置的信息,从而游戏开发者101可以在该配置界面对目标对象的动作空间、状态空间、奖励函数、目标训练算法、AI模型等进行配置(其余内容的配置未示出)。其中,游戏端200可以在该配置界面上呈现有可供游戏开发者101选择的配置信息项,如呈现有如图4所示的针对目标训练算法以及AI模型的不同类型的候选项,该配置信息项可以由游戏推理装置300预先提供给游戏端200,以便于游戏开发者101可以直接在该配置界面上对训练算法以及AI模型进行选择,而可以不用输入具体的训练算法文件以及AI模型文件,从而可以进一步方便游戏开发者101的配置。然后,游戏端200可以根据游戏开发者101所选择的配置信息项,自动生成相应的第一配置文件并将其发送给游戏推理装置300。实际应用时,游戏开发者101也可以是采用其它方式向游戏推理装置300提供第一配置文件,此处仅作为一种示例性说明,其具体实现方式并不限定。In a possible implementation, the game terminal 200 may present a configuration interface as shown in FIG. 4 to the user, and the configuration interface may present information prompting the game developer 101 to input the relevant configuration, so that the game developer 101 can The configuration interface configures the action space, state space, reward function, target training algorithm, AI model, etc. of the target object (the configuration of the rest of the content is not shown). The game terminal 200 may present configuration information items that can be selected by the game developer 101 on the configuration interface. For example, different types of candidate items for target training algorithms and AI models as shown in FIG. The item can be provided to the game terminal 200 in advance by the game reasoning device 300, so that the game developer 101 can directly select the training algorithm and AI model on the configuration interface, without inputting the specific training algorithm file and AI model file, Thereby, the configuration of the game developer 101 can be further facilitated. Then, the game terminal 200 can automatically generate the corresponding first configuration file according to the configuration information item selected by the game developer 101 and send it to the game inference device 300 . In practical application, the game developer 101 may also provide the first configuration file to the game inference apparatus 300 in other manners, which is merely an exemplary description, and its specific implementation manner is not limited.
S302:游戏推理装置300基于云平台的游戏算法框架以及获取的第一配置文件,在云平台上配置第一游戏的推理服务。S302: The game inference apparatus 300 configures the inference service of the first game on the cloud platform based on the game algorithm framework of the cloud platform and the acquired first configuration file.
游戏算法框架,可以是预先定义有一种或者多种不同类型的训练算法以及AI模型的算法库,该游戏算法框架中的各种训练算法和AI模型可以预先构建。其中,在预先构建游戏算法框架中的AI模型时,可以选择基于强化学习通用的全连接网络结构或者卷积神经网络结构构建AI模型,并且,还可以在该AI模型的网络架构中添加L2正则化项和/或Dropout层,以便通过该L2正则化项和/或Dropout层提高AI模型的泛化性能,尽可能避免训练得到的AI模型所输出的推理动作的普适性较低。实际应用场景中,还可以根据实际应用需求对AI模型中网络层数和/或神经元数量进行适应性调整等。在配置第一游戏的推理服务时,游戏推理装置300可以通过应用程序接口(Application Programming Interface,API)调用游戏算法框架中的训练算法和AI模型。实际应用时,游戏开发者101可以在云平台上订阅该游戏算法框架,以便游戏推理装置300基于其订阅的游戏算法框架配置第一游戏的推理服务。The game algorithm framework may be an algorithm library in which one or more different types of training algorithms and AI models are pre-defined. Various training algorithms and AI models in the game algorithm framework can be pre-built. Among them, when building the AI model in the game algorithm framework in advance, you can choose to build the AI model based on the general-purpose fully connected network structure or convolutional neural network structure of reinforcement learning, and you can also add L2 regularity to the network architecture of the AI model. The L2 regularization term and/or the Dropout layer can be used to improve the generalization performance of the AI model, so as to avoid the low universality of the inference actions output by the trained AI model as much as possible. In practical application scenarios, the number of network layers and/or the number of neurons in the AI model can also be adaptively adjusted according to actual application requirements. When configuring the reasoning service of the first game, the game reasoning apparatus 300 may call the training algorithm and AI model in the game algorithm framework through an application programming interface (Application Programming Interface, API). In practical application, the game developer 101 may subscribe to the game algorithm framework on the cloud platform, so that the game reasoning apparatus 300 configures the reasoning service of the first game based on the game algorithm framework subscribed to.
作为一种配置推理服务的实现示例,游戏推理装置300可以包括如图2所示的通信模块301 以及配置模块303。通信模块301在获得游戏开发者101提供的第一配置文件后,可以将第一配置文件发送给配置模块303。配置模块303可以根据第一配置文件中定义的目标训练算法的类型,在游戏算法框架中调用相应类型的目标训练算法,并根据第一配置文件中所定义的AI模型的类型,在游戏算法框架中调用相应类型的一个或者多个AI模型。由于游戏算法框架中的AI模型在进行模型训练之前推理效果通常较差,因此,配置模块303可以利用调用的目标训练算法对AI模型进行训练,以便配置模块303根据完成训练的AI模型进一步配置出游戏开发者101所需的推理服务。其中,配置模块303实现AI模型训练以及后续推理模块302利用该AI模型进行游戏推理所需计算资源的规格,可以根据游戏开发者101提供的第一配置文件进行确定。As an implementation example of a configuration reasoning service, the game reasoning apparatus 300 may include a communication module 301 and a configuration module 303 as shown in FIG. 2 . After obtaining the first configuration file provided by the game developer 101 , the communication module 301 may send the first configuration file to the configuration module 303 . The configuration module 303 can call the corresponding type of target training algorithm in the game algorithm framework according to the type of target training algorithm defined in the first configuration file, and according to the type of AI model defined in the first configuration file, in the game algorithm framework. One or more AI models of the corresponding type are called in the . Because the AI model in the game algorithm framework usually has poor reasoning effect before model training, the configuration module 303 can use the called target training algorithm to train the AI model, so that the configuration module 303 can further configure the AI model according to the trained AI model. Inference services required by game developers 101 . The specification of computing resources required by the configuration module 303 to implement AI model training and the subsequent reasoning module 302 to use the AI model to perform game reasoning can be determined according to the first configuration file provided by the game developer 101 .
示例性地,配置模块303可以通过超参数搜索的方式对AI模型进行训练。其中,AI模型中的超参数是指AI模型在进行模型训练之前预先设定的参数,而AI模型中剩余的其它参数,可以通过后续的模型训练过程进行确定。实际应用时,预先设定的超参数,可能并不一定能够使得AI模型的对于目标对象的推理效果达到较高或者最高的水平,因此,本实施例中,游戏推理装置300可以通过超参数搜索的方式对AI模型进行训练,以确定出能够使得训练得到的AI模型达到较高推理效果的超参数。Exemplarily, the configuration module 303 can train the AI model by means of hyperparameter search. Among them, the hyperparameters in the AI model refer to the parameters preset by the AI model before model training, and other parameters remaining in the AI model can be determined through the subsequent model training process. In practical application, the pre-set hyperparameters may not necessarily enable the inference effect of the AI model for the target object to reach a higher or highest level. Therefore, in this embodiment, the game inference device 300 can search through the hyperparameters. The AI model is trained in a way to determine the hyperparameters that can make the trained AI model achieve a higher inference effect.
具体实现时,配置模块303可以预先确定超参数的多组可能的取值,从而基于超参数的不同取值,构建多个AI模型,以构建第一AI模型以及第二AI模型为例(实际应用时,还可以基于超参数取值的多种可能,构建出更多数量的AI模型)。其中,第一AI模型以及第二AI模型具有不同的超参数,但第一AI模型以及第二AI模型均属于第一配置文件中定义的AI模型的类型。这样,配置模块303可以利用预先获取的训练数据,分别对第一AI模型以及第二AI模型进行训练。通常情况下,超参数的多组可能的取值中存在至少一组取值,能够使得基于该组取值所训练得到的AI模型的推理效果较高,如该AI模型推理出的动作和/或状态的适用性较高等。本实施例中,AI模型的推理效果可以通过奖励值进行衡量。具体的,配置模块303在训练第一AI模型以及第二AI模型时,可以利用第一配置文件中定义的奖励函数,分别计算出第一AI模型对应的奖励值以及第二AI模型对应的奖励值。此时,若第一AI模型对应的奖励值大于第二AI模型对应的奖励值,则配置模块303将第一AI模型所具有的超参数作为所要搜索出的超参数,并将第一AI模型作为训练效果相对较高的AI模型。其中,AI模型对应的奖励值,可以通过游戏开发者101预先配置的奖励函数进行计算得到。实际应用时,当构建出的AI模型的数量多于两个时,配置模块303可以将多个AI模型中奖励值最大的AI模型所具有的超参数作为所要搜索出的超参数,相应的,具有该超参数的AI模型即为训练效果最高的AI模型。应理解,此处是以根据奖励值完成超参数搜索为例进行示例性说明,实际应用时,也可以是通过其它可能的方式完成超参数搜索,本实施例对此并不进行限定。During specific implementation, the configuration module 303 can pre-determine multiple sets of possible values of the hyperparameters, so as to construct multiple AI models based on different values of the hyperparameters, taking the construction of the first AI model and the second AI model as an example (actually When applied, a larger number of AI models can be constructed based on the multiple possibilities of hyperparameter values). The first AI model and the second AI model have different hyperparameters, but both the first AI model and the second AI model belong to the type of AI model defined in the first configuration file. In this way, the configuration module 303 can use the pre-acquired training data to separately train the first AI model and the second AI model. Usually, there are at least one set of possible values of hyperparameters, which can make the AI model trained based on the set of values have a higher inference effect, such as the actions and/or actions inferred by the AI model. Or the applicability of the state is higher, etc. In this embodiment, the reasoning effect of the AI model can be measured by the reward value. Specifically, when training the first AI model and the second AI model, the configuration module 303 can use the reward function defined in the first configuration file to calculate the reward value corresponding to the first AI model and the reward corresponding to the second AI model respectively. value. At this time, if the reward value corresponding to the first AI model is greater than the reward value corresponding to the second AI model, the configuration module 303 uses the hyperparameters of the first AI model as the hyperparameters to be searched, and uses the first AI model As an AI model with relatively high training effect. Wherein, the reward value corresponding to the AI model can be calculated by the reward function preconfigured by the game developer 101 . In practical application, when the number of constructed AI models is more than two, the configuration module 303 can use the hyperparameter of the AI model with the largest reward value among the multiple AI models as the hyperparameter to be searched out. Correspondingly, The AI model with this hyperparameter is the AI model with the highest training effect. It should be understood that the hyperparameter search is performed according to the reward value as an example for illustrative description. In practical application, the hyperparameter search may also be completed in other possible ways, which is not limited in this embodiment.
在进行超参数搜索过程中,配置模块303可以在云平台上创建一个进程,从而利用该进程可以串行训练各个具有不同超参数的AI模型,其中,训练每个具有不同超参数的AI模型时采用的奖励函数相同。During the hyperparameter search process, the configuration module 303 can create a process on the cloud platform, so that each AI model with different hyperparameters can be serially trained by using this process, wherein, when training each AI model with different hyperparameters The reward function used is the same.
或者,配置模块303可以在云平台上创建多个进程,每个进程可以负责至少一个AI模型的训练,并且不同进程所训练的AI模型之间具有不同的超参数。举例来说,当配置模块303构建出第一AI模型以及第二AI模型时,配置模块303可以在云平台上创建第一进程以及第二进程,其中,第一进程用于训练第一AI模型,并且第二进程用于训练第二AI模型。示例性地,配置模块303所创建的进程可以表现为执行器(worker),每个执行器可以通过进程等方式实现。如此,配置模块303可以同时利用多个进程并行训练多个AI模型,从而可以有效提高超参数搜索的效率。实际应用场景中,当进程的数量与AI模型的数量相同时,进程与AI模型之间可以 一一对应,从而配置模块303可以通过各个进程,并行训练每个进程分别对应的AI模型。而当进程的数量少于AI模型的数量(不同AI模型的超参数不同)时,一个进程可以对应于多个AI模型,则,配置模块303可以利用多个进程先对部分AI模型进行并行训练,并在完成对于该部分AI模型的训练后,配置模块303可以利用该多个进程继续对剩余部分的AI模型进行训练。可选地,在完成对于部分AI模型的训练后,多个进程之间可以交互超参数数据,这样,进程之间可复用表现良好的AI模型的超参数,并从剩余的超参数中随机探索新的超参数,以此可以降低训练多个AI模型所需的计算开销。Alternatively, the configuration module 303 may create multiple processes on the cloud platform, each process may be responsible for training at least one AI model, and the AI models trained by different processes have different hyperparameters. For example, when the configuration module 303 builds the first AI model and the second AI model, the configuration module 303 can create the first process and the second process on the cloud platform, where the first process is used to train the first AI model , and the second process is used to train the second AI model. Exemplarily, the process created by the configuration module 303 can be represented as an executor (worker), and each executor can be implemented by a process or the like. In this way, the configuration module 303 can use multiple processes to train multiple AI models in parallel, thereby effectively improving the efficiency of hyperparameter search. In an actual application scenario, when the number of processes is the same as the number of AI models, the processes and AI models can be in one-to-one correspondence, so that the configuration module 303 can train the AI models corresponding to each process in parallel through each process. When the number of processes is less than the number of AI models (the hyperparameters of different AI models are different), one process can correspond to multiple AI models, then the configuration module 303 can use multiple processes to first perform parallel training on some AI models , and after completing the training of the part of the AI model, the configuration module 303 may continue to train the remaining part of the AI model by using the multiple processes. Optionally, after completing the training of some AI models, hyperparameter data can be exchanged between multiple processes. In this way, the hyperparameters of the AI model with good performance can be reused between processes, and the remaining hyperparameters can be randomly selected. Explore new hyperparameters that can reduce the computational overhead required to train multiple AI models.
上述实施方式中,配置模块303通过超参数搜索的方式所训练出的一个最优的AI模型可以用于推理一种风格的动作/状态。而在进一步的实施方式中,配置模块303还可以通过种群演化(population based training,PBT)等方式,得到能够推理出多种不同动作/状态的风格的AI模型,其中,配置模块303在训练不同风格的AI模型时,所使用的奖励函数不同。比如,配置模块303可以在云平台上构建第一AI模型以及第二AI模型,其中,第一AI模型可以对应于奖励函数1,并且配置模块303可以利用奖励函数1训练出用于推理第一风格的动作/状态;同时,第二AI模型可以对应于奖励函数2,并且配置模块303可以利用奖励函数2训练出用于推理第二风格的动作/状态。种群演化,是并行化搜索和序列优化相结合的一种异步自动超参调节优化方法。在种群演化过程中,配置模块303可以通过预先定义的不同奖励函数,训练得到用于推理出不同风格的AI模型,即每个奖励函数可以对应于一种推理风格的AI模型。举例来说,在对战游戏场景中,针对于对战游戏中的非玩家角色A,配置模块303可以通过种群演化过程,为该非玩家角色A分别训练得到“激进型”、“保守型”以及“平衡型”三种战斗风格的AI模型,这三种战斗风格的AI模型可以各自对应于三个不同的奖励函数,即“激进型”战斗风格的AI模型可以对应于奖励函数1、“保守型”战斗风格的AI模型可以对应于奖励函数2、“平衡型”战斗风格的AI模型可以对应于奖励函数3。针对于每种战斗风格,配置模块303可以通过上述超参数搜索的方式训练出属于该战斗风格并且具有较高推理效果的AI模型(如奖励值最大的AI模型)。其中,种群演化过程中所使用的不同奖励函数,可以由配置模块303根据第一配置文件进行确定,或者可以由配置模块303自行进行设置等。In the above embodiment, an optimal AI model trained by the configuration module 303 by means of hyperparameter search can be used to infer a style of action/state. In a further embodiment, the configuration module 303 can also obtain an AI model capable of inferring a variety of styles of different actions/states by means of population based training (PBT). The reward function used is different when different styles of AI models are used. For example, the configuration module 303 can build the first AI model and the second AI model on the cloud platform, wherein the first AI model can correspond to the reward function 1, and the configuration module 303 can use the reward function 1 to train a model for inferring the first AI model. At the same time, the second AI model may correspond to the reward function 2, and the configuration module 303 may use the reward function 2 to train the action/state for inferring the second style. Population evolution is an asynchronous automatic hyperparameter adjustment optimization method that combines parallelized search and sequence optimization. In the process of population evolution, the configuration module 303 can obtain AI models for inferring different styles by training different pre-defined reward functions, that is, each reward function can correspond to an AI model of one inference style. For example, in the battle game scenario, for the non-player character A in the battle game, the configuration module 303 can train the non-player character A to obtain "aggressive type", "conservative type" and "aggressive type" through the population evolution process. The AI models of the three combat styles of "Balanced" can each correspond to three different reward functions, that is, the AI models of "aggressive" combat styles can correspond to reward function 1, "Conservative" "The AI model of the combat style can correspond to the reward function 2, and the AI model of the "balanced" combat style can correspond to the reward function 3. For each combat style, the configuration module 303 can train an AI model (such as the AI model with the largest reward value) that belongs to the combat style and has a higher reasoning effect through the above-mentioned hyperparameter search. The different reward functions used in the population evolution process may be determined by the configuration module 303 according to the first configuration file, or may be set by the configuration module 303 itself.
在种群演化过程中,基于一个AI模型可能会因为迭代训练而演化得到新的AI模型,此时,若为每个AI模型(包括新的AI模型以及旧的AI模型)均分配一个端口,则因为产生的AI模型数量较多而导致进程占用的端口数量过大。为此,配置模块303在利用进程迭代训练AI模型时,可以利用新的AI模型替换旧的AI模型,这样,进程可以复用旧的AI模型对应的端口来接收训练数据,并利用该训练数据对新的AI模型进行训练,从而可以有效避免种群演化前后所得到的AI模型数量过多而导致对于游戏推理装置300中进程占用的端口数量过大。In the process of population evolution, based on an AI model, a new AI model may be evolved due to iterative training. At this time, if each AI model (including the new AI model and the old AI model) is assigned a port, then Due to the large number of generated AI models, the number of ports occupied by the process is too large. To this end, when the configuration module 303 uses the process to iteratively train the AI model, it can replace the old AI model with the new AI model, so that the process can reuse the port corresponding to the old AI model to receive training data, and use the training data By training the new AI model, it can effectively avoid that the number of AI models obtained before and after population evolution is too large, which leads to the excessive number of ports occupied by the process in the game inference device 300 .
其中,配置模块303在通过超参数搜索和/或种群演化的方式迭代训练AI模型时,可以基于第一配置文件中定义的自我博弈的训练方式完成对该AI模型的训练。Wherein, when the configuration module 303 iteratively trains the AI model by means of hyperparameter search and/or population evolution, the AI model can be trained based on the training method of self-play defined in the first configuration file.
本实施例中,对AI模型进行训练所需的训练数据,可以由游戏端200提供给游戏推理装置300。游戏推理装置300中的通信模块301在接收到训练数据后,可以将该训练数据发送给配置模块303。示例性地,训练数据具体可以由游戏端200上运行的第一游戏的游戏应用实例生成,例如可以是第一游戏的游戏应用实例中非玩家角色与玩家角色在不同对战时刻所对应的攻击力、防御力、血量等游戏数据。In this embodiment, the training data required for training the AI model may be provided by the game terminal 200 to the game inference apparatus 300 . After receiving the training data, the communication module 301 in the game inference device 300 can send the training data to the configuration module 303 . Exemplarily, the training data can be specifically generated by the game application instance of the first game running on the game terminal 200, for example, it can be the attack power corresponding to the non-player character and the player character at different battle moments in the game application instance of the first game. , Defense, HP and other game data.
作为一种实现示例,游戏端200可以向通信模块301发送多个训练请求,不同训练请求包括目标对象的不同训练数据。如图5所示,该多个训练请求来自于游戏端200上运行的第一游戏的多个游戏应用实例。针对相同目标对象,多个游戏应用实例在运行时可以生成多个关于 该目标对象的数据。如此,游戏端200上的多个游戏应用实例可以在单位时间内生成更多的训练数据,从而可以加快游戏推理装置300对于AI模型的训练过程。当然,本实施例中是以游戏端200上多个游戏应用实例并行产生训练数据为例,在其它实施例中,游戏端200上也可以是仅运行第一游戏的一个游戏应用实例并产生训练数据。As an implementation example, the game terminal 200 may send multiple training requests to the communication module 301, and different training requests include different training data of the target object. As shown in FIG. 5 , the multiple training requests come from multiple game application instances of the first game running on the game terminal 200 . For the same target object, multiple game application instances can generate multiple pieces of data about the target object at runtime. In this way, multiple game application instances on the game terminal 200 can generate more training data per unit time, thereby speeding up the training process of the AI model by the game inference apparatus 300 . Of course, in this embodiment, multiple game application instances on the game terminal 200 are used to generate training data in parallel as an example. In other embodiments, the game terminal 200 may also only run one game application instance of the first game and generate training data. data.
进一步的,游戏端200上运行的游戏应用实例所生成的针对目标对象的训练数据中,可能包含部分无效数据,如用于标记数据产生时间、数据量大小的数据等,这部分数据对于配置模块303训练AI模型而言没有指导意义,因此,游戏端200可以将这部分数据作为无效数据进行过滤。如此,游戏端200向游戏推理装置300发送的训练数据的数据量可以得到减少,从而可以减少游戏端200与游戏推理装置300之间的数据通信所导致的时延,提高模型训练效率。Further, the training data for the target object generated by the game application instance running on the game terminal 200 may contain some invalid data, such as data used to mark the data generation time, data size, etc. 303 has no guiding significance for training the AI model, therefore, the game terminal 200 can filter this part of the data as invalid data. In this way, the amount of training data sent by the game terminal 200 to the game inference device 300 can be reduced, thereby reducing the time delay caused by data communication between the game terminal 200 and the game inference device 300 and improving model training efficiency.
在利用多个进程并行训练多个AI模型的过程中,以第一进程负责训练第一AI模型、第二进程负责训练第二AI模型为例,通信模块301在接收到多个推理请求后,可以根据第一进程的端口号和/或IP地址以及第二进程的端口号和/或IP地址,将多个训练请求中的训练数据发送给第一进程以及第二进程。这样,配置模块303在利用多个进程并行训练多个AI模型时,可以利用第一进程以及第一进程所接收到的训练数据训练第一AI模型,利用第二进程以及第二进程所接收到的训练数据训练第二AI模型。类似的,当一个进程负责训练多个AI模型时(如进程数量少于AI模型数量),单个进程训练多个AI模型中的任意一个AI模型的过程,可以与上述过程类似,在此不做赘述。In the process of using multiple processes to train multiple AI models in parallel, taking the first process responsible for training the first AI model and the second process responsible for training the second AI model as an example, after the communication module 301 receives multiple inference requests, The training data in the multiple training requests may be sent to the first process and the second process according to the port number and/or IP address of the first process and the port number and/or IP address of the second process. In this way, when using multiple processes to train multiple AI models in parallel, the configuration module 303 can use the first process and the training data received by the first process to train the first AI model, and use the second process and the training data received by the second process to train the first AI model. of training data to train a second AI model. Similarly, when a process is responsible for training multiple AI models (for example, the number of processes is less than the number of AI models), the process of training any one of the multiple AI models by a single process can be similar to the above process, and will not be done here. Repeat.
可选地,不同进程在训练其对应的AI模型时所使用的训练数据,可以来源于第一游戏的一个或者多个游戏应用实例。这样,通信模块301在获得第一游戏的不同游戏应用实例产生的训练数据后,可以根据预先的配置,向一个或者多个进程发送该训练数据。举例来说,如图5所示,游戏推理装置300上可以创建有进程1以及进程2,并且,游戏端200上可以运行有第一游戏的游戏应用实例1、游戏应用实例2以及游戏应用实例n。当通信模块301接收到这三个游戏应用实例所产生的训练数据后,通信模块301可以将游戏应用实例1的训练数据1以及游戏应用实例2的训练数据2发送给进程1,将游戏应用实例n的训练数据n发送给进程2。其中,通信模块301中可以预先配置有进程与游戏应用实例的对应关系,这样,通信模块301可以根据该对应关系将游戏应用实例产生的训练数据发送给对应的进程。当然,在其它可能的示例中,通信模块301也可以是将游戏应用实例1的训练数据1同时发送给进程1以及进程2等,本实施例对于通信模块301分配训练数据的具体实现方式并不进行限定。如此,游戏推理装置300可以利用游戏端200上不同游戏应用实例所产生的训练数据,同时训练出多个AI模型。并且,对于游戏端200而言,其可以通过一个输出端口向通信模块301提供训练数据,而可以不用通过多个输出端口分别为不同推理风格的AI模型提供训练数据,从而可以降低对于游戏端200端口的要求。Optionally, the training data used by different processes when training their corresponding AI models may be derived from one or more game application instances of the first game. In this way, after obtaining the training data generated by different game application instances of the first game, the communication module 301 can send the training data to one or more processes according to the pre-configuration. For example, as shown in FIG. 5 , a process 1 and a process 2 may be created on the game inference device 300 , and the game application instance 1 , the game application instance 2 and the game application instance of the first game may run on the game terminal 200 n. After the communication module 301 receives the training data generated by the three game application instances, the communication module 301 can send the training data 1 of the game application instance 1 and the training data 2 of the game application instance 2 to the process 1, and the game application instance The training data n of n is sent to process 2. The communication module 301 may be preconfigured with a corresponding relationship between the process and the game application instance, so that the communication module 301 can send the training data generated by the game application instance to the corresponding process according to the corresponding relationship. Of course, in other possible examples, the communication module 301 may also send the training data 1 of the game application instance 1 to the process 1 and the process 2 at the same time. be limited. In this way, the game inference apparatus 300 can use the training data generated by different game application instances on the game terminal 200 to simultaneously train multiple AI models. In addition, for the game terminal 200, it can provide training data to the communication module 301 through one output port, instead of providing training data for AI models with different inference styles through multiple output ports, thereby reducing the need for the game terminal 200. port requirements.
当游戏推理装置300完成第一游戏的推理服务的成功配置后,游戏推理装置300还可以生成通知消息,并将该通知消息通过通信模块301发送给游戏端200,以通知该游戏端200训练过程。实际应用中,游戏推理装置300还可以将每次训练过程中针对目标对象的训练结果(如包括博弈结果等)通过通信模块301反馈给游戏端200等。这样,游戏开发者101可以在游戏端200上通过相应的界面或者窗口查看到模型训练过程中的相关数据。After the game inference device 300 completes the successful configuration of the inference service of the first game, the game inference device 300 may also generate a notification message, and send the notification message to the game terminal 200 through the communication module 301 to notify the game terminal 200 of the training process . In practical applications, the game reasoning apparatus 300 may also feed back the training results (eg, including game results, etc.) for the target object in each training process to the game terminal 200 through the communication module 301 . In this way, the game developer 101 can view the relevant data in the model training process on the game terminal 200 through a corresponding interface or window.
本实施例中,是以游戏推理装置300创建新的AI模型并加以训练为例进行示例性说明,实际应用的其他方式中,也可以是复用已有的AI模型,并通过“surgery(手术)”方式对已有AI模型进行训练,即可以是在已有的AI模型中的网络结构添加新的网络层,从而在训练AI模型时,主要对新添加的网络层中的超参数进行搜索以及该网络层中的参数进行训练,以此可 以提高模型训练的效率、降低模型训练所需的计算量。In this embodiment, the game reasoning device 300 is used to create a new AI model and train it as an example for illustration. In other practical applications, the existing AI model can also be reused, and the “surgery” method can be used for )" method to train the existing AI model, that is, a new network layer can be added to the network structure in the existing AI model, so that when training the AI model, the hyperparameters in the newly added network layer are mainly searched. And the parameters in the network layer are trained, which can improve the efficiency of model training and reduce the amount of computation required for model training.
本实施例中,在云平台上完成针对第一游戏的推理服务的配置后,游戏推理装置300可以利用该推理服务为游戏端200发送的针对第一游戏的推理请求进行响应,以实现为游戏端200进行相应的游戏推理。基于此,本实施例中还可以进一步包括:In this embodiment, after the configuration of the inference service for the first game is completed on the cloud platform, the game inference device 300 can use the inference service to respond to the inference request for the first game sent by the game terminal 200, so as to realize the game The terminal 200 performs corresponding game reasoning. Based on this, this embodiment may further include:
S303:游戏端200在运行游戏应用实例时,向游戏推理装置300发送推理请求,该推理请求包括第一游戏的游戏应用实例中的目标对象的待处理数据。S303: When running the game application instance, the game terminal 200 sends an inference request to the game inference apparatus 300, where the inference request includes the data to be processed of the target object in the game application instance of the first game.
示例性地,在游戏对战场景中,推理请求中携带的待处理数据,例如可以是指示非玩家角色(即目标对象)以及对方玩家角色在第一游戏中状态的数据,如非玩家角色以及对方玩家角色的游戏画面,该游戏画面中的内容可以包括非玩家角色以及对方玩家角色的血量、魔法量、攻击力、防御力、技能状态等信息,或者,可以是描述非玩家角色以及对方玩家角色的对战状态的文字信息等。而在用户行为预测场景中,待处理数据例如可以是包括用户过往动作的视频或者图片数据等。本实施例中,对于待处理数据的具体实现方式并不进行限定。Exemplarily, in a game battle scenario, the data to be processed carried in the inference request may, for example, be data indicating the state of the non-player character (ie, the target object) and the opponent's player character in the first game, such as the non-player character and the opponent. The game screen of the player character. The content in the game screen may include information such as the HP, magic power, attack power, defense power, skill status of the non-player character and the opposing player character, or may describe the non-player character and the opposing player. Text information of the character's battle status, etc. In the user behavior prediction scenario, the data to be processed may be, for example, video or picture data including the user's past actions. In this embodiment, the specific implementation manner of the data to be processed is not limited.
实际应用时,游戏推理装置300可以向游戏端200提供API接口,从而游戏端200可以通过该API接口向游戏推理装置300中的通信模块301发送多个推理请求,该多个推理请求分别用于请求目标对象在未来多个时刻的推理动作和/或状态。In practical application, the game inference device 300 can provide an API interface to the game terminal 200, so that the game terminal 200 can send multiple inference requests to the communication module 301 in the game inference device 300 through the API interface. Request the inference action and/or state of the target object at multiple moments in the future.
由于游戏端200与通信模块301之间成功建立通信连接(如基于三次握手建立的TCP连接等),通常需要一定的耗时,从而游戏端200每次请求动作推理服务的时延,可能会因为游戏端200与通信模块301之间的连接建立过程而被增加。基于此,在一种可能的实施方式中,游戏端200可以与通信模块301之间建立长连接,如游戏端200采用超文本传输协议版本1.1(Hypertext Transfer Protocol Version 1.1,HTTP1.1)与通信模块301之间建立长连接等。如此,游戏端200在与通信模块301成功建立一次连接后,后续每次向游戏推理装置300请求推理服务时,均可以不用执行建立连接的过程,而可以基于已建立的长连接直接向游戏推理装置300发送推理请求,以此可以有效降低游戏端200获取目标对象的推理结果的时延。相应的,游戏推理装置300在获取游戏端200提供的训练数据时,也可以是通过预先建立的长连接接收游戏端200发送的训练数据。Since the successful establishment of a communication connection between the game terminal 200 and the communication module 301 (such as a TCP connection established based on a three-way handshake, etc.) usually takes a certain amount of time, the delay of each request by the game terminal 200 for the action inference service may be caused by The connection establishment process between the game terminal 200 and the communication module 301 is added. Based on this, in a possible implementation, the game terminal 200 can establish a long connection with the communication module 301. For example, the game terminal 200 uses Hypertext Transfer Protocol Version 1.1 (Hypertext Transfer Protocol Version 1.1, HTTP1.1) to communicate with A long connection is established between the modules 301 and so on. In this way, after the game terminal 200 successfully establishes a connection with the communication module 301, each time it requests the game inference device 300 for inference services, it does not need to perform the process of establishing a connection, but can directly infer the game based on the established long connection. The apparatus 300 sends the reasoning request, so that the delay for the game terminal 200 to obtain the reasoning result of the target object can be effectively reduced. Correspondingly, when acquiring the training data provided by the game terminal 200 , the game reasoning apparatus 300 may also receive the training data sent by the game terminal 200 through a pre-established persistent connection.
在进一步可能的实施方式中,游戏端200与游戏推理装置300可以具有不同的部署环境,如游戏端200被部署于Windows操作系统对应的环境中,而游戏推理装置300被部署于Linux操作系统对应的环境中。此时,被部署于Linux操作系统对应的环境下的游戏推理装置300,可能难以直接根据Windows操作系统对应的环境下生成的待处理数据进行动作推理。因此,通信模块301可以先判断待处理数据的格式是否为目标格式,若不是,表明游戏端200与游戏推理装置300可能基于不同的环境进行部署,则通信模块301可以对第一格式的待处理数据进行解码处理,得到游戏推理装置300(也即云平台)所能识别的目标格式的待处理数据,并将其提供给推理模块302。其中,根据一种格式的数据解码得到另一种格式数据的具体实现,在实际场景存在相关应用,本实施例对该过程不再进行赘述。而若推理请求中待处理数据的格式为目标格式时,通信模块301可以直接将该待处理数据发送给推理模块302。相应的,游戏端200在向游戏推理装置300提供训练数据时,也可以是对该训练数据的格式进行相应的处理,以得到符合云平台的部署环境的格式的数据。In a further possible implementation, the game terminal 200 and the game inference device 300 may have different deployment environments, for example, the game terminal 200 is deployed in an environment corresponding to a Windows operating system, while the game inference device 300 is deployed in an environment corresponding to a Linux operating system in the environment. At this time, it may be difficult for the game inference device 300 deployed in the environment corresponding to the Linux operating system to directly perform action inference based on the data to be processed generated in the environment corresponding to the Windows operating system. Therefore, the communication module 301 can first determine whether the format of the data to be processed is the target format. If not, it indicates that the game terminal 200 and the game reasoning device 300 may be deployed based on different environments, and the communication module 301 can determine whether the format of the data to be processed is in the first format. The data is decoded to obtain data to be processed in a target format that can be recognized by the game inference device 300 (ie, the cloud platform), and provided to the inference module 302 . The specific implementation of decoding data in one format to obtain data in another format has related applications in actual scenarios, and the process will not be repeated in this embodiment. If the format of the data to be processed in the inference request is the target format, the communication module 301 can directly send the data to be processed to the inference module 302 . Correspondingly, when the game terminal 200 provides the training data to the game inference apparatus 300, it may also perform corresponding processing on the format of the training data to obtain data in a format conforming to the deployment environment of the cloud platform.
当其他客户端向游戏推理装置300发送推理请求时,通信模块301也可以是检测该其他客户端发送的推理请求中待处理数据的第二格式是否与目标格式相同,并且当第二格式与目标格式不一致时,通信模块301可以将该第二格式的待处理数据转换为目标格式的待处理数据。When another client sends an inference request to the game inference device 300, the communication module 301 may also detect whether the second format of the data to be processed in the inference request sent by the other client is the same as the target format, and when the second format is the same as the target When the formats are inconsistent, the communication module 301 may convert the data to be processed in the second format into the data to be processed in the target format.
S304:游戏推理装置300利用完成配置的第一游戏的推理服务对推理请求中的待处理数据 进行推理,得到针对该目标对象的动作和/或状态的指示信息。S304: The game inference device 300 uses the inference service of the configured first game to infer the data to be processed in the inference request, and obtains indication information for the action and/or state of the target object.
具体实现时,推理模块302在接收到通信模块301提供的待处理数据后,可以调用预先完成配置的推理服务,该推理服务依赖预先通过目标训练算法完成训练的AI模型,从而推理模块302可以先将待处理数据输入至该AI模型中,得到AI模型输出的为目标对象推理得到的动作(如攻击、逃跑等动作)和/或状态(如防御力增加、情绪)的指示信息。In a specific implementation, after receiving the data to be processed provided by the communication module 301, the inference module 302 can call the pre-configured inference service, which relies on the AI model trained in advance through the target training algorithm, so the inference module 302 can first Input the data to be processed into the AI model, and obtain the indication information of the action (such as attack, escape, etc.) and/or state (such as defense increase, emotion) output by the AI model for the target object inference.
可选的,推理模块302在得到指示信息之前,还可以利用启发式算法对AI模型输出的指示信息进行约束。其中,启发式算法,是基于直观或经验构造的算法,能够在可接受的时间和空间复杂度下给出待解决组合优化问题的各个实例的一个可行解,其中,所给出的可行解可能是最优解,也可能不是最优解,并且,可行解与最优解的偏离程度通常难以被预计。实际应用时,启发式算法中可以预先定义有约束规则。举例来说,在对战游戏场景中,推理模块302为非玩家角色(即目标对象)所推理得到的动作为“向后移动”,即向着远离对战的玩家角色的方向“逃跑”,但是非玩家角色当前在对战地图中的游戏位置后方存在不可跨越的障碍物,而在其余方向上没有障碍物物,即非玩家角色无法在该对战地图中“向后移动”。此时,推理模块302可以利用启发式算法对AI模型输出的推理动作进行约束,具体将“向后移动”的推理动作约束为“向左移动”或者“向右移动”,以提高推理模块302为非玩家角色推理出的动作的合理性。Optionally, before obtaining the indication information, the reasoning module 302 may also use a heuristic algorithm to constrain the indication information output by the AI model. Among them, a heuristic algorithm is an algorithm constructed based on intuition or experience, which can give a feasible solution to each instance of the combinatorial optimization problem to be solved under acceptable time and space complexity, wherein the given feasible solution may is the optimal solution, or it may not be the optimal solution, and the degree of deviation of the feasible solution from the optimal solution is usually difficult to predict. In practical applications, constraint rules can be predefined in the heuristic algorithm. For example, in the battle game scenario, the action deduced by the reasoning module 302 for the non-player character (ie the target object) is "moving backward", that is, "running away" in the direction away from the player character in battle, but the non-player character "runs away" The character currently has an insurmountable obstacle behind the game position in the battle map, and there are no obstacles in the remaining directions, that is, the non-player character cannot "move backward" in the battle map. At this time, the inference module 302 can use a heuristic algorithm to constrain the inference action output by the AI model, specifically constraining the inference action of "move backward" to be "move left" or "move right", so as to improve the inference module 302 The plausibility of inferred actions for non-player characters.
S305:游戏推理装置300向游戏端200返回针对目标对象的动作和/或状态的指示信息。S305: The game reasoning apparatus 300 returns the indication information for the action and/or state of the target object to the game terminal 200.
实际应用时,若游戏端200与游戏推理装置300被部署于不同的环境时,通信模块301将推理模块302输出的目标对象的动作和/或状态的指示信息编码成游戏端200所能识别的第一格式的指示信息,并向游戏端200发送该第一格式的动作指示信息,以使得游戏端200能够识别该指示信息,并设置该目标对象在下一刻执行该指示信息对应的动作或者展现该指示信息对应的状态。如此,由于游戏推理装置300能够为部署于多种环境的游戏端200提供动作推理服务,从而可以降低游戏端200对于部署环境的要求,提高游戏推理装置300为游戏端200提供针对游戏的推理服务的普适性。In practical application, if the game terminal 200 and the game inference device 300 are deployed in different environments, the communication module 301 encodes the indication information of the action and/or state of the target object output by the inference module 302 into the information that the game terminal 200 can recognize. instruction information in the first format, and send the action instruction information in the first format to the game terminal 200, so that the game terminal 200 can recognize the instruction information, and set the target object to perform the action corresponding to the instruction information at the next moment or show the Indicates the status corresponding to the information. In this way, since the game reasoning apparatus 300 can provide action reasoning services for the game terminals 200 deployed in various environments, the requirements for the deployment environment of the game terminals 200 can be reduced, and the game reasoning apparatus 300 can provide the game terminals 200 with game-specific reasoning services. universality.
上述实施例中,介绍了游戏推理装置300训练AI模型以及利用完成训练的AI模型实现对目标对象的动作和/或状态推理的过程。为了便于进一步理解本申请实施例的技术方案,下面将结合游戏场景的应用实例进行介绍。在该应用场景中,游戏开发者期望游戏推理装置300能够提供对游戏应用实例中非玩家角色A(简称角色A)的系列推理动作,使得角色A能够战胜该游戏中的非玩家角色B(简称角色B),并且角色A可以采用三种不同的战斗风格的系列推理动作实现战胜角色B,包括“激进型”、“保守型”、“平衡型”这三种战斗风格。其中,角色A和角色B的初始血量一致,攻击力以及防御力等强度一致,以及所可能执行的动作也一致。当在规定步数内,角色A攻击角色B并使得角色B的血量降低为0且角色A自身仍有血量时,则角色A获胜。设定角色A强度和角色B强度一致,可使用的招式也一致。In the above-mentioned embodiment, the process of training the AI model by the game reasoning apparatus 300 and using the trained AI model to infer the action and/or state of the target object is described. In order to facilitate further understanding of the technical solutions of the embodiments of the present application, the following will be introduced in conjunction with application examples of game scenarios. In this application scenario, the game developer expects the game reasoning device 300 to provide a series of reasoning actions for the non-player character A (referred to as character A) in the game application instance, so that character A can defeat non-player character B (referred to as character A) in the game character B), and character A can use a series of reasoning actions in three different combat styles to achieve victory over character B, including three combat styles of "aggressive", "conservative" and "balanced". Among them, the initial blood volume of character A and character B are the same, the strength of attack power and defense power are the same, and the actions that may be performed are also the same. When character A attacks character B within the specified number of steps and reduces character B's HP to 0 and character A still has HP, then character A wins. Set the strength of character A and character B to be the same, and the available moves are also the same.
参阅图6,示出了本申请实施例提供的一种结合具体游戏场景的在云平台配置游戏的推理服务的方法流程示意图,该方法可以应用于图7所示的游戏推理装置300中。其中,在图2所示的游戏推理装置300的基础上,图7所示的游戏推理装置300还可以包括对象存储服务模块304、部署模块305以及在线云服务模块306。Referring to FIG. 6 , a schematic flowchart of a method for configuring a game reasoning service on a cloud platform combined with a specific game scene provided by an embodiment of the present application is shown. The method can be applied to the game reasoning apparatus 300 shown in FIG. 7 . Wherein, based on the game inference apparatus 300 shown in FIG. 2 , the game inference apparatus 300 shown in FIG. 7 may further include an object storage service module 304 , a deployment module 305 and an online cloud service module 306 .
如图6所示,该方法具体可以包括:As shown in Figure 6, the method may specifically include:
S601:对象存储服务模块304预先存储实现通信模块301、游戏算法框架、推理模块302以及配置模块303的程序代码。S601: The object storage service module 304 prestores program codes for implementing the communication module 301, the game algorithm framework, the reasoning module 302 and the configuration module 303.
其中,通信模块301、游戏算法框架、推理模块302以及配置模块303的程序代码可以由技术人员预先进行开发并保存在对象存储服务模块304中。实际应用时,通信模块301例如可以是通过flask框架等方式实现。Among them, the program codes of the communication module 301 , the game algorithm framework, the reasoning module 302 and the configuration module 303 can be developed in advance by a technician and stored in the object storage service module 304 . In practical application, the communication module 301 may be implemented by, for example, a flask framework or the like.
S602:部署模块305将对象存储服务模块304存储的程序代码部署于云平台,并发布成在线服务。S602: The deployment module 305 deploys the program code stored by the object storage service module 304 on the cloud platform, and publishes it as an online service.
本实施例中,用于为游戏端200提供游戏的推理服务的通信模块301以及推理模块302可以形成一项云服务部署于云平台等,支持为游戏开发者在线提供服务。实际应用时,游戏推理装置300(或者通信模块301、游戏算法框架、推理模块302以及配置模块303)可以作为游戏AI(game AI)框架部署于云中心或者边缘中心。In this embodiment, the communication module 301 and the reasoning module 302 for providing game inference services for the game terminal 200 may form a cloud service and be deployed on a cloud platform, etc., to support online provision of services for game developers. In practical applications, the game inference device 300 (or the communication module 301, the game algorithm framework, the inference module 302, and the configuration module 303) can be deployed in a cloud center or an edge center as a game AI (game AI) framework.
S603:当游戏开发者101订阅该在线服务时,在线服务模块306拉取存储节点的实现通信模块301、游戏算法框架、推理模块302以及配置模块303的程序代码,并将其部署于云中心或者边缘中心中的计算节点上,以便利用计算节点上的计算资源支持AI模型的训练以及动作推理过程。S603: When the game developer 101 subscribes to the online service, the online service module 306 pulls the program code of the implementation communication module 301, the game algorithm framework, the reasoning module 302 and the configuration module 303 of the storage node, and deploys them in the cloud center or On the computing nodes in the edge center, in order to use the computing resources on the computing nodes to support the training of AI models and the action reasoning process.
实际应用时,在部署模块305将实现游戏的推理服务的程序代码发布成在线服务后,游戏开发者101可以通过游戏端200在云平台上订阅该在线服务,以触发游戏推理装置300为该游戏开发者101在云平台上配置游戏的推理服务。In practical application, after the deployment module 305 publishes the program code for realizing the inference service of the game as an online service, the game developer 101 can subscribe to the online service on the cloud platform through the game terminal 200 to trigger the game inference device 300 for the game. The developer 101 configures the inference service of the game on the cloud platform.
进一步地,在线服务模块306将实现通信模块301、游戏算法框架、推理模块302以及配置模块303的程序代码成功部署在计算节点后,通信模块301可以向游戏端200提供API接口,以便于游戏端200通过API接口与游戏推理装置300建立通信连接。示例性地,游戏端200与通信模块301之间可以基于HTTP 1.1等协议建立长连接。Further, after the online service module 306 successfully deploys the program codes that implement the communication module 301, the game algorithm framework, the reasoning module 302 and the configuration module 303 on the computing node, the communication module 301 can provide an API interface to the game terminal 200 to facilitate the game terminal. 200 establishes a communication connection with the game reasoning device 300 through the API interface. Exemplarily, a long connection can be established between the game terminal 200 and the communication module 301 based on protocols such as HTTP 1.1.
S604:游戏端200接收游戏开发者101提供的配置文件,该配置文件中定义了角色A以及角色B的动作空间、状态空间、训练算法的类型、训练算法所训练的AI模型的类型、用于训练AI模型的奖励函数、指示AI模型的存储地址的环境变量以及计算资源的规格,并将该配置文件转发给游戏推理装置300。S604: The game terminal 200 receives the configuration file provided by the game developer 101. The configuration file defines the action space, state space of character A and character B, the type of training algorithm, the type of AI model trained by the training algorithm, and the type of AI model used by the training algorithm. The reward function of the AI model is trained, the environment variable indicating the storage address of the AI model, and the specification of the computing resource, and the configuration file is forwarded to the game inference device 300 .
S605:游戏推理装置300中的配置模块303,根据通信模块301接收到的配置文件,从游戏算法框架中调用相应类型的训练算法以及AI模型。S605: The configuration module 303 in the game inference device 300 invokes the corresponding type of training algorithm and AI model from the game algorithm framework according to the configuration file received by the communication module 301.
其中,角色A以及角色B的动作空间,可以包括向前走、向后走、向左走、向右走、攻击1、攻击2、攻击3、攻击4、攻击5、向前跑、向后跳、向左跑、向右跑等动作。状态空间,包括角色A以及角色B的血量、位置、朝向、魔法量、攻击力、防御力等状态。Among them, the action space of character A and character B can include walking forward, walking backward, walking left, walking right, attack 1, attack 2, attack 3, attack 4, attack 5, run forward, and backward Jump, run left, run right, etc. State space, including character A and character B's health, position, orientation, mana, attack, defense and other states.
不同奖励函数,用于训练推理出不同战斗风格的AI模型。作为一种示例,在训练属于“平衡型”战斗风格的AI模型时,游戏开发者定义的奖励函数可以如下述公式(1)所示:Different reward functions are used to train AI models that infer different fighting styles. As an example, when training an AI model that belongs to the "balanced" fighting style, the reward function defined by the game developer can be shown in the following formula (1):
Figure PCTCN2022072425-appb-000001
Figure PCTCN2022072425-appb-000001
其中,
Figure PCTCN2022072425-appb-000002
表征利用奖励函数所计算出的奖励值,hp表征血量,t以及t-1表征不同的时刻,α为预设的系数值。
in,
Figure PCTCN2022072425-appb-000002
Represents the reward value calculated by the reward function, hp represents blood volume, t and t-1 represent different moments, and α is a preset coefficient value.
游戏推理装置300中的配置模块303接收到配置文件后,可以根据该配置文件中的计算资源规格,将计算节点上符合该规格的计算资源分配给游戏推理装置300,以便游戏推理装置300配置的游戏的推理服务通过该计算资源运行。并且,配置模块303可以根据配置文件中所指定的训练算法的类型以及AI模型的类型,从游戏算法框架中调用相应类型的训练算法以及AI模型,以便利用选取的训练算法对至少一个AI模型进行训练。示例性地,游戏算法框架中AI模型的网络结构中可以包含Dropout层以及L2正则化项,以此可以提高AI模型的泛化性能。After receiving the configuration file, the configuration module 303 in the game inference device 300 can allocate the computing resources on the computing nodes that meet the specifications to the game inference device 300 according to the computing resource specifications in the configuration file, so that the game inference device 300 configures the computing resources. The game's inference service runs through this computing resource. In addition, the configuration module 303 can call the corresponding type of training algorithm and AI model from the game algorithm framework according to the type of training algorithm and the type of AI model specified in the configuration file, so as to use the selected training algorithm to perform at least one AI model. train. Exemplarily, the network structure of the AI model in the game algorithm framework may include a Dropout layer and an L2 regularization term, so as to improve the generalization performance of the AI model.
由于从游戏算法框架中调用的AI模型尚未进行训练,其具有的推理效果较差,为此,游戏推理装置300还可以进一步执行步骤S606,以获取用于训练AI模型的训练数据。Since the AI model called from the game algorithm framework has not been trained yet, its reasoning effect is poor. Therefore, the game reasoning apparatus 300 may further perform step S606 to obtain training data for training the AI model.
S606:游戏端200上启动多个相同游戏应用的实例,并通过运行预设的脚本实现向通信模块301发送多份训练数据。其中,每个游戏应用实例中均包括角色A以及角色B,并且每个游戏应用实例产生一份针对角色A以及角色B的训练数据。S606: Start multiple instances of the same game application on the game terminal 200, and send multiple copies of training data to the communication module 301 by running a preset script. Wherein, each game application instance includes character A and character B, and each game application instance generates a piece of training data for character A and character B.
游戏端200上的脚本可以预先由技术人员进行开发,并且该脚本在运行时可以支持游戏端200与游戏推理装置300之间的通信。The script on the game terminal 200 can be developed by technical personnel in advance, and the script can support the communication between the game terminal 200 and the game reasoning device 300 when running.
本实施例中,游戏端200上同时运行多个游戏应用实例且并行产生多份训练数据,可以加快游戏推理装置300获取训练数据,从而可以加快游戏推理装置300的模型训练过程。In this embodiment, multiple game application instances are simultaneously run on the game terminal 200 and multiple copies of training data are generated in parallel, which can speed up the acquisition of training data by the game inference device 300 , thereby speeding up the model training process of the game inference device 300 .
S607:通信模块301对游戏端200发送的训练数据进行解码,得到推理模块302所能识别的目标格式的训练数据,并将其提供给配置模块303。S607 : The communication module 301 decodes the training data sent by the game terminal 200 to obtain the training data in the target format that can be recognized by the inference module 302 , and provides the training data to the configuration module 303 .
实际应用时,游戏端200与游戏推理装置300可能会被部署于不同的环境中,为此,通信模块301在接收到游戏端200发送的训练数据后,可以将该训练数据解码成游戏推理装置300所能识别的目标格式的训练数据。In practical applications, the game terminal 200 and the game inference device 300 may be deployed in different environments. Therefore, after receiving the training data sent by the game terminal 200, the communication module 301 can decode the training data into the game inference device. 300 training data in the target format that can be recognized.
在一种可能的实施方式中,通信模块301在将训练数据发送给配置模块303之前,还可以对训练数据进行预处理。例如,通信模块301可以对各份训练数据中的游戏地图以及位置坐标等信息进行标准化,并增加相应的用于描述角色距离、方位等信息的特征。In a possible implementation, before sending the training data to the configuration module 303, the communication module 301 may further preprocess the training data. For example, the communication module 301 can standardize information such as game maps and location coordinates in each piece of training data, and add corresponding features for describing information such as character distance and orientation.
S608:配置模块303运行多个进程,利用通信模块301转发的训练数据并行训练多个AI模型。S608: The configuration module 303 runs multiple processes, and uses the training data forwarded by the communication module 301 to train multiple AI models in parallel.
作为一种实现示例,配置模块303可以分布式训练出AI模型。具体的,配置模块303中可以包括多个进程,并且每个进程可以基于一份或者多份训练数据对一个AI模型进行训练,针对于每个AI模型,配置模块303可以将训练数据中角色A的数据输入至AI模型中,并得到该AI模型输出的角色A的推理动作,再利用推理得到的角色A的动作与角色B进行博弈,再利用的得到的博弈结果反馈调整AI模型中的参数。如此,可以提高配置模块303训练处多个AI模型的效率。类似的,针对角色B进行模型训练的过程,与针对角色A进行模型训练的过程类似,可参照相关之处描述,在此不做赘述。如图8所示,AI模型的初始训练阶段,角色A当对AI模型迭代训练200次左右,角色A在战斗结束时的剩余血量低于角色B在战斗结束时的剩余血量,但是,随着AI模型的迭代训练次数增加,当迭代训练达到100次时,角色A在战斗结束时的剩余血量开始高于角色B在战斗结束时的剩余血量,也即角色A能够战胜B。这一点也可以体现于如图9所示的双方胜率曲线图中,当迭代训练达到100次时,角色A的胜率接近100%。当针对于该AI模型的迭代训练达到收敛条件时,如AI模型在最近的预设次数(如20次)的迭代训练过程中,均能使得角色A的胜率达到预设值(如98%等),配置模块303可以继续采用类似的方式针对角色B对该AI模型进行训练。As an implementation example, the configuration module 303 may train an AI model in a distributed manner. Specifically, the configuration module 303 may include multiple processes, and each process may train an AI model based on one or more pieces of training data. For each AI model, the configuration module 303 may assign the role A in the training data to the training data. The data is input into the AI model, and the inference action of the character A output by the AI model is obtained, and then the action of the character A obtained by the inference is used to play a game with the character B, and the obtained game result is used to feed back and adjust the parameters in the AI model . In this way, the efficiency of multiple AI models where the configuration module 303 trains can be improved. Similarly, the process of model training for character B is similar to the process of model training for character A, which can be described with reference to relevant places, and will not be repeated here. As shown in Figure 8, in the initial training stage of the AI model, when character A iteratively trains the AI model for about 200 times, the remaining HP of character A at the end of the battle is lower than the remaining HP of character B at the end of the battle, however, As the number of iterative training of the AI model increases, when the iterative training reaches 100, the remaining HP of character A at the end of the battle begins to be higher than that of character B at the end of the battle, that is, character A can defeat B. This can also be reflected in the graph of the win rate of both sides as shown in Figure 9. When the iterative training reaches 100 times, the winning rate of character A is close to 100%. When the iterative training for the AI model reaches the convergence condition, for example, the AI model can make the winning rate of character A reach the preset value (eg 98%, etc.) during the most recent preset number of iterations (eg 20 times) ), the configuration module 303 can continue to train the AI model for character B in a similar manner.
然后,通过超参数搜索以及种群演化等方式,可以训练得到上述“激进型”、“保守型”、“平衡型”这三种战斗风格的AI模型。其中,超参数搜索以及种群演化的具体实现过程,可参阅前述实施例的相关之处描述,在此不做赘述。Then, through hyperparameter search and population evolution, the AI models of the above-mentioned three combat styles of "aggressive", "conservative" and "balanced" can be trained. For the specific implementation process of hyperparameter search and population evolution, reference may be made to the descriptions in the above-mentioned embodiments, which will not be repeated here.
其中,训练得到的AI模型,配置模块303可以根据配置文件中所执行的存储地址保存该AI模型。Wherein, for the AI model obtained by training, the configuration module 303 may save the AI model according to the storage address executed in the configuration file.
S609:配置模块303通过通信模块301向游戏端200反馈AI模型训练完成的通知。S609: The configuration module 303 feeds back a notification of the completion of the AI model training to the game terminal 200 through the communication module 301.
本实施例中,游戏端200与游戏推理装置300之间进行数据通信时,可以由通信模块301完成对通信数据的格式转换,以使得通信双方能够相互识别对方发送的通信数据。In this embodiment, when data communication is performed between the game terminal 200 and the game inference device 300, the communication module 301 can complete the format conversion of the communication data, so that the communication parties can mutually identify the communication data sent by the other party.
实际应用时,游戏开发者可以通过游戏端200查看到游戏推理装置300训练AI模型的过程中所产生的数据。例如,游戏开发者101可以通过游戏端200在云平台的界面上查看训练效果等。如图10所示,游戏开发者101可以在云平台的界面上查看到基于种群演化的方式所得到的三种战斗风格的AI模型,在模型训练过程中针对角色A的胜率变化曲线;或者,游戏开发者101可以云平台的界面上查看到如图8所示的双方对战血量的变化曲线,或者如图9所示的双方对战胜率的变化曲线等。In practical application, the game developer can view the data generated in the process of training the AI model by the game reasoning apparatus 300 through the game terminal 200 . For example, the game developer 101 can view the training effect and the like on the interface of the cloud platform through the game terminal 200 . As shown in FIG. 10 , the game developer 101 can view the AI models of the three combat styles based on the population evolution method on the interface of the cloud platform, and the change curve of the winning rate for character A during the model training process; or, The game developer 101 can view the change curve of the blood volume between the two sides as shown in FIG. 8 , or the change curve of the victory rate between the two sides as shown in FIG. 9 on the interface of the cloud platform.
S610:游戏端200向游戏推理装置300发送动作推理请求,用于请求对角色A进行动作推理,该动作推理请求中包括角色A以及角色B的游戏画面以及战斗风格的标识。S610: The game terminal 200 sends an action inference request to the game inference device 300 for requesting action inference for the character A, where the action inference request includes the game screen of the character A and the character B and the identification of the fighting style.
S611:通信模块301在转换得到目标格式的游戏画面以及战斗风格标识后,将其发送给推理模块302。S611 : After the communication module 301 obtains the game screen and the fighting style identifier in the target format after conversion, it sends it to the reasoning module 302 .
S612:推理模块302利用与该战斗风格的标识对应的AI模型,根据该目标格式的游戏画面推理出角色A的动作指示信息。S612: The reasoning module 302 uses the AI model corresponding to the identifier of the fighting style to infer the action instruction information of the character A according to the game screen of the target format.
S613:推理模块302将角色A的动作指示信息发送给通信模块301。S613: The reasoning module 302 sends the action indication information of the character A to the communication module 301.
S614:通信模块301在完成对于动作指示信息的格式转换后,将游戏端200能够识别的格式的动作指示信息发送给游戏端200。S614: After completing the format conversion of the action indication information, the communication module 301 sends the action indication information in a format that can be recognized by the game terminal 200 to the game terminal 200.
以上结合图1至图10对本申请实施例提供的在云平台配置游戏的推理服务的方法进行介绍,接下来结合附图对本申请实施例提供的用于实现上述方法实施例中游戏推理装置300功能的计算设备进行介绍。The method for configuring a game reasoning service on a cloud platform provided by the embodiments of the present application is described above with reference to FIGS. 1 to 10 . Next, the functions of the game reasoning apparatus 300 provided by the embodiments of the present application for implementing the above method embodiments are described with reference to the accompanying drawings. of computing devices.
图11提供了一种计算设备集群。如图11所示,计算设备集群1100具体可以用于实现上述图3所示的游戏推理装置300的功能。Figure 11 provides a computing device cluster. As shown in FIG. 11 , the computing device cluster 1100 can be specifically used to implement the functions of the game reasoning apparatus 300 shown in FIG. 3 .
计算设备集群1100包括至少一个计算设备,其中,每个计算设备可以包括总线1101、处理器1102和存储器1103。处理器1102、存储器1103之间通过总线1101通信。Computing device cluster 1100 includes at least one computing device, where each computing device may include a bus 1101 , a processor 1102 and a memory 1103 . The processor 1102 and the memory 1103 communicate through the bus 1101 .
总线1101可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图11中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The bus 1101 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 11, but it does not mean that there is only one bus or one type of bus.
处理器1102可以为中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)、神经网络处理器(neural network processing unit,NPU)等处理器中的任意一种或多种。The processor 1102 can be a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP), a neural network Any one or more of processors such as a network processor (neural network processing unit, NPU).
存储器1103可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器1103还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard drive drive,HDD)或固态硬盘(solid state drive,SSD)。The memory 1103 may include volatile memory, such as random access memory (RAM). The memory 1103 may also include non-volatile memory, such as read-only memory (ROM), flash memory, hard drive (HDD) or solid state drive , SSD).
存储器1103中存储有可执行的程序代码,处理器1102执行该可执行的程序代码以执行前述游戏推理装置300所执行的在云平台配置游戏的推理服务的方法。Executable program codes are stored in the memory 1103, and the processor 1102 executes the executable program codes to execute the aforementioned method for configuring an inference service of a game on a cloud platform executed by the game inference device 300.
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设 备执行上述游戏推理装置300所执行的在云平台配置游戏的推理服务的方法。Embodiments of the present application also provide a computer-readable storage medium. The computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center that contains one or more available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state drives), and the like. The computer-readable storage medium includes instructions, and the instructions instruct the computing device to execute the method for configuring an inference service for a game on a cloud platform, which is executed by the game inference apparatus 300 described above.
本申请实施例还提供了一种计算机程序产品。所述计算机程序产品包括一个或多个计算机指令。在计算设备上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。The embodiments of the present application also provide a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computing device, all or part of the processes or functions described in the embodiments of the present application are generated.
所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机或数据中心进行传输。The computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted over a wire from a website site, computer or data center. (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) to another website site, computer or data center.
所述计算机程序产品可以为一个软件安装包,在需要使用前述对象识别方法的任一方法的情况下,可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。The computer program product can be a software installation package, which can be downloaded and executed on a computing device when any of the aforementioned object recognition methods needs to be used.
上述各个附图对应的流程或结构的描述各有侧重,某个流程或结构中没有详述的部分,可以参见其他流程或结构的相关描述。The descriptions of the processes or structures corresponding to each of the above-mentioned drawings have their own emphasis, and for the parts that are not described in detail in a certain process or structure, reference may be made to the related descriptions of other processes or structures.

Claims (27)

  1. 一种在云平台配置游戏的推理服务的方法,其特征在于,所述方法包括:A method for configuring a game reasoning service on a cloud platform, characterized in that the method comprises:
    获取第一配置文件,所述第一配置文件包括针对第一游戏的配置信息;obtaining a first configuration file, where the first configuration file includes configuration information for the first game;
    基于所述云平台的游戏算法框架和所述第一配置文件,在所述云平台配置所述第一游戏的推理服务。Based on the game algorithm framework of the cloud platform and the first configuration file, the inference service of the first game is configured on the cloud platform.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    获取第二配置文件,所述第二配置文件包括针对第二游戏的配置信息;obtaining a second configuration file, where the second configuration file includes configuration information for the second game;
    基于所述云平台的所述游戏算法框架和所述第二配置文件,在所述云平台配置所述第二游戏的推理服务。Based on the game algorithm framework and the second configuration file of the cloud platform, the reasoning service of the second game is configured on the cloud platform.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    利用所述第一游戏的推理服务对游戏端发送的推理请求进行响应,其中,所述游戏端包括运行所述第一游戏的游戏应用实例的设备,所述推理请求包括针对所述第一游戏的游戏应用实例中的目标对象的待处理数据,所述响应包括针对所述目标对象的动作和/或状态的指示信息。Utilize the inference service of the first game to respond to an inference request sent by a game terminal, wherein the game terminal includes a device running a game application instance of the first game, and the inference request includes an inference request for the first game The to-be-processed data of the target object in the game application instance, the response includes the indication information for the action and/or state of the target object.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述第一配置文件包括以下配置信息中的一种或多种:所述第一游戏的游戏应用实例中的目标对象的动作空间、所述第一游戏的游戏应用实例中的目标对象的状态空间、目标训练算法的第一类型、人工智能AI模型的第二类型、奖励函数、所述AI模型的训练方式、所述AI模型的推理方式、所述AI模型的保存地址、用于所述AI模型的训练和推理的计算资源的规格。The method according to any one of claims 1-3, wherein the first configuration file includes one or more of the following configuration information: the target object in the game application instance of the first game Action space, the state space of the target object in the game application instance of the first game, the first type of target training algorithm, the second type of artificial intelligence AI model, the reward function, the training method of the AI model, the The inference method of the AI model, the storage address of the AI model, and the specification of computing resources used for training and inference of the AI model.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述基于所述云平台的游戏算法框架和所述第一配置文件在所述云平台配置第一游戏的推理服务,包括:The method according to any one of claims 1-4, wherein the inference service of the first game is configured on the cloud platform based on the game algorithm framework of the cloud platform and the first configuration file, comprising: :
    基于所述第一配置文件和所述游戏算法框架,对至少一个AI模型进行训练;training at least one AI model based on the first configuration file and the game algorithm framework;
    根据训练完成的所述至少一个AI模型配置所述第一游戏的推理服务。The reasoning service of the first game is configured according to the at least one AI model that has been trained.
  6. 根据权利要求5所述的方法,其特征在于,所述对至少一个AI模型进行训练,包括:The method according to claim 5, wherein the training of at least one AI model comprises:
    接收来自游戏端的多个训练请求,所述多个训练请求来自于所述第一游戏的多个游戏应用实例,不同训练请求包括针对所述多个游戏应用实例中的同一目标对象的不同训练数据;Receive multiple training requests from the game terminal, the multiple training requests are from multiple game application instances of the first game, and different training requests include different training data for the same target object in the multiple game application instances ;
    根据所述多个训练请求中的训练数据对所述至少一个AI模型进行训练。The at least one AI model is trained according to the training data in the plurality of training requests.
  7. 根据权利要求5或6所述的方法,其特征在于,当所述至少一个AI模型包括第一AI模型和第二AI模型时,所述第一AI模型和所述第二AI模型的超参数不同,和/或,所述第一AI模型和所述第二AI模型对应的奖励函数不同。The method according to claim 5 or 6, wherein when the at least one AI model includes a first AI model and a second AI model, the hyperparameters of the first AI model and the second AI model and/or, the reward functions corresponding to the first AI model and the second AI model are different.
  8. 根据权利要求5-7任一项所述的方法,其特征在于,当所述至少一个AI模型包括第一AI模型和第二AI模型时,所述云平台运行有第一进程和第二进程,所述根据所述多个训练请求中的训练数据对所述至少一个AI模型进行训练,包括:The method according to any one of claims 5-7, wherein when the at least one AI model includes a first AI model and a second AI model, the cloud platform runs a first process and a second process , the at least one AI model is trained according to the training data in the multiple training requests, including:
    根据所述第一进程的端口号和/或IP地址和所述第二进程的端口号和/或IP地址,将所述 多个训练请求中的训练数据发送至所述第一进程和所述第二进程;According to the port number and/or IP address of the first process and the port number and/or IP address of the second process, the training data in the plurality of training requests is sent to the first process and the the second process;
    利用所述第一进程和所述第一进程接收到的训练数据训练所述第一AI模型,利用所述第二进程和所述第二进程接收到的训练数据训练所述第二AI模型。The first AI model is trained using the first process and the training data received by the first process, and the second AI model is trained using the second process and the training data received by the second process.
  9. 根据权利要求5-8任一项所述的方法,其特征在于,所述基于所述第一配置文件和所述游戏算法框架,对至少一个AI模型进行训练,包括:The method according to any one of claims 5-8, wherein the training of at least one AI model based on the first configuration file and the game algorithm framework includes:
    根据所述第一配置文件中的目标训练算法的第一类型以及所述AI模型的第二类型,在所述游戏算法框架中调用所述第一类型的目标训练算法以及所述第二类型的所述至少一个AI模型;According to the first type of target training algorithm and the second type of the AI model in the first configuration file, the first type of target training algorithm and the second type of target training algorithm are called in the game algorithm framework the at least one AI model;
    基于调用的所述第一类型的目标训练算法,对所述第二类型的所述至少一个AI模型进行训练。The at least one AI model of the second type is trained based on the called target training algorithm of the first type.
  10. 根据权利要求1-9任一项所述的方法,其特征在于,当游戏端和所述云平台的数据格式不同时,在利用所述第一游戏的推理服务对游戏端发送的推理请求进行响应之前,所述方法还包括:The method according to any one of claims 1-9, wherein when the data formats of the game terminal and the cloud platform are different, the reasoning request sent by the game terminal is performed using the reasoning service of the first game. Before responding, the method further includes:
    对所述游戏端发送的推理请求中数据的格式进行处理,得到所述云平台能够识别的数据格式的数据。The format of the data in the reasoning request sent by the game terminal is processed to obtain data in a data format that can be recognized by the cloud platform.
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述云平台与所述游戏端之间保持长连接,并且,所述云平台通过所述长连接接收和响应所述游戏端发送的推理请求。The method according to any one of claims 1-10, wherein a long connection is maintained between the cloud platform and the game terminal, and the cloud platform receives and responds to the game through the long connection The inference request sent by the client.
  12. 根据权利要求1-11任一项所述的方法,其特征在于,所述获取第一配置文件包括:The method according to any one of claims 1-11, wherein the acquiring the first configuration file comprises:
    基于游戏开发者选择的配置信息项,获取所述第一配置文件。The first configuration file is acquired based on the configuration information item selected by the game developer.
  13. 一种配置游戏的推理服务的装置,其特征在于,所述装置包括:An apparatus for configuring a reasoning service for a game, wherein the apparatus comprises:
    通信模块,用于获取第一配置文件,所述第一配置文件包括针对第一游戏的配置信息;a communication module, configured to obtain a first configuration file, where the first configuration file includes configuration information for the first game;
    配置模块,用于基于云平台的游戏算法框架和所述第一配置文件,在所述云平台配置所述第一游戏的推理服务。The configuration module is configured to configure the reasoning service of the first game on the cloud platform based on the game algorithm framework of the cloud platform and the first configuration file.
  14. 根据权利要求13所述的装置,其特征在于,The device of claim 13, wherein:
    所述通信模块,还用于获取第二配置文件,所述第二配置文件包括针对第二游戏的配置信息;The communication module is further configured to obtain a second configuration file, where the second configuration file includes configuration information for the second game;
    所述配置模块,还用于基于所述云平台的所述游戏算法框架和所述第二配置文件,在所述云平台配置所述第二游戏的推理服务。The configuration module is further configured to configure the reasoning service of the second game on the cloud platform based on the game algorithm framework and the second configuration file of the cloud platform.
  15. 根据权利要求13或14所述的装置,其特征在于,所述装置还包括:The device according to claim 13 or 14, wherein the device further comprises:
    推理模块,用于利用所述第一游戏的推理服务对游戏端发送的推理请求进行响应,其中,所述游戏端包括运行所述第一游戏的游戏应用实例的设备,所述推理请求包括针对所述第一游戏的游戏应用实例中的目标对象的待处理数据,所述响应包括针对所述目标对象的动作和/或状态的指示信息。an inference module, configured to use the inference service of the first game to respond to an inference request sent by a game terminal, wherein the game terminal includes a device running a game application instance of the first game, and the inference request includes an inference request for The to-be-processed data of the target object in the game application instance of the first game, and the response includes indication information for the action and/or state of the target object.
  16. 根据权利要求13-15任一项所述的装置,其特征在于,所述第一配置文件包括以下配置信息中的一种或多种:所述第一游戏的游戏应用实例中的目标对象的动作空间、所述第一 游戏的游戏应用实例中的目标对象的状态空间、目标训练算法的第一类型、人工智能AI模型的第二类型、奖励函数、所述AI模型的训练方式、所述AI模型的推理方式、所述AI模型的保存地址、用于所述AI模型的训练和推理的计算资源的规格。The device according to any one of claims 13 to 15, wherein the first configuration file includes one or more of the following configuration information: the target object in the game application instance of the first game Action space, the state space of the target object in the game application instance of the first game, the first type of target training algorithm, the second type of artificial intelligence AI model, the reward function, the training method of the AI model, the The inference method of the AI model, the storage address of the AI model, and the specification of computing resources used for training and inference of the AI model.
  17. 根据权利要求13-16任一项所述的装置,其特征在于,所述配置模块,具体用于:The device according to any one of claims 13-16, wherein the configuration module is specifically used for:
    基于所述第一配置文件和所述游戏算法框架,对至少一个AI模型进行训练;training at least one AI model based on the first configuration file and the game algorithm framework;
    根据训练完成的所述至少一个AI模型配置所述第一游戏的推理服务。The reasoning service of the first game is configured according to the at least one AI model that has been trained.
  18. 根据权利要求17所述的装置,其特征在于,所述配置模块,具体用于:The device according to claim 17, wherein the configuration module is specifically used for:
    接收来自游戏端的多个训练请求,所述多个训练请求来自于所述第一游戏的多个游戏应用实例,不同训练请求包括针对所述多个游戏应用实例中的同一目标对象的不同训练数据;Receive multiple training requests from the game terminal, the multiple training requests are from multiple game application instances of the first game, and different training requests include different training data for the same target object in the multiple game application instances ;
    根据所述多个训练请求中的训练数据对所述至少一个AI模型进行训练。The at least one AI model is trained according to the training data in the plurality of training requests.
  19. 根据权利要求17或18所述的装置,其特征在于,当所述至少一个AI模型包括第一AI模型和第二AI模型时,所述第一AI模型和所述第二AI模型的超参数不同,和/或,所述第一AI模型和所述第二AI模型对应的奖励函数不同。The apparatus according to claim 17 or 18, wherein when the at least one AI model includes a first AI model and a second AI model, hyperparameters of the first AI model and the second AI model and/or, the reward functions corresponding to the first AI model and the second AI model are different.
  20. 根据权利要求17-19任一项所述的装置,其特征在于,当所述至少一个AI模型包括第一AI模型和第二AI模型时,所述云平台运行有第一进程和第二进程,所述配置模块,具体用于:The apparatus according to any one of claims 17-19, wherein when the at least one AI model includes a first AI model and a second AI model, the cloud platform runs a first process and a second process , the configuration module is specifically used for:
    根据所述第一进程的端口号和/或IP地址和所述第二进程的端口号和/或IP地址,将所述多个训练请求中的训练数据发送至所述第一进程和所述第二进程;According to the port number and/or IP address of the first process and the port number and/or IP address of the second process, the training data in the plurality of training requests is sent to the first process and the the second process;
    利用所述第一进程和所述第一进程接收到的训练数据训练所述第一AI模型,利用所述第二进程和所述第二进程接收到的训练数据训练所述第二AI模型。The first AI model is trained using the first process and the training data received by the first process, and the second AI model is trained using the second process and the training data received by the second process.
  21. 根据权利要求17-20任一项所述的装置,其特征在于,所述配置模块,具体用于:The device according to any one of claims 17-20, wherein the configuration module is specifically used for:
    根据所述第一配置文件中的目标训练算法的第一类型以及所述AI模型的第二类型,在所述游戏算法框架中调用所述第一类型的目标训练算法以及所述第二类型的所述至少一个AI模型;According to the first type of target training algorithm and the second type of the AI model in the first configuration file, the first type of target training algorithm and the second type of target training algorithm are called in the game algorithm framework the at least one AI model;
    基于调用的所述第一类型的目标训练算法,对所述第二类型的所述至少一个AI模型进行训练。The at least one AI model of the second type is trained based on the called target training algorithm of the first type.
  22. 根据权利要求13-21任一项所述的装置,其特征在于,当游戏端和所述云平台的数据格式不同时,在利用所述第一游戏的推理服务对游戏端发送的推理请求进行响应之前,所述方法还包括:The device according to any one of claims 13-21, wherein when the data formats of the game terminal and the cloud platform are different, the reasoning request sent by the game terminal is performed using the reasoning service of the first game. Before responding, the method further includes:
    对所述游戏端发送的推理请求中数据的格式进行处理,得到所述云平台能够识别的数据格式的数据。The format of the data in the reasoning request sent by the game terminal is processed to obtain data in a data format that can be recognized by the cloud platform.
  23. 根据权利要求13-22任一项所述的装置,其特征在于,所述云平台与游戏端之间保持长连接,并且,所述云平台通过所述长连接接收和响应所述游戏端发送的推理请求。The device according to any one of claims 13-22, wherein a long connection is maintained between the cloud platform and the game terminal, and the cloud platform receives and responds to the transmission from the game terminal through the long connection inference request.
  24. 根据权利要求13-23任一项所述的装置,其特征在于,所述通信模块,具体用于基于游戏开发者选择的配置信息项,获取所述第一配置文件。The device according to any one of claims 13-23, wherein the communication module is specifically configured to acquire the first configuration file based on a configuration information item selected by a game developer.
  25. 一种计算设备集群,其特征在于,所述计算设备集群包括至少一个计算设备,每个计算设备包括处理器、存储器;A computing device cluster, characterized in that the computing device cluster includes at least one computing device, and each computing device includes a processor and a memory;
    所述处理器用于执行所述存储器中存储的指令,以使所述至少一个计算设备执行如权利要求1至12任一项所述的方法。The processor is adapted to execute instructions stored in the memory to cause the at least one computing device to perform the method of any one of claims 1 to 12.
  26. 一种计算机可读存储介质,其特征在于,包括指令,当所述指令在计算设备运行时,使得所述计算设备执行如权利要求1至12中任一项所述的方法。A computer-readable storage medium, comprising instructions that, when executed on a computing device, cause the computing device to perform the method of any one of claims 1 to 12.
  27. 一种包含指令的计算机程序产品,当其在计算设备上运行时,使得所述计算设备执行如权利要求1至12中任一项所述的方法。A computer program product comprising instructions which, when run on a computing device, cause the computing device to perform the method of any one of claims 1 to 12.
PCT/CN2022/072425 2021-04-09 2022-01-17 Method and apparatus for configuring game inference service on cloud platform, and related device WO2022213702A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110379975.X 2021-04-09
CN202110379975 2021-04-09
CN202110742189.1 2021-06-30
CN202110742189.1A CN115193053A (en) 2021-04-09 2021-06-30 Method and device for configuring inference service of game on cloud platform and related equipment

Publications (1)

Publication Number Publication Date
WO2022213702A1 true WO2022213702A1 (en) 2022-10-13

Family

ID=83545057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/072425 WO2022213702A1 (en) 2021-04-09 2022-01-17 Method and apparatus for configuring game inference service on cloud platform, and related device

Country Status (1)

Country Link
WO (1) WO2022213702A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090149246A1 (en) * 2007-12-05 2009-06-11 Verizon Laboratories, Inc. Method and apparatus for providing customized games
CN108701265A (en) * 2016-03-14 2018-10-23 欧姆龙株式会社 Learning Service provides device
US10272341B1 (en) * 2016-12-20 2019-04-30 Amazon Technologies, Inc. Procedural level generation for games
US20190197402A1 (en) * 2017-04-19 2019-06-27 AIBrain Corporation Adding deep learning based ai control

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090149246A1 (en) * 2007-12-05 2009-06-11 Verizon Laboratories, Inc. Method and apparatus for providing customized games
CN108701265A (en) * 2016-03-14 2018-10-23 欧姆龙株式会社 Learning Service provides device
US10272341B1 (en) * 2016-12-20 2019-04-30 Amazon Technologies, Inc. Procedural level generation for games
US20190197402A1 (en) * 2017-04-19 2019-06-27 AIBrain Corporation Adding deep learning based ai control

Similar Documents

Publication Publication Date Title
US11050823B2 (en) Method, apparatus, and system for playing scene animation
JP6405478B1 (en) Distributed ledger device and distributed ledger method for adjustment of blockchain-based game difficulty
US9662588B2 (en) Spawning new timelines during game session replay
JP4572119B2 (en) Dynamic bandwidth control
JP6398001B2 (en) Spawn new timeline during game session replay
JP2010525422A (en) A distributed network architecture that introduces dynamic content into an artificial environment
CN112870721B (en) Game interaction method, device, equipment and storage medium
CN111773665A (en) Data processing method, platform and device based on game platform
Duong et al. QoS-aware revenue-cost optimization for latency-sensitive services in IaaS clouds
CN110585697A (en) Cross-server control method and device in game
CN110191116B (en) Malicious node isolation method and system, computing power verification terminal and P2P network
JP2017517785A (en) System and method for operating an artificial social network
KR20200106024A (en) Flexible computer gaming based on machine learning
WO2022213702A1 (en) Method and apparatus for configuring game inference service on cloud platform, and related device
JP7111822B2 (en) Group gameplay with users in close proximity using gaming platforms
CN111632385A (en) Game control method and device, computer equipment and storage medium
CN111841019A (en) Game application security verification method, device, equipment and storage medium
US20140274407A1 (en) Mechanism for implementing cloud local area network party service for multi-player content/game environment
CN115193053A (en) Method and device for configuring inference service of game on cloud platform and related equipment
JP7266827B2 (en) Server and method for providing game services based on control by applications other than game applications
KR101874590B1 (en) Method for providing communication middleware service using universal game network library
CN117046111B (en) Game skill processing method and related device
Forsbacka et al. A Peer-to-Peer Networking Framework for Scalable Massively Multiplayer Online Game Development in Unity
US9731192B1 (en) Task-based content management
US20240123347A1 (en) Game interactive control method and apparatus, storage medium and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22783773

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22783773

Country of ref document: EP

Kind code of ref document: A1