CN113648658A

CN113648658A - Game data processing method, system, device, computer equipment and storage medium

Info

Publication number: CN113648658A
Application number: CN202110837834.8A
Authority: CN
Inventors: 刘舟; 杨帆; 黎广璘
Original assignee: Guangzhou Sanqi Mutual Entertainment Technology Co ltd
Current assignee: Guangzhou Sanqi Mutual Entertainment Technology Co ltd
Priority date: 2021-07-23
Filing date: 2021-07-23
Publication date: 2021-11-16
Anticipated expiration: 2041-07-23
Also published as: CN113648658B

Abstract

The application relates to a game data processing method, a game data processing device, a computer device and a storage medium. The method comprises the following steps: the method comprises the steps of obtaining game environment data corresponding to a current game from a first server, wherein the first server is used for collecting the game environment data of a plurality of games, the current game is any one of the plurality of games, converting the game environment data to obtain game training interaction data, sending the game training interaction data to a second server, training a neural network model by the second server according to the game training interaction data to obtain a current game action and a current game reward value output by the neural network model, stopping training when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game. By adopting the method, the aim that different games can access the same A I training scheme can be achieved, A I corresponding to different game scenes does not need to be written, the development cost is reduced, and the development efficiency is improved.

Description

Game data processing method, system, device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a game data processing method and apparatus, a computer device, and a storage medium.

Background

With the development of games, a game needs a lot of AI (Artificial Intelligence), so that the world of the game is more real and colorful, the traditional AI needs to depend on developers to manually write behavior logic, however, different game scenes need to write different AI for training, and the manual writing of behavior logic by the developers easily causes high development cost and low development efficiency.

Disclosure of Invention

Therefore, it is necessary to provide a game data processing method, system, device, computer device, and storage medium for the above technical problems, which can perform standardized format conversion on game environment data corresponding to different games, train the same set of neural network model through the converted data, achieve the purpose that different games can access the same set of AI training scheme, and do not need to compile AIs corresponding to different game scenes, thereby reducing development cost and improving development efficiency.

A game data processing method, the method comprising:

acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games;

carrying out data type conversion on the game environment data to obtain game training interactive data in a standard data format;

and sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

In one embodiment, the game data processing method further includes: receiving a game environment data acquisition instruction sent by a second server, wherein the game environment data acquisition instruction is generated by triggering when the current game reward value does not reach a preset game reward threshold value, acquiring game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interactive data in a standard data format, and stopping the training of the neural network model until the current game reward value output by the neural network model in the second server reaches the preset game reward threshold value.

In one embodiment, before obtaining the game environment data corresponding to the current game from the first server, the method further includes: when the connection with the first server is successfully established through the preset communication protocol, the initialization parameters sent by the first server are received, the initialization parameters are sent to the second server based on the preset communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters to obtain the initialized neural network model, a current game resetting instruction sent by the second server is received, and the current game resetting instruction is sent to the first server, so that the first server resets the game scene of the current game according to the current game resetting instruction.

In one embodiment, the data type conversion of the game environment data to obtain the game training interaction data in the standard data format includes: extracting game environment key parameters from the game environment data, obtaining a corresponding game environment data matrix according to the game environment parameters and preset matrix dimension information, and generating game training interactive data according to the game environment key parameters and the game environment data matrix.

In one embodiment, generating game training interaction data according to the game environment key parameters and the game environment data matrix comprises: determining a corresponding first information type according to the game environment key parameters, determining a corresponding second information type according to the game environment data matrix, transmitting the game environment key parameters into standard parameters of which the data type corresponding to the first information type is a structural body to obtain first game training interactive data, transmitting the game environment data matrix into standard parameters of which the data type corresponding to the second information type is the structural body to obtain second game training interactive data, and obtaining the game training interactive data according to the first game training interactive data and the second game training interactive data.

In one embodiment, the game data processing method further includes: the method comprises the steps of obtaining target game environment data corresponding to a target game from a first server, carrying out standardized data format conversion on the target game environment data to obtain corresponding target game interaction data, sending the target game interaction data to a second server to enable the second server to call a matched target game model file according to the target game interaction data, predicting to obtain a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file, and sending the target game action returned by the second server to the first server to enable the first server to control a target object in the target game to execute the next action according to the target game action.

A game data processing system, the system comprising:

the game data processing device is used for sending a current game environment data acquisition request to the first server, wherein the current game environment data acquisition request carries a current game identifier;

the first server is used for acquiring game environment data corresponding to the current game identifier from the game environment data corresponding to the multiple games according to the current game environment data acquisition request and returning the game environment data to the game data processing equipment;

the game data processing equipment is used for receiving the game environment data, performing data type conversion on the game environment data to obtain game training interactive data in a standard data format, and sending the game training interactive data to the second server;

and the second server is used for receiving the game training interaction data, training the neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action and generating a game model file corresponding to the current game.

A game data processing apparatus, the apparatus comprising:

the game system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring game environment data corresponding to a current game from a first server, the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games;

the conversion module is used for carrying out data type conversion on the game environment data to obtain game training interactive data in a standard data format;

and the sending module is used for sending the game training interaction data to the second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, when the current game reward value reaches a preset game reward threshold value, the training of the neural network model is stopped, the current game action is output, and a game model file corresponding to the current game is generated.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

converting the data type of the game environment data to obtain game training interactive data in a standard data format;

The game data processing method, the system, the device, the computer equipment and the storage medium acquire the game environment data corresponding to the current game from the first server, the first server is used for acquiring the game environment data corresponding to a plurality of games, the current game is any one of the games, performing data type conversion on the game environment data to obtain game training interactive data in a standard data format, sending the game training interactive data to a second server so that the second server can perform data type conversion on the game environment data according to the game training interactive data, training the neural network model to obtain the current game action and the current game reward value corresponding to the current game output by the neural network model, and when the current game reward value reaches a preset game reward threshold value, stopping the neural network model training, outputting the current game action, and generating a game model file corresponding to the current game.

Therefore, the game environment data corresponding to different games can be subjected to standardized format conversion, the same set of neural network model is trained through the converted data, the purpose that different games can be accessed to the same set of AI training scheme is achieved, the AI corresponding to different game scenes does not need to be compiled, the development cost is reduced, and the development efficiency is improved.

Drawings

FIG. 1 is a diagram of an application environment of a game data processing method according to an embodiment;

FIG. 2 is a flow diagram illustrating a method of processing game data according to one embodiment;

FIG. 3 is a flow diagram illustrating a method of processing game data according to one embodiment;

FIG. 4 is a flowchart illustrating a game data processing method according to another embodiment;

FIG. 5 is a flowchart illustrating a game environment data conversion step according to an embodiment;

FIG. 6 is a schematic flow chart diagram illustrating the game training interaction data generation step in one embodiment;

FIG. 7 is a flow diagram illustrating a method of processing game data according to one embodiment;

FIG. 8 is a block diagram of a game data processing system in one embodiment;

FIG. 9 is a block diagram showing the construction of a game data processing device according to an embodiment;

FIG. 10 is a diagram showing an internal structure of a computer device in one embodiment;

FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The game data processing method provided by the application can be applied to the application environment shown in FIG. 1. Wherein the game data processing device 102 communicates with the first server 104 via a network, and the game data processing device 102 communicates with the second server 106 via a network. The game data processing 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and may also be an independent server or a server cluster composed of a plurality of servers, and the first server 104 and the second server 106 may be implemented by independent servers or a server cluster composed of a plurality of servers. The game data processing device 102 may also be embedded in sdk (Software Development Kit) on the first server or the second server.

Specifically, the game data processing device 102 sends a current game environment data acquisition request to the first server, where the current game environment data acquisition request carries a current game identifier, and the first server 104 acquires, according to the current game environment data acquisition request, game environment data corresponding to the current game identifier from game environment data corresponding to a plurality of games, and returns the game environment data to the game data processing device 102. Further, the game data processing device 102 receives the game environment data, performs data type conversion on the game environment data to obtain game training interaction data in a standard data format, sends the game training interaction data to the second server 106, the second server 106 receives the game training interaction data, trains the neural network model according to the game training interaction data to obtain a current game action and a current game reward value corresponding to a current game output by the neural network model, stops training the neural network model when the current game reward value reaches a preset game reward threshold value, outputs the current game action, and generates a game model file corresponding to the current game.

In one embodiment, as shown in fig. 2, a game data processing method is provided, which is described by taking the game data processing device in fig. 1 as an example, and comprises the following steps:

step 202, obtaining game environment data corresponding to a current game from a first server, where the first server is used to collect the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

The first server may be a server related to a service, such as a game server, and the current game may be a game currently being processed, or may be determined according to an actual service requirement, a product requirement, or an application scenario, where the first server is configured to collect game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

The game environment data is environment data related to the current game, and may be, but is not limited to, scene parameters of the current game in the running process, control object data corresponding to the current game, and the like.

Specifically, the current game environment data acquisition request is sent to the first server, the current game environment data acquisition request carries the current game identifier, and the first server searches for the game environment data corresponding to the current game identifier according to the current game environment data acquisition request.

And step 204, performing data type conversion on the game environment data to obtain game training interactive data in a standard data format.

The game training interactive data is training interactive data in a standard data format and is used for training the neural network model, the game environment data corresponding to different games can be converted to obtain game training interactive data corresponding to the standard data format, and a training scheme that different games can be accessed to the same set of neural network model is achieved.

The conversion of the game environment data can be data format conversion of the game environment data, and the same set of standard data format conversion rules can be used for converting the game environment data of different games to obtain game training interactive data in a standard data format.

And step 206, sending the game training interaction data to the second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game reward value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

Here, the second server is a server where the neural network is located, and may be an AI server. Specifically, game training interactive data are sent to a second server through a network, the second server calls a neural network model after receiving the game training interactive data, the game training interactive data are used as input parameters of the neural network model, and a current game action and a current game reward value corresponding to a current game are obtained through calculation of the neural network model. The current game action is an action used for controlling a current control object in the current game to execute, the current game reward value can be a current incentive means of the current game, and whether the neural network model achieves the training purpose or not can be determined according to the current game reward value.

The specific step of determining whether the neural network model achieves the training purpose through the current game reward value may be to obtain a preset game reward threshold value, and determine whether the neural network model achieves the training purpose according to the current game reward value and the preset game reward threshold value. Wherein, the game model file comprises a neural network model trained by the current game.

In another embodiment, the step of determining whether the neural network model achieves the training purpose through the current game award value may be specifically to obtain a total training frequency of the neural network model, determine that the neural network model achieves the training purpose when the total training frequency reaches a threshold training frequency, stop training of the neural network model, output a current game action, and store the current game action as a game model file corresponding to the current game.

The game data processing method includes the steps of obtaining game environment data corresponding to a current game from a first server, wherein the first server is used for collecting the game environment data corresponding to a plurality of games, the current game is any one of the plurality of games, conducting data type conversion on the game environment data to obtain game training interaction data in a standard data format, sending the game training interaction data to a second server to enable the second server to train a neural network model according to the game training interaction data to obtain a current game action and a current game reward value, corresponding to the current game and output by the neural network model, stopping training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

In one embodiment, as shown in fig. 3, the game data processing method further includes:

step 302, receiving a game environment data acquisition instruction sent by the second server, where the game environment data acquisition instruction is generated by triggering when the current game award value does not reach a preset game award threshold.

And 304, acquiring game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interactive data in a standard data format, and stopping training of the neural network model until the current game reward value output by the neural network model in the second server reaches a preset game reward threshold value.

When the current game reward value does not reach the preset game reward threshold value, the neural network model is determined not to reach the training purpose, so that the neural network model needs to be trained continuously, the second server triggers and generates a game environment acquisition instruction, and the game environment acquisition instruction returns to the first server through the execution of the main body game data processing equipment.

The method comprises the steps that after a game environment acquisition instruction is received by a first server, game environment data corresponding to the next game scene of a current game are acquired, the current game can comprise a plurality of game scenes, the previous game environment data cannot enable a neural network model to achieve a training purpose, the game environment data corresponding to the next game scene of the current game need to be acquired, the game environment data are returned to an execution step to be subjected to data type conversion, game training interactive data in a standard data format are acquired, and the like, until the current game reward value output by the neural network model in a second server reaches a preset game reward threshold value, training of the neural network model is stopped.

In one embodiment, as shown in fig. 4, before obtaining the game environment data corresponding to the current game from the first server, the method further includes:

step 402, receiving initialization parameters sent by a first server when a connection is successfully established with the first server through a predetermined communication protocol.

Step 404, sending the initialization parameter to the second server based on the predetermined communication protocol, so that the second server sets the neural network model as a corresponding initial training interaction parameter according to the initialization parameter, and obtains the initialized neural network model.

Step 406, receiving a current game resetting instruction sent by the second server, and sending the current game resetting instruction to the first server, so that the first server resets the game scene of the current game according to the current game resetting instruction.

The predetermined communication protocol is a self-developed general protocol, the execution main body game data processing device establishes connection with the first server and the second server respectively through the predetermined communication protocol, and the execution game data processing device can be embedded in the first server or the second server in an SDK mode.

Specifically, after a connection is successfully established with a first server through a predetermined communication protocol, initialization parameters sent by the first server are received, wherein the initialization parameters include a seed value, a communication version, a packet version and a training capability parameter. The training ability parameters comprise basic reinforcement learning ability, connection PNG observation setting values, compression channel transmission mapping, mixed actions, training analysis, variable length observation setting values, whether a plurality of agent groups exist or not and the like. The initialization parameters are used for standardizing a game logic server and a training terminal, wherein the seed value, the communication version and the package version are bases for communication training interaction of the logic server and the training terminal, and the parameters are mainly used for verifying the legality of the logic server and the training terminal. The training ability parameters specify some basic configurations of the training end, so that the training end performs data communication interaction in a specified mode.

Further, after the connection with the second server is successfully established based on the predetermined communication protocol, the initialization parameter is sent to the second server, the second server sets the neural network model as the corresponding initial training interaction parameter according to the initialization parameter to obtain the initialized neural network model, and then the second server sends a current game resetting instruction, wherein the current game resetting instruction is used for resetting the scene of the current game in the first server.

And finally, the execution main body forwards the current game resetting instruction sent by the second server to the first server, and the first server resets the game scene of the current game according to the current game resetting instruction.

In one embodiment, as shown in fig. 5, the data type conversion of the game environment data to obtain the game training interaction data in the standard data format includes:

step 502, extracting the key parameters of the game environment from the game environment data.

And step 504, obtaining a corresponding game environment data matrix according to the game environment parameters and the preset matrix dimension information.

Step 506, generating game training interactive data according to the game environment key parameters and the game environment data matrix.

The game environment key parameters are extracted from the game environment data, and the game environment key parameters can be observed values, wherein the observed values are key parameters transmitted by the first server through a preset communication protocol.

The preset matrix dimension information is preset dimension information, the preset matrix dimension information is a matrix dimension indicating a transmitted data format, and is a one-dimensional matrix, a two-dimensional matrix or an N-dimensional matrix, the preset matrix dimension information can be obtained specifically according to actual business requirements, product requirements or actual application scenes, a corresponding game environment data matrix can be obtained according to game environment parameters and the preset matrix dimension information, specifically, the number of the game environment parameters can be obtained, and the corresponding game environment data matrix can be generated according to the number of the game environment parameters and the preset matrix dimension information.

For example, the game scene is a balance ball, the key parameters of the game environment to be transmitted at this time are 6 pieces of information such as coordinates and directions of the balance ball, the preset matrix dimension information is a one-dimensional matrix, and the finally generated game environment data matrix is a dimension matrix of 1X 6.

And finally, generating game training interaction data through the game environment key parameters and the game environment data matrix. Specifically, the game training interaction data may be obtained by transmitting the game environment key parameters and the information types corresponding to the game environment data matrix into corresponding structure standard parameters.

In one embodiment, as shown in FIG. 6, generating game training interaction data from the game environment key parameters and the game environment data matrix comprises:

step 602, determining a corresponding first information type according to the game environment key parameter.

And step 604, determining a corresponding second information type according to the game environment data matrix.

And 606, transmitting the game environment key parameters into standard parameters of which the data types corresponding to the first information types are structural bodies to obtain first game training interactive data.

And 608, transmitting the game environment data matrix into a standard parameter with the data type corresponding to the second information type as a structural body to obtain second game training interactive data, and obtaining the game training interactive data according to the first game training interactive data and the second game training interactive data.

The first information type is an information type corresponding to the game environment key parameter, the second information type is an information type corresponding to the game environment data matrix, and specifically, the first information type corresponding to the game environment key parameter is determined by obtaining the game environment key parameter, and the second information type corresponding to the game environment data matrix is obtained.

Further, a data type corresponding to the first information type is obtained as a standard parameter of the structural body, and the key parameters of the game environment are transmitted into the standard parameter to obtain first game training interactive data.

Similarly, a standard parameter with a data type corresponding to the second information type as a structural body is obtained, and the game environment data matrix is transmitted into the standard parameter to obtain second game training interactive data.

For example, the game scene is a balance ball, the key parameters of the game scene to be transmitted are 6 pieces of information such as coordinates and directions of the balance ball, the preset matrix dimension information is a one-dimensional matrix, and the finally generated game environment data matrix is a dimension matrix of 1X 6. Since FloatData (floating point data) is a collection of FloatDatas (floating point numbers), 6 parameters are passed into FloatData, while the data matrix 1X6 is passed into Observation. When the game scene is a ping-pong ball, the key parameters of the game scene to be transmitted are the coordinates and the speed of the ball, the positions of the players and the like, the number of the information is 8, the preset matrix dimension information is a one-dimensional matrix, and the finally generated game environment data matrix is a dimension matrix of 1X 8. Because FloatData is a collection of FloatDatas, 8 parameters are passed into FloatData, while the game environment data matrix 1X8 is passed into Observation. Wherein, FloatData (floating point number data) is a collection of Float type data, and observer is a key parameter of the game environment.

In one embodiment, as shown in fig. 7, the game data processing method further includes:

step 702, obtaining target game environment data corresponding to the target game from the first server.

Step 704, converting the standardized data format of the target game environment data to obtain corresponding target game interaction data.

Step 706, sending the target game interaction data to the second server, so that the second server calls the matched target game model file according to the target game interaction data, and predicting the target game action based on the target game interaction data through the target neural network model corresponding to the target game model file.

Step 708, the target game action returned by the second server is sent to the first server, so that the first server controls the target object in the target game to execute the next action according to the target game action.

After the game model files corresponding to the respective games are obtained, the game action can be predicted by the game model files. The target game is a game needing game action prediction, and can be determined according to actual business requirements, product requirements or actual application scenes.

Specifically, the target game environment data corresponding to the target game is acquired from the first server, specifically, the request is sent to the first server, and the first server acquires the target game environment data corresponding to the target game according to the request. Or, the first server may automatically acquire the target game environment data corresponding to the target game at a preset time point by automatic triggering.

Further, the target game environment data is subjected to conversion of a standardized data format, namely, the target game environment data is subjected to unification processing to obtain corresponding target game interaction data, and the corresponding target game interaction data is sent to the second server.

The second server receives the target game interaction data, firstly calls the matched target game model file, different games correspond to different game model files, then predicts the target game environment data through the target neural network model corresponding to the target game model file to obtain corresponding target game actions, and forwards the target game actions to the first server through the execution main body.

And finally, after receiving the target game action, the first server controls a target object in the target game to execute the next action according to the target game action.

In a specific embodiment, a game data processing method is provided, which specifically includes the following steps:

1. and receiving the initialization parameters sent by the first server when the connection is successfully established with the first server through the preset communication protocol.

2. And sending the initialization parameters to a second server based on a preset communication protocol so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters to obtain the initialized neural network model.

3. And receiving a current game resetting instruction sent by the second server, and sending the current game resetting instruction to the first server so that the first server resets the game scene of the current game according to the current game resetting instruction.

4. The method comprises the steps of obtaining game environment data corresponding to a current game from a first server, wherein the first server is used for collecting the game environment data corresponding to a plurality of games, and the current game is any one of the games.

5. And carrying out data type conversion on the game environment data to obtain game training interactive data in a standard data format.

And 5-1, extracting the key parameters of the game environment from the game environment data.

And 5-2, obtaining a corresponding game environment data matrix according to the game environment parameters and the preset matrix dimension information.

And 5-3, generating game training interaction data according to the game environment key parameters and the game environment data matrix.

And 5-3-1, determining a corresponding first information type according to the key parameters of the game environment.

And 5-3-2, determining a corresponding second information type according to the game environment data matrix.

And 5-3-3, transmitting the key parameters of the game environment into standard parameters of which the data type corresponding to the first information type is a structural body to obtain first game training interactive data.

And 5-3-4, transmitting the game environment data matrix into a standard parameter with the data type corresponding to the second information type as a structural body to obtain second game training interactive data, and obtaining the game training interactive data according to the first game training interactive data and the second game training interactive data.

6. And sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

7. And receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is generated by triggering when the current game reward value does not reach a preset game reward threshold value.

8. And obtaining game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interactive data in a standard data format, and stopping the training of the neural network model until the current game reward value output by the neural network model in the second server reaches a preset game reward threshold value.

9. And acquiring target game environment data corresponding to the target game from the first server.

10. And carrying out standardized data format conversion on the target game environment data to obtain corresponding target game interaction data.

11. And sending the target game interaction data to a second server so that the second server calls the matched target game model file according to the target game interaction data, and predicting to obtain the target game action based on the target game interaction data through a target neural network model corresponding to the target game model file.

12. And sending the target game action returned by the second server to the first server so that the first server controls the target object in the target game to execute the next action according to the target game action.

It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.

In one embodiment, as shown in FIG. 8, there is provided a game data processing system comprising:

the game data processing device 802 is configured to send a current game environment data acquisition request to the first server, where the current game environment data acquisition request carries a current game identifier.

And the first server 804 is configured to obtain, according to the current game environment data acquisition request, game environment data corresponding to the current game identifier from the game environment data corresponding to the multiple games, and return the game environment data to the game data processing device.

And the game data processing device 802 is configured to receive the game environment data, perform data type conversion on the game environment data, obtain game training interaction data in a standard data format, and send the game training interaction data to the second server.

The second server 806 is configured to receive the game training interaction data, train the neural network model according to the game training interaction data, obtain a current game action and a current game award value corresponding to the current game output by the neural network model, stop training the neural network model when the current game award value reaches a preset game award threshold value, output the current game action, and generate a game model file corresponding to the current game.

In one embodiment, as shown in fig. 9, there is provided a game data processing apparatus 900 including: an obtaining module 902, a converting module 904, and a sending module 906, wherein:

an obtaining module 902 is configured to obtain game environment data corresponding to a current game from a first server, where the first server is configured to collect game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

And a conversion module 904, configured to perform data type conversion on the game environment data to obtain game training interactive data in a standard data format.

The sending module 906 is configured to send the game training interaction data to the second server, so that the second server trains the neural network model according to the game training interaction data to obtain a current game action and a current game award value, which are output by the neural network model and correspond to the current game, and when the current game award value reaches a preset game award threshold, stops training of the neural network model, outputs the current game action, and generates a game model file corresponding to the current game.

In one embodiment, the game data processing apparatus 900 receives a game environment data acquisition instruction sent by the second server, where the game environment data acquisition instruction is generated by triggering when the current game award value does not reach the preset game award threshold, acquires game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returns to the conversion module 904 to perform data type conversion on the game environment data, and obtains game training interactive data in the standard data format, and stops the neural network model training until the current game award value output by the neural network model in the second server reaches the preset game award threshold.

In one embodiment, when the game data processing apparatus 900 successfully establishes a connection with the first server through a predetermined communication protocol, the initialization parameter sent by the first server is received, the initialization parameter is sent to the second server based on the predetermined communication protocol, so that the second server sets the neural network model as a corresponding initial training interaction parameter according to the initialization parameter to obtain an initialized neural network model, a current game resetting instruction sent by the second server is received, and the current game resetting instruction is sent to the first server, so that the first server resets a game scene of the current game according to the current game resetting instruction.

In one embodiment, the conversion module 904 extracts the game environment key parameters from the game environment data, obtains the corresponding game environment data matrix according to the game environment parameters and the preset matrix dimension information, and generates the game training interaction data according to the game environment key parameters and the game environment data matrix.

In one embodiment, the conversion module 904 determines a corresponding first information type according to the game environment key parameter, determines a corresponding second information type according to the game environment data matrix, transmits the game environment key parameter to a standard parameter with a data type corresponding to the first information type as a structural body to obtain first game training interactive data, transmits the game environment data matrix to a standard parameter with a data type corresponding to the second information type as a structural body to obtain second game training interactive data, and obtains the game training interactive data according to the first game training interactive data and the second game training interactive data.

In one embodiment, the game data processing device 900 obtains target game environment data corresponding to a target game from a first server, performs standardized data format conversion on the target game environment data to obtain corresponding target game interaction data, sends the target game interaction data to a second server, so that the second server calls a matched target game model file according to the target game interaction data, predicts a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file, and sends the target game action returned by the second server to the first server, so that the first server controls a target object in the target game to execute a next action according to the target game action.

For the specific definition of the game data processing device, reference may be made to the above definition of the game data processing method, which is not described herein again. The respective modules in the above game data processing device may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store game training interaction data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a game data processing method.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a game data processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the configurations shown in fig. 10 or 11 are merely block diagrams of some configurations relevant to the present disclosure, and do not constitute a limitation on the computing devices to which the present disclosure may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games; carrying out data type conversion on the game environment data to obtain game training interactive data in a standard data format; and sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

In one embodiment, the processor, when executing the computer program, further performs the steps of: receiving a game environment data acquisition instruction sent by a second server, wherein the game environment data acquisition instruction is generated by triggering when the current game reward value does not reach a preset game reward threshold value, acquiring game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interactive data in a standard data format, and stopping the training of the neural network model until the current game reward value output by the neural network model in the second server reaches the preset game reward threshold value.

In one embodiment, the processor, when executing the computer program, further performs the steps of: when the connection with the first server is successfully established through the preset communication protocol, the initialization parameters sent by the first server are received, the initialization parameters are sent to the second server based on the preset communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters to obtain the initialized neural network model, a current game resetting instruction sent by the second server is received, and the current game resetting instruction is sent to the first server, so that the first server resets the game scene of the current game according to the current game resetting instruction.

In one embodiment, the processor, when executing the computer program, further performs the steps of: extracting game environment key parameters from the game environment data, obtaining a corresponding game environment data matrix according to the game environment parameters and preset matrix dimension information, and generating game training interactive data according to the game environment key parameters and the game environment data matrix.

In one embodiment, the processor, when executing the computer program, further performs the steps of: determining a corresponding first information type according to the game environment key parameters, determining a corresponding second information type according to the game environment data matrix, transmitting the game environment key parameters into standard parameters of which the data type corresponding to the first information type is a structural body to obtain first game training interactive data, transmitting the game environment data matrix into standard parameters of which the data type corresponding to the second information type is the structural body to obtain second game training interactive data, and obtaining the game training interactive data according to the first game training interactive data and the second game training interactive data.

In one embodiment, the processor, when executing the computer program, further performs the steps of: the method comprises the steps of obtaining target game environment data corresponding to a target game from a first server, carrying out standardized data format conversion on the target game environment data to obtain corresponding target game interaction data, sending the target game interaction data to a second server to enable the second server to call a matched target game model file according to the target game interaction data, predicting to obtain a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file, and sending the target game action returned by the second server to the first server to enable the first server to control a target object in the target game to execute the next action according to the target game action.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games; carrying out data type conversion on the game environment data to obtain game training interactive data in a standard data format; and sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A game data processing method, the method comprising:

acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the games;

and sending the game training interaction data to a second server so that the second server trains a neural network model according to the game training interaction data to obtain a current game action and a current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

2. The method of claim 1, further comprising:

receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is generated by triggering when the current game reward value does not reach the preset game reward threshold value;

and obtaining game environment data corresponding to the next game scene of the current game from the first server according to a game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interactive data in a standard data format, and stopping the training of the neural network model until the current game reward value output by the neural network model in the second server reaches a preset game reward threshold value.

3. The method of claim 1, wherein before obtaining the game environment data corresponding to the current game from the first server, the method further comprises:

when the connection with the first server is successfully established through a preset communication protocol, receiving initialization parameters sent by the first server;

sending the initialization parameters to the second server based on the preset communication protocol so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters to obtain an initialized neural network model;

and receiving a current game resetting instruction sent by the second server, and sending the current game resetting instruction to the first server so that the first server resets the game scene of the current game according to the current game resetting instruction.

4. The method of claim 1, wherein the converting the game environment data into the game training interaction data in a standard data format comprises:

extracting game environment key parameters from the game environment data;

obtaining a corresponding game environment data matrix according to the game environment parameters and preset matrix dimension information;

and generating game training interactive data according to the game environment key parameters and the game environment data matrix.

5. The method of claim 4, wherein generating game training interaction data from the game environment key parameters and the game environment data matrix comprises:

determining a corresponding first information type according to the game environment key parameters;

determining a corresponding second information type according to the game environment data matrix;

transmitting the game environment key parameters into standard parameters of which the data type corresponding to the first information type is a structural body to obtain first game training interactive data;

and transmitting the game environment data matrix into a standard parameter with a data type corresponding to the second information type as a structural body to obtain second game training interactive data, and obtaining the game training interactive data according to the first game training interactive data and the second game training interactive data.

6. The method of claim 1, further comprising:

acquiring target game environment data corresponding to a target game from the first server;

carrying out standardized data format conversion on the target game environment data to obtain corresponding target game interaction data;

sending the target game interaction data to the second server so that the second server calls a matched target game model file according to the target game interaction data, and predicting to obtain a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file;

and sending the target game action returned by the second server to the first server so that the first server controls a target object in the target game to execute the next action according to the target game action.

7. A gaming data processing system, the system comprising:

the first server is used for acquiring game environment data corresponding to the current game identifier from the game environment data corresponding to a plurality of games according to the current game environment data acquisition request and returning the game environment data to the game data processing equipment;

the game data processing equipment is used for receiving the game environment data, performing data type conversion on the game environment data to obtain game training interactive data in a standard data format, and sending the game training interactive data to a second server;

and the second server is used for receiving the game training interaction data, training a neural network model according to the game training interaction data to obtain the current game action and the current game reward value which are output by the neural network model and correspond to the current game, stopping the training of the neural network model when the current game reward value reaches a preset game reward threshold value, outputting the current game action and generating a game model file corresponding to the current game.

8. A game data processing apparatus, characterized in that the apparatus comprises:

the game processing device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring game environment data corresponding to a current game from a first server, the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the games;

and the sending module is used for sending the game training interaction data to a second server so that the second server trains a neural network model according to the game training interaction data to obtain a current game action and a current game reward value which are output by the neural network model and correspond to the current game, and when the current game reward value reaches a preset game reward threshold value, the training of the neural network model is stopped, the current game action is output, and a game model file corresponding to the current game is generated.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented when the computer program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.