CN113648658B

CN113648658B - Game data processing method, system, device, computer equipment and storage medium

Info

Publication number: CN113648658B
Application number: CN202110837834.8A
Authority: CN
Inventors: 刘舟; 杨帆; 黎广璘
Original assignee: Guangzhou Sanqi Mutual Entertainment Technology Co ltd
Current assignee: Guangzhou Sanqi Mutual Entertainment Technology Co ltd
Priority date: 2021-07-23
Filing date: 2021-07-23
Publication date: 2023-10-20
Anticipated expiration: 2041-07-23
Also published as: CN113648658A

Abstract

The application relates to a game data processing method, a game data processing device, computer equipment and a storage medium. The method comprises the following steps: the game environment data corresponding to the current game are acquired from a first server, the first server is used for acquiring game environment data of a plurality of games, the current game is any one of the games, game training interaction data are obtained by converting the game environment data, the game training interaction data are sent to a second server, the second server trains a neural network model according to the game training interaction data, current game actions and current game rewarding values output by the neural network model are obtained, when the current game rewarding values reach a preset game rewarding threshold value, training is stopped, the current game actions are output, and a game model file corresponding to the current game is generated. By adopting the method, the aim that different games can access the same set of A I training schemes can be achieved, A I corresponding to different game scenes do not need to be written, development cost is reduced, and development efficiency is improved.

Description

Game data processing method, system, device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a game data processing method, apparatus, computer device, and storage medium.

Background

With the development of games, the games need a plurality of AI (Artificial Intelligence ), so that the world of the games is more real and colorful, traditional AI need to be manually written by developers to perform training, however, different game scenes need to be manually written by the developers to perform training, and the development cost is high and the development efficiency is low easily.

Disclosure of Invention

Based on the above, it is necessary to provide a game data processing method, system, device, computer equipment and storage medium, which can convert the standardized format of game environment data corresponding to different games, and train the same set of neural network model by the converted data, so as to achieve the purpose that different games can access the same set of AI training scheme, without writing AI corresponding to different game scenes, reduce development cost and improve development efficiency.

A game data processing method, the method comprising:

Acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games;

performing data type conversion on the game environment data to obtain game training interaction data in a standard data format;

and sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

In one embodiment, the game data processing method further includes: and receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is triggered and generated when the current game rewarding value does not reach the preset game rewarding threshold value, acquiring game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interaction data in a standard data format, and stopping training of the neural network model until the current game rewarding value output by the neural network model in the second server reaches the preset game rewarding threshold value.

In one embodiment, before obtaining game environment data corresponding to the current game from the first server, the method further includes: when connection is successfully established with the first server through a preset communication protocol, receiving initialization parameters sent by the first server, sending the initialization parameters to the second server based on the preset communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters, obtains an initialized neural network model, receives a current game reset instruction sent by the second server, and sends the current game reset instruction to the first server, so that the first server resets a game scene of a current game according to the current game reset instruction.

In one embodiment, performing data type conversion on game environment data to obtain game training interaction data in a standard data format, including: the game environment key parameters are extracted from the game environment data, a corresponding game environment data matrix is obtained according to the game environment parameters and the dimension information of the preset matrix, and game training interaction data are generated according to the game environment key parameters and the game environment data matrix.

In one embodiment, generating game training interaction data from a game environment key parameter and a game environment data matrix includes: determining a corresponding first information type according to the game environment key parameters, determining a corresponding second information type according to the game environment data matrix, transmitting the game environment key parameters into standard parameters with the data type corresponding to the first information type as a structural body to obtain first game training interaction data, transmitting the game environment data matrix into standard parameters with the data type corresponding to the second information type as a structural body to obtain second game training interaction data, and obtaining game training interaction data according to the first game training interaction data and the second game training interaction data.

In one embodiment, the game data processing method further includes: obtaining target game environment data corresponding to a target game from a first server, performing standardized data format conversion on the target game environment data to obtain corresponding target game interaction data, sending the target game interaction data to a second server, enabling the second server to call a matched target game model file according to the target game interaction data, predicting a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file, and sending the target game action returned by the second server to the first server, so that the first server controls a target object in the target game to execute a next action according to the target game action.

A game data processing system, the system comprising:

the game data processing equipment is used for sending a current game environment data acquisition request to the first server, wherein the current game environment data acquisition request carries a current game identifier;

the first server is used for acquiring game environment data corresponding to the current game identifier from the game environment data corresponding to the plurality of games according to the current game environment data acquisition request and returning the game environment data to the game data processing equipment;

the game data processing equipment is used for receiving game environment data, performing data type conversion on the game environment data to obtain game training interaction data in a standard data format, and sending the game training interaction data to the second server;

the second server is used for receiving game training interaction data, training the neural network model according to the game training interaction data to obtain current game actions and current game rewards corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewards reach a preset game rewards threshold, outputting the current game actions, and generating a game model file corresponding to the current game.

A game data processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games;

the conversion module is used for carrying out data type conversion on the game environment data to obtain game training interaction data in a standard data format;

the sending module is used for sending the game training interaction data to the second server so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game rewarding value corresponding to the current game output by the neural network model, and when the current game rewarding value reaches the preset game rewarding threshold value, the training of the neural network model is stopped, the current game action is output, and the game model file corresponding to the current game is generated.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

converting the data type of the game environment data to obtain game training interaction data in a standard data format;

The game data processing method, the game data processing system, the game data processing device, the computer equipment and the storage medium acquire game environment data corresponding to a current game from a first server, the first server is used for acquiring the game environment data corresponding to a plurality of games, the current game is any one of the games, data type conversion is carried out on the game environment data to obtain game training interaction data in a standard data format, the game training interaction data are sent to a second server, so that the second server trains a neural network model according to the game training interaction data to obtain current game actions and current game rewarding values corresponding to the current game output by the neural network model, training of the neural network model is stopped when the current game rewarding values reach a preset game rewarding threshold value, and the current game actions are output to generate game model files corresponding to the current game.

Therefore, the game environment data corresponding to different games can be subjected to standardized format conversion, the converted data are used for training the same set of neural network model in a butt joint mode, the purpose that different games can be connected with the same set of AI training schemes is achieved, AI corresponding to different game scenes do not need to be written, development cost is reduced, and development efficiency is improved.

Drawings

FIG. 1 is an application environment diagram of a game data processing method in one embodiment;

FIG. 2 is a flow chart of a method of processing game data according to one embodiment;

FIG. 3 is a flow chart of a method of processing game data according to one embodiment;

FIG. 4 is a flow chart of a game data processing method according to another embodiment;

FIG. 5 is a flow chart illustrating the steps of converting game environment data according to one embodiment;

FIG. 6 is a flow chart of game training interaction data generation steps in one embodiment;

FIG. 7 is a flow chart of a method of processing game data in one embodiment;

FIG. 8 is a block diagram of a game data processing system in one embodiment;

FIG. 9 is a block diagram of a game data processing device in one embodiment;

FIG. 10 is an internal block diagram of a computer device in one embodiment;

FIG. 11 is an internal block diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The game data processing method provided by the application can be applied to an application environment shown in figure 1. Wherein the game data processing device 102 communicates with the first server 104 via a network and the game data processing device 102 communicates with the second server 106 via a network. The game data processing 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, portable wearable devices, independent servers or a server cluster composed of a plurality of servers, and the first server 104 and the second server 106 may be implemented by independent servers or a server cluster composed of a plurality of servers. The game data processing device 102 may also be embedded on the first server or the second server in the form of sdk (Software Development Kit ).

Specifically, the game data processing device 102 sends a current game environment data collection request to the first server, where the current game environment data collection request carries a current game identifier, and the first server 104 obtains, according to the current game environment data collection request, game environment data corresponding to the current game identifier from game environment data corresponding to a plurality of games, and returns the game environment data to the game data processing device 102. Further, the game data processing device 102 receives game environment data, performs data type conversion on the game environment data to obtain game training interaction data in a standard data format, sends the game training interaction data to the second server 106, the second server 106 receives the game training interaction data, trains the neural network model according to the game training interaction data to obtain a current game action and a current game reward value corresponding to a current game output by the neural network model, stops training the neural network model when the current game reward value reaches a preset game reward threshold, outputs the current game action, and generates a game model file corresponding to the current game.

In one embodiment, as shown in fig. 2, there is provided a game data processing method, which is exemplified as an application of the method to the game data processing apparatus in fig. 1, including the steps of:

step 202, obtaining game environment data corresponding to a current game from a first server, where the first server is configured to collect game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

The first server may be a service-related server, for example, a game server, and the current game may be a game currently being processed, or may be determined according to an actual service requirement, a product requirement, or an application scenario, where the first server is configured to collect game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

The game environment data is environment data related to the current game, which can be but is not limited to scene parameters of the current game in the running process, control object data corresponding to the current game, and the like.

Specifically, the current game environment data acquisition request can be sent to the first server, the current game environment data acquisition request carries the current game identifier, and the first server searches and obtains the game environment data corresponding to the current game identifier according to the current game environment data acquisition request.

And 204, performing data type conversion on the game environment data to obtain game training interaction data in a standard data format.

The game training interaction data are training interaction data in a standard data format, are used for training the neural network model, and game environment data corresponding to different games can be converted to obtain game training interaction data corresponding to the standard data format, so that different games can access to the training scheme of the same set of the neural network model.

The conversion of the game environment data can be data format conversion of the game environment data, and the game environment data of different games can be converted by using the same set of standard data format conversion rules to obtain game training interaction data in a standard data format.

And 206, transmitting the game training interaction data to a second server, so that the second server trains the neural network model according to the game training interaction data to obtain the current game action and the current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

The second server here is a server where the neural network is located, and may be an AI server. Specifically, game training interaction data are sent to a second server through a network, after the second server receives the game training interaction data, a neural network model is called, the game training interaction data are used as input parameters of the neural network model, and a current game action and a current game rewarding value corresponding to a current game are obtained through calculation of the neural network model. The current game action is an action used for controlling the current control object in the current game, the current game reward value can be a current motivation means of the current game, and whether the neural network model achieves the training purpose can be determined through the current game reward value.

The determining whether the neural network model reaches the training purpose through the current game reward value may specifically include obtaining a preset game reward threshold, determining whether the neural network model reaches the training purpose according to the current game reward value and the preset game reward threshold, and determining that the neural network model reaches the training purpose when the current game reward value reaches the preset game reward threshold, stopping training of the neural network model, outputting current game actions, and storing the current game actions as a game model file corresponding to the current game. The game model file comprises a neural network model trained by the current game.

In another embodiment, determining whether the neural network model reaches the training purpose through the current game reward value may further include obtaining a total training frequency of the neural network model, determining that the neural network model reaches the training purpose when the total training frequency reaches a training frequency threshold, stopping training of the neural network model, outputting a current game action, and storing the current game action as a game model file corresponding to the current game.

In the game data processing method, game environment data corresponding to a current game are acquired from the first server, the first server is used for acquiring the game environment data corresponding to a plurality of games, the current game is any one of the games, data type conversion is carried out on the game environment data to obtain game training interaction data in a standard data format, the game training interaction data are sent to the second server, so that the second server trains a neural network model according to the game training interaction data to obtain current game actions and current game rewarding values corresponding to the current game output by the neural network model, training of the neural network model is stopped when the current game rewarding values reach a preset game rewarding threshold value, and a game model file corresponding to the current game is generated.

In one embodiment, as shown in fig. 3, the game data processing method further includes:

step 302, receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is triggered and generated when the current game reward value does not reach a preset game reward threshold value.

Step 304, game environment data corresponding to the next game scene of the current game is obtained from the first server according to the game environment data collection instruction, the execution step is returned to perform data type conversion on the game environment data to obtain game training interaction data in a standard data format, and the training of the neural network model is stopped until the current game rewarding value output by the neural network model in the second server reaches a preset game rewarding threshold value.

When the current game reward value does not reach the preset game reward threshold, the neural network model is determined to not reach the training purpose, so that training of the neural network model is required to be continued, the second server triggers generation of a game environment acquisition instruction, and the game environment acquisition instruction is returned to the first server through execution of the main game data processing equipment.

After the first server receives the game environment acquisition instruction, game environment data corresponding to the next game scene of the current game is acquired, the current game can comprise a plurality of game scenes, the previous game environment data can not enable the neural network model to achieve the training purpose, the game environment data corresponding to the next game scene of the current game is required to be acquired, the execution step is returned to perform data type conversion on the game environment data to obtain game training interaction data in a standard data format, and the like, until the current game rewarding value output by the neural network model in the second server reaches the preset game rewarding threshold value, training of the neural network model is stopped.

In one embodiment, as shown in fig. 4, before obtaining game environment data corresponding to the current game from the first server, the method further includes:

in step 402, upon successful connection establishment with the first server via the predetermined communication protocol, initialization parameters sent by the first server are received.

And step 404, transmitting the initialization parameters to the second server based on a predetermined communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters, and obtaining the initialized neural network model.

Step 406, receiving the current game reset instruction sent by the second server, and sending the current game reset instruction to the first server, so that the first server resets the game scene of the current game according to the current game reset instruction.

The predetermined communication protocol is a self-developed general protocol, and the execution subject game data processing device establishes connection with the first server and the second server through the predetermined communication protocol, wherein the execution subject game data processing device can be embedded in the first server or the second server in the form of an SDK.

Specifically, after the connection is successfully established with the first server through the predetermined communication protocol, the initialization parameters sent by the first server are received, wherein the initialization parameters comprise a seed value, a communication version, a packet version and training capability parameters. The training ability parameters include basic reinforcement learning ability, connection PNG observation setting values, compression channel transmission mapping, mixing actions, training analysis, variable length observation setting values, whether multiple agent groups are needed, and the like. The initialization parameters are used for standardizing the game logic server and the training end, wherein the seed value, the communication version and the packet version are the basis for communication training interaction between the logic server and the training end, and the parameters are mainly used for verifying the legality of the two parties. The training capacity parameters are basic configurations of the designated training terminals, so that the training terminals can conduct data communication interaction in a designated mode.

Further, after the connection is successfully established with the second server based on the predetermined communication protocol, the initialization parameter is sent to the second server, the second server sets the neural network model to the corresponding initial training interaction parameter according to the initialization parameter, the initialized neural network model is obtained, and then the second server sends a current game reset instruction which is used for resetting the scene of the current game in the first server.

And finally, the execution body forwards the current game reset instruction sent by the second server to the first server, and the first server resets the game scene of the current game according to the current game reset instruction.

In one embodiment, as shown in fig. 5, performing data type conversion on game environment data to obtain game training interaction data in a standard data format, including:

step 502, extracting key parameters of the game environment from the game environment data.

Step 504, obtaining a corresponding game environment data matrix according to the game environment parameters and the dimension information of the preset matrix.

Step 506, game training interaction data is generated according to the game environment key parameters and the game environment data matrix.

Wherein the game environment key parameter is extracted from the game environment data, and the game environment key parameter may be an observed value, which is a key parameter transmitted by the first server through a predetermined communication protocol.

The preset matrix dimension information is preset dimension information, the preset matrix dimension information is matrix dimension indicating a transmitted data format, whether the preset matrix dimension information is a one-dimensional matrix or a two-dimensional matrix or an N-dimensional matrix, the preset matrix dimension information is specifically obtained by determining according to actual service requirements, product requirements or actual application scenes, a corresponding game environment data matrix can be obtained according to game environment parameters and the preset matrix dimension information, specifically, the number of game environment parameters can be obtained, and the corresponding game environment data matrix is generated according to the number of game environment parameters and the preset matrix dimension information.

For example, the game scene is a balance ball, at this time, the key parameters of the game environment to be transferred are 6 pieces of information such as coordinates, directions and the like of the balance ball, the preset matrix dimension information is a one-dimensional matrix, and the finally generated game environment data matrix is a 1X6 dimension matrix.

Finally, game training interaction data can be generated through the game environment key parameters and the game environment data matrix. Specifically, according to the key parameters of the game environment and the information types corresponding to the game environment data matrix, the key parameters are transmitted into the corresponding standard parameters of the structural body to obtain game training interaction data.

In one embodiment, as shown in FIG. 6, generating game training interaction data from a game environment key parameter and a game environment data matrix includes:

step 602, determining a corresponding first information type according to the game environment key parameters.

Step 604, determining a corresponding second information type according to the game environment data matrix.

Step 606, the game environment key parameters are transmitted into the standard parameters with the data type corresponding to the first information type as the structural body, so as to obtain the first game training interaction data.

Step 608, the game environment data matrix is transmitted to the standard parameter with the data type corresponding to the second information type as the structural body, so as to obtain second game training interaction data, and the game training interaction data is obtained according to the first game training interaction data and the second game training interaction data.

The first information type refers to an information type corresponding to a game environment key parameter, and the second information type refers to an information type corresponding to a game environment data matrix, specifically, the first information type corresponding to the game environment key parameter is determined by acquiring the game environment key parameter, and the second information type corresponding to the game environment data matrix is acquired.

Further, the standard parameters of which the data types are structural bodies corresponding to the first information types are obtained, and key parameters of the game environment are transmitted into the standard parameters to obtain first game training interaction data.

And similarly, acquiring a standard parameter of which the data type corresponds to the second information type is a structural body, and transmitting the game environment data matrix into the standard parameter to obtain second game training interaction data.

For example, the game scene is a balance ball, at this time, the key parameters of the game scene to be transferred are 6 pieces of information such as coordinates, directions and the like of the balance ball, the preset matrix dimension information is a one-dimensional matrix, and the finally generated game environment data matrix is a 1X6 dimension matrix. Since FloatData is a set of Float (floating point number), 6 parameters are input into FloatData, and data matrix 1X6 is input into Observation (Observation value). When the game scene is a table tennis, key parameters of the game scene to be transmitted are coordinates and speeds of the ball, positions and other information of two players are 8, the dimension information of the preset matrix is a one-dimensional matrix, and the finally generated game environment data matrix is a 1X8 dimension matrix. Since FloatData is a set of floats, 8 parameters are input into FloatData, and game environment data matrix 1X8 is input into observations. Wherein, floatData (floating point data) is a collection of Float (floating point) type data, and Observation (observed value) is a game environment key parameter.

In one embodiment, as shown in fig. 7, the game data processing method further includes:

step 702, obtaining target game environment data corresponding to a target game from a first server.

And step 704, performing standardized data format conversion on the target game environment data to obtain corresponding target game interaction data.

And step 706, sending the target game interaction data to the second server, so that the second server calls the matched target game model file according to the target game interaction data, and predicting the target game action based on the target game interaction data through a target neural network model corresponding to the target game model file.

Step 708, the target game action returned by the second server is sent to the first server, so that the first server controls the target object in the target game to execute the next action according to the target game action.

After the game model files corresponding to the games are obtained, the game actions can be predicted through the game model files. The target game is a game requiring game action prediction, and can be determined according to actual service requirements, product requirements or actual application scenes.

Specifically, the target game environment data corresponding to the target game is obtained from the first server, specifically, a request is sent to the first server, and the first server obtains the target game environment data corresponding to the target game according to the request. Or the first server can automatically acquire the target game environment data corresponding to the target game in a preset time point after automatic triggering.

Further, the target game environment data is converted into a standardized data format, namely, the target game environment data is subjected to unified processing, so that corresponding target game interaction data is obtained, and the corresponding target game interaction data is sent to the second server.

The second server receives the target game interaction data, firstly invokes the matched target game model files, and different games correspond to different game model files, predicts the target game environment data through a target neural network model corresponding to the target game model files to obtain corresponding target game actions, and forwards the corresponding target game actions to the first server through the execution main body.

Finally, after the first server receives the target game action, the first server controls the target object in the target game to execute the next action according to the target game action.

In a specific embodiment, there is provided a game data processing method, specifically including the steps of:

1. upon successful establishment of a connection with the first server via a predetermined communication protocol, initialization parameters sent by the first server are received.

2. And sending the initialization parameters to a second server based on a preset communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters, and the initialized neural network model is obtained.

3. Receiving a current game reset instruction sent by the second server, and sending the current game reset instruction to the first server, so that the first server resets the game scene of the current game according to the current game reset instruction.

4. The game environment data corresponding to the current game are acquired from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

5. And performing data type conversion on the game environment data to obtain game training interaction data in a standard data format.

And 5-1, extracting key parameters of the game environment from the game environment data.

And 5-2, obtaining a corresponding game environment data matrix according to the game environment parameters and the dimension information of the preset matrix.

And 5-3, generating game training interaction data according to the game environment key parameters and the game environment data matrix.

5-3-1, determining a corresponding first information type according to the game environment key parameters.

5-3-2, determining a corresponding second information type according to the game environment data matrix.

5-3-3, transmitting the key parameters of the game environment into the standard parameters with the data type corresponding to the first information type as the structural body, and obtaining the first game training interaction data.

5-3-4, transmitting the game environment data matrix into standard parameters with the data type corresponding to the second information type as a structural body to obtain second game training interaction data, and obtaining game training interaction data according to the first game training interaction data and the second game training interaction data.

6. And sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

7. And receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is triggered and generated when the current game rewarding value does not reach a preset game rewarding threshold value.

8. And acquiring game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step, performing data type conversion on the game environment data to obtain game training interaction data in a standard data format, and stopping training of the neural network model until the current game rewarding value output by the neural network model in the second server reaches a preset game rewarding threshold value.

9. And acquiring target game environment data corresponding to the target game from the first server.

10. And carrying out standardized data format conversion on the target game environment data to obtain corresponding target game interaction data.

11. And sending the target game interaction data to the second server, so that the second server calls the matched target game model file according to the target game interaction data, and predicting and obtaining target game actions based on the target game interaction data through a target neural network model corresponding to the target game model file.

12. And sending the target game action returned by the second server to the first server so that the first server controls the target object in the target game to execute the next action according to the target game action.

It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, and the order of execution of the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with at least a part of the sub-steps or stages of other steps or other steps.

In one embodiment, as shown in FIG. 8, there is provided a game data processing system comprising:

the game data processing device 802 is configured to send a current game environment data acquisition request to the first server, where the current game environment data acquisition request carries a current game identifier.

The first server 804 is configured to obtain, according to the current game environment data acquisition request, game environment data corresponding to the current game identifier from the game environment data corresponding to the plurality of games, and return the game environment data to the game data processing device.

The game data processing device 802 is configured to receive game environment data, perform data type conversion on the game environment data, obtain game training interaction data in a standard data format, and send the game training interaction data to the second server.

And the second server 806 is configured to receive game training interaction data, train the neural network model according to the game training interaction data, obtain a current game action and a current game reward value corresponding to a current game output by the neural network model, stop training the neural network model when the current game reward value reaches a preset game reward threshold, output the current game action, and generate a game model file corresponding to the current game.

In one embodiment, as shown in FIG. 9, there is provided a game data processing apparatus 900 comprising: an acquisition module 902, a conversion module 904, and a transmission module 906, wherein:

the obtaining module 902 is configured to obtain game environment data corresponding to a current game from a first server, where the first server is configured to collect game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games.

The conversion module 904 is configured to perform data type conversion on the game environment data to obtain game training interaction data in a standard data format.

And the sending module 906 is configured to send the game training interaction data to the second server, so that the second server trains the neural network model according to the game training interaction data, obtains a current game action and a current game reward value corresponding to a current game output by the neural network model, stops training the neural network model when the current game reward value reaches a preset game reward threshold, outputs the current game action, and generates a game model file corresponding to the current game.

In one embodiment, the game data processing device 900 receives a game environment data collection instruction sent by the second server, where the game environment data collection instruction is triggered and generated when the current game rewarding value does not reach the preset game rewarding threshold, acquires, according to the game environment data collection instruction, game environment data corresponding to a next game scene of the current game from the first server, returns to the conversion module 904 to perform data type conversion on the game environment data, and obtains game training interaction data in a standard data format, and stops training of the neural network model until the current game rewarding value output by the neural network model in the second server reaches the preset game rewarding threshold.

In one embodiment, the game data processing apparatus 900 receives an initialization parameter sent by the first server when a connection is successfully established with the first server through a predetermined communication protocol, sends the initialization parameter to the second server based on the predetermined communication protocol, so that the second server sets the neural network model to a corresponding initial training interaction parameter according to the initialization parameter, obtains an initialized neural network model, receives a current game reset instruction sent by the second server, and sends the current game reset instruction to the first server, so that the first server resets a game scene of the current game according to the current game reset instruction.

In one embodiment, the conversion module 904 extracts the game environment key parameters from the game environment data, obtains a corresponding game environment data matrix according to the game environment parameters and the preset matrix dimension information, and generates game training interaction data according to the game environment key parameters and the game environment data matrix.

In one embodiment, the conversion module 904 determines a corresponding first information type according to the game environment key parameter, determines a corresponding second information type according to the game environment data matrix, transmits the game environment key parameter to a standard parameter with a data type corresponding to the first information type as a structural body to obtain first game training interaction data, transmits the game environment data matrix to a standard parameter with a data type corresponding to the second information type as a structural body to obtain second game training interaction data, and obtains game training interaction data according to the first game training interaction data and the second game training interaction data.

In one embodiment, the game data processing device 900 obtains the target game environment data corresponding to the target game from the first server, performs standardized data format conversion on the target game environment data to obtain corresponding target game interaction data, sends the target game interaction data to the second server, so that the second server calls the matched target game model file according to the target game interaction data, predicts the target game action based on the target game interaction data through the target neural network model corresponding to the target game model file, and sends the target game action returned by the second server to the first server, so that the first server controls the target object in the target game to execute the next action according to the target game action.

The specific limitation of the game data processing device can be referred to above as limitation of the game data processing method, and will not be repeated here. The respective modules in the above-described game data processing apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing game training interaction data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a game data processing method.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 11. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a game data processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structures shown in fig. 10 or 11 are merely block diagrams of portions of structures associated with aspects of the application and are not intended to limit the computer device to which aspects of the application may be applied, and that a particular computer device may include more or fewer components than those shown, or may combine certain components, or may have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program: acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games; performing data type conversion on the game environment data to obtain game training interaction data in a standard data format; and sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

In one embodiment, the processor when executing the computer program further performs the steps of: and receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is triggered and generated when the current game rewarding value does not reach the preset game rewarding threshold value, acquiring game environment data corresponding to the next game scene of the current game from the first server according to the game environment data acquisition instruction, returning to the execution step to perform data type conversion on the game environment data to obtain game training interaction data in a standard data format, and stopping training of the neural network model until the current game rewarding value output by the neural network model in the second server reaches the preset game rewarding threshold value.

In one embodiment, the processor when executing the computer program further performs the steps of: when connection is successfully established with the first server through a preset communication protocol, receiving initialization parameters sent by the first server, sending the initialization parameters to the second server based on the preset communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters, obtains an initialized neural network model, receives a current game reset instruction sent by the second server, and sends the current game reset instruction to the first server, so that the first server resets a game scene of a current game according to the current game reset instruction.

In one embodiment, the processor when executing the computer program further performs the steps of: the game environment key parameters are extracted from the game environment data, a corresponding game environment data matrix is obtained according to the game environment parameters and the dimension information of the preset matrix, and game training interaction data are generated according to the game environment key parameters and the game environment data matrix.

In one embodiment, the processor when executing the computer program further performs the steps of: determining a corresponding first information type according to the game environment key parameters, determining a corresponding second information type according to the game environment data matrix, transmitting the game environment key parameters into standard parameters with the data type corresponding to the first information type as a structural body to obtain first game training interaction data, transmitting the game environment data matrix into standard parameters with the data type corresponding to the second information type as a structural body to obtain second game training interaction data, and obtaining game training interaction data according to the first game training interaction data and the second game training interaction data.

In one embodiment, the processor when executing the computer program further performs the steps of: obtaining target game environment data corresponding to a target game from a first server, performing standardized data format conversion on the target game environment data to obtain corresponding target game interaction data, sending the target game interaction data to a second server, enabling the second server to call a matched target game model file according to the target game interaction data, predicting a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file, and sending the target game action returned by the second server to the first server, so that the first server controls a target object in the target game to execute a next action according to the target game action.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring the game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games; performing data type conversion on the game environment data to obtain game training interaction data in a standard data format; and sending the game training interaction data to a second server so that the second server trains the neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A method of game data processing, the method comprising:

acquiring game environment data corresponding to a current game from a first server, wherein the first server is used for acquiring game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games;

And sending the game training interaction data to a second server, so that the second server trains a neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game, wherein each game is connected into the same set of neural network model for training.

2. The method according to claim 1, wherein the method further comprises:

receiving a game environment data acquisition instruction sent by the second server, wherein the game environment data acquisition instruction is triggered and generated when the current game rewarding value does not reach the preset game rewarding threshold value;

and acquiring game environment data corresponding to the next game scene of the current game from the first server according to a game environment data acquisition instruction, returning to an execution step, performing data type conversion on the game environment data to obtain game training interaction data in a standard data format, and stopping training of the neural network model until the current game rewarding value output by the neural network model in the second server reaches a preset game rewarding threshold value.

3. The method of claim 1, wherein before the obtaining the game environment data corresponding to the current game from the first server, further comprises:

receiving initialization parameters sent by the first server when connection is successfully established with the first server through a preset communication protocol;

transmitting the initialization parameters to the second server based on the preset communication protocol, so that the second server sets the neural network model as corresponding initial training interaction parameters according to the initialization parameters to obtain an initialized neural network model;

receiving a current game reset instruction sent by the second server, and sending the current game reset instruction to the first server, so that the first server resets the game scene of the current game according to the current game reset instruction.

4. The method of claim 1, wherein said performing data type conversion on said game environment data to obtain game training interaction data in a standard data format comprises:

extracting key parameters of the game environment from the game environment data;

Obtaining a corresponding game environment data matrix according to the game environment parameters and the dimension information of the preset matrix;

and generating game training interaction data according to the game environment key parameters and the game environment data matrix.

5. The method of claim 4, wherein generating game training interaction data from the game environment key parameters and the game environment data matrix comprises:

determining a corresponding first information type according to the game environment key parameters;

determining a corresponding second information type according to the game environment data matrix;

transmitting the game environment key parameters into standard parameters with the data type corresponding to the first information type as a structural body to obtain first game training interaction data;

and transmitting the game environment data matrix into standard parameters with the data type corresponding to the second information type as a structural body to obtain second game training interaction data, and obtaining game training interaction data according to the first game training interaction data and the second game training interaction data.

6. The method according to claim 1, wherein the method further comprises:

Acquiring target game environment data corresponding to a target game from the first server;

performing standardized data format conversion on the target game environment data to obtain corresponding target game interaction data;

sending the target game interaction data to the second server, so that the second server calls a matched target game model file according to the target game interaction data, and predicting a target game action based on the target game interaction data through a target neural network model corresponding to the target game model file;

and sending the target game action returned by the second server to the first server so that the first server controls a target object in the target game to execute a next action according to the target game action.

7. A game data processing system, the system comprising:

the first server is used for acquiring game environment data corresponding to the current game identifier from game environment data corresponding to a plurality of games according to the current game environment data acquisition request and returning the game environment data to the game data processing equipment;

The game data processing equipment is used for receiving the game environment data, performing data type conversion on the game environment data to obtain game training interaction data in a standard data format, and sending the game training interaction data to a second server;

the second server is used for receiving the game training interaction data, training a neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game, wherein each game is connected into the same set of neural network model for training.

8. A game data processing device, the device comprising:

the system comprises an acquisition module, a first server and a second server, wherein the acquisition module is used for acquiring game environment data corresponding to a current game from the first server, the first server is used for acquiring game environment data corresponding to a plurality of games, and the current game is any one of the plurality of games;

the sending module is used for sending the game training interaction data to a second server so that the second server trains a neural network model according to the game training interaction data to obtain a current game action and a current game rewarding value corresponding to the current game output by the neural network model, stopping training the neural network model when the current game rewarding value reaches a preset game rewarding threshold value, outputting the current game action, and generating a game model file corresponding to the current game, wherein each game is connected with the same set of neural network model for training.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when the computer program is executed by the processor.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.