CN113599802A

CN113599802A - Data processing method, device and system

Info

Publication number: CN113599802A
Application number: CN202110837835.2A
Authority: CN
Inventors: 刘舟; 杨帆; 黎广璘
Original assignee: Anhui Sanqi Jiyu Network Technology Co ltd
Current assignee: Anhui Sanqi Jiyu Network Technology Co ltd
Priority date: 2021-07-23
Filing date: 2021-07-23
Publication date: 2021-11-05
Anticipated expiration: 2041-07-23
Also published as: CN113599802B

Abstract

The present application relates to the field of machine learning, and in particular, to a data processing method, apparatus, and system. The method comprises the following steps: the method comprises the steps that when a first server controls a target object to execute a next action, environment data related to the target object are collected; the target object refers to a proxy object in any scene; performing data type conversion on the environment data to obtain standard format data, and transmitting the standard format data to a second server; the standard format data at least comprises matrix dimension information and environment parameter information; when action data returned by the second server are received, controlling the target object to execute the next action according to the action data; the action data is obtained by the second server analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information and deducing according to the series of environment parameters. The embodiment of the invention can use a set of solution to train A I in different scenes, thereby reducing the development cost of A I.

Description

Data processing method, device and system

Technical Field

The present application relates to the field of machine learning, and in particular, to a data processing method, apparatus, and system.

Background

In a game scene, a player is frequently disconnected due to a series of ineffectiveness factors such as network and equipment crash, so that the game experience of other players is influenced, and especially for a MOBA (Multiplayer Online Battle Arena) game, an offline hosted AI (Artificial Intelligence) is required to solve the problem.

Traditional offline managed AI depends on developers to manually write behavior logic, for example, AI developed by means of state machines, behavior trees, rule scripts and the like often has simple behaviors and single loop, cannot adapt to on-site situation adjustment strategies, and for complex scenes, development of AI by using a traditional development mode often has very high development cost. The AI of reinforcement learning training is used, the environment can be flexibly adapted, the behaviors are changeable, the random strain can be realized, the reality and the personification are more realized, the computer computing power of a GPU, a CPU and the like is required in the training process, the human intervention is not required, and in comparison, the research and development cost can be reduced.

However, currently, when the AI is trained by reinforcement learning, solutions for training the AI can only be developed individually according to specific cases, and if the AI needs to be trained for different game scenes, corresponding solutions need to be developed according to requirements of different game scenes, for example, a corresponding communication standard code library, a communication entity serialization scheme defined by training interaction, an AI training end standard code library, and the like are developed for different game scenes, so that the development cost of the AI is increased.

Disclosure of Invention

Aiming at the defects or shortcomings, the invention provides a data processing method, a device and a system.

The present invention provides a data processing method according to a first aspect, which in one embodiment is applied to a first server for controlling proxy objects in a plurality of scenes to perform actions; the method comprises the following steps:

the method comprises the steps that when a first server controls a target object to execute a next action, environment data related to the target object are collected; the target object refers to a proxy object in any scene;

performing data type conversion on the environment data to obtain standard format data, and transmitting the standard format data to a second server; the standard format data at least comprises matrix dimension information and environment parameter information;

when action data returned by the second server are received, controlling the target object to execute the next action according to the action data; the action data is obtained by the second server analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information and deducing according to the series of environment parameters.

In one embodiment, the action data is obtained by the second server analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, converting the series of environment parameters into model input data meeting the input data format requirement of the target model by using an analysis tool, and performing inference based on the model input data by using the target model; the target model is used to infer the next action of the target object.

In one embodiment, the step of transmitting the standard format data to the second server comprises:

serializing the standard format data into binary data;

transmitting the binary data to the second server based on a predetermined communication protocol; the predetermined communication protocol is the only communication protocol adopted by the communication between the first server and the second server;

correspondingly, after receiving the binary data, the second server performs deserialization on the binary data to obtain data in a standard format.

In one embodiment, the environmental data includes a plurality of parameters; the method for converting the data type of the environment data into the standard format data comprises the following steps:

determining information types corresponding to all parameters in the environment data; the information type at least comprises a matrix dimension and an environment parameter;

transmitting each parameter in the environment data into a standard parameter of which the data type corresponding to the information type is a structural body to obtain standard format data; the number of standard parameters is the same as the number of types of information.

The present invention provides a data processing method according to a second aspect, which in one embodiment is applied to a second server; the method comprises the following steps:

acquiring standard format data from a first server; the standard format data at least comprises matrix dimension information and environment parameter information; the standard format data is obtained by acquiring environment data related to a target object and performing data type conversion on the environment data when a first server controls the target object to execute a next action, the first server is used for controlling proxy objects in a plurality of scenes to execute the action, and the target object refers to a proxy object in any scene;

analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, and deducing to obtain action data according to the series of environment parameters;

and transmitting the action data to the first server, so that the first server controls the target object to execute the next action according to the action data when receiving the action data returned by the second server.

In one embodiment, the step of inferring motion data from the set of environmental parameters comprises:

converting the series of environmental parameters into model input data meeting the input data format requirements of the target model by using an analysis tool; the target model is used for deducing the next action of the target object;

the motion data is inferred based on the model input data using the target model.

In one embodiment, the step of obtaining the standard format data from the first server is preceded by:

receiving binary data transmitted by a first server based on a predetermined communication protocol;

deserializing the binary data to obtain data in a standard format; the predetermined communication protocol is the only communication protocol employed for communication between the first server and the second server.

The present invention provides according to a third aspect a system comprising, in one embodiment, a first server for controlling proxy objects in a plurality of scenarios to perform an action;

the method comprises the steps that when a first server controls a target object to execute a next action, environment data related to the target object are collected, and the target object refers to a proxy object in any scene; performing data type conversion on the environment data to obtain standard format data, and transmitting the standard format data to a second server; the standard format data at least comprises matrix dimension information and environment parameter information;

the second server analyzes the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included by the environment parameter information, deduces action data according to the series of environment parameters, and transmits the action data to the first server;

and when the first server receives the action data returned by the second server, controlling the target object to execute the next action according to the action data.

The present invention provides, in accordance with a fourth aspect, a data processing apparatus for, in one embodiment, controlling proxy objects in a plurality of scenes to perform actions; the device comprises:

the environment data acquisition module is used for acquiring environment data related to the target object when the target object is controlled to execute the next action; the target object refers to a proxy object in any scene;

the data processing module is used for converting the data type of the environment data to obtain standard format data and transmitting the standard format data to the second server; the standard format data at least comprises matrix dimension information and environment parameter information;

the control module is used for controlling the target object to execute the next action according to the action data when the action data returned by the second server are received; the action data is obtained by the second server analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information and deducing according to the series of environment parameters.

The present invention provides according to a fifth aspect a data processing apparatus, which in one embodiment comprises:

the data acquisition module is used for acquiring standard format data from the first server; the standard format data at least comprises matrix dimension information and environment parameter information; the standard format data is obtained by acquiring environment data related to a target object and performing data type conversion on the environment data when a first server controls the target object to execute a next action, the first server is used for controlling proxy objects in a plurality of scenes to execute the action, and the target object refers to a proxy object in any scene;

the data processing module is used for analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included by the environment parameter information, and deducing action data according to the series of environment parameters;

and the transmission module is used for transmitting the action data to the first server, so that the first server controls the target object to execute the next action according to the action data when receiving the action data returned by the second server.

In the embodiment of the invention, when a target object is controlled by a first server, namely a proxy object of any scene in a plurality of scenes executes the next action, the environment data related to the target object is collected, the data type of the environment data is converted to obtain standard format data, and then the standard format data is transmitted to a second server; the standard format data at least comprises matrix dimension information and environment parameter information; after receiving the standard format data, the second server analyzes the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, then deduces action data according to the series of environment parameters, and then returns the action data to the first server.

Drawings

FIG. 1 is a diagram of an application environment of a data processing method in one embodiment;

FIG. 2 is a flow diagram illustrating a data processing method according to one embodiment;

FIG. 3 is a flow diagram illustrating data type conversion performed by the first server in one embodiment;

FIG. 4 is a flow chart illustrating a data processing method according to another embodiment;

FIG. 5 is a flow diagram that illustrates interaction between a first server and a second server in a system, according to one embodiment;

FIG. 6 is a block diagram of a data processing apparatus according to one embodiment;

FIG. 7 is a block diagram showing a configuration of a data processing apparatus according to another embodiment;

FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The invention provides a data processing method. In this embodiment, the data processing method can be applied to the application environment shown in fig. 1. The first server controls agent objects (agents) in different scenes to act, when the first server controls the agent objects in any scene to act next, the first server collects environment data related to the agent objects, converts the environment data into standard format data and then sends the standard format data to the second server, after the second server receives the standard format data, the second server analyzes environment parameter information according to matrix dimension information in the standard format data to obtain a series of environment parameters included by the environment parameter information, the neural network is used for deducing action data according to the series of environment parameters and returning the action data to the first server, and the first server can know how to control the agent objects to act next step through the action data. In the embodiment, the environment parameters of different scenes are abstracted, the format of the environment parameters is abstracted, the first server converts the environment parameters of any scene into data in a standard format when transmitting the environment parameters of any scene to the second server, so that a corresponding communication frame does not need to be developed according to the characteristics (such as the characteristics of the number of parameters, the type of parameters and the like) of the environment parameters in each scene, relevant personnel only need to set corresponding matrix dimensions for the environment parameters according to the characteristics of the environment parameters in the scene, and the second server can analyze the environment parameter information according to the matrix dimension information in the data in the standard format, so that a series of environment parameters included in the environment parameter information are obtained, and further, relevant operation of action inference can be performed according to the series of environment parameters. The AI described in this embodiment may be game AI in the game field, such as monster AI, NPC (non-player character) AI, PVP (player to player) AI, PVE (player to player environment) AI, pet/follower AI, and the like, and is used to control a virtual object in a game scene to perform a corresponding action, or may be robot AI in the robot field to control a robot to perform a corresponding action, and may also be AI in other fields.

Further, the relevant process of the first server using the second server to infer the next action of the proxy object in any scenario may occur in a training scenario or an inference scenario of the AI (i.e., a scenario in which the trained AI is used for inference), that is, the data processing method provided by the embodiment may be applied in the training scenario or the inference scenario of the AI.

For example, in one embodiment, in a training scenario of an AI (hereinafter, also referred to as a model), the second server is used for training the AI of the agent object of each scenario, wherein the second server may train each AI in sequence or simultaneously, for example, in the same reinforcement learning training period, the second server trains each AI using the same neural network (which may also be different neural networks). The first server converts environment data related to the proxy objects in different scenes into standard format data and then sends the standard format data to the second server, and the second server uses the corresponding neural network to deduce according to the standard format data related to the proxy objects in different scenes. The neural network refers to a simulated neural network model constructed by using a reinforcement learning framework in reinforcement learning.

The process of the first server and the second server interacting to train different AIs is similar, and the following describes the process of the first server and the second server interacting to train one AI:

the method comprises the steps that a first server and a second server are connected, then the first server sends initialization parameters to the second server to indicate the second server to set corresponding training interaction parameters for a trainer, and then the following interaction process is repeatedly carried out between the first server and the second server to train the AI until a preset training end condition is met. When a predetermined training end condition is met, for example, when the reward (reward value) or max step (maximum training number) reaches a set value, the predetermined training end condition is met, and the second server stores the training result of the AI as a model file. The AI of different scenes corresponds to different model files. In addition, when the training result of the AI is saved, the training result may be saved in various formats commonly found in the industry, such as ". onnx", ". pth", ". ckpt", and the like, or may be uniformly saved in a certain format, such as ". onnx", which may be flexibly adjusted according to actual needs, which is not limited in this embodiment. Furthermore, this interaction process is repeated on the order of millions or more in training the AI. For example, the famous go AI alphago is trained by completing billions of game plays.

The initialization parameters may include information such as a seed value, a communication version, a packet version, and training capability parameters, and the training capability parameters may include information such as a basic reinforcement learning capability, a connection PNG observation setting value, a compression channel transmission mapping, a hybrid action, a training analysis, a variable length observation setting value, and whether there are multiple agent groups. The initialization parameter can be used for standardizing a first server and a second server, the information such as the seed value, the communication version, the packet version and the like is the basis of communication training interaction between the first server and the second server, the parameters are mainly used for verifying the legality of the two parties, and the training capacity parameter is used for appointing some basic configurations of a training end (in the scene of training AI, the second server can also be called as the training end) so that the training end carries out data communication interaction in an appointed mode. The above training interaction parameters may include environment parameter configuration, action space configuration, and the like, and the action space configuration may include configuration information such as an action dimension, an action value, a representation name for distinguishing different model configurations, and a training agent group ID for grouping different agent objects of the first server.

The above interaction process is as follows:

step 1: when an agent object (namely an agent) in a control target scene executes the next action, a first server collects environment data related to the agent object, converts the environment data into standard format data and then sends the standard format data to a second server;

step 2: the second server uses the neural network to deduce to obtain action data according to the standard format data, and sends the action data to the first server;

and step 3: and the first server controls the proxy object to execute the next action according to the action data.

In each interactive process, after the second server receives the standard format data, at least a plurality of environment parameters (the types and the number of the environment parameters corresponding to different scenes may be different) can be extracted from the standard format data, the current reward is calculated according to the environment parameters, if the current reward reaches a preset maximum reward or the current training frequency reaches max step, the training is judged to be completed, a training result is derived, otherwise, the plurality of environment parameters are converted into model input data, then action data are deduced according to the model input data by using a neural network and returned to the first server, and therefore the first server can control the agent object to execute the next action according to the action data.

In another embodiment, in an inference scenario of AI, a trained AI of multiple scenarios may be deployed in a second server, where a first server converts environment data related to an agent object of any scenario into standard format data and sends the standard format data to the second server when the agent object needs to be controlled to perform a next action, the second server extracts environment parameters from the environment data, converts the extracted environment parameters into model input data, and then invokes a model file corresponding to the agent object to infer action data according to the model input data and return the action data to the first server, so that the first server may control the agent object to perform the next action according to the action data.

The first server and the second server may be implemented by independent servers or a server cluster composed of a plurality of servers.

Example one

In this embodiment, a data processing method provided by the present invention includes the steps shown in fig. 2, and the following description will take the application of this method to the first server in fig. 1 as an example.

S110: environmental data associated with the target object is collected while the target object is controlled to perform the next action.

The first server is used for controlling proxy objects in a plurality of scenes to execute actions, and the target object refers to a proxy object in any scene. The multiple scenes refer to multiple scenes in the same field (such as a game field, a robot field, and the like).

For example, in the field of gaming, the plurality of scenes may refer to a plurality of games, such as a game, B game, and C game, at which time the proxy objects in the plurality of scenes may refer to an a monster in the a game (the monster refers to the proxy object), a B monster in the B game, and a C monster in the C game, which may perform each step of action (such as sprinting, jumping, lying down, forward (or back/left/right) attack, etc.) under the control of the first server. When the first server is to control any stranger to perform the next action, the environmental data related to the stranger can be collected. It should be noted that, in different games, the environmental data related to the monster to be collected may be different, and it is assumed that the environmental data of the a monster to be collected in the a game at least includes the game identifier of the a game, the monster identifier of the a monster, and the offensive power and position of the a monster, and the environmental data of the B monster to be collected in the B game at least includes the game identifier of the B game, the monster identifier of the B monster, and the blood volume, offensive power and position of the B monster.

For example, in the robot field, the scene may be an application scene of a robot, and the proxy objects in the multiple scenes may refer to robots in multiple application scenes, such as an a robot in a home scene, and a B robot in an office scene. The first server can control each robot to execute actions (for example, send out various sounds, shake head, kicking legs and the like), and the first server can collect current environment data of each robot (for example, volume, light intensity, voice and the like data which are uploaded by the robot and collected by various sensors arranged in the robot).

S120: and converting the data type of the environment data to obtain standard format data, and transmitting the standard format data to a second server, wherein the standard format data at least comprises matrix dimension information and environment parameter information.

The first server can realize the conversion of the collected environment data into standard format data and transmit the standard format data to the second server by using a universal communication framework. The communication framework can be understood as a code implementation of a defined communication standard, which can be embedded in Software code in the form of a Software dependent library (also called a function dependent library) (i.e. SDK, which is called Software Development Kit in english, and is called Software Development Kit in chinese).

In an embodiment, the step of converting the data type of the environment data into the standard format data as shown in fig. 3 includes:

s121: and determining the information types corresponding to the parameters in the environment data.

S122: and transmitting each parameter in the environment data into a standard parameter of which the data type corresponding to the information type is a structural body to obtain data in a standard format.

The environment data comprises a plurality of parameters, and the number of the parameters of the environment data is different in different scenes. The information type at least comprises matrix dimension and environment parameters, the matrix dimension information refers to standard parameters of which the information type is the matrix dimension, and the environment parameter information refers to standard parameters of which the information type is the environment parameters. It should be noted that the type and the number of the information types may be defined according to the requirements of the actual scene, the standard parameters correspond to each information type one to one, that is, the number of the standard parameters is the same as the number of the types of the information types, and the first server stores the parameters in the environment data belonging to the same information type in one standard parameter during the conversion. The data type of the standard parameter is a structure (i.e., struct), and the standard parameter can be used for storing a group of data of different or same types.

Exemplarily, assuming that AI is required to be trained for multiple games, a basic data type of each game in a training scene and a basic data type of an inference scene may be found first, and then a common point of the basic data types is found according to a used scene, so as to define corresponding information types, for example, the defined information types are an environment parameter and a matrix dimension, and standard parameters corresponding to each information type are float data and Observation, respectively. Wherein the content of the first and second substances,

FloatData is a collection of Float-type data. Because the parameters of the game environment can be represented by floating point number, the FloatData can be defined as a Floatstorage structure of a variable-length set, and different games only have different numbers of game parameters and can be stored in the FloatData and prepared for subsequent transmission;

the observer contains attributes of collection data type of String type, variable dimension tuple of tuple ancestor type, etc. The observer is used for storing the matrix dimensionality of the environment parameters, and different games can set corresponding values for the observer according to different environment parameters.

In another example, in addition to two information types, the defined information types may include vector sensor and proxy object information, with standard parameters for VectorSensor and AgentInfo, respectively. Wherein the content of the first and second substances,

the VectorSensor comprises attributes such as a Double type data set, a String type proxy object name, an observer structure and the like;

AgentInfo contains the reward value attribute of Float type data, done (current end of office tag) attribute of Boolean type, id attribute of String type, groupId attribute of String type, maxStepReached attribute of Boolean type, etc.

When the first server is used for conversion, each parameter in the environment data is transmitted into a standard parameter of which the data type corresponding to the information type is a structural body. The following describes the conversion process by taking an example of converting parameters belonging to two information types, namely, the environment parameter and the matrix dimension, in the environment data into the standard parameters of FloatData and observer (the same applies to the conversion of other standard parameters).

For example, the game scene a is a balance ball, the collected environment parameters to be transmitted are 6 coordinates, directions and the like of the balance ball, and the data matrix is the dimensionality of 1X6, so that the 6 environment parameters such as the coordinates, the directions and the like of the balance ball are transmitted into the standard parameter FloatData during conversion, and the data matrix 1X6 is transmitted into the standard parameter Observation; for another example, if the B game scene is ping-pong, the collected environment parameters to be transmitted are 8 coordinates and speeds of the ball, positions of both players, and the like, and the data matrix is a dimension of 1X8, then the 8 environment parameters are transferred to the standard parameter, flowdata, and the data matrix 1X8 is transferred to the standard parameter, Observation, during the conversion.

Further, the second server is also embedded with the aforementioned general communication framework, after receiving the standard format data sent by the first server through the general communication framework, the second server extracts the matrix dimension information and the environment parameter information from the standard format data, then analyzes the environment parameter information according to the matrix dimension information, thereby obtaining a series of environment parameters (for example, 6 environment parameters of the balance ball in the above example) included in the environment parameter information, deduces the action data according to the series of environment parameters, and then returns the action data to the first server.

When the second server infers the series of environment parameters, the analysis tool is used for converting the series of environment parameters into model input data meeting the input data format requirement of a target model for inferring the next action of the target object, and then the target model is used for inferring based on the model input data to obtain action data.

Further, in order to better transmit data in the network, the step of the first server transmitting the standard format data to the second server comprises: serializing the standard format data into binary data; transmitting the binary data to the second server based on a predetermined communication protocol; the predetermined communication protocol is the only communication protocol employed for communication between the first server and the second server.

Among them, protobuf can be selected for the serialization and the deserialization. Serialization refers to the conversion of structural data or objects into a format that can be stored and transmitted (e.g., over a network) while ensuring that the serialized results can later be reconstructed back into the original structural data or objects (possibly in another computing environment).

Accordingly, at this time, the second server receives binary data after the standard format data is serialized instead of the standard format data, and thus the second server needs to deserialize the binary data after receiving the binary data to obtain the standard format data.

S130: and when the action data returned by the second server is received, controlling the target object to execute the next action according to the action data.

When the first server receives the action data returned by the second server, the target object can be controlled to execute the next action according to the action data.

In this embodiment, the first server needs to convert the environment data into standard format data before transmitting it to the second server via the network. No matter which scene the environment data of the agent object can be finally converted into the standard format data composed of a plurality of standard parameters, so that a corresponding communication protocol does not need to be developed (or defined) for each scene, only one communication protocol needs to be adopted when the standard format data is transmitted (for example, in the prior art, if the environment data related to the agent object of the scene A has 8 parameters, a communication protocol needs to be developed to transmit the 8 parameters, if the environment data related to the agent object of the scene B has 6 parameters, the developed communication protocol a cannot be reused, a communication protocol B needs to be developed to transmit the 6 parameters, and the scheme develops a communication protocol c which is used for transmitting 4 standard parameters, and the environment data is finally converted into the 4 standard parameters no matter how many parameters are contained, so that the environment data of the agent objects of different scenes can be transmitted through the communication protocol). For example, in the past, a communication protocol a needs to be used when transmitting the environment data of the proxy object of the scene a, a communication protocol B needs to be used when transmitting the environment data of the proxy object of the scene B, and communication adaptation needs to be performed to realize data transmission, that is, the communication protocol a is used when transmitting the data of the scene a, and the communication protocol B is switched to when transmitting the data of the scene B, which requires a large development cost.

Example two

Based on the same inventive concept as the first embodiment, the present invention provides another data processing method, which is applied to the second server shown in fig. 1 in this embodiment. As shown in fig. 4, the method includes:

s210: standard format data from a first server is obtained.

The standard format data at least comprises matrix dimension information and environment parameter information. The standard format data is obtained by acquiring environment data related to the target object and performing data type conversion on the environment data when the first server controls the target object to execute the next action. The first server is used for controlling the proxy objects in a plurality of scenes to execute actions, and the target object refers to the proxy object in any scene.

S220: analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, and deducing to obtain action data according to the series of environment parameters;

s230: and transmitting the action data to the first server, so that the first server controls the target object to execute the next action according to the action data when receiving the action data returned by the second server.

In one embodiment, the step of the second server deducing the motion data according to the series of environment parameters comprises:

In one embodiment, the step of the second server obtaining the standard format data from the first server is preceded by:

The second server may implement the foregoing steps S210 and S220 through a general communication framework and perform deserialization on the binary data to obtain data in a standard format. For a description of the general communication framework, please refer to the related contents in the first embodiment.

For specific limitations of the first server and the second server in the data processing method provided in this embodiment, reference may be made to limitations of the data processing method in the first embodiment, and details are not described here again.

EXAMPLE III

Based on the same inventive concept as the first embodiment, the present invention further provides a system, which in this embodiment includes the first server and the second server shown in fig. 1, wherein the first server is configured to control the proxy objects in the plurality of scenes to perform the action. The interaction between the first server and the second server is shown in fig. 5, and includes:

s310: the method comprises the steps that when a first server controls a target object to execute a next action, environment data related to the target object are collected, and the target object refers to a proxy object in any scene; performing data type conversion on the environment data to obtain standard format data, and transmitting the standard format data to a second server; the standard format data at least comprises matrix dimension information and environment parameter information;

s320: the second server analyzes the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included by the environment parameter information, deduces action data according to the series of environment parameters, and transmits the action data to the first server;

s330: and when the first server receives the action data returned by the second server, controlling the target object to execute the next action according to the action data.

In one embodiment, the step of the first server transmitting the standard format data to the second server comprises:

the first server serializes the standard format data into binary data;

In one embodiment, the environmental data includes a plurality of parameters; the method for converting the data type of the environment data into the standard format data by the first server comprises the following steps:

the method comprises the steps that a first server determines information types corresponding to all parameters in environment data; the information type at least comprises a matrix dimension and an environment parameter;

the second server converts the series of environmental parameters into model input data meeting the input data format requirement of the target model by using an analysis tool; the target model is used for deducing the next action of the target object;

For specific limitations of the first server and the second server in the system provided in this embodiment, reference may be made to limitations of the data processing method in the first embodiment, which is not described herein again.

Fig. 2-5 are flow diagrams of a method for determining a transportation route in one embodiment. It should be understood that although the various steps in the flow charts of fig. 2-5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-5 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

Example four

Based on the same inventive concept as the first embodiment, the invention also provides a data processing device, which is used for controlling the proxy objects in a plurality of scenes to execute actions in the embodiment; as shown in fig. 6, the apparatus includes:

an environment data collection module 110, configured to collect environment data related to the target object when the target object is controlled to perform a next action; the target object refers to a proxy object in any scene;

the data processing module 120 is configured to perform data type conversion on the environment data to obtain standard format data, and transmit the standard format data to the second server; the standard format data at least comprises matrix dimension information and environment parameter information;

the control module 130 is configured to, when receiving the action data returned by the second server, control the target object to execute a next action according to the action data; the action data is obtained by the second server analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information and deducing according to the series of environment parameters.

In one embodiment, a data processing module, comprising:

the serialization submodule is used for serializing the standard format data into binary data;

a transmission sub-module for transmitting the binary data to the second server based on a predetermined communication protocol; the predetermined communication protocol is the only communication protocol employed for communication between the first server and the second server.

In one embodiment, the environmental data includes a plurality of parameters; a data processing module comprising:

the information type determining submodule is used for determining the information types corresponding to all the parameters in the environment data; the information type at least comprises a matrix dimension and an environment parameter;

the conversion submodule is used for transmitting each parameter in the environment data into a standard parameter of which the data type corresponding to the information type is a structural body, so as to obtain data in a standard format; the number of standard parameters is the same as the number of types of information.

For specific limitations of the data processing apparatus provided in this embodiment, reference may be made to limitations of the data processing method in the first embodiment, which is not described herein again. The various modules in the data processing apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

EXAMPLE five

Based on the same inventive concept as the embodiment, the present invention also provides a data processing apparatus, which, in one embodiment, as shown in fig. 7, includes:

a data obtaining module 210, configured to obtain standard format data from a first server; the standard format data at least comprises matrix dimension information and environment parameter information; the standard format data is obtained by acquiring environment data related to a target object and performing data type conversion on the environment data when a first server controls the target object to execute a next action, the first server is used for controlling proxy objects in a plurality of scenes to execute the action, and the target object refers to a proxy object in any scene;

the data processing module 220 is configured to analyze the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, and infer to obtain action data according to the series of environment parameters;

the transmission module 230 is configured to transmit the action data to the first server, so that the first server controls the target object to execute a next action according to the action data when receiving the action data returned by the second server.

In one embodiment, a data processing module, comprising:

the conversion sub-module is used for converting the series of environmental parameters into model input data meeting the input data format requirement of the target model by using an analysis tool; the target model is used for deducing the next action of the target object;

and the inference submodule is used for inferring on the basis of the model input data by using the target model to obtain the action data.

In one embodiment, the data obtaining module, before being configured to obtain the standard format data from the first server, is further configured to receive binary data transmitted by the first server based on a predetermined communication protocol, and perform deserialization on the binary data to obtain the standard format data; the predetermined communication protocol is the only communication protocol employed for communication between the first server and the second server.

For specific limitations of the data processing apparatus provided in this embodiment, reference may be made to limitations of the data processing method in embodiment two, which is not described herein again. The various modules in the data processing apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

EXAMPLE six

In the present embodiment, a computer device is provided, and its internal structure diagram may be as shown in fig. 8.

The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data such as environment data, and the specific stored data can also be referred to as the definition in the above method embodiment. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

The present embodiment further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps included in the above-mentioned first embodiment or second embodiment are implemented.

EXAMPLE seven

In this embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps included in the above-mentioned first or second embodiment.

It will be understood by those skilled in the art that all or part of the processes of the embodiments of the methods described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A data processing method is applied to a first server, wherein the first server is used for controlling proxy objects in a plurality of scenes to execute actions; the method comprises the following steps:

when action data returned by the second server are received, controlling the target object to execute a next action according to the action data; the action data is obtained by the second server analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information and deducing according to the series of environment parameters.

2. The method of claim 1, wherein the action data is obtained by the second server parsing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, converting the series of environment parameters into model input data meeting an input data format requirement of a target model by using a parsing tool, and performing inference based on the model input data by using the target model; the target model is used to infer a next action of the target object.

3. The method of claim 1, wherein the step of transmitting the standard format data to a second server comprises:

serializing the standard format data into binary data;

transmitting the binary data to the second server based on a predetermined communication protocol; the predetermined communication protocol is the only communication protocol employed for communication between the first server and the second server;

correspondingly, after receiving the binary data, the second server performs deserialization on the binary data to obtain the standard format data.

4. The method of claim 1, wherein the environmental data comprises a plurality of parameters; and converting the data type of the environment data to obtain standard format data, wherein the step comprises the following steps of:

transmitting each parameter in the environment data into a standard parameter of which the data type corresponding to the information type is a structural body to obtain standard format data; the number of the standard parameters is the same as the number of the types of the information.

5. A data processing method, characterized in that the method is applied to a second server; the method comprises the following steps:

acquiring standard format data from a first server; the standard format data at least comprises matrix dimension information and environment parameter information; the standard format data is obtained by acquiring environment data related to a target object and performing data type conversion on the environment data when the first server controls the target object to execute a next action, wherein the first server is used for controlling proxy objects in a plurality of scenes to execute the action, and the target object refers to a proxy object in any scene;

analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, and deducing action data according to the series of environment parameters;

6. The method of claim 1, wherein the step of inferring motion data from the set of environmental parameters comprises:

inferring motion data based on the model input data using the target model.

7. The method of claim 5, wherein the step of obtaining the standard-format data from the first server is preceded by:

receiving binary data transmitted by the first server based on a predetermined communication protocol;

deserializing the binary data to obtain the standard format data; the predetermined communication protocol is the only communication protocol employed for communication between the first server and the second server.

8. A system comprising a first server and a second server, the first server for controlling proxy objects in a plurality of scenarios to perform actions;

the first server collects environment data related to a target object when controlling the target object to execute a next action, wherein the target object refers to a proxy object in any scene; performing data type conversion on the environment data to obtain standard format data, and transmitting the standard format data to a second server; the standard format data at least comprises matrix dimension information and environment parameter information;

the second server analyzes the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, deduces action data according to the series of environment parameters, and transmits the action data to the first server;

9. A data processing apparatus for controlling proxy objects in a plurality of scenes to perform actions; the device comprises:

the environment data acquisition module is used for acquiring environment data related to a target object when the target object is controlled to execute the next action; the target object refers to a proxy object in any scene;

the data processing module is used for converting the data type of the environment data to obtain standard format data and transmitting the standard format data to a second server; the standard format data at least comprises matrix dimension information and environment parameter information;

10. A data processing apparatus, characterized in that the apparatus comprises:

the data acquisition module is used for acquiring standard format data from the first server; the standard format data at least comprises matrix dimension information and environment parameter information; the standard format data is obtained by acquiring environment data related to a target object and performing data type conversion on the environment data when the first server controls the target object to execute a next action, wherein the first server is used for controlling proxy objects in a plurality of scenes to execute the action, and the target object refers to a proxy object in any scene;

the data processing module is used for analyzing the environment parameter information according to the matrix dimension information to obtain a series of environment parameters included in the environment parameter information, and deducing action data according to the series of environment parameters;