CN115358404A

CN115358404A - Data processing method, device and equipment based on machine learning model reasoning

Info

Publication number: CN115358404A
Application number: CN202211074172.4A
Authority: CN
Inventors: 蒋煜襄; 郭聪; 于万金; 王林芳; 陈�峰
Original assignee: Jingdong Technology Holding Co Ltd
Current assignee: Jingdong Technology Holding Co Ltd
Priority date: 2022-09-02
Filing date: 2022-09-02
Publication date: 2022-11-18

Abstract

The disclosure provides a data processing method, a device and equipment based on machine learning model reasoning, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space; preprocessing original input data to be processed to obtain preprocessed data stored in a cache space; performing model reasoning processing on the preprocessed data to obtain model reasoning data stored in a cache space; and post-processing the model inference data to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into a target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.

Description

Data processing method, device and equipment based on machine learning model reasoning

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a data processing method, device, and apparatus based on machine learning model inference.

Background

With the development of big data technology and artificial intelligence technology, more and more business scenes such as financial wind control, online advertisement, commodity recommendation, intelligent cities and the like adopt a large number of machine learning technologies to improve the service quality and the intelligent decision level. For specific tasks, after a model is obtained through training in a training environment specified by the model, the model needs to be packaged, then the model is deployed as an online reasoning service, and when a user uses an operating environment the same as that of the training environment, the reasoning service can be used.

In the related art, model reasoning is usually performed on dedicated reasoning apparatuses, because each dedicated reasoning apparatus has different reasoning performances. Therefore, when a plurality of special reasoning devices corresponding to the back ends execute the reasoning of the whole process of the model, data corresponding to a plurality of reasoning stages are inevitably generated, and the model reasoning efficiency is low due to a large amount of data.

Therefore, how to improve the efficiency of model reasoning is a technical problem to be solved urgently at present.

Disclosure of Invention

The data processing method, the data processing device and the data processing equipment based on the machine learning model inference process the data in the inference whole process in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model inference efficiency is also improved.

In a first aspect, the present disclosure provides a data processing method based on machine learning model inference, including:

acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information;

preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;

performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary;

and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, the configuration file at least includes: configuring platform parameters and model path parameters;

the method for acquiring the platform equipment information according to the preset configuration file, initializing the platform equipment information, and determining the cache space corresponding to the platform equipment information includes:

screening a rear-end interface according to the configuration platform parameters;

processing a preset model file based on the model path parameters to obtain model information generated in inference equipment corresponding to the rear-end interface;

and according to the model information, creating the cache space in the inference equipment corresponding to the platform equipment information.

Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, the screening out the backend interface according to the configuration platform parameters includes:

establishing a model class corresponding to the inference interface according to a preset inference interface;

and determining the back-end interface which is registered in the configuration file according to the model class and the configuration platform parameters.

Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, processing a preset model file based on the model path parameter to obtain model information generated in inference equipment corresponding to the backend interface includes:

acquiring a model file stored in a host memory according to the model path parameters;

and transmitting the model file in the host memory to inference equipment corresponding to the rear-end interface so as to analyze the model file by using the inference equipment to obtain the model information.

Preferably, according to a data processing method based on machine learning model inference provided by the present disclosure, before the raw input data to be processed is preprocessed to obtain preprocessed data stored in the cache space, the method includes:

initializing a preset data dictionary to obtain the target data dictionary;

and storing the original input data in a host memory, and storing an original data pointer corresponding to the storage position of the original input data in the host memory into the target data dictionary so as to call the original input data from the host memory by using the original data pointer.

Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, the performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data includes:

acquiring a running function in the inference interface;

and loading a reasoning function in the model class based on the operation function, and performing model reasoning on the preprocessed data acquired by using the input pointer to obtain the model reasoning data.

In a second aspect, the present disclosure also provides a data processing apparatus based on machine learning model inference, including:

the initialization module is used for acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information and determining a cache space corresponding to the platform equipment information;

the preprocessing module is used for preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;

the model reasoning module is used for carrying out model reasoning processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model reasoning data and storing an output pointer corresponding to a second position of the model reasoning data in the cache space into the target data dictionary;

and the post-processing module is used for post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

In a third aspect, the present disclosure also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the data processing method based on machine learning model inference as described in any one of the above when executing the program.

In a fourth aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data processing method based on machine learning model inference as described in any one of the above.

In a fifth aspect, the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the data processing method based on machine learning model inference as described in any one of the above.

According to the data processing method, the data processing device and the data processing equipment based on the machine learning model inference, platform equipment information is obtained according to a preset configuration file, initialization processing is carried out on the platform equipment information, and a cache space corresponding to the platform equipment information is determined; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.

Drawings

In order to more clearly illustrate the technical solutions of the present disclosure or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic flow chart of a data processing method based on machine learning model inference according to an embodiment of the present disclosure;

FIG. 2 is a diagram of interfaces and model classes of a standard inference interface in accordance with an embodiment of the present disclosure;

FIG. 3 is a schematic flowchart of step S100 in FIG. 1 according to an embodiment of the disclosure;

FIG. 4 is a schematic flowchart of step S310 in FIG. 3 according to an embodiment of the present disclosure;

FIG. 5 is a schematic flowchart of step S320 in FIG. 3 according to an embodiment of the disclosure;

fig. 6 is a schematic structural diagram of a data processing apparatus based on machine learning model inference provided in an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the present disclosure more apparent, the technical solutions of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without inventive step, are intended to be within the scope of the present disclosure.

Standard inference interface (InferDK): the standard reasoning interface is obtained by packaging the bottom layer reasoning capability provided by the back end (backend). The method is used for loading the initialization model and controlling data transmission between equipment and a host or between the equipment and the equipment, and comprises a model driving execution interface, so that the thread safety of model reasoning can be ensured, and an access interface for basic parameters of the model is provided. By packaging the standard inference interface, the upper-layer stage process has no perception on the bottom hardware equipment when executing operations such as model inference and the like, and can support more model inference capabilities by expanding the rear end of the standard inference interface, wherein the upper-layer stage process is the last logic layer located on the logic layer of the standard inference interface and corresponding to the service and process framework.

Backend (backend): the backend is typically encapsulated using a native runtime interface (runtime api) provided by the hardware device vendor.

Image processing interface (ProcessDK): according to the capability of the processing unit such as the image provided by the hardware equipment, the processing capability of different hardware is uniformly packaged. Thereby the upper layer stage shields various interfaces provided by each hardware processing unit. For image data input reasoning, atomic capabilities such as typical "matting", "scaling", etc. are provided.

A data dictionary: a Data dictionary (Data dictionary) is a directory of record databases and application metadata that a user can access, and in the embodiment of the present disclosure, is used for storing a plurality of Data pointer information to quickly acquire Data in a cache space.

Pointer: in computer science, a Pointer (Pointer) is an object in a programming language, with an address whose value points directly to a value stored elsewhere in computer memory, in the disclosed embodiment to the location information of data stored in cache space.

Buffer space (buffer): in the field of computers, buffer space is actually referred to as a buffer register, which temporarily stores data sent from an external device so that the processor can take it away, or is used to temporarily store data sent from the processor to the external device (back-end reasoning equipment). With the buffer space, the CPU working at high speed and the peripheral working at low speed can be coordinated and buffered, and the synchronization of data transmission is realized.

The following describes a data processing method, apparatus, and device based on machine learning model inference, which are provided in the embodiments of the present disclosure, with reference to fig. 1 to fig. 7, and can process data in the entire inference process in a pointer and data dictionary manner, so that not only the data processing efficiency is improved, but also the efficiency of model inference is improved.

Specifically, the following embodiments are used to explain, and first, a data processing method based on machine learning model inference in the embodiments of the present disclosure is described.

As shown in fig. 1, which is a schematic flow chart of an implementation of a data processing method based on machine learning model inference according to an embodiment of the present disclosure, the data processing method based on machine learning model inference may include, but is not limited to, steps S100 to S400.

S100, acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information;

s200, preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;

s300, performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary;

s400, performing post-processing on the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

In step S100 of some embodiments, platform device information is obtained according to a preset configuration file, and the platform device information is initialized to determine a cache space corresponding to the platform device information. It can be understood that, the platform device information is obtained through the configuration file written by the user, and the platform device information may be obtained in the configuration form of YAML or JSON.

Further, YAML is a highly readable format for expressing data serialization.

JSON (JavaScript Object Notation) is a lightweight data exchange format.

Further, initializing the platform device information acquired from the configuration file, and determining a cache space. The specific implementation steps may be to create a model class corresponding to the inference interface according to a preset inference interface, and determine the backend interface registered in the configuration file according to the model class and the configuration platform parameters. Then obtaining a model file stored in a host memory according to the model path parameters; and transmitting the model file in the host memory to inference equipment corresponding to the rear-end interface so as to analyze the model file by using the inference equipment to obtain the model information. And finally, according to the model information, creating the cache space in the inference equipment corresponding to the platform equipment information.

It should be noted that the configuration file at least includes: platform parameters (platform) and model path parameters (ModelPath) are configured.

Model path parameters (ModelPath): the model warehouse where the model file is located or the path stored by the model file can be the os cloud storage or the local storage model file, and the type is the character string type.

Configuration platform parameters (platform): the inference back end of the model file operation, for example, tensorrt, cann, openvino, mnn, etc., is of a character string type.

Of course, as shown with reference to fig. 2, the configuration file may also include, but is not limited to:

model name (ModelName): is a unique identifier of the model, is unique in the same software development tool (SDK), and has a character string type.

Multiplexing memory function (Use _ device _ buffer): whether the memory of the inference device is multiplexed or not, wherein the type is Boolean type (boolean).

Further, referring to the interface and model class diagram of the standard inference interface (InferDK) shown in FIG. 2,

the inference interface may include at least, but is not limited to, the following functions:

a Run function (Run function), and a model class function (model loader function).

The model class (ModelLoader) may include at least, but is not limited to, the following functions:

inference function (inference function), transfer function (Transfer function), data metric function (GetDims function), and registration function (Register function).

The model class corresponds to a plurality of back-end ports, each of which may include at least, but is not limited to, the following functions:

inference function (inference function), transfer function (Transfer function) and data metric function (GetDims function), device name (DeviceId).

It will be appreciated that model class and inference functions in the plurality of back-end ports can be invoked to implement model inference by invoking run functions in the inference interface.

And the transmission function is used for transmitting data on the host side and the inference equipment side.

The data metric function is used for controlling the dimension of input or output data and the data size according to the metric function.

The registration function is used for identifying whether the back-end port is registered or not so as to screen out the available back-end ports.

Optionally, the backend object names corresponding to the plurality of backend ports may be trmodeloaders, cannmodelloaders, and cpumodelloaders.

In step S200 of some embodiments, original input data to be processed is preprocessed to obtain preprocessed data stored in the cache space, and an input pointer corresponding to a first position of the preprocessed data in the cache space is stored in a target data dictionary. It can be understood that, after the step S100 is executed, platform device information is obtained according to a preset configuration file, the platform device information is initialized, a cache space corresponding to the platform device information is determined, the raw input data to be processed is preprocessed, preprocessed data is obtained after preprocessing, the preprocessed data is stored in the cache space created in the step S100, and an input pointer corresponding to a first position of the preprocessed data in the cache space is stored in the target data dictionary.

For example, the buffer space includes at least, but is not limited to, a first location, a second location, a third location, and a fourth location. Storing the preprocessed data at a first position, and storing an input pointer pointing to the first position in the target data dictionary, so that the preprocessed data can be quickly obtained from the buffer space through the input pointer in the target data dictionary.

In step S300 of some embodiments, a model inference process is performed on the preprocessed data obtained from the cache space according to the input pointer to obtain model inference data, and an output pointer corresponding to a second position of the model inference data in the cache space is stored in the target data dictionary. It can be understood that, after the step S200 is performed to preprocess the original input data to be processed to obtain the preprocessed data stored in the cache space, and the input pointer corresponding to the first position of the preprocessed data in the cache space is stored in the target data dictionary, the preprocessed data is first quickly acquired from the cache space according to the input pointer in the target data dictionary, and then the preprocessed data is subjected to model inference processing, specifically, the preprocessed data is asynchronously transmitted to the inference device corresponding to the standard inference interface (InferDK) to load the inference model for inference processing, so as to obtain the model inference data.

And further, the model inference data is stored in a second position in the cache space, and an output pointer pointing to the second position is stored in the target data dictionary and used for rapidly acquiring the model inference data from the cache space through the output pointer in the target data dictionary, so that the inference efficiency of the model inference is improved.

In step S400 in some embodiments, the model inference data acquired from the cache space according to the output pointer is post-processed to obtain inference data stored in the cache space, and a target pointer corresponding to a third position of the inference data in the cache space is stored in the target data dictionary. It can be understood that, after the step S300 is executed to perform the model inference process on the preprocessed data obtained from the cache space according to the input pointer to obtain the model inference data, and store the output pointer corresponding to the second position of the model inference data in the cache space into the target data dictionary, the specific execution step may be to quickly obtain the model inference data from the cache space according to the output pointer in the target data dictionary, and perform the post-processing on the model inference data to obtain the inference data.

Optionally, the inference data is stored in a third position in the cache space, and a target pointer pointing to the third position is stored in the target data dictionary, so that the inference data can be quickly acquired from the cache space through the target pointer in the target data dictionary, and the multiplexing efficiency of the inference data after model inference is improved.

Optionally, the specific step of performing post-processing on the model inference data to obtain the inference data may be to perform segmentation processing on each model inference data by using a preset segmentation rule to obtain the inference data, where the segmentation rule is to segment each model inference data into inference data having the same number as the preset number threshold.

In some embodiments, as shown with reference to fig. 3, step S100 may also include, but is not limited to, steps S310 to S330.

S310, screening out a rear-end interface according to the configuration platform parameters;

s320, processing a preset model file based on the model path parameter to obtain model information generated in inference equipment corresponding to the rear-end interface;

s330, according to the model information, the cache space is created in the inference equipment corresponding to the platform equipment information.

In step S310 of some embodiments, a backend interface is screened out according to the configuration platform parameters. It can be understood that, screening out the backend interface according to the configuration platform parameter (platform), the specific implementation steps may be: and establishing a model class corresponding to the inference interface according to a preset inference interface, and determining a rear-end interface which is registered in the configuration file according to the model class and the parameters of the configuration platform.

Optionally, the inference interface may include at least, but is not limited to: standard inference interface (InferDK).

And establishing a corresponding model class according to the standard reasoning interface, and determining the back-end interfaces which are registered in the configuration file according to the configuration platform parameters in the model class, wherein for example, a plurality of back-end interfaces are connected with the model class, and only the back-end interfaces which are registered by using a registration function can be used.

In step S320 of some embodiments, a preset model file is processed based on the model path parameter, so as to obtain model information generated in the inference device corresponding to the backend interface. It can be understood that after the step of screening out the backend interface according to the configuration platform parameters in step S310 is performed, the model file is processed based on the model path parameters to obtain the model information.

Optionally, the model information is generated in an inference device corresponding to the backend interface. Each back-end interface corresponds to one back-end reasoning device, the back-end interface is reserved in each back-end reasoning device when a manufacturer leaves a factory, and the reasoning performance of the back-end reasoning devices can be called by using the back-end interfaces.

In step S330 of some embodiments, the cache space is created in the inference device corresponding to the platform device information according to the model information. It can be understood that after the step S320 of processing the preset model file based on the model path parameters to obtain the model information generated in the inference device corresponding to the backend interface is completed, the specific steps may be: according to the model information analyzed in step S320, a cache space is created in the inference device for temporarily storing relevant data generated in the whole process of model inference, where the relevant data at least includes: preprocessing data, model inference data, and inference data.

Optionally, the data cached in the cache space may be passed through a transfer function for transfer between the host memory and the inference device.

Furthermore, when data transmission is carried out, a large amount of data does not need to be transmitted, only pointer information pointing to a data storage position needs to be transmitted, and the pointer information is stored in the target data dictionary, so that the target data dictionary only needs to be transmitted, data transmission can be reduced, related data can be called quickly, and the efficiency of model reasoning is greatly improved.

In some embodiments, as shown with reference to fig. 4, step S310 may also include, but is not limited to, steps S410 to S420.

S410, creating a model class corresponding to a preset reasoning interface according to the reasoning interface;

s420, determining the back-end interface registered in the configuration file according to the model class and the configuration platform parameters.

In step S410 of some embodiments, a model class corresponding to the inference interface is created according to a preset inference interface. It is to be understood that a model class corresponding to the inference interface is created from the inference interface for determining backend interfaces corresponding to a plurality of backend (backends) from the model class.

In step S420 of some embodiments, the backend interface that has been registered in the configuration file is determined according to the model class and the configuration platform parameters. It can be understood that, after the step S410 of creating the model class corresponding to the inference interface according to the preset inference interface is performed, the specific execution step may be to create the corresponding model class according to the standard inference interface, and determine the backend interfaces that have been registered in the configuration file according to the configuration platform parameters in the model class, for example, there are multiple backend interfaces connected with the model class, and only the backend interfaces that have been registered by using the registration function can be used.

In some embodiments, as shown with reference to fig. 5, step S320 may also include, but is not limited to, steps S510 to S520.

S510, obtaining a model file stored in a host memory according to the model path parameters;

s520, the model file in the host memory is transmitted to the reasoning equipment corresponding to the back-end interface, so that the reasoning equipment is used for analyzing the model file to obtain the model information.

In step S510 of some embodiments, the model file stored in the host memory is acquired according to the model path parameters, and it is understood that after the step of determining the backend interface registered in the configuration file according to the model class and the configuration platform parameters in step S420 is completed, the specific execution steps may be: and acquiring a model file from a cloud database or a local database according to the model path acceptance number, and storing the model file in a host memory for transmitting to the inference equipment for analysis.

In step S520 of some embodiments, the model file in the host memory is transmitted to the inference device corresponding to the backend interface, so that the inference device is used to analyze the model file to obtain the model information. It can be understood that, after the step of step S510 is executed, the specific execution step may be to transmit the model file in the host memory to the inference device corresponding to the backend interface, and analyze the model file by using the inference device to obtain the model information.

In some embodiments, before preprocessing the raw input data to be processed in step S200 to obtain the preprocessed data stored in the cache space, the method includes:

initializing a preset data dictionary to obtain the target data dictionary;

It can be understood that, before the raw input data to be processed is preprocessed to obtain the preprocessed data stored in the cache space, a preset data dictionary is initialized to obtain a target data dictionary, and the target data dictionary is used for storing pointer information pointing to the related data stored in the cache space in the target data dictionary.

The original input data are stored in the host memory, and the original data pointers corresponding to the storage positions of the original input data in the host memory are stored in the target data dictionary, so that the original input data can be quickly called from the host memory by using the original data pointers, and the efficiency of model reasoning is improved.

In some embodiments, performing model inference processing on the preprocessed data obtained from the cache space according to the input pointer to obtain model inference data includes:

acquiring a running function in the inference interface;

It can be understood that the obtaining of the Run function in the inference interface may be to obtain a Run function, load an inference function in the model class by using the Run function, then obtain preprocessed data from the cache space by using the inference function in the model class and an inference function in the back-end port, and then transmit the preprocessed data to an inference device corresponding to the standard inference interface (InferDK) in an asynchronous manner to load the inference model for inference processing, so as to obtain model inference data, which greatly improves the efficiency of model inference.

Preferably, in some embodiments, the standard inference interface further initializes the transfer function according to whether the memory (Use _ device _ buffer) on the device side needs to be multiplexed, the transfer function is also implemented according to the transport capability interface provided by different backend devices, if the original input data to be processed is the memory data provided by the non-inference hardware device, data transport (H2D) is required, otherwise, the pointer is directly used for memory multiplexing (D2D).

Alternatively, data transmission from the host side to the device side (H2D), data transmission from the device side to the device side (D2D), data transmission from the device side to the host side (D2H), and data transmission from the host side to the host side (H2H).

Preferably, for multiple back-ends of the image processing interface (ProcessDK): since an inference device can be composed of multiple dedicated chips, each dedicated chip performing a specific processing task, e.g., video decoding (from H264 video streams to images) is performed by a dedicated chip, which is more efficient than a non-dedicated chip.

The image processing interface (ProcessDK) can transfer the calculation tasks suitable for the special chip to the special chip with higher efficiency from the host control chip, thereby realizing the full-flow optimization of the data processing flow.

Preferably, the data processing flow at least comprises: preprocessing, model reasoning and post-processing. The CPU of the host controls a complex reasoning process, the operations of coding, decoding, rendering and the like are processed by a coding and decoding chip, and the model reasoning is processed by a GPU/NPU suitable for matrix operation.

According to the data processing method based on machine learning model reasoning, platform equipment information is obtained according to a preset configuration file, initialization processing is carried out on the platform equipment information, and a cache space corresponding to the platform equipment information is determined; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.

The following describes the data processing apparatus based on machine learning model inference provided by the present disclosure, and the data processing apparatus based on machine learning model inference described below and the data processing method based on machine learning model inference described above may be referred to in correspondence with each other.

Referring to fig. 6, a data processing apparatus based on machine learning model inference includes:

the initialization module 610 is configured to acquire platform device information according to a preset configuration file, perform initialization processing on the platform device information, and determine a cache space corresponding to the platform device information;

the preprocessing module 620 is configured to preprocess original input data to be processed to obtain preprocessed data stored in the cache space, and store an input pointer corresponding to a first position of the preprocessed data in the cache space in a target data dictionary;

the model inference module 630 is configured to perform model inference processing on the preprocessed data obtained from the cache space according to the input pointer to obtain model inference data, and store an output pointer corresponding to a second position of the model inference data in the cache space in the target data dictionary;

and the post-processing module 640 is configured to perform post-processing on the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and store a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

According to the data processing device based on machine learning model reasoning provided by the present disclosure, the configuration file at least comprises: a configuration platform parameter and a model path parameter, and an initialization module 610, specifically configured to screen out a backend interface according to the configuration platform parameter;

According to the data processing apparatus based on machine learning model inference provided by the present disclosure, the initialization module 610 is specifically configured to create a model class corresponding to an inference interface according to a preset inference interface;

According to the data processing apparatus based on machine learning model inference provided by the present disclosure, the initialization module 610 is specifically configured to obtain a model file stored in a host memory according to the model path parameter;

According to the present disclosure, a data processing apparatus based on machine learning model inference is provided, including:

the storage module is used for initializing a preset data dictionary to obtain the target data dictionary before preprocessing the original input data to be processed to obtain preprocessed data stored in the cache space;

According to the data processing apparatus based on machine learning model inference provided by the present disclosure, the model inference module 630 is specifically configured to obtain an operation function in the inference interface;

and loading a reasoning function in the model class based on the running function, and performing model reasoning processing on the preprocessed data acquired by using the input pointer to obtain the model reasoning data.

According to the data processing device based on machine learning model inference, platform equipment information is obtained according to a preset configuration file, initialization processing is carried out on the platform equipment information, and a cache space corresponding to the platform equipment information is determined; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.

Fig. 7 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 7: a processor (processor) 710, a communication Interface (Communications Interface) 720, a memory (memory) 730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform a method of machine learning model inference based data processing, the method comprising: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present disclosure also provides a computer program product, the computer program product includes a computer program, the computer program can be stored on a non-transitory computer readable storage medium, when the computer program is executed by a processor, a computer can execute the data processing method based on machine learning model inference provided by the above methods, the method includes: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

In yet another aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements a method of data processing based on machine learning model inference provided by the above methods, the method comprising: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present disclosure.

Claims

1. A data processing method based on machine learning model reasoning is characterized by comprising the following steps:

performing model reasoning on the preprocessed data acquired from the cache space according to the input pointer to obtain model reasoning data, and storing an output pointer corresponding to a second position of the model reasoning data in the cache space into the target data dictionary;

2. The machine-learning model inference based data processing method of claim 1, wherein the configuration file comprises at least: configuring platform parameters and model path parameters;

the method for acquiring the platform equipment information according to the preset configuration file, initializing the platform equipment information and determining the cache space corresponding to the platform equipment information comprises the following steps:

3. The machine-learning-model-inference-based data processing method of claim 2, wherein the screening out backend interfaces according to the configuration platform parameters comprises:

4. The data processing method based on machine learning model inference as claimed in claim 2, wherein said processing a preset model file based on said model path parameters to obtain model information generated in an inference device corresponding to said back-end interface comprises:

5. The machine-learning-model-inference-based data processing method of claim 1, wherein said step before preprocessing raw input data to be processed to obtain preprocessed data stored in said cache space, comprises:

initializing a preset data dictionary to obtain the target data dictionary;

storing the original input data in a host memory, and storing an original data pointer corresponding to the storage position of the original input data in the host memory into the target data dictionary, so as to call the original input data from the host memory by using the original data pointer.

6. The method for processing data based on machine learning model inference according to claim 3, wherein said performing model inference on said preprocessed data obtained from said cache space according to said input pointer to obtain model inference data comprises:

acquiring a running function in the reasoning interface;

7. A data processing apparatus based on machine learning model inference, comprising:

the model reasoning module is used for carrying out model reasoning on the preprocessed data acquired from the cache space according to the input pointer to obtain model reasoning data, and storing an output pointer corresponding to a second position of the model reasoning data in the cache space into the target data dictionary;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the data processing method based on machine learning model inference as claimed in any one of claims 1 to 6.

9. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the data processing method based on machine learning model inference as claimed in any one of claims 1 to 6.

10. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, performs the steps of the method for data processing based on machine learning model inference as claimed in any of claims 1 to 6.