CN115358404A - Data processing method, device and equipment based on machine learning model reasoning - Google Patents

Data processing method, device and equipment based on machine learning model reasoning Download PDF

Info

Publication number
CN115358404A
CN115358404A CN202211074172.4A CN202211074172A CN115358404A CN 115358404 A CN115358404 A CN 115358404A CN 202211074172 A CN202211074172 A CN 202211074172A CN 115358404 A CN115358404 A CN 115358404A
Authority
CN
China
Prior art keywords
data
model
inference
cache space
reasoning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211074172.4A
Other languages
Chinese (zh)
Inventor
蒋煜襄
郭聪
于万金
王林芳
陈�峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202211074172.4A priority Critical patent/CN115358404A/en
Publication of CN115358404A publication Critical patent/CN115358404A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a data processing method, a device and equipment based on machine learning model reasoning, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space; preprocessing original input data to be processed to obtain preprocessed data stored in a cache space; performing model reasoning processing on the preprocessed data to obtain model reasoning data stored in a cache space; and post-processing the model inference data to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into a target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.

Description

Data processing method, device and equipment based on machine learning model reasoning
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a data processing method, device, and apparatus based on machine learning model inference.
Background
With the development of big data technology and artificial intelligence technology, more and more business scenes such as financial wind control, online advertisement, commodity recommendation, intelligent cities and the like adopt a large number of machine learning technologies to improve the service quality and the intelligent decision level. For specific tasks, after a model is obtained through training in a training environment specified by the model, the model needs to be packaged, then the model is deployed as an online reasoning service, and when a user uses an operating environment the same as that of the training environment, the reasoning service can be used.
In the related art, model reasoning is usually performed on dedicated reasoning apparatuses, because each dedicated reasoning apparatus has different reasoning performances. Therefore, when a plurality of special reasoning devices corresponding to the back ends execute the reasoning of the whole process of the model, data corresponding to a plurality of reasoning stages are inevitably generated, and the model reasoning efficiency is low due to a large amount of data.
Therefore, how to improve the efficiency of model reasoning is a technical problem to be solved urgently at present.
Disclosure of Invention
The data processing method, the data processing device and the data processing equipment based on the machine learning model inference process the data in the inference whole process in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model inference efficiency is also improved.
In a first aspect, the present disclosure provides a data processing method based on machine learning model inference, including:
acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information;
preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;
performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary;
and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, the configuration file at least includes: configuring platform parameters and model path parameters;
the method for acquiring the platform equipment information according to the preset configuration file, initializing the platform equipment information, and determining the cache space corresponding to the platform equipment information includes:
screening a rear-end interface according to the configuration platform parameters;
processing a preset model file based on the model path parameters to obtain model information generated in inference equipment corresponding to the rear-end interface;
and according to the model information, creating the cache space in the inference equipment corresponding to the platform equipment information.
Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, the screening out the backend interface according to the configuration platform parameters includes:
establishing a model class corresponding to the inference interface according to a preset inference interface;
and determining the back-end interface which is registered in the configuration file according to the model class and the configuration platform parameters.
Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, processing a preset model file based on the model path parameter to obtain model information generated in inference equipment corresponding to the backend interface includes:
acquiring a model file stored in a host memory according to the model path parameters;
and transmitting the model file in the host memory to inference equipment corresponding to the rear-end interface so as to analyze the model file by using the inference equipment to obtain the model information.
Preferably, according to a data processing method based on machine learning model inference provided by the present disclosure, before the raw input data to be processed is preprocessed to obtain preprocessed data stored in the cache space, the method includes:
initializing a preset data dictionary to obtain the target data dictionary;
and storing the original input data in a host memory, and storing an original data pointer corresponding to the storage position of the original input data in the host memory into the target data dictionary so as to call the original input data from the host memory by using the original data pointer.
Preferably, according to the data processing method based on machine learning model inference provided by the present disclosure, the performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data includes:
acquiring a running function in the inference interface;
and loading a reasoning function in the model class based on the operation function, and performing model reasoning on the preprocessed data acquired by using the input pointer to obtain the model reasoning data.
In a second aspect, the present disclosure also provides a data processing apparatus based on machine learning model inference, including:
the initialization module is used for acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information and determining a cache space corresponding to the platform equipment information;
the preprocessing module is used for preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;
the model reasoning module is used for carrying out model reasoning processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model reasoning data and storing an output pointer corresponding to a second position of the model reasoning data in the cache space into the target data dictionary;
and the post-processing module is used for post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
In a third aspect, the present disclosure also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the data processing method based on machine learning model inference as described in any one of the above when executing the program.
In a fourth aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data processing method based on machine learning model inference as described in any one of the above.
In a fifth aspect, the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the data processing method based on machine learning model inference as described in any one of the above.
According to the data processing method, the data processing device and the data processing equipment based on the machine learning model inference, platform equipment information is obtained according to a preset configuration file, initialization processing is carried out on the platform equipment information, and a cache space corresponding to the platform equipment information is determined; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.
Drawings
In order to more clearly illustrate the technical solutions of the present disclosure or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a data processing method based on machine learning model inference according to an embodiment of the present disclosure;
FIG. 2 is a diagram of interfaces and model classes of a standard inference interface in accordance with an embodiment of the present disclosure;
FIG. 3 is a schematic flowchart of step S100 in FIG. 1 according to an embodiment of the disclosure;
FIG. 4 is a schematic flowchart of step S310 in FIG. 3 according to an embodiment of the present disclosure;
FIG. 5 is a schematic flowchart of step S320 in FIG. 3 according to an embodiment of the disclosure;
fig. 6 is a schematic structural diagram of a data processing apparatus based on machine learning model inference provided in an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the present disclosure more apparent, the technical solutions of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without inventive step, are intended to be within the scope of the present disclosure.
Standard inference interface (InferDK): the standard reasoning interface is obtained by packaging the bottom layer reasoning capability provided by the back end (backend). The method is used for loading the initialization model and controlling data transmission between equipment and a host or between the equipment and the equipment, and comprises a model driving execution interface, so that the thread safety of model reasoning can be ensured, and an access interface for basic parameters of the model is provided. By packaging the standard inference interface, the upper-layer stage process has no perception on the bottom hardware equipment when executing operations such as model inference and the like, and can support more model inference capabilities by expanding the rear end of the standard inference interface, wherein the upper-layer stage process is the last logic layer located on the logic layer of the standard inference interface and corresponding to the service and process framework.
Backend (backend): the backend is typically encapsulated using a native runtime interface (runtime api) provided by the hardware device vendor.
Image processing interface (ProcessDK): according to the capability of the processing unit such as the image provided by the hardware equipment, the processing capability of different hardware is uniformly packaged. Thereby the upper layer stage shields various interfaces provided by each hardware processing unit. For image data input reasoning, atomic capabilities such as typical "matting", "scaling", etc. are provided.
A data dictionary: a Data dictionary (Data dictionary) is a directory of record databases and application metadata that a user can access, and in the embodiment of the present disclosure, is used for storing a plurality of Data pointer information to quickly acquire Data in a cache space.
Pointer: in computer science, a Pointer (Pointer) is an object in a programming language, with an address whose value points directly to a value stored elsewhere in computer memory, in the disclosed embodiment to the location information of data stored in cache space.
Buffer space (buffer): in the field of computers, buffer space is actually referred to as a buffer register, which temporarily stores data sent from an external device so that the processor can take it away, or is used to temporarily store data sent from the processor to the external device (back-end reasoning equipment). With the buffer space, the CPU working at high speed and the peripheral working at low speed can be coordinated and buffered, and the synchronization of data transmission is realized.
The following describes a data processing method, apparatus, and device based on machine learning model inference, which are provided in the embodiments of the present disclosure, with reference to fig. 1 to fig. 7, and can process data in the entire inference process in a pointer and data dictionary manner, so that not only the data processing efficiency is improved, but also the efficiency of model inference is improved.
Specifically, the following embodiments are used to explain, and first, a data processing method based on machine learning model inference in the embodiments of the present disclosure is described.
As shown in fig. 1, which is a schematic flow chart of an implementation of a data processing method based on machine learning model inference according to an embodiment of the present disclosure, the data processing method based on machine learning model inference may include, but is not limited to, steps S100 to S400.
S100, acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information;
s200, preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;
s300, performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary;
s400, performing post-processing on the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
In step S100 of some embodiments, platform device information is obtained according to a preset configuration file, and the platform device information is initialized to determine a cache space corresponding to the platform device information. It can be understood that, the platform device information is obtained through the configuration file written by the user, and the platform device information may be obtained in the configuration form of YAML or JSON.
Further, YAML is a highly readable format for expressing data serialization.
JSON (JavaScript Object Notation) is a lightweight data exchange format.
Further, initializing the platform device information acquired from the configuration file, and determining a cache space. The specific implementation steps may be to create a model class corresponding to the inference interface according to a preset inference interface, and determine the backend interface registered in the configuration file according to the model class and the configuration platform parameters. Then obtaining a model file stored in a host memory according to the model path parameters; and transmitting the model file in the host memory to inference equipment corresponding to the rear-end interface so as to analyze the model file by using the inference equipment to obtain the model information. And finally, according to the model information, creating the cache space in the inference equipment corresponding to the platform equipment information.
It should be noted that the configuration file at least includes: platform parameters (platform) and model path parameters (ModelPath) are configured.
Model path parameters (ModelPath): the model warehouse where the model file is located or the path stored by the model file can be the os cloud storage or the local storage model file, and the type is the character string type.
Configuration platform parameters (platform): the inference back end of the model file operation, for example, tensorrt, cann, openvino, mnn, etc., is of a character string type.
Of course, as shown with reference to fig. 2, the configuration file may also include, but is not limited to:
model name (ModelName): is a unique identifier of the model, is unique in the same software development tool (SDK), and has a character string type.
Multiplexing memory function (Use _ device _ buffer): whether the memory of the inference device is multiplexed or not, wherein the type is Boolean type (boolean).
Further, referring to the interface and model class diagram of the standard inference interface (InferDK) shown in FIG. 2,
the inference interface may include at least, but is not limited to, the following functions:
a Run function (Run function), and a model class function (model loader function).
The model class (ModelLoader) may include at least, but is not limited to, the following functions:
inference function (inference function), transfer function (Transfer function), data metric function (GetDims function), and registration function (Register function).
The model class corresponds to a plurality of back-end ports, each of which may include at least, but is not limited to, the following functions:
inference function (inference function), transfer function (Transfer function) and data metric function (GetDims function), device name (DeviceId).
It will be appreciated that model class and inference functions in the plurality of back-end ports can be invoked to implement model inference by invoking run functions in the inference interface.
And the transmission function is used for transmitting data on the host side and the inference equipment side.
The data metric function is used for controlling the dimension of input or output data and the data size according to the metric function.
The registration function is used for identifying whether the back-end port is registered or not so as to screen out the available back-end ports.
Optionally, the backend object names corresponding to the plurality of backend ports may be trmodeloaders, cannmodelloaders, and cpumodelloaders.
In step S200 of some embodiments, original input data to be processed is preprocessed to obtain preprocessed data stored in the cache space, and an input pointer corresponding to a first position of the preprocessed data in the cache space is stored in a target data dictionary. It can be understood that, after the step S100 is executed, platform device information is obtained according to a preset configuration file, the platform device information is initialized, a cache space corresponding to the platform device information is determined, the raw input data to be processed is preprocessed, preprocessed data is obtained after preprocessing, the preprocessed data is stored in the cache space created in the step S100, and an input pointer corresponding to a first position of the preprocessed data in the cache space is stored in the target data dictionary.
For example, the buffer space includes at least, but is not limited to, a first location, a second location, a third location, and a fourth location. Storing the preprocessed data at a first position, and storing an input pointer pointing to the first position in the target data dictionary, so that the preprocessed data can be quickly obtained from the buffer space through the input pointer in the target data dictionary.
In step S300 of some embodiments, a model inference process is performed on the preprocessed data obtained from the cache space according to the input pointer to obtain model inference data, and an output pointer corresponding to a second position of the model inference data in the cache space is stored in the target data dictionary. It can be understood that, after the step S200 is performed to preprocess the original input data to be processed to obtain the preprocessed data stored in the cache space, and the input pointer corresponding to the first position of the preprocessed data in the cache space is stored in the target data dictionary, the preprocessed data is first quickly acquired from the cache space according to the input pointer in the target data dictionary, and then the preprocessed data is subjected to model inference processing, specifically, the preprocessed data is asynchronously transmitted to the inference device corresponding to the standard inference interface (InferDK) to load the inference model for inference processing, so as to obtain the model inference data.
And further, the model inference data is stored in a second position in the cache space, and an output pointer pointing to the second position is stored in the target data dictionary and used for rapidly acquiring the model inference data from the cache space through the output pointer in the target data dictionary, so that the inference efficiency of the model inference is improved.
In step S400 in some embodiments, the model inference data acquired from the cache space according to the output pointer is post-processed to obtain inference data stored in the cache space, and a target pointer corresponding to a third position of the inference data in the cache space is stored in the target data dictionary. It can be understood that, after the step S300 is executed to perform the model inference process on the preprocessed data obtained from the cache space according to the input pointer to obtain the model inference data, and store the output pointer corresponding to the second position of the model inference data in the cache space into the target data dictionary, the specific execution step may be to quickly obtain the model inference data from the cache space according to the output pointer in the target data dictionary, and perform the post-processing on the model inference data to obtain the inference data.
Optionally, the inference data is stored in a third position in the cache space, and a target pointer pointing to the third position is stored in the target data dictionary, so that the inference data can be quickly acquired from the cache space through the target pointer in the target data dictionary, and the multiplexing efficiency of the inference data after model inference is improved.
Optionally, the specific step of performing post-processing on the model inference data to obtain the inference data may be to perform segmentation processing on each model inference data by using a preset segmentation rule to obtain the inference data, where the segmentation rule is to segment each model inference data into inference data having the same number as the preset number threshold.
In some embodiments, as shown with reference to fig. 3, step S100 may also include, but is not limited to, steps S310 to S330.
S310, screening out a rear-end interface according to the configuration platform parameters;
s320, processing a preset model file based on the model path parameter to obtain model information generated in inference equipment corresponding to the rear-end interface;
s330, according to the model information, the cache space is created in the inference equipment corresponding to the platform equipment information.
In step S310 of some embodiments, a backend interface is screened out according to the configuration platform parameters. It can be understood that, screening out the backend interface according to the configuration platform parameter (platform), the specific implementation steps may be: and establishing a model class corresponding to the inference interface according to a preset inference interface, and determining a rear-end interface which is registered in the configuration file according to the model class and the parameters of the configuration platform.
Optionally, the inference interface may include at least, but is not limited to: standard inference interface (InferDK).
And establishing a corresponding model class according to the standard reasoning interface, and determining the back-end interfaces which are registered in the configuration file according to the configuration platform parameters in the model class, wherein for example, a plurality of back-end interfaces are connected with the model class, and only the back-end interfaces which are registered by using a registration function can be used.
In step S320 of some embodiments, a preset model file is processed based on the model path parameter, so as to obtain model information generated in the inference device corresponding to the backend interface. It can be understood that after the step of screening out the backend interface according to the configuration platform parameters in step S310 is performed, the model file is processed based on the model path parameters to obtain the model information.
Optionally, the model information is generated in an inference device corresponding to the backend interface. Each back-end interface corresponds to one back-end reasoning device, the back-end interface is reserved in each back-end reasoning device when a manufacturer leaves a factory, and the reasoning performance of the back-end reasoning devices can be called by using the back-end interfaces.
In step S330 of some embodiments, the cache space is created in the inference device corresponding to the platform device information according to the model information. It can be understood that after the step S320 of processing the preset model file based on the model path parameters to obtain the model information generated in the inference device corresponding to the backend interface is completed, the specific steps may be: according to the model information analyzed in step S320, a cache space is created in the inference device for temporarily storing relevant data generated in the whole process of model inference, where the relevant data at least includes: preprocessing data, model inference data, and inference data.
Optionally, the data cached in the cache space may be passed through a transfer function for transfer between the host memory and the inference device.
Furthermore, when data transmission is carried out, a large amount of data does not need to be transmitted, only pointer information pointing to a data storage position needs to be transmitted, and the pointer information is stored in the target data dictionary, so that the target data dictionary only needs to be transmitted, data transmission can be reduced, related data can be called quickly, and the efficiency of model reasoning is greatly improved.
In some embodiments, as shown with reference to fig. 4, step S310 may also include, but is not limited to, steps S410 to S420.
S410, creating a model class corresponding to a preset reasoning interface according to the reasoning interface;
s420, determining the back-end interface registered in the configuration file according to the model class and the configuration platform parameters.
In step S410 of some embodiments, a model class corresponding to the inference interface is created according to a preset inference interface. It is to be understood that a model class corresponding to the inference interface is created from the inference interface for determining backend interfaces corresponding to a plurality of backend (backends) from the model class.
In step S420 of some embodiments, the backend interface that has been registered in the configuration file is determined according to the model class and the configuration platform parameters. It can be understood that, after the step S410 of creating the model class corresponding to the inference interface according to the preset inference interface is performed, the specific execution step may be to create the corresponding model class according to the standard inference interface, and determine the backend interfaces that have been registered in the configuration file according to the configuration platform parameters in the model class, for example, there are multiple backend interfaces connected with the model class, and only the backend interfaces that have been registered by using the registration function can be used.
In some embodiments, as shown with reference to fig. 5, step S320 may also include, but is not limited to, steps S510 to S520.
S510, obtaining a model file stored in a host memory according to the model path parameters;
s520, the model file in the host memory is transmitted to the reasoning equipment corresponding to the back-end interface, so that the reasoning equipment is used for analyzing the model file to obtain the model information.
In step S510 of some embodiments, the model file stored in the host memory is acquired according to the model path parameters, and it is understood that after the step of determining the backend interface registered in the configuration file according to the model class and the configuration platform parameters in step S420 is completed, the specific execution steps may be: and acquiring a model file from a cloud database or a local database according to the model path acceptance number, and storing the model file in a host memory for transmitting to the inference equipment for analysis.
In step S520 of some embodiments, the model file in the host memory is transmitted to the inference device corresponding to the backend interface, so that the inference device is used to analyze the model file to obtain the model information. It can be understood that, after the step of step S510 is executed, the specific execution step may be to transmit the model file in the host memory to the inference device corresponding to the backend interface, and analyze the model file by using the inference device to obtain the model information.
In some embodiments, before preprocessing the raw input data to be processed in step S200 to obtain the preprocessed data stored in the cache space, the method includes:
initializing a preset data dictionary to obtain the target data dictionary;
and storing the original input data in a host memory, and storing an original data pointer corresponding to the storage position of the original input data in the host memory into the target data dictionary so as to call the original input data from the host memory by using the original data pointer.
It can be understood that, before the raw input data to be processed is preprocessed to obtain the preprocessed data stored in the cache space, a preset data dictionary is initialized to obtain a target data dictionary, and the target data dictionary is used for storing pointer information pointing to the related data stored in the cache space in the target data dictionary.
The original input data are stored in the host memory, and the original data pointers corresponding to the storage positions of the original input data in the host memory are stored in the target data dictionary, so that the original input data can be quickly called from the host memory by using the original data pointers, and the efficiency of model reasoning is improved.
In some embodiments, performing model inference processing on the preprocessed data obtained from the cache space according to the input pointer to obtain model inference data includes:
acquiring a running function in the inference interface;
and loading a reasoning function in the model class based on the operation function, and performing model reasoning on the preprocessed data acquired by using the input pointer to obtain the model reasoning data.
It can be understood that the obtaining of the Run function in the inference interface may be to obtain a Run function, load an inference function in the model class by using the Run function, then obtain preprocessed data from the cache space by using the inference function in the model class and an inference function in the back-end port, and then transmit the preprocessed data to an inference device corresponding to the standard inference interface (InferDK) in an asynchronous manner to load the inference model for inference processing, so as to obtain model inference data, which greatly improves the efficiency of model inference.
Preferably, in some embodiments, the standard inference interface further initializes the transfer function according to whether the memory (Use _ device _ buffer) on the device side needs to be multiplexed, the transfer function is also implemented according to the transport capability interface provided by different backend devices, if the original input data to be processed is the memory data provided by the non-inference hardware device, data transport (H2D) is required, otherwise, the pointer is directly used for memory multiplexing (D2D).
Alternatively, data transmission from the host side to the device side (H2D), data transmission from the device side to the device side (D2D), data transmission from the device side to the host side (D2H), and data transmission from the host side to the host side (H2H).
Preferably, for multiple back-ends of the image processing interface (ProcessDK): since an inference device can be composed of multiple dedicated chips, each dedicated chip performing a specific processing task, e.g., video decoding (from H264 video streams to images) is performed by a dedicated chip, which is more efficient than a non-dedicated chip.
The image processing interface (ProcessDK) can transfer the calculation tasks suitable for the special chip to the special chip with higher efficiency from the host control chip, thereby realizing the full-flow optimization of the data processing flow.
Preferably, the data processing flow at least comprises: preprocessing, model reasoning and post-processing. The CPU of the host controls a complex reasoning process, the operations of coding, decoding, rendering and the like are processed by a coding and decoding chip, and the model reasoning is processed by a GPU/NPU suitable for matrix operation.
According to the data processing method based on machine learning model reasoning, platform equipment information is obtained according to a preset configuration file, initialization processing is carried out on the platform equipment information, and a cache space corresponding to the platform equipment information is determined; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.
The following describes the data processing apparatus based on machine learning model inference provided by the present disclosure, and the data processing apparatus based on machine learning model inference described below and the data processing method based on machine learning model inference described above may be referred to in correspondence with each other.
Referring to fig. 6, a data processing apparatus based on machine learning model inference includes:
the initialization module 610 is configured to acquire platform device information according to a preset configuration file, perform initialization processing on the platform device information, and determine a cache space corresponding to the platform device information;
the preprocessing module 620 is configured to preprocess original input data to be processed to obtain preprocessed data stored in the cache space, and store an input pointer corresponding to a first position of the preprocessed data in the cache space in a target data dictionary;
the model inference module 630 is configured to perform model inference processing on the preprocessed data obtained from the cache space according to the input pointer to obtain model inference data, and store an output pointer corresponding to a second position of the model inference data in the cache space in the target data dictionary;
and the post-processing module 640 is configured to perform post-processing on the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and store a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
According to the data processing device based on machine learning model reasoning provided by the present disclosure, the configuration file at least comprises: a configuration platform parameter and a model path parameter, and an initialization module 610, specifically configured to screen out a backend interface according to the configuration platform parameter;
processing a preset model file based on the model path parameters to obtain model information generated in inference equipment corresponding to the rear-end interface;
and according to the model information, creating the cache space in the inference equipment corresponding to the platform equipment information.
According to the data processing apparatus based on machine learning model inference provided by the present disclosure, the initialization module 610 is specifically configured to create a model class corresponding to an inference interface according to a preset inference interface;
and determining the back-end interface which is registered in the configuration file according to the model class and the configuration platform parameters.
According to the data processing apparatus based on machine learning model inference provided by the present disclosure, the initialization module 610 is specifically configured to obtain a model file stored in a host memory according to the model path parameter;
and transmitting the model file in the host memory to inference equipment corresponding to the rear-end interface so as to analyze the model file by using the inference equipment to obtain the model information.
According to the present disclosure, a data processing apparatus based on machine learning model inference is provided, including:
the storage module is used for initializing a preset data dictionary to obtain the target data dictionary before preprocessing the original input data to be processed to obtain preprocessed data stored in the cache space;
and storing the original input data in a host memory, and storing an original data pointer corresponding to the storage position of the original input data in the host memory into the target data dictionary so as to call the original input data from the host memory by using the original data pointer.
According to the data processing apparatus based on machine learning model inference provided by the present disclosure, the model inference module 630 is specifically configured to obtain an operation function in the inference interface;
and loading a reasoning function in the model class based on the running function, and performing model reasoning processing on the preprocessed data acquired by using the input pointer to obtain the model reasoning data.
According to the data processing device based on machine learning model inference, platform equipment information is obtained according to a preset configuration file, initialization processing is carried out on the platform equipment information, and a cache space corresponding to the platform equipment information is determined; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary. The data in the whole reasoning process is processed in a pointer and data dictionary mode, so that the data processing efficiency is improved, and the model reasoning efficiency is also improved.
Fig. 7 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 7: a processor (processor) 710, a communication Interface (Communications Interface) 720, a memory (memory) 730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform a method of machine learning model inference based data processing, the method comprising: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present disclosure also provides a computer program product, the computer program product includes a computer program, the computer program can be stored on a non-transitory computer readable storage medium, when the computer program is executed by a processor, a computer can execute the data processing method based on machine learning model inference provided by the above methods, the method includes: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
In yet another aspect, the present disclosure also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements a method of data processing based on machine learning model inference provided by the above methods, the method comprising: acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information; preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary; performing model inference processing on the preprocessed data acquired from the cache space according to the input pointer to obtain model inference data, and storing an output pointer corresponding to a second position of the model inference data in the cache space into the target data dictionary; and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present disclosure.

Claims (10)

1. A data processing method based on machine learning model reasoning is characterized by comprising the following steps:
acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information, and determining a cache space corresponding to the platform equipment information;
preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;
performing model reasoning on the preprocessed data acquired from the cache space according to the input pointer to obtain model reasoning data, and storing an output pointer corresponding to a second position of the model reasoning data in the cache space into the target data dictionary;
and post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
2. The machine-learning model inference based data processing method of claim 1, wherein the configuration file comprises at least: configuring platform parameters and model path parameters;
the method for acquiring the platform equipment information according to the preset configuration file, initializing the platform equipment information and determining the cache space corresponding to the platform equipment information comprises the following steps:
screening a rear-end interface according to the configuration platform parameters;
processing a preset model file based on the model path parameters to obtain model information generated in inference equipment corresponding to the rear-end interface;
and according to the model information, creating the cache space in the inference equipment corresponding to the platform equipment information.
3. The machine-learning-model-inference-based data processing method of claim 2, wherein the screening out backend interfaces according to the configuration platform parameters comprises:
establishing a model class corresponding to the inference interface according to a preset inference interface;
and determining the back-end interface which is registered in the configuration file according to the model class and the configuration platform parameters.
4. The data processing method based on machine learning model inference as claimed in claim 2, wherein said processing a preset model file based on said model path parameters to obtain model information generated in an inference device corresponding to said back-end interface comprises:
acquiring a model file stored in a host memory according to the model path parameters;
and transmitting the model file in the host memory to inference equipment corresponding to the rear-end interface so as to analyze the model file by using the inference equipment to obtain the model information.
5. The machine-learning-model-inference-based data processing method of claim 1, wherein said step before preprocessing raw input data to be processed to obtain preprocessed data stored in said cache space, comprises:
initializing a preset data dictionary to obtain the target data dictionary;
storing the original input data in a host memory, and storing an original data pointer corresponding to the storage position of the original input data in the host memory into the target data dictionary, so as to call the original input data from the host memory by using the original data pointer.
6. The method for processing data based on machine learning model inference according to claim 3, wherein said performing model inference on said preprocessed data obtained from said cache space according to said input pointer to obtain model inference data comprises:
acquiring a running function in the reasoning interface;
and loading a reasoning function in the model class based on the operation function, and performing model reasoning on the preprocessed data acquired by using the input pointer to obtain the model reasoning data.
7. A data processing apparatus based on machine learning model inference, comprising:
the initialization module is used for acquiring platform equipment information according to a preset configuration file, initializing the platform equipment information and determining a cache space corresponding to the platform equipment information;
the preprocessing module is used for preprocessing original input data to be processed to obtain preprocessed data stored in the cache space, and storing an input pointer corresponding to a first position of the preprocessed data in the cache space into a target data dictionary;
the model reasoning module is used for carrying out model reasoning on the preprocessed data acquired from the cache space according to the input pointer to obtain model reasoning data, and storing an output pointer corresponding to a second position of the model reasoning data in the cache space into the target data dictionary;
and the post-processing module is used for post-processing the model inference data acquired from the cache space according to the output pointer to obtain inference data stored in the cache space, and storing a target pointer corresponding to a third position of the inference data in the cache space into the target data dictionary.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the data processing method based on machine learning model inference as claimed in any one of claims 1 to 6.
9. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the data processing method based on machine learning model inference as claimed in any one of claims 1 to 6.
10. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, performs the steps of the method for data processing based on machine learning model inference as claimed in any of claims 1 to 6.
CN202211074172.4A 2022-09-02 2022-09-02 Data processing method, device and equipment based on machine learning model reasoning Pending CN115358404A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211074172.4A CN115358404A (en) 2022-09-02 2022-09-02 Data processing method, device and equipment based on machine learning model reasoning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211074172.4A CN115358404A (en) 2022-09-02 2022-09-02 Data processing method, device and equipment based on machine learning model reasoning

Publications (1)

Publication Number Publication Date
CN115358404A true CN115358404A (en) 2022-11-18

Family

ID=84006223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211074172.4A Pending CN115358404A (en) 2022-09-02 2022-09-02 Data processing method, device and equipment based on machine learning model reasoning

Country Status (1)

Country Link
CN (1) CN115358404A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115964181A (en) * 2023-03-10 2023-04-14 之江实验室 Data processing method and device, storage medium and electronic equipment
CN116362336A (en) * 2023-06-02 2023-06-30 之江实验室 Model reasoning interaction method, electronic equipment and readable storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115964181A (en) * 2023-03-10 2023-04-14 之江实验室 Data processing method and device, storage medium and electronic equipment
CN116362336A (en) * 2023-06-02 2023-06-30 之江实验室 Model reasoning interaction method, electronic equipment and readable storage medium
CN116362336B (en) * 2023-06-02 2023-08-22 之江实验室 Model reasoning interaction method, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN110058922B (en) Method and device for extracting metadata of machine learning task
KR102225822B1 (en) Apparatus and method for generating learning data for artificial intelligence performance
CN115358404A (en) Data processing method, device and equipment based on machine learning model reasoning
US20230126597A1 (en) Container orchestration framework
AU2008332701A1 (en) Templating system and method for updating content in real time
JP7012689B2 (en) Command execution method and device
WO2022042113A1 (en) Data processing method and apparatus, and electronic device and storage medium
CN110249312B (en) Method and system for converting data integration jobs from a source framework to a target framework
US10719303B2 (en) Graphics engine and environment for encapsulating graphics libraries and hardware
US11321090B2 (en) Serializing and/or deserializing programs with serializable state
CN114911465B (en) Method, device and equipment for generating operator and storage medium
CN112784989A (en) Inference system, inference method, electronic device, and computer storage medium
CN111814959A (en) Model training data processing method, device and system and storage medium
CN113449841A (en) Method and device for inserting conversion operator
CN114490116B (en) Data processing method and device, electronic equipment and storage medium
US11409564B2 (en) Resource allocation for tuning hyperparameters of large-scale deep learning workloads
CN116932147A (en) Streaming job processing method and device, electronic equipment and medium
CN112308573A (en) Intelligent customer service method and device, storage medium and computer equipment
CN116226850A (en) Method, device, equipment, medium and program product for detecting virus of application program
CN112230911B (en) Model deployment method, device, computer equipment and storage medium
CN117296042A (en) Application management platform for super-fusion cloud infrastructure
CN112114931B (en) Deep learning program configuration method and device, electronic equipment and storage medium
EP2972837B1 (en) Dynamic memory management for a virtual supercomputer
CN111401560A (en) Inference task processing method, device and storage medium
US20210055971A1 (en) Method and node for managing a request for hardware acceleration by means of an accelerator device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination