CN112906803B

CN112906803B - Model integration method, device, server and computer readable storage medium

Info

Publication number: CN112906803B
Application number: CN202110227581.2A
Authority: CN
Inventors: 杨杭
Original assignee: Chongqing Unisinsight Technology Co Ltd
Current assignee: Chongqing Unisinsight Technology Co Ltd
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2022-11-01
Anticipated expiration: 2041-03-01
Also published as: CN112906803A

Abstract

The embodiment of the invention provides a model integration method, a model integration device, a server and a computer readable storage medium, and relates to the field of deep learning. The server stores a model file and a model file configuration table in advance, the model file comprises a plurality of models, the model file configuration table is used for recording the position of each model in the model file, the target model is read from the model file according to the task parameters and the model file configuration table by acquiring the task parameters corresponding to the video stream analysis task, the task parameters are used for specifying the target model to be integrated, the target model is loaded on the inference card and an operator engine corresponding to the target model is generated, an engine pipeline is generated according to the task parameters and the operator engine, and the input video stream data is processed through the engine pipeline. Therefore, one or more models to be integrated can be flexibly selected according to the video stream analysis task and loaded on one inference card, and consumption of hardware resources is effectively reduced.

Description

Model integration method, device, server and computer readable storage medium

Technical Field

The invention relates to the field of deep learning, in particular to a model integration method, a model integration device, a model integration server and a computer-readable storage medium.

Background

In the deep learning field, a batch of labeled materials are trained to obtain an algorithm model based on deep learning. The algorithm model can be used for detecting or classifying different targets in some scenes needing artificial intelligence analysis, however, the algorithm model trained independently cannot be actually applied to the ground, and the general processing mode is that the algorithm model is quantized firstly, and then an SDK (Software Development Kit) capable of loading the model and standardizing an input/output interface is developed for packaging, and further the SDK is provided for an upper layer to be called.

At present, one SDK only contains one model, and is then deployed on one inference card, and video streams supported by the inference card can only analyze scenes based on the model and cannot load other types of models, and if other types of models need to be loaded, another inference card is needed, thereby causing waste of hardware resources.

Disclosure of Invention

In view of the above, the present invention provides a model integration method, apparatus, server and computer-readable storage medium, which can flexibly select one or more models to be integrated according to a video stream analysis task and reduce consumption of hardware resources.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, the present invention provides a model integration method, which is applied to a server, where the server stores a model file and a model file configuration table in advance, the model file includes multiple models, and the model file configuration table is used to record the position of each model in the model file; the method comprises the following steps:

acquiring task parameters corresponding to a video stream analysis task; the task parameters are used for specifying a target model to be integrated;

reading the target model from the model file according to the task parameters and the model file configuration table;

loading the target model to an inference card, and generating an operator engine corresponding to the target model;

and generating an engine pipeline according to the task parameters and the operator engine so as to process input video stream data through the engine pipeline.

In an optional embodiment, the step of reading the target model from the model file according to the task parameter and the model file configuration table includes:

searching a model position corresponding to the model identification of the target model in the corresponding relation;

and reading the target model from the model file according to the model position.

In an optional embodiment, the task parameter includes a model identifier of the object model, and the generating an engine pipeline according to the task parameter and the operator engine includes:

associating the operator engines according to the sequence corresponding to each model identifier in the task parameters to obtain an engine pipeline; the previous operator engine in the engine pipeline registers the input interface of the current operator engine, and the current operator engine registers the input interface of the next operator engine.

In an optional implementation manner, an operator engine warehouse is also preset in the server, and the operator engine warehouse is used for storing the generated operator engine; after acquiring the task parameters corresponding to the video stream analysis task, the method further includes:

if an operator engine corresponding to the target model exists in the operator engine warehouse, acquiring the operator engine corresponding to the target model from the operator engine warehouse so as to generate an engine pipeline;

and if the operator engine corresponding to the target model does not exist in the operator engine warehouse, after the operator engine corresponding to the target model is generated, storing the generated operator engine into the operator engine warehouse.

In an alternative embodiment, the method further comprises:

sending the obtained video stream data into a corresponding engine pipeline, and processing the obtained video stream data by using an operator engine in the engine pipeline;

and after a processing result output by the last operator engine of the engine pipeline is obtained, the processing result is called back to an application program.

In an optional implementation manner, the operator engine maintains an operator routing table, where the operator routing table includes a routing index, the routing index includes a data identifier, a target operator engine, a next hop operator engine, and a data priority, and the operator engine determines, according to the operator routing table maintained by itself, a priority corresponding to received video stream data, the target operator engine, and the next hop operator engine; the method further comprises the following steps:

and updating the operator routing table maintained by the operator engine under the condition that a new video stream analysis task is created or the created video stream analysis task is deleted.

In an optional implementation manner, the operator engine includes an input interface, an analysis interface, and an output interface, where the input interface is configured to maintain input queues with multiple priorities and store received data in corresponding input queues according to priorities; the analysis interface is used for analyzing the data in the input queue; the output interface is used for maintaining output queues with various priorities and storing the received data to the corresponding output queues according to the priorities.

In a second aspect, the present invention provides a model integration apparatus, applied to a server, where the server stores a model file and a model file configuration table in advance, the model file includes a plurality of models, and the model file configuration table is used to record the position of each model in the model file; the device comprises:

the parameter acquisition module is used for acquiring task parameters corresponding to the video stream analysis task; the task parameters are used for specifying a target model to be integrated;

the model acquisition module is used for reading the target model from the model file according to the task parameters and the model file configuration table;

the model loading module is used for loading the target model onto an inference card and generating an operator engine corresponding to the target model;

and the engine pipeline generating module is used for generating an engine pipeline according to the task parameters and the operator engine so as to process the input video stream data through the engine pipeline.

In a third aspect, the present invention provides a server comprising a processor and a memory, the memory storing a computer program, the processor implementing the method of any one of the preceding embodiments when executing the computer program.

In a fourth aspect, the invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method of any of the preceding embodiments.

The server stores a model file and a model file configuration table in advance, the model file comprises a plurality of models, the model file configuration table is used for recording the positions of the models in the model file, task parameters corresponding to a video stream analysis task are obtained and used for specifying a target model to be integrated, the target model is read from the model file according to the task parameters and the model file configuration table, the target model is loaded on an inference card and an operator engine corresponding to the target model is generated, and an engine pipeline is generated according to the task parameters and the operator engine to process input video stream data through the engine pipeline. Therefore, one or more models to be integrated can be flexibly selected according to the video stream analysis tasks, different models are loaded on one inference card by different video stream analysis tasks or a plurality of models are loaded by one video stream analysis task at the same time, and finally an engine pipeline is generated to process the video stream data, so that the effects of richer analysis data, more flexible and full use of the inference capability of the inference card are achieved, and the consumption of hardware resources can be reduced.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a block diagram of a server according to an embodiment of the present invention;

FIG. 2 is a flow chart diagram illustrating a model integration method provided by an embodiment of the invention;

FIG. 3 is another schematic flow chart diagram of a model integration method provided by the embodiment of the invention;

FIG. 4 is a schematic flow chart diagram illustrating a model integration method provided by an embodiment of the invention;

FIG. 5 is a schematic flow chart diagram illustrating a model integration method according to an embodiment of the present invention;

FIG. 6 shows a schematic diagram of the connection of the operator engines in a plurality of pipeline;

FIG. 7 is a functional block diagram of a model integration apparatus provided in an embodiment of the present invention;

fig. 8 is a functional block diagram of a model integration apparatus according to an embodiment of the present invention.

Icon: 100-a server; 700-a model integrated device; 110-a memory; 120-a processor; 130-a communication module; 710-a parameter acquisition module; 720-model acquisition module; 730-a model loading module; 740-an engine pipeline generation module; 750-operator engine access module; 760-data processing module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It is noted that relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

Fig. 1 is a block diagram of a server 100 according to an embodiment of the present invention. The server 100 may be, but is not limited to, a web server, a database server, a cloud server, and the like. The server 100 includes a memory 110, a processor 120, and a communication module 130. The memory 110, the processor 120, and the communication module 130 are electrically connected to each other directly or indirectly to enable data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.

The memory 110 is used to store programs or data. The Memory 110 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), and the like.

The processor 120 is used to read/write data or programs stored in the memory 110 and perform corresponding functions. In this embodiment, the processor 120 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), an Artificial Intelligence (AI) inference card, and the like. The model integration method disclosed by the embodiments of the present invention may be implemented when the processor 120 executes a computer program stored in the memory 110.

The communication module 130 is used for establishing a communication connection between the server 100 and other communication terminals (such as video cameras, capturing cameras and other devices with video and image capturing functions) through a network, and is used for transceiving data through the network.

It should be understood that the configuration shown in fig. 1 is merely a schematic diagram of the configuration of the server 100, and that the server 100 may include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by the processor 120, the computer program can implement the model integration method disclosed in the embodiments of the present invention.

Fig. 2 is a schematic flow chart of a model integration method according to an embodiment of the present invention. It should be noted that the model integration method according to the embodiment of the present invention is not limited by fig. 2 and the following specific sequence, and it should be understood that, in other embodiments, the sequence of some steps in the model integration method according to the present invention may be interchanged according to actual needs, or some steps in the model integration method may be omitted or deleted. The model integration method can be applied to the server 100 described above, and the specific process shown in fig. 2 will be described in detail below.

Step S201, acquiring task parameters corresponding to a video stream analysis task; the task parameters are used to specify the target model to be integrated.

In this embodiment, the task parameters may be configured when creating the video stream analysis task, i.e. specifying one or more models (target models to be integrated) associated with the video stream analysis task. That is, the number of the target models specified by the task parameters may be one or multiple, and this embodiment does not limit this.

Step S202, reading the target model from the model file according to the task parameters and the model file configuration table.

In the present embodiment, the server 100 stores a model file including a plurality of models and a model file configuration table in advance, where the model file configuration table is used to record the position of each model in the model file.

The model file and the model file configuration table may be obtained by quantizing a plurality of trained models, respectively, reading binary data corresponding to each model, and inputting the read binary data into one file, so as to merge the plurality of models into one file (i.e., the above-mentioned model file). In the process of combining a plurality of models into one model file, a corresponding model file configuration table (which may be in json format) is generated, and the relevant information of each model in the model file is described through the model file configuration table. In practical application, in order to improve the security of the model file, the model file may be encrypted by using an encryption algorithm after a plurality of models are combined into one model file, so that the model file cannot be loaded and used without a private key even if the model file is leaked.

When the video stream analysis task is created, the target model to be integrated is specified through the task parameters, and the model file configuration table records the position of each model in the model file, so that the server 100 can obtain the position of the target model in the model file according to the task parameters and the model file configuration table, and further read the target model from the decrypted model file based on the position.

And step S203, loading the target model to the inference card, and generating an operator engine corresponding to the target model.

In this embodiment, the server 100 loads the read target model onto the inference card, and after the target model is loaded, the target model may be instantiated into a corresponding operator engine according to the algorithm type corresponding to the target model. Because the target model to be loaded is specified by configuring the task parameters when the video stream analysis task is created, and the server 100 pre-stores the model files including the multiple models, different models can be loaded by different video stream analysis tasks or multiple models can be loaded by one video stream analysis task on one inference card, so that the inference card can not only support the video stream analysis task of a single model scene when deploying services.

When the server 100 generates an operator engine corresponding to the target model, if the algorithm type corresponding to the target model is the detection algorithm, the target model may be instantiated as the corresponding detection operator engine, and if the algorithm type corresponding to the target model is the classification algorithm, the target model may be instantiated as the corresponding classification operator engine. For the detection operator engine, the method can be used for detecting the target in the picture and obtaining the type and the coordinate of the target; the classification operator engine can be used for carrying out category analysis on the received whole picture and outputting confidence coefficients of various categories, so that the picture classification is realized.

And step S204, generating an engine pipeline according to the task parameters and the operator engine so as to process the input video stream data through the engine pipeline.

In this embodiment, after instantiating the target model as the corresponding operator engine, the server 100 may generate an engine pipeline (pipeline) according to the model information of the target model and the operator engine specified in the current task parameter, so as to realize integration of one or more target models associated with the video stream analysis task, where the pipeline includes the operator engine corresponding to the target model, and then process the input video stream data through each operator engine in the pipeline. The video stream data may be acquired by a front-end device (e.g., a camera) connected to the server 100.

In the model integration method provided by the embodiment of the present invention, a server 100 stores a model file and a model file configuration table in advance, the model file includes a plurality of models, the model file configuration table is used for recording the position of each model in the model file, the task parameter corresponding to the video stream analysis task is obtained, the task parameter is used for specifying a target model to be integrated, the target model is read from the model file according to the task parameter and the model file configuration table, the target model is loaded on an inference card, an operator engine corresponding to the target model is generated, and an engine pipeline is generated according to the task parameter and the operator engine, so as to process input video stream data through the engine pipeline. Therefore, one or more models to be integrated can be flexibly selected according to the video stream analysis tasks, different models are loaded on one inference card by different video stream analysis tasks or a plurality of models are loaded by one video stream analysis task at the same time, and finally an engine pipeline is generated to process the video stream data, so that the effects of richer analysis data, more flexible and full use of the inference capability of the inference card are achieved, and the consumption of hardware resources can be reduced.

Optionally, the model file configuration table may include a corresponding relationship between a model identifier and a model position, and the task parameter may include a model identifier of the target model, and then step S202 may specifically include: searching a model position corresponding to the model identification of the target model in the corresponding relation; and reading the target model from the model file according to the model position.

In one example, assuming that the model file includes model a, model B and model C, table 1 shows the distribution of model a, model B and model C in the memory after merging, where the file start address of model a is 0xabcd, and the data length is a bytes; the starting address of model B is 0xabcd + a, and the data length is B bytes; the starting address of model C is 0 xbcd + a + b, and the data length is C bytes. The address range stored by the model file is 0 xabcd-0 xabcd + a + b + c, where 0xabcd is the corresponding start address of the model file and 0xabcd + a + b + c is the corresponding end address of the model file.

TABLE 1

File start address	File data
		0xabcd	Data of model A, length a bytes
0xabcd+a	Data of model B, length B bytes
		0xabcd+a+b	Data of model C, length C bytes

In the process of merging the model a, the model B, and the model C into one model file, the model file configuration table shown in table 2 may be generated. The model identifier may be information uniquely identifying the model, such as a model name (name), and the model location may include information for determining the location of the model in the model file, such as a model index (index), a model size (size), and an offset (offset) of the model in the entire model file. The size of the model can be understood as the data length corresponding to the model, and the offset of the model in the whole model file can be understood as the offset of the start address of the model relative to the start address of the model file.

TABLE 2

name	index	offset	size
				A
	1	0	a
				B	2	a	b
C	3	a+b	c

Assuming that the task parameters corresponding to the currently created video stream analysis task include a model identifier a and a model identifier C, and the model position is represented by the offset between the model size and the model in the entire model file, the model position of the target model corresponding to the model identifier a can be determined as follows according to the model file configuration table shown in the model identifier a lookup table 2: the model size is a, the offset is 0, the starting address of the target model corresponding to the model identifier A is 0xabcd, the data length is a bytes, and then the data at the corresponding position in the model file is read based on the starting address and the data length to obtain the model A. Similarly, according to the model file configuration table shown in the model representation C lookup table 2, the model position of the target model corresponding to the model identifier C can be determined as follows: the model size is C, the offset is a + b, the starting address of the target model corresponding to the model identifier C is 0 xbcd + a + b, the data length is C bytes, and then the data at the corresponding position in the model file is read based on the starting address and the data length to obtain the model C. In this way, the target models (model A and model C) associated with the video stream analysis task are read from the model file according to the model identification and the model file configuration table in the task parameters.

Optionally, the server 100 may generate the engine pipelines in sequence according to the model information of the target model specified in the task parameter corresponding to the current video stream analysis task, that is, the step S204 may specifically include: associating the operator engines according to the sequence corresponding to each model identifier in the task parameters to obtain an engine pipeline; the current operator engine registers the input interface of the next operator engine.

For example, if each model identifier in the task parameter corresponding to a certain video stream analysis task is A, B, C, the corresponding sequence is a → B → C, and the operator engines corresponding to model a, model B and model C are sequentially connected in series according to the sequence, so that the operator engines corresponding to model a, model B and model C are associated to obtain the engine pipeline.

In this embodiment, the operator engine may include the following interfaces: the system comprises an input interface, an analysis interface, an output interface, a creation interface and a destruction interface, wherein the input interface is used for maintaining input queues with various priorities and storing received data to corresponding input queues according to the priorities; the analysis interface is used for analyzing the data in the input queue; the output interface is used for maintaining output queues with various priorities and storing the received data to the corresponding output queues according to the priorities; the creation interface is used for initializing a creation operator engine according to the loaded model; the destroy interface is used for destroying the operator engine.

When the pipeline is constructed, a previous operator engine (namely, a previous operator engine adjacent to the current operator engine) is registered with an input interface of the current operator engine, and the current operator engine is registered with an input interface of a next operator engine (namely, a next operator engine adjacent to the current operator engine), so that the current operator engine in the pipeline consumes data produced by the previous operator engine, and the data is reprocessed to further produce the data for consumption by the next operator engine. That is to say, the former operator engine generates data and then notifies the current operator engine to consume the data through the callback function, and after the current operator engine consumes and analyzes the data, the latter operator engine is notified to consume the data through the callback function, so that the data series connection is realized.

In this embodiment, the input queue maintained by the input interface of the current operator engine can store the original video stream data and the processing result output by the previous operator engine (if the current operator engine has no previous operator engine, the input queue has no processing result output by the previous operator engine), and the input interface can maintain the input queues with three priorities, and the data in the input queue with the higher priority is consumed earlier.

The analysis interface of the current operator engine takes out data (for example, original video stream data) of the input queue for classification or detection analysis, serializes the analysis result, and finally synchronously outputs the result. The output queue maintained by the output interface of the current operator engine can store original video stream data, a processing result output by the previous operator engine and an analysis result output by the analysis interface of the current operator engine; the output interface maintains output queues with three priorities, and data in the output queues with higher priorities are output first.

Alternatively, the server 100 may process the received video stream data by using the engine pipeline after generating the engine pipeline. As shown in fig. 3, the model integration method provided in the embodiment of the present invention may further include the following steps:

step S301, sending the obtained video stream data to a corresponding engine pipeline, and processing the obtained video stream data by using an operator engine in the engine pipeline.

In this embodiment, after sending video stream data to a corresponding engine pipeline, a first operator engine in the engine pipeline consumes original video stream data, a second operator engine consumes data produced by the first operator engine, and so on, each operator engine classifies or detects and analyzes received data and sequentially outputs the data in a streaming manner, and a processing result output by a last operator engine in the engine pipeline is a final result obtained by analyzing the video stream data.

Step S302, after obtaining the processing result output by the last operator engine of the engine pipeline, recalling the processing result to the application program.

In this embodiment, after obtaining the processing result output by the last operator engine of the engine pipeline, the server 100 calls back the processing result to the upper-layer application program for further application.

Optionally, an operator engine warehouse is also preset in the server 100, and is configured to store the generated operator engine, so as to facilitate management of models and pipeline corresponding to different video stream analysis tasks, an engine management layer may be further encapsulated on the operator engine, when a video stream analysis task is received, a corresponding required target model is loaded and a corresponding operator engine is generated, and meanwhile, the generated operator engine is placed in the operator engine warehouse, and information of the operator engine stored in the operator engine warehouse is recorded through the engine management layer. As shown in fig. 4, after step S201, the model integration method provided in the embodiment of the present invention may further include the following steps:

step S401, judging whether an operator engine corresponding to the target model exists in the operator engine warehouse or not.

In this embodiment, for a target model formulated in task parameters corresponding to a current video stream analysis task, if a relevant record of an operator engine corresponding to the target model already exists in an engine management layer, it indicates that a corresponding operator engine already exists in an operator engine warehouse, and no repeated creation is needed, that is, the target model is already loaded on the inference card and no repeated loading is needed, and step S402 is executed; if the operator engine corresponding to the target model is not recorded in the engine management layer, it indicates that the target model needs to be newly loaded, the target model is read from the model file and loaded onto the inference card, and after the operator engine corresponding to the target model is generated, step S403 is executed.

And S402, if the operator engine corresponding to the target model exists in the operator engine warehouse, acquiring the operator engine corresponding to the target model from the operator engine warehouse so as to generate an engine pipeline.

In this embodiment, when an operator engine of a target model to be integrated already exists in the operator engine warehouse, the operator engine can be directly obtained from the operator engine warehouse, and engine pipelines are generated in sequence, so that repeated creation of the operator engine is avoided, and resource overhead is reduced.

Step S403, if there is no operator engine corresponding to the target model in the operator engine warehouse, after generating the operator engine corresponding to the target model, storing the generated operator engine in the operator engine warehouse.

In this embodiment, when there is no operator engine of the target model that needs to be integrated in the operator engine repository, the target model is still read from the model file according to the relevant content of step S202 and step S203, and the target model is loaded onto the inference card, so as to generate the operator engine corresponding to the target model. And after generating an operator engine corresponding to the target model, storing the operator engine into an operator engine warehouse. When a new video stream analysis task is created subsequently, if the operator engine is also needed to be used for the video stream analysis task, repeated creation is not needed, resource expenditure is effectively reduced, and the operator engine can be shared by a plurality of video stream analysis tasks. It should be noted that step S403 may be executed before step S204, after step S204, or simultaneously with step S204, and this embodiment does not limit the execution sequence of these steps.

In practical applications, since the operator engine may be reused by different video stream analysis tasks, it is necessary to distinguish different video stream data and specify the operator engine of the next hop. Therefore, an operator routing table is introduced in the embodiment, each operator engine maintains the operator routing table, the operator routing table includes a routing index, the routing index includes a data identifier, a target operator engine, a next hop operator engine and a data priority, and the operator engine determines the priority, the target operator engine and the next hop operator engine corresponding to the received video stream data according to the operator routing table maintained by the operator engine.

In this embodiment, the routing index may be understood as a routing table entry in the operator routing table, and each routing table entry includes information such as a data identifier, a target operator engine, a next hop operator engine, and a data priority. The data identifier is used for indicating that the data is the data of the video stream of several channels, the data identifier corresponds to a data index carried in video stream data, the target operator engine is an operator engine for generating a final result of the video stream data of the channel, the next-hop operator engine is a next operator engine for processing the video stream data of the channel, the data priority is the priority of the video stream data of the channel in an operator routing table, and the video stream data with higher priority is processed by the operator engine first.

For each operator engine in pipeline, after the operator engine consumes input video stream data, firstly, inquiring an operator routing table according to a data index (namely, the video stream data of several routes) carried by the video stream data, determining the data priority corresponding to the video stream data, then, putting the video stream data into an input queue of the corresponding priority, when an analysis interface of the operator engine analyzes the video stream data, preferentially processing data in the input queue with high priority, sequentially taking out the video stream data from the input queue for analysis, then, sending an analysis result to an output interface, and putting the analysis result corresponding to the video stream data into an output queue of the corresponding priority by the output interface according to the data priority corresponding to the video stream data. If the operator engine is the same as the target operator engine determined by inquiring the operator routing table, indicating that the processing result output by the operator engine is the final result obtained by analyzing the video stream data; if the operator engine is different from the target operator engine determined by querying the operator routing table, the operator engine needs to notify the corresponding next-hop operator routing to consume the data in the output queue until the last operator engine in pipeline outputs the final result.

As shown in fig. 5, the model integration method provided in the embodiment of the present invention may further include the following steps:

step S501, under the condition that a new video stream analysis task is created or the created video stream analysis task is deleted, an operator routing table maintained by an operator engine is updated.

In this embodiment, when a new video stream analysis task is created in the server 100, the engine management layer may recalculate and update the operator routing table of each operator engine according to the task parameter corresponding to the video stream analysis task. For example, the task parameter may further include information of which path of video stream data (data identifier) needs to be analyzed, a priority corresponding to the path of video stream data, and the like, after the engine pipeline is generated, a target operator engine and a next hop operator engine of the current operator engine are determined according to a connection relationship of each operator engine in the engine pipeline, and a routing index is newly added in the operator routing table based on the information of the data identifier, the priority, the target operator engine, the next hop operator engine, and the like, so as to update the operator routing table. A pipeline list is required to be maintained in an engine management layer and is used for recording the connection condition of each current operator engine, and each pipeline corresponds to information such as which video stream data; if a video stream analysis task is deleted, the engine management layer also updates the operator routing table of each operator engine, for example, deleting the routing index corresponding to the video stream analysis task in the operator routing table.

In this embodiment, the engine management layer provides, to the outside, an interface to be exposed as the whole SDK, and mainly includes an initialization interface, a destruction interface, a picture input interface, and a callback registration interface, where the initialization interface is responsible for initializing the whole SDK, the destruction interface is responsible for destroying the SDK, the picture input interface is responsible for receiving an externally input decoded picture, and the callback registration interface is responsible for notifying an upper application registration function when a final processing result is generated.

As shown in fig. 6, it is assumed that 3 pipelines are currently generated in the server 100, which are pipeline1, pipeline2, and pipeline3, where the pipeline1 includes an operator engine a and an operator engine C, the pipeline2 includes the operator engine a, the operator engine B, and the operator engine C, and the pipeline3 includes the operator engine B, so that the pipeline1 and the pipeline2 share the operator engine a and the operator engine C, the pipeline2 and the pipeline3 share the operator engine B, and the engine management layer records which video stream data corresponds to each pipeline. After receiving video stream data, the server 100 decodes the video stream data to obtain a frame-by-frame decoded picture, inputs the decoded picture into the SDK and places the SDK into a picture queue of an engine management layer, then continuously takes out the picture from the picture queue through a thread and sends the picture into a first operator engine corresponding to pipeline for processing, and each operator engine analyzes received data according to an operator routing table maintained by each operator engine and sequentially outputs analysis results in a streaming manner. For example, for video stream data corresponding to pipeline1, a decoded picture is sent to a first operator engine (i.e., operator engine a) of pipeline1, after the operator engine a finishes processing, the operator engine a notifies an operator engine C (next hop operator engine) to consume data according to an operator routing table maintained by the operator engine a, and the operator engine C is a last operator engine of pipeline1 and outputs a final analysis result. For video stream data corresponding to pipeline2, a decoded picture is sent to a first operator engine (namely, operator engine a) of pipeline2, after the operator engine a finishes processing, the operator engine a notifies an operator engine B (next hop operator engine) to consume data according to an operator routing table maintained by the operator engine a, after the operator engine B finishes processing, the operator engine C (next hop operator engine) is notified to consume data according to the operator routing table maintained by the operator engine B, and the operator engine C is the last operator engine of pipeline2 and outputs a final analysis result. For the video stream data corresponding to pipeline3, the decoded picture will be sent to the first operator engine (i.e. operator engine B) of pipeline3, and operator engine B is also the last operator engine in pipeline3, so that after the processing of the input decoded picture is completed, the final analysis result will be output.

A specific application scenario is given below to describe in detail the model integration method provided in the embodiment of the present invention. Assuming that the behavior of the person needs to be analyzed in a certain area, an alarm is given if an out-of-compliance behavior is detected. The existing four behavior analysis requirements of smoking, playing mobile phones, dozing and fighting correspond to 4 corresponding models respectively. Suppose that there are 6 cameras in the area, two cameras are needed for smoking detection, two cameras are needed for mobile phone playing detection, one camera is used for dozing detection, and the other camera is used for shelving detection. According to the current general processing mode, the analytic service on one inference card can only load one model, 4 models exist in the scene, so 4 inference cards are needed to meet the requirement, if each inference card can support 8 paths of video streams, the 4 inference cards can support 32 paths of video streams in total, but only 6 paths of video streams are needed in the current scene, and thus the waste of hardware resources is caused. After the model integration method provided by the embodiment of the invention is adopted, different models can be loaded respectively aiming at 6 video stream analysis tasks created by 6 cameras, and then one reasoning card can meet the requirements. And simultaneously, all models can be loaded in a video stream analysis task at the same time, so that the behavior analysis of smoking, playing mobile phones, dozing and fighting can be simultaneously carried out on each path of video stream data. The method is flexible, and the hardware consumption of practical application scenes can be greatly reduced.

In order to perform the corresponding steps in the above embodiments and various possible modes, an implementation mode of the model integration apparatus is given below. Referring to fig. 7, a functional block diagram of a model integration apparatus 700 according to an embodiment of the present invention is shown. It should be noted that the basic principle and the generated technical effect of the model integration apparatus 700 provided in the present embodiment are the same as those of the above embodiments, and for the sake of brief description, no part of the present embodiment is mentioned, and corresponding contents in the above embodiments may be referred to. The model integrated apparatus 700 includes: a parameter acquisition module 710, a model acquisition module 720, a model loading module 730, and an engine pipeline generation module 740.

Alternatively, the modules may be stored in the memory 110 shown in fig. 1 in the form of software or Firmware (Firmware) or be fixed in an Operating System (OS) of the server 100, and may be executed by the processor 120 in fig. 1. Meanwhile, data, codes of programs, etc. required to execute the above modules may be stored in the memory 110.

The parameter obtaining module 710 is configured to obtain task parameters corresponding to a video stream analysis task; the task parameters are used to specify the object model to be integrated.

It is understood that the parameter obtaining module 710 may perform the step S201.

The model obtaining module 720 is configured to read the target model from the model file according to the task parameters and the model file configuration table.

It is understood that the model obtaining module 720 can perform the step S202.

The model loading module 730 is configured to load the target model onto the inference card, and generate an operator engine corresponding to the target model.

It is understood that the model loading module 730 can perform the above step S203.

The engine pipeline generating module 740 is configured to generate an engine pipeline according to the task parameters and the operator engine, so as to process the input video stream data through the engine pipeline.

It is understood that the engine pipeline generation module 740 can perform the step S204.

Optionally, the model file configuration table includes a corresponding relationship between a model identifier and a model position, the task parameter includes a model identifier of the target model, and the model obtaining module 720 is specifically configured to search for the model position corresponding to the model identifier of the target model in the corresponding relationship; and reading the target model from the model file according to the model position.

Optionally, the engine pipeline generating module 740 is specifically configured to associate the operator engines according to the sequence corresponding to each model identifier in the task parameter, so as to obtain an engine pipeline; the current operator engine registers the input interface of the next operator engine.

Optionally, referring to fig. 8, the model integration apparatus 700 provided in the embodiment of the present invention may further include an operator engine access module 750 and a data processing module 760.

The operator engine access module 750 is configured to, if an operator engine corresponding to the target model exists in the operator engine repository, obtain an operator engine corresponding to the target model from the operator engine repository so as to generate an engine pipeline; and if the operator engine corresponding to the target model does not exist in the operator engine warehouse, storing the generated operator engine into the operator engine warehouse after the operator engine corresponding to the target model is generated.

It is understood that the operator engine access module 750 can perform the above steps S401 to S403.

Optionally, the operator engine maintains an operator routing table, where the operator routing table includes a routing index, the routing index includes a data identifier, a target operator engine, a next hop operator engine, and a data priority, and the operator engine determines, according to the operator routing table maintained by itself, a priority corresponding to the received video stream data, the target operator engine, and the next hop operator engine. The operator engine access module 750 is further configured to update the operator routing table maintained by the operator engine when a new video stream analysis task is created or a created video stream analysis task is deleted.

It is understood that the operator engine access module 750 can also execute the step S501.

Optionally, the data processing module 760 is configured to send the obtained video stream data to a corresponding engine pipeline, and process the obtained video stream data by using an operator engine in the engine pipeline; and after a processing result output by the last operator engine of the engine pipeline is obtained, the processing result is called back to the application program.

It is understood that the data processing module 760 may perform the above steps S301 to S302.

In the model integration apparatus 700 provided in this embodiment of the present invention, since the server 100 stores a model file and a model file configuration table in advance, the model file includes a plurality of models, the model file configuration table is used to record the position of each model in the model file, a parameter obtaining module 710 obtains task parameters corresponding to a video stream analysis task, the task parameters are used to specify a target model to be integrated, the model obtaining module 720 reads the target model from the model file according to the task parameters and the model file configuration table, the model loading module 730 loads the target model onto an inference card and generates an operator engine corresponding to the target model, and the engine pipeline generating module 740 generates an engine pipeline according to the task parameters and the operator engine to process input video stream data through the engine pipeline. Therefore, one or more models to be integrated can be flexibly selected according to the video stream analysis tasks, different models are loaded on one inference card by different video stream analysis tasks or a plurality of models are loaded by one video stream analysis task at the same time, and finally an engine pipeline is generated to process the video stream data, so that the effects of richer analysis data, more flexible and full use of the inference capability of the inference card are achieved, and the consumption of hardware resources can be reduced.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A model integration method is applied to a server, wherein a model file and a model file configuration table are stored in the server in advance, the model file comprises a plurality of models, and the model file configuration table is used for recording the position of each model in the model file; the method comprises the following steps:

generating an engine pipeline according to the task parameters and the operator engine so as to process input video stream data through the engine pipeline;

an operator engine warehouse is also preset in the server and used for storing the generated operator engine; after acquiring the task parameters corresponding to the video stream analysis task, the method further comprises the following steps:

2. The method of claim 1, wherein the model file configuration table comprises a correspondence between model identifiers and model locations, wherein the task parameters comprise model identifiers of the object models, and wherein reading the object models from the model files according to the task parameters and the model file configuration table comprises:

3. The method of claim 1, wherein the task parameters comprise a model identification of the object model, and wherein generating an engine pipeline from the task parameters and the operator engine comprises:

according to the sequence corresponding to each model identifier in the task parameters, correlating the operator engines to obtain an engine pipeline; the previous operator engine in the engine pipeline registers the input interface of the current operator engine, and the current operator engine registers the input interface of the next operator engine.

4. The method of claim 1, further comprising:

5. The method according to claim 1, wherein the operator engine maintains an operator routing table, the operator routing table includes a routing index, the routing index includes a data identifier, a target operator engine, a next hop operator engine, and a data priority, and the operator engine determines a priority, a target operator engine, and a next hop operator engine corresponding to the received video stream data according to the operator routing table maintained by itself; the method further comprises the following steps:

6. The method according to any one of claims 1 to 5, wherein the operator engine comprises an input interface, an analysis interface and an output interface, wherein the input interface is used for maintaining input queues with various priorities and storing the received data to the corresponding input queues according to the priorities; the analysis interface is used for analyzing the data in the input queue; the output interface is used for maintaining output queues with various priorities and storing the received data to the corresponding output queues according to the priorities.

7. The model integration device is applied to a server, wherein the server stores a model file and a model file configuration table in advance, the model file comprises a plurality of models, and the model file configuration table is used for recording the position of each model in the model file; the device comprises:

the engine pipeline generation module is used for generating an engine pipeline according to the task parameters and the operator engine so as to process input video stream data through the engine pipeline;

an operator engine warehouse is also preset in the server and used for storing the generated operator engine; the model integration apparatus further includes:

the operator engine access module is used for acquiring an operator engine corresponding to the target model from the operator engine warehouse so as to generate an engine pipeline if the operator engine corresponding to the target model exists in the operator engine warehouse; and if the operator engine corresponding to the target model does not exist in the operator engine warehouse, after the operator engine corresponding to the target model is generated, storing the generated operator engine into the operator engine warehouse.

8. A server, comprising a processor and a memory, the memory storing a computer program which, when executed by the processor, implements the method of any one of claims 1-6.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1-6.