CN114513681A

CN114513681A - Video processing system, method, device, electronic equipment and storage medium

Info

Publication number: CN114513681A
Application number: CN202210088178.0A
Authority: CN
Inventors: 周海彬; 周华兵; 张彦铎; 卢涛; 鲁统伟; 李迅; 王燕燕
Original assignee: Wuhan Yijin Technology Co ltd; Wuhan Institute of Technology
Current assignee: Wuhan Yijin Technology Co ltd; Wuhan Institute of Technology
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2022-05-17

Abstract

The application relates to a video processing system, a video processing method, a video processing device, electronic equipment and a storage medium, and belongs to the technical field of software monitoring. The method comprises the following steps: the front service unit in the edge end is used for acquiring initial video data acquired by the application scene equipment end, performing target object identification operation on the initial video data to extract an image frame containing a target object, and sending the image frame to the algorithm service unit; the algorithm service unit in the edge terminal is used for determining a corresponding object extraction model according to a current application scene corresponding to each image frame containing a target object, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame; the cloud end is used for storing target data corresponding to the initial video data. The method and the device have high-efficiency image processing capability, different object extraction models are loaded according to different application scenes, one application scene does not correspond to one video processing system, and the application range is wide.

Description

Video processing system, method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of software monitoring technologies, and in particular, to a video processing system, a method, an apparatus, an electronic device, and a storage medium.

Background

At present, video monitoring is distributed throughout all corners of life, and surveillance cameras are installed in places such as intersections, blocks, markets, construction sites and the like, and along with the wide application of video monitoring technology, a variety of application scene requirements are derived, especially the application of the video monitoring technology on safety protection, such as: passerby is ridden to the electric motor car through the surveillance camera head control crossroad at crossing to whether the helmet is dressed to the passerby is ridden to further judgement electric motor car, through the surveillance camera head control operating personnel's of building site work progress, thereby whether further judgement operating personnel wear safety helmet etc..

In the process of solving the application scenario requirement, a common processing method is as follows: the method comprises the following steps that an intelligent camera is installed, and monitoring data are processed through the intelligent camera, on one hand, the intelligent camera generally only has basic computing capacity, cannot analyze a large amount of data in real time, and is lack of more professional image processing operation, so that the image processing capacity is weak; on the other hand, the processing of the image by the intelligent camera is based on a predefined algorithm, the application scene cannot be changed, and the application range is limited. Or, the monitoring data is acquired by a common camera and then uploaded to the cloud, the monitoring data is further processed by the cloud, a large amount of network resources and storage space resources need to be occupied, and if the cloud is unstable, great potential safety hazards exist in the use and storage of the data.

Content of application

The present application provides a video processing system, method, apparatus, electronic device and storage medium that addresses at least one of the above-identified shortcomings. The technical scheme is as follows:

in a first aspect, a video processing system is provided, the system comprising: the system comprises an application scene equipment end, an edge end and a cloud end, wherein the edge end comprises a front-end service unit and an algorithm service unit;

the front service unit: the system comprises an algorithm service unit, an application scene equipment end and a target object identification unit, wherein the algorithm service unit is used for acquiring initial video data acquired by the application scene equipment end, performing target object identification operation on the initial video data to extract an image frame containing a target object and sending the image frame containing the target object in the initial video data to the algorithm service unit;

the algorithm service unit: the image processing device is used for determining an object extraction model corresponding to a current application scene according to the current application scene corresponding to each image frame containing the target object, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame;

and the cloud is used for storing target data corresponding to the initial video data.

The video processing system has the following beneficial effects:

according to the video data processing method and device, video data collection and video data processing are separated, the image frames containing the target object in the video data are efficiently processed and extracted through the front-end service unit in the edge end, the target object identification operation is carried out on the video data through the algorithm service unit in the edge end, based on the front-end service unit and the algorithm service unit, the real-time transmission of the video data can be met, the bandwidth pressure is relieved, and the high-efficiency image processing capacity can be achieved. According to the method and the device, different object extraction models are loaded according to different application scenes, one application scene does not correspond to one video processing system, and the application range is wide. A large amount of network resources are not occupied, the data storage pressure of the cloud is reduced, and the stability of the system is maintained.

On the basis of the above scheme, a video processing system of the present application may be further improved as follows.

Further, the cloud comprises a data storage unit;

the front-end service unit is further configured to: acquiring a number corresponding to the initial video data and extraction timestamps corresponding to each frame of image frames containing the target object respectively, and sending the extraction timestamps and the number to the data storage unit;

the data storage unit is used for: and storing the data sent by the preposed service unit.

The beneficial effect of adopting the further scheme is that: the data related to the video processing system is selectively stored, so that the storage pressure of a cloud is further reduced, the safety of data storage is facilitated, and the stability of the system is convenient to maintain.

Further, the cloud end also comprises an edge control unit;

the object extraction model corresponding to the current application scene is a pre-trained first object extraction model or a second object extraction model uploaded by a user;

the edge control unit: the data storage unit is used for storing data, improving the first object extraction model to obtain a third object extraction model, and replacing the first object extraction model with the third object extraction model

The beneficial effect of adopting the further scheme is that: on the basis of the pre-trained object extraction model, the types of the object extraction models which can be loaded in the current application scene are further improved so as to replace the object extraction models with better model effect in a targeted manner, and meanwhile, different object extraction models can be loaded in different application scenes, so that the flexibility of the system is increased, the application range of the system is expanded, and the practicability of the system is improved.

Further, the edge further includes a streaming media service unit, and the streaming media service unit: the method comprises the steps of obtaining initial video data sent by an application scene equipment terminal, and then generating a first access link corresponding to the initial video data;

the front-end service unit is further configured to: and acquiring the initial video data according to the first access link.

Further, the algorithm service unit is further configured to: sending the target data to the prepositive service unit;

the front-end service unit is further configured to: sending the target data to the streaming media service unit;

the streaming media service unit is further configured to: and after the target data is obtained, generating a second access link corresponding to the target data.

The beneficial effect of adopting the further scheme is that: only through the first access link and the second access link, the initial video data and the target data can be correspondingly obtained, so that the operation of a user is facilitated, and excellent user experience is brought.

Further, the front end service unit is further configured to:

before the initial video data is subjected to target object identification operation, performing framing operation on the initial video data to obtain a plurality of initial image frames;

and performing at least one of definition processing, size adjustment and noise reduction processing on each initial image frame.

The beneficial effect of adopting the further scheme is that: through the prepositive service unit, useful information in the video data is preliminarily extracted, and although the prepositive service unit and the algorithm service unit belong to the edge terminal, the operation authorities of the prepositive service unit and the algorithm service unit are separated, so that the prepositive service unit can better serve the algorithm service unit.

In a second aspect, a video processing method is provided, which includes the following steps:

carrying out target object identification operation on initial video data to extract an image frame containing a target object, wherein the initial video data is video data collected by an application scene equipment end;

for each frame of image frame containing the target object, determining an object extraction model corresponding to the current application scene according to the current application scene corresponding to the image frame, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame; and storing target data corresponding to the initial video data through a cloud.

The video processing method has the following beneficial effects:

the method and the device can meet the requirement of real-time transmission of video data, relieve bandwidth pressure and ensure high-efficiency image processing capacity. According to the method and the device, different object extraction models are loaded according to different application scenes, one application scene does not correspond to one video processing system, the application range is wide, a large amount of network resources do not need to be occupied, the data storage pressure of a cloud is reduced, and the stability of the system is maintained. .

In a third aspect, a video processing apparatus is provided, the apparatus comprising:

the system comprises a target object identification module, a target object identification module and a target object identification module, wherein the target object identification module is used for carrying out target object identification operation on initial video data so as to extract an image frame containing a target object, and the initial video data is video data collected by an application scene equipment end;

the target data determining module is used for determining an object extraction model corresponding to a current application scene according to the current application scene corresponding to each image frame containing the target object, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame;

and the storage module is used for storing the target data corresponding to the initial video data through a cloud end.

The video processing device has the following beneficial effects:

according to the video data processing method and device, video data collection and video data processing are separated, the image frames containing the target objects in the video data are efficiently processed and extracted through the target object identification module, target object identification operation is performed on the video data through the target data determination module, and based on the preposed service unit and the algorithm service unit, real-time transmission of the video data can be met, bandwidth pressure is relieved, and high-efficiency image processing capacity can be achieved. According to the application, different object extraction models are loaded according to different application scenes, one application scene does not correspond to one video processing device, and the application range is wide. And a large amount of network resources are not required to be occupied, the data storage pressure of the storage module is reduced, and the stability of the system is maintained.

In a fourth aspect, an electronic device is provided, which includes a processor and a memory, wherein at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the method of the second aspect.

In a fifth aspect, a storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, carries out the method of the second aspect.

Drawings

The present application is further described below with reference to the accompanying drawings and examples.

FIG. 1 is a schematic illustration of an implementation environment to which embodiments of the present application relate;

FIG. 2 is a block diagram of a video processing system according to an embodiment of the present application;

fig. 3 is a second schematic structural diagram of a video processing system according to an embodiment of the present application;

fig. 4 is a third schematic structural diagram of a video processing system according to an embodiment of the present application;

FIG. 5 is a fourth exemplary block diagram of a video processing system according to an embodiment of the present disclosure;

FIG. 6 is a fifth exemplary block diagram of a video processing system according to the present application;

fig. 7 is a flowchart of a video processing method according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present application;

fig. 9 is an electronic device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The video processing system provided by the embodiment of the application can be realized by an application scene equipment end, a computer and a cloud end together.

The application scene equipment end in the embodiment of the application can be installed in a campus and used for monitoring whether students in the campus turn over walls or not, can also be installed in a crossroad and used for monitoring whether electric vehicles in the crossroad ride a vehicle or not and wearing helmets or not, and can also be installed in a construction site and used for monitoring whether construction operators wear safety helmets or not.

In the embodiment of the application, the computer device may be a server or a terminal. The terminal can be a desktop computer, a notebook computer, a mobile phone, a tablet computer and the like. The server may be a background server of an application program, and the application program may be an application program with an information push function, and the like. The server may be a single server or a server group, and if the server is a single server, the server may be responsible for all processing in the following scheme, and if the server is a server group, different servers in the server group may be respectively responsible for different processing in the following scheme, and the specific processing allocation condition may be arbitrarily set by a technician according to actual needs, and is not described herein again.

Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application. Referring to fig. 1, the implementation environment includes: an application scenario device end 210, a computer 100, and a cloud 230. The application scenario device end 210 and the computer 100, and the connection between the computer 100 and the cloud 230 may be a wireless connection or a wired connection. The application scene device 210 transmits the acquired initial video data in the current application scene to the computer 100, after the computer 100 acquires the initial video data, the video processing system provided by the embodiment of the application can be used for processing the initial video data to obtain target data, and the cloud 230 acquires and stores the target data corresponding to the initial video data.

Fig. 2 is a video processing system 200 according to an embodiment of the present disclosure. Referring to fig. 2, the system includes: the system comprises an application scene device end 210, an edge end 220 and a cloud end 230, wherein the edge end 220 comprises a preposition service unit 221 and an algorithm service unit 222. The application scene device end 210 is connected with the front-end service unit 221, the front-end service unit 221 is connected with the algorithm service unit 222, and the cloud end 230 is connected with the algorithm service unit 222.

Front-end service unit 221: the algorithm service unit 222 is configured to acquire initial video data acquired by the application scene device 210, perform a target object recognition operation on the initial video data to extract an image frame containing a target object, and send the image frame containing the target object in the initial video data to the algorithm service unit.

Algorithm service unit 222: the method is used for determining an object extraction model corresponding to the current application scene according to the current application scene corresponding to each image frame containing the target object, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame.

And the cloud 230 is configured to store target data corresponding to the initial video data.

Optionally, as shown in fig. 3, the cloud 230 includes an edge control unit 231 and a data storage unit 232, where the edge control unit 231: for controlling the application scene device end 210 and the edge end 220 to be turned on or off.

The front-end service unit 221 is further configured to: the number corresponding to the initial video data and the extraction time stamp corresponding to each image frame containing the target object are obtained, and the extraction time stamp and the number are sent to the data storage unit 232. The data storage unit 232 is used for: and storing the data transmitted by the preposed service unit.

The edge terminal 220 further includes a streaming service unit 223, and the streaming service unit 223: the method and the device for accessing the application scene equipment end 210 are used for generating a first access link corresponding to initial video data after the initial video data sent by the application scene equipment end is obtained. The front-end service unit 221 is further configured to: and acquiring initial video data according to the first access link.

The algorithm service unit 222 is further configured to: the target data is transmitted to the front end service unit 221. The front-end service unit 221 is further configured to: the target data is transmitted to the streaming service unit 240. The streaming media service unit 223 is further configured to: and after the target data is acquired, generating a second access link corresponding to the target data.

The cloud end can be accessed to a public cloud or an industry cloud. For each image frame containing the target object, the extraction time stamp corresponding to the image frame is the time stamp corresponding to the image frame extracted from the initial video data. The cloud end acquires the target data through the data storage unit and stores the target data through the data storage unit. The user can then obtain the target data through the second access link.

The data storage unit is also used for storing the image frames which are not extracted from the preposition service unit and contain the target object, other data related to the preposition service unit and the algorithm service unit in the edge terminal, personal information of the user and the like.

Specifically, through the timestamp technology, an unprocessed image frame corresponding to a certain timestamp and a processed image frame corresponding to the unprocessed image frame can be used as a set of comparison data, and after data annotation is performed, the comparison data can be used as a training data set of an object extraction model to obtain a timestamp corresponding to target data from initial video data, so that the position of the target data in the initial video data can be located. The different numbers corresponding to different pieces of initial video data are used to mark a certain piece of initial video data, and are distinguished from the rest of initial video data, for example, the numbers of the initial video data acquired by the same application scene device end at different times are different, and the numbers of the initial video data acquired by different application scene device ends at the same time are also different.

Specifically, the application scenario device side communicates with the front-end service unit through a dedicated network, where the dedicated network includes but is not limited to a LAN local area network, a 4G/5G mobile network, and a PON passive optical network, and accordingly, the communication protocol between the application scenario device side and the front-end service unit includes but is not limited to an RTMP real-time messaging protocol, an RTSP real-time streaming protocol, and a WebRTC web page instant messaging. The application scene equipment end is a terminal equipment capable of performing data interaction with the front-end service unit, including but not limited to a network camera, a vehicle-mounted camera, a common camera and an intelligent camera, and the application scene equipment end can be selected according to actual requirements, which is not limited herein.

Specifically, the algorithm service unit may be an object extraction model built on a Web application framework based on flash, or may be an object extraction model built on a deep learning framework such as tensrflowserving, TensorRTInferenceServer, and the like, and may extract a target image or a target text based on the object extraction model, that is, the target data may be in an image form or a text form, which is not limited herein. For example, the object extraction model can be used for realizing functions of face recognition, human body back shadow recognition, people number identification, suspicious object marking and the like, and for example, the object extraction model can be an object detection algorithm, a semantic segmentation algorithm and the like. And respectively defining an API for data input and an API for data output based on an API application programming interface to realize a standardized interface of the algorithm service unit, and replacing an object extraction model in the algorithm service unit in the edge end through an edge control unit in the cloud end to realize different functions.

Specifically, the following example explains the image frame including the marked target object, that is, the target data, by extracting the target object in the image frame by the algorithm service unit through the object extraction model corresponding to the current application scene.

In the implementation process, there may be a plurality of application scenario device ends. As shown in fig. 4, when the application scene device end is a camera, the first edge end includes a first pre-service unit, a first algorithm service unit, and a first streaming media service unit, the second edge end includes a second pre-service unit, a second algorithm service unit, and a second streaming media service unit, and so on, the nth edge end includes an nth pre-service unit, an nth algorithm service unit, and an nth streaming media service unit. The cameras 1-1 and 1-2 … … send the acquired initial video data to the first streaming media service unit, the cameras 2-1 and 2-2 … … send the acquired initial video data to the second streaming media service unit, and in the same way, the cameras n-1 and n-2 … … send the acquired initial video data to the nth streaming media service unit. When the user wants to acquire the target data in the first edge end, the target data is acquired through the second link in the first streaming media service unit, when the user wants to acquire the target data in the second edge end, the target data is acquired through the second link in the second streaming media service unit, and so on, and when the user wants to acquire the target data in the Nth edge end, the target data is acquired through the second link in the nth streaming media service unit.

In implementation, the front-end service unit may also extract the image frames in a frame extraction manner, and send the extracted image frames to the algorithm service unit. For example, 1 frame of image frame is extracted and sent to the algorithm service unit every 3 frames of image frames in the initial video data, and then the algorithm service unit performs target object recognition operation on the image frame after acquiring the image frame, and then matches the application scene corresponding to the image frame to determine the object extraction model.

Optionally, when there are multiple application scene device ends, the corresponding edge ends are divided according to the geographic positions of the application scene device ends, one geographic position may correspond to one or more edge ends, the set number of the application scene device ends is related to the computing capability of the edge ends, each application scene device end may be correspondingly set with a different number, and any application scene device end transmits the acquired initial video data to the edge end closest to the application scene device end.

Specifically, for each edge, all units in the edge adopt a cluster topology structure, and the request in the edge is uniformly distributed through the gateway, for example, a front end service unit obtains a request of initial video data acquired by an application scene device, as shown in fig. 5, and a registration center is further provided for receiving registration information of a user.

In implementation, the edge control unit may be deployed in a multi-machine distributed service manner, data interaction is realized by using zookeeper, the edge control unit is provided with a standardized RestFulAPI interface, and frames such as TensorFlow and Caffe are integrated inside the edge control unit. The preposed service unit uses a Web application program framework of flash, an Eureka service discovery framework, realizes interface standardization based on restful API and performs request distribution through a Zuul gateway.

Based on the above, for example, as shown in fig. 6, the application scenarios are 2, which are respectively a certain intersection a in beijing and a certain intersection B in west ampere. The number of the edge terminals is 2, the edge terminals are respectively Beijing-edge terminals (comprising a preposed service unit BJ-1 and an algorithm service unit BJ-2), are arranged at Beijing, Xian-edge terminals (comprising a preposed service unit XA-1 and an algorithm service unit XA-2) and are arranged at Xian. And crossroad a is provided with 4 application scene equipment ends, which are ordinary camera a1, ordinary camera a2, ordinary camera A3 and ordinary camera a4 respectively, and crossroad B is provided with 5 application scene equipment ends, which are ordinary camera B1, ordinary camera B2, ordinary camera B3, ordinary camera B4 and ordinary camera B5 respectively. The edge control unit 231 in the cloud is respectively connected with a front-located service unit BJ-1 in a Beijing-edge end, an algorithm service unit BJ-2, a front-located service unit XA-1 in a Xian-edge end, an algorithm service unit XA-2, a common camera in a crossroad A and a common camera in a crossroad B, the common camera in the crossroad A is connected with the front-located service unit BJ-1 in the Beijing-edge end, and the common camera in the crossroad B is connected with the front-located service unit XA-1 in the Xian-edge end.

The normal cameras a 1-a 4 can only transmit the acquired initial video data to the front service unit BJ-1 in the beijing-edge end, the normal cameras B1-B5 can only transmit the acquired initial video data to the front service unit XA-1 in the west ampere-edge end, and the edge control unit 231 can control the front service unit BJ-1 and the algorithm service unit BJ-2 in the beijing-edge end, the front service unit XA-1 and the algorithm service unit XA-2 in the west ampere-edge end, and the normal cameras in the intersection a/B to be turned on or turned off.

Still taking the above example as an example, suppose that the intersection videos between 2018, 1 month, 12 days 8:00 and 2018, 1 month, 12 days, 8:20 need to be analyzed to obtain which electric vehicle riders do not wear the helmet in the time period.

If the common cameras in the intersection A and the intersection B collect video data once every 10 minutes, only aiming at the common camera A1, two sections of initial video data are collected, namely initial video data A1-1 and initial video data A1-2, respectively, then a front service unit in the Beijing-edge end needs to identify whether the electric vehicle is ridden or not on the initial video data A1-1 and the initial video data A1-2 respectively after the initial video data A1-1 and the initial video data A1-2 are obtained, and extracts an image frame corresponding to the initial video data A1-1 and containing the electric vehicle ridden by the rider, an image frame corresponding to the initial video data A1-2 and containing the electric vehicle ridden by the rider, and an algorithm service unit in the Beijing-edge end determines a corresponding object extraction model according to the image frame containing the electric vehicle ridden by the rider, the image frames of the electric vehicle rider who does not wear the helmet are extracted from the image frames containing the electric vehicle ridden by the rider, and the image frames are marked to obtain the image frames containing the marks of the electric vehicle rider who does not wear the helmet.

Optionally, the object extraction model corresponding to the current application scenario is a first object extraction model trained in advance, or a second object extraction model uploaded by a user.

An edge control unit further configured to: and improving the first object extraction model by using the data stored in the data storage unit to obtain a third object extraction model, and replacing the first object extraction model with the third object extraction model.

Specifically, the edge control unit in the cloud acquires the storage data in the data storage unit, and optimizes the pre-trained first object extraction model by using the storage data to obtain a third object extraction model with a more accurate extraction effect.

Generally, when a training object extracts a model, a large amount of sample data needs to be acquired to obtain a model with a good extraction effect. Therefore, if sample data is lacked for training in the training process of the object extraction model, target data with good effect cannot be obtained, the data in the data storage unit is screened in the later period, data labeling is carried out to form a new training set sample, and the accuracy of the object extraction model is improved; by the edge control unit, the object extraction model can be quickly updated, so that the first object extraction model which is not updated is replaced by the third object extraction model which is updated, and the accuracy of the same kind of object extraction model is improved.

In this embodiment, the algorithmic service unit is implemented based on the flash application framework. And carrying out algorithm model deployment by packaging the object extraction model into an HTTP interface. Therefore, in the object extraction model replacing process under the condition, the edge control unit controls to close the algorithm service unit, then calls the written shell script or the automatic processing tool to sequentially replace the object extraction model file of the algorithm service unit in each edge end, and the algorithm service unit is not restarted until the object extraction model is replaced.

Optionally, the front end service unit is further configured to: before the initial video data is subjected to target object identification operation, framing operation is carried out on the initial video data to obtain a plurality of initial image frames, and at least one of definition processing, size adjustment and noise reduction processing is carried out on each frame of initial image frames.

It should be noted that: in the above embodiment, when processing a video, the video processing system is exemplified by only the division of the above functional units, and in practical applications, the above function distribution may be completed by different functional units according to needs, that is, the internal structure of the device is divided into different functional units to complete all or part of the above described functions.

Fig. 7 is a flowchart of a video processing method according to an embodiment of the present application. The embodiment comprises the following steps: the system comprises an application scene equipment end, an edge end and a cloud end, wherein the edge end comprises a front-end service unit and an algorithm service unit, and the cloud end comprises an edge control unit.

S1, performing a target object recognition operation on the initial video data to extract an image frame containing the target object, wherein the initial video data is the video data collected by the application scene device.

S2, for each image frame containing the target object, determining an object extraction model corresponding to the current application scene according to the current application scene corresponding to the image frame, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame.

And S3, storing the target data corresponding to the initial video data through the cloud.

In the above embodiments, although the steps are numbered as S1, S2, S3, etc., it is only a specific example given in this application, and it is understood that some or all of the above embodiments may be included in some embodiments.

It should be noted that: the video processing method and the video processing system provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the system embodiments and are not described herein again.

Based on the same technical concept, an embodiment of the present application further provides a video processing apparatus 800, as shown in fig. 8, the apparatus including:

and a target object recognition module 810, configured to perform a target object recognition operation on the initial video data to extract an image frame including the target object, where the initial video data is video data acquired through the application scene device.

The target data determining module 820 is configured to determine, for each image frame including a target object, an object extraction model corresponding to a current application scene according to the current application scene corresponding to the image frame, and extract the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame.

The storage module 830 is configured to store target data corresponding to the initial video data through a cloud.

It should be noted that: the video processing apparatus and the video processing system provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in system embodiments and are not described herein again.

Based on the same technical concept, as shown in fig. 9, an embodiment of the present application further provides an electronic device 900, where the electronic device 900 includes a processor 910 and a memory 920, where the memory 920 stores at least one instruction 921, and the at least one instruction 921 is loaded and executed by the processor 910 to complete the video processing method in the foregoing embodiment.

Based on the same technical concept, the embodiment of the present application further provides a storage medium, in which a computer program is stored, and the computer program is executed by a processor to complete the video processing method in the foregoing embodiment. The storage medium may be non-transitory. For example, the storage medium may be a ROM (Read-Only Memory), a RAM (Random Access Memory), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. The video processing system is characterized by comprising an application scene equipment end, an edge end and a cloud end, wherein the edge end comprises a preposed service unit and an algorithm service unit;

2. The video processing system of claim 1, wherein the cloud comprises a data store;

3. The video processing system of claim 2, wherein the cloud further comprises an edge control unit;

the edge control unit: and the data storage unit is used for improving the first object extraction model by using the data stored in the data storage unit to obtain a third object extraction model, and replacing the first object extraction model with the third object extraction model.

4. A video processing system according to any one of claims 1 to 3, wherein the edge terminal further comprises a streaming service unit, the streaming service unit: the method comprises the steps of obtaining initial video data sent by an application scene equipment terminal, and then generating a first access link corresponding to the initial video data;

5. The video processing system of claim 4, wherein the algorithm service unit is further configured to: sending the target data to the prepositive service unit;

6. A video processing system according to any of claims 1 to 3, wherein the front end service unit is further configured to:

7. A video processing method, comprising the steps of:

for each frame of image frame containing the target object, determining an object extraction model corresponding to the current application scene according to the current application scene corresponding to the image frame, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame;

and storing target data corresponding to the initial video data through a cloud.

8. A video processing apparatus, comprising:

the target data determining module is used for determining an object extraction model corresponding to the current application scene according to the current application scene corresponding to each image frame containing the target object, and extracting the target object in the image frame according to the object extraction model corresponding to the current application scene to obtain target data of the image frame;

9. An electronic device comprising a processor and a memory, the memory having stored therein at least one instruction that is loaded and executed by the processor to implement the video processing method of claim 7.

10. A storage medium, characterized in that the storage medium has stored therein a computer program which, when executed by a processor, implements the video processing method according to claim 7.