CN115661701A

CN115661701A - Real-time image processing method and device, electronic equipment and readable storage medium

Info

Publication number: CN115661701A
Application number: CN202211228426.3A
Authority: CN
Inventors: 李爽; 于丽娜; 李卫军; 宁欣
Original assignee: Institute of Semiconductors of CAS
Current assignee: Institute of Semiconductors of CAS
Priority date: 2022-10-09
Filing date: 2022-10-09
Publication date: 2023-01-31

Abstract

The invention provides a real-time image processing method, a real-time image processing device, an electronic device and a readable storage medium, wherein the method comprises the following steps: acquiring a first frame image and a second frame image corresponding to a video to be processed, wherein the first frame image and the second frame image are frame connection images; determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target characteristic semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image. The method is used for solving the defect that the electronic equipment cannot process the video to be processed in real time due to the existing image processing method, processing the continuous frame images in the video to be processed is realized, unnecessary calculation can be effectively reduced, the processing response time of the second frame image is shortened, and the purpose of processing the second frame image in real time is achieved.

Description

Real-time image processing method and device, electronic equipment and readable storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a real-time image processing method and apparatus, an electronic device, and a readable storage medium.

Background

The existing image processing methods are three, namely processing each frame of image of a video to be processed, processing a specific area image of the video to be processed and processing the video to be processed by using a frame skipping method.

The first method is to perform the same processing operation on each frame of image in the video to be processed, but as the resolution of each frame of image data becomes high, the accuracy requirement is increased, and other factors, the algorithm speed corresponding to the processing operation cannot meet the real-time requirement easily; the second method is to select a Region Of Interest (ROI) image from all frame images Of a video to be processed and perform a processing operation on the ROI image, but the method lacks adaptability and flexibility for different tasks, which causes problems such as image missing processing; the third method is to select a certain number of frame images from all frame images of a video to be processed by a frame skipping method and process the frame images, but the method prolongs the processing response time after the frame images are changed, and there is a case of processing delay.

That is, any existing image processing method has certain limitations, so that the electronic device cannot process the video to be processed in real time.

Disclosure of Invention

The invention provides a real-time image processing method, a real-time image processing device, electronic equipment and a readable storage medium, which are used for solving the defect that the electronic equipment cannot process a video to be processed in real time due to certain limitations of the existing image processing method, realizing the processing of continuous frame images in the video to be processed, effectively reducing unnecessary calculation, shortening the processing response time of a second frame image and achieving the purpose of processing the second frame image in real time.

The invention provides a real-time image processing method, which comprises the following steps:

acquiring a first frame image and a second frame image corresponding to a video to be processed, wherein the first frame image and the second frame image are frame connection images;

determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target characteristic semantic information;

and correspondingly processing the second frame image according to the coincidence degree of the first target semantic area image and the second target semantic area image.

According to a real-time image processing method provided by the present invention, the determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image includes: performing semantic segmentation on the first frame image to obtain a plurality of first semantic region images; determining a first target semantic area image from the plurality of first semantic area images based on the target feature semantic information; performing semantic segmentation on the second frame image to obtain a plurality of second semantic region images; and determining a second target semantic region image from the plurality of second semantic region images based on the target characteristic semantic information.

According to a real-time image processing method provided by the present invention, the corresponding processing is performed on the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image, and the method includes: determining the contact ratio of the first target semantic area image and the second target semantic area image; under the condition that the coincidence degree is larger than a preset coincidence degree threshold value, determining a first processing result corresponding to the first frame image, and synchronizing the first processing result to the second frame image; and under the condition that the coincidence degree is less than or equal to the preset coincidence degree threshold value, the second frame image is reprocessed, and a second processing result corresponding to the second frame image is updated.

According to a real-time image processing method provided by the present invention, when the coincidence degree is greater than a preset coincidence degree threshold, determining a first processing result corresponding to the first frame image, and synchronizing the first processing result to the second frame image, includes: under the condition that the coincidence degree is larger than a preset coincidence degree threshold value, performing feature recognition on the first frame image to obtain a first feature recognition result, and synchronizing the first feature recognition result to the second frame image; and/or under the condition that the contact ratio is greater than a preset contact ratio threshold value, determining first key point information corresponding to the first frame image by using a key point positioning algorithm, and synchronizing the first key point information to the second frame image.

According to a real-time image processing method provided by the present invention, the reprocessing the second frame image and updating a second processing result corresponding to the second frame image when the coincidence degree is less than or equal to the preset coincidence degree threshold value includes: under the condition that the coincidence degree is less than or equal to the preset coincidence degree threshold value, performing feature recognition on the second frame image again, and updating a second feature recognition result corresponding to the second frame image; and/or under the condition that the contact ratio is less than or equal to the preset contact ratio threshold value, updating second key point information corresponding to the second frame image by using a key point positioning algorithm.

According to a real-time image processing method provided by the present invention, the updating of the second key point information corresponding to the second frame image includes: and using a preset key point positioning model to reposition the second target semantic region image to obtain new key point information corresponding to the second target semantic region image.

According to a real-time image processing method provided by the invention, the semantic segmentation is performed on the first frame image to obtain a plurality of first semantic region images, and the method comprises the following steps: and performing semantic segmentation on the first frame image by using a semantic segmentation algorithm to obtain a plurality of first semantic region images.

The present invention also provides an image processing apparatus comprising:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a first frame image and a second frame image corresponding to a video to be processed, and the first frame image and the second frame image are continuous frame images;

the processing module is used for determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, and the first target semantic area image and the second target semantic area image comprise the same target characteristic semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic area image and the second target semantic area image.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the real-time image processing method as described in any one of the above when executing the program.

The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a real-time image processing method as described in any of the above.

The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a real-time image processing method as described in any one of the above.

According to the real-time image processing method, the real-time image processing device, the electronic equipment and the readable storage medium, a first frame image and a second frame image corresponding to a video to be processed are obtained, and the first frame image and the second frame image are frame connection images; determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target feature semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image. The method is used for solving the defect that the electronic equipment cannot process the video to be processed in real time due to certain limitations of the existing image processing methods, processing the continuous frame images in the video to be processed is realized, unnecessary calculation can be effectively reduced, the processing response time of the second frame image is shortened, and the purpose of processing the second frame image in real time is achieved.

Drawings

In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic flow chart of a real-time image processing method provided by the present invention;

FIG. 2 is a schematic diagram of an image processing apparatus according to the present invention;

fig. 3 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the electronic device according to the embodiment of the present invention may include: computers, mobile terminals, wearable devices, and the like.

The execution subject according to the embodiment of the present invention may be an image processing apparatus or an electronic device, and the embodiment of the present invention will be further described below by taking the electronic device as an example.

As shown in fig. 1, which is a schematic flow chart of a real-time image processing method provided by the present invention, the method may include:

101. and acquiring a first frame image and a second frame image corresponding to the video to be processed.

The video to be processed refers to a series of continuous still images obtained by the electronic equipment by using the camera, and the still images can be called frame images, that is, the number of the frame images included in the video to be processed is at least two;

the first frame image refers to any one of the plurality of frame images;

the second frame image and the first frame image are consecutive frame images, that is, the first frame image and the second frame image are two adjacent frame images.

After the electronic device acquires the video to be processed, a first frame image corresponding to the video to be processed can be acquired, and a continuous frame image of the first frame image, that is, a second frame image, is acquired.

102. And determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image.

Wherein, the target semantic region image can also be called a target ROI image;

the first target semantic area image and the second target semantic area image may include the same target feature semantic information;

the semantic information of the target feature may be a semantic meaning of the visual layer, a semantic meaning of the object layer, or a semantic meaning of the concept layer, and is not limited specifically here.

The number of the first target semantic area images is 1, and the number of the second target semantic area images is 1.

Optionally, the number of the target feature semantic information is not limited.

After the electronic equipment acquires the first frame image, the corresponding first target semantic area image can be accurately determined based on the target characteristic semantic information; after the electronic device acquires the second frame image, the corresponding second target semantic region image can be accurately determined based on the same target feature semantic information, so that the electronic device can subsequently and accurately determine the contact ratio of the first target semantic region image and the second target semantic region image.

In some embodiments, the electronic device determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image may include: the electronic equipment carries out semantic segmentation on the first frame image to obtain a plurality of first semantic region images; the electronic equipment determines a first target semantic area image from a plurality of first semantic area images based on the target characteristic semantic information; the electronic equipment performs semantic segmentation on the second frame image to obtain a plurality of second semantic region images; the electronic device determines a second target semantic area image from the plurality of second semantic area images based on the target feature semantic information.

The first semantic region image can be called a first ROI image, and the second semantic region image can be called a second ROI image; the first semantic area image and the second semantic area image include the same target feature semantic information.

After the electronic equipment acquires the first frame image, semantic segmentation can be performed on semantic meanings expressed in the first frame image to obtain a plurality of first semantic region images; then, the electronic equipment can determine target characteristic semantic information from the semantic meanings and determine a first target semantic area image from a plurality of first semantic area images based on the target characteristic semantic information; then, after acquiring a plurality of second semantic region images corresponding to the second frame image, the electronic device may determine a second target semantic region image from the plurality of second semantic region images based on the previously acquired target feature semantic information.

That is to say, the first target semantic region image and the second target semantic region image include the same target feature semantic information, so that the electronic device can improve the efficiency of determining the coincidence degree of the first target semantic region image and the second target semantic region image in the subsequent process.

Optionally, the semantic meaning referred to above is that at the visual level, the bottom layer is commonly understood, i.e. color, texture, shape, etc.; the semantic meaning is in an object layer, namely a middle layer, and usually comprises attribute features and the like, wherein the attribute features are states of a certain object at a certain moment; the semantic meaning is at the concept level, usually what the frame image expresses is closest to human comprehension.

For example, assuming that a first frame image contains sand, blue sky, sea water, etc., semantic meaning of the first frame image is a block distinction at a visual layer, sand, blue sky and sea water at an object layer, and a beach at a concept layer, which are semantics to be expressed by pixels in the first frame image.

In some embodiments, the semantic segmentation, performed by the electronic device, of the first frame image to obtain a plurality of first semantic region images may include: the electronic equipment performs semantic segmentation on the first frame image by using a semantic segmentation algorithm to obtain a plurality of first semantic region images.

The semantic segmentation algorithm refers to that the electronic equipment groups or segments pixels expressing different semantic meanings in a frame image.

Alternatively, the semantic segmentation algorithm may include, but is not limited to: a Full Convolutional Neural Network (FCNN), a Pyramid Scene analysis Network (PSP-Net), and the like.

The electronic equipment can firstly determine the semantic meaning expressed by each pixel in the first frame image by using a semantic segmentation algorithm; then, the electronic equipment groups the pixels with different semantic meanings to accurately obtain a plurality of pixel groups, and each pixel group can correspond to one first semantic region image. That is to say, the electronic device performs semantic segmentation on the first frame image by using a semantic segmentation algorithm, so that a plurality of first semantic region images can be accurately obtained.

Optionally, the semantic segmentation, performed by the electronic device, of the second frame image to obtain a plurality of second semantic region images, where the semantic segmentation may include: and the electronic equipment performs semantic segmentation on the second frame image by using a semantic segmentation algorithm to obtain a plurality of second semantic region images.

It should be noted that, the explanation for the electronic device to acquire the multiple second semantic area images is similar to the explanation for the electronic device to acquire the multiple first semantic area images, and details are not described here.

103. And correspondingly processing the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image.

After the electronic device acquires the first target semantic region image and the second target semantic region image, the coincidence degree of the first target semantic region image and the second target semantic region image can be determined, and the second frame image is correspondingly processed according to the coincidence degree.

If the contact ratio is higher, the second target semantic area image is less changed than the first target semantic area image, and then the electronic equipment does not perform any processing on the second frame image; if the contact ratio is low, which indicates that the second target semantic region image has a larger change than the first target semantic region image, the electronic device needs to update the second frame image to ensure the accuracy of the semantic expression of the second frame image.

In some embodiments, the electronic device performs corresponding processing on the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image, and may include: the electronic equipment determines the contact ratio of the first target semantic area image and the second target semantic area image; the electronic equipment determines a first processing result corresponding to the first frame image under the condition that the coincidence degree is greater than a preset coincidence degree threshold value, and synchronizes the first processing result to the second frame image; and the electronic equipment reprocesses the second frame image and updates a second processing result corresponding to the second frame image under the condition that the contact ratio is less than or equal to the preset contact ratio threshold value.

The preset contact ratio threshold value may be set before the electronic device leaves a factory, or may be user-defined, and is not specifically limited herein;

the processing result may include a feature recognition result, key point information, and the like.

Illustratively, assume that the preset overlap threshold is 95%. If the electronic device determines that the coincidence degree of the first target semantic region image and the second target semantic region image is 98%, and the coincidence degree 98% is greater than a preset coincidence degree threshold value 95%, at this time, it indicates that the coincidence degree of the first target semantic region image and the second target semantic region image is high, then the electronic device may not perform subsequent processing on the second frame image, and only needs to synchronize a first processing result corresponding to the first frame image to the second frame image; if the electronic device determines that the coincidence degree of the first target semantic region image and the second target semantic region image is 82%, and the coincidence degree 82% is smaller than a preset coincidence degree threshold value 95%, at this time, it indicates that the coincidence degree of the first target semantic region image and the second target semantic region image is low, then the electronic device may perform subsequent processing on the second frame image, that is, the electronic device may update the second processing result corresponding to the second frame image.

In this way, the electronic device can respond to the changed pixels in the second frame image in time when determining that the degree of overlap between the first target semantic region image and the second target semantic region image is small.

In some embodiments, the electronic device determines a first processing result corresponding to the first frame image and synchronizes the first processing result to the second frame image when the degree of coincidence is greater than a preset degree of coincidence threshold, which may include but is not limited to at least one of the following implementations:

implementation mode 1: and the electronic equipment performs feature recognition on the first frame image to obtain a first feature recognition result under the condition that the contact ratio is greater than a preset contact ratio threshold value, and synchronizes the first feature recognition result to the second frame image.

The feature recognition refers to that the electronic device processes, analyzes and understands the first frame image to recognize and obtain feature recognition results corresponding to different target objects in the first frame image.

The electronic device can directly acquire a first feature recognition result corresponding to the first frame image and synchronize the first feature recognition result to the second frame image under the condition that the coincidence degree of the first target semantic region image and the second target semantic region image is determined to be large. For the second frame image, a large amount of unnecessary feature recognition related calculation can be reduced, so that the effect of processing the to-be-processed video in real time is improved.

Implementation mode 2: and under the condition that the contact ratio is greater than a preset contact ratio threshold value, the electronic equipment determines first key point information corresponding to the first frame image by using a key point positioning algorithm, and synchronizes the first key point information to the second frame image.

The key point positioning algorithm is also called as a key point identification algorithm and refers to an algorithm for positioning first key point information corresponding to a target object in a first frame image by electronic equipment;

the key point information refers to position information of a first key point corresponding to the object which is easy to recognize, and the number of the key point information is at least one.

Alternatively, the target object may be a living body (e.g., a human body, an animal, and a plant) or a non-living body (e.g., a house, a vehicle, and a road), and is not particularly limited herein.

Illustratively, the key point information of the human body may include a wrist, an elbow, a navel, a crotch, a knee, an ankle, and the like.

Under the condition that the coincidence degree of the first target semantic region image and the second target semantic region image is determined to be large, the electronic equipment can obtain first key point information corresponding to the first frame image by using a key point positioning algorithm, and synchronize the first key point information to the second frame image. For the second frame image, the correlation calculation of acquiring a large amount of unnecessary key point information can be reduced, so that the effect of processing the to-be-processed video in real time is improved.

In some embodiments, the electronic device reprocesses the second frame image and updates the second processing result corresponding to the second frame image when the degree of coincidence is less than or equal to the preset degree of coincidence threshold, which may include but is not limited to at least one of the following implementations:

implementation mode 1: and under the condition that the contact ratio is less than or equal to the preset contact ratio threshold value, the electronic equipment performs feature recognition on the second frame image again, and updates a second feature recognition result corresponding to the second frame image.

Under the condition that the coincidence degree of the first target semantic region image and the second target semantic region image is determined to be small, the electronic device indicates that the change of the second frame image is large, the first feature recognition result corresponding to the first frame image is not suitable for the second frame image, and at the moment, the electronic device can perform feature recognition on the second frame image again to obtain a second feature recognition result corresponding to the second frame image quickly.

Implementation mode 2: and under the condition that the contact ratio is less than or equal to the preset contact ratio threshold value, the electronic equipment updates second key point information corresponding to the second frame image by using a key point positioning algorithm.

Under the condition that the coincidence degree of the first target semantic region image and the second target semantic region image is determined to be small, the electronic device indicates that the change of the second frame image is large, the first key point information corresponding to the first frame image is not suitable for the second frame image, and at the moment, the electronic device can perform key point positioning on the second frame image again to accurately obtain the second key point information corresponding to the second frame image.

Optionally, the electronic device may determine that the first keypoint information corresponding to the first frame image is executed after step 101 and before step 103, which is not described in detail herein.

In some embodiments, the updating, by the electronic device, second keypoint information corresponding to the second frame image may include: and the electronic equipment repositions the second target semantic area image by using a preset key point positioning model to obtain new key point information corresponding to the second target semantic area image.

The preset key point positioning model is used for repositioning the key point information of other frame images based on the first key point information corresponding to the first frame image.

The electronic equipment can accurately reposition the current key point information in the second target semantic area image based on the first key point information corresponding to the first frame image by using a preset key point positioning model, and then obtains new key point information corresponding to the second target semantic area image.

Therefore, under the condition that the coincidence degree is less than or equal to the preset coincidence degree threshold value, the electronic equipment can only process the second target semantic area image which changes in the second frame image in time, so that a large amount of unnecessary calculation can be reduced, and the effect of processing the to-be-processed video in real time is improved.

In the embodiment of the invention, a first frame image and a second frame image corresponding to a video to be processed are obtained, wherein the first frame image and the second frame image are frame connection images; determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image. The method is used for solving the defect that the electronic equipment cannot process the video to be processed in real time due to certain limitations of the existing image processing methods, processing the continuous frame images in the video to be processed is realized, unnecessary calculation can be effectively reduced, the processing response time of the second frame image is shortened, and the purpose of processing the second frame image in real time is achieved.

The following describes the image processing apparatus provided by the present invention, and the image processing apparatus described below and the real-time image processing method described above may be referred to in correspondence with each other.

As shown in fig. 2, the schematic diagram of the structure of the image processing apparatus provided by the present invention may include:

an obtaining module 201, configured to obtain a first frame image and a second frame image corresponding to a video to be processed, where the first frame image and the second frame image are continuous frame images;

a processing module 202, configured to determine a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, where the first target semantic area image and the second target semantic area image include the same target feature semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic area image and the second target semantic area image.

Optionally, the processing module 202 is specifically configured to perform semantic segmentation on the first frame image to obtain a plurality of first semantic region images; determining a first target semantic region image from the plurality of first semantic region images based on the target feature semantic information; performing semantic segmentation on the second frame image to obtain a plurality of second semantic region images; determining a second target semantic area image from the plurality of second semantic area images based on the target feature semantic information.

Optionally, the processing module 202 is specifically configured to determine a coincidence degree of the first target semantic region image and the second target semantic region image; under the condition that the coincidence degree is larger than a preset coincidence degree threshold value, determining a first processing result corresponding to the first frame image, and synchronizing the first processing result to the second frame image; and under the condition that the coincidence degree is less than or equal to the preset coincidence degree threshold value, the second frame image is reprocessed, and a second processing result corresponding to the second frame image is updated.

Optionally, the processing module 202 is specifically configured to perform feature recognition on the first frame image to obtain a first feature recognition result and synchronize the first feature recognition result to the second frame image when the coincidence degree is greater than a preset coincidence degree threshold; and/or under the condition that the contact ratio is greater than a preset contact ratio threshold value, determining first key point information corresponding to the first frame image by using a key point positioning algorithm, and synchronizing the first key point information to the second frame image.

Optionally, the processing module 202 is specifically configured to perform feature recognition again on the second frame image and update a second feature recognition result corresponding to the second frame image when the coincidence degree is less than or equal to the preset coincidence degree threshold; and/or under the condition that the contact ratio is less than or equal to the preset contact ratio threshold value, updating second key point information corresponding to the second frame image by using a key point positioning algorithm.

Optionally, the processing module 202 is specifically configured to reposition the second target semantic area image by using a preset key point positioning model to obtain new key point information corresponding to the second target semantic area image.

Optionally, the processing module 202 is specifically configured to perform semantic segmentation on the first frame image by using a semantic segmentation algorithm to obtain a plurality of first semantic region images.

As shown in fig. 3, which is a schematic structural diagram of an electronic device provided in the present invention, the electronic device may include: a processor (processor) 310, a communication Interface (communication Interface) 320, a memory (memory) 330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may invoke logic instructions in the memory 330 to perform a real-time image processing method comprising: acquiring a first frame image and a second frame image corresponding to a video to be processed, wherein the first frame image and the second frame image are frame connection images; determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target characteristic semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic area image and the second target semantic area image.

In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the real-time image processing method provided by the above methods, the method comprising: acquiring a first frame image and a second frame image corresponding to a video to be processed, wherein the first frame image and the second frame image are frame connection images; determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target characteristic semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic area image and the second target semantic area image.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for real-time image processing provided by the above methods, the method comprising: acquiring a first frame image and a second frame image corresponding to a video to be processed, wherein the first frame image and the second frame image are frame connection images; determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target characteristic semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic area image and the second target semantic area image.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method of real-time image processing, comprising:

determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, wherein the first target semantic area image and the second target semantic area image comprise the same target feature semantic information;

and correspondingly processing the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image.

2. The method according to claim 1, wherein the determining a first target semantic region image corresponding to the first frame image and a second target semantic region image corresponding to the second frame image comprises:

performing semantic segmentation on the first frame image to obtain a plurality of first semantic region images;

determining a first target semantic area image from the plurality of first semantic area images based on the target feature semantic information;

performing semantic segmentation on the second frame image to obtain a plurality of second semantic region images;

determining a second target semantic area image from the plurality of second semantic area images based on the target feature semantic information.

3. The method according to claim 1 or 2, wherein the performing corresponding processing on the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image comprises:

determining the contact ratio of the first target semantic region image and the second target semantic region image;

under the condition that the coincidence degree is larger than a preset coincidence degree threshold value, determining a first processing result corresponding to the first frame image, and synchronizing the first processing result to the second frame image;

and under the condition that the coincidence degree is less than or equal to the preset coincidence degree threshold value, the second frame image is reprocessed, and a second processing result corresponding to the second frame image is updated.

4. The method according to claim 3, wherein the determining a first processing result corresponding to the first frame image and synchronizing the first processing result to the second frame image when the coincidence degree is greater than a preset coincidence degree threshold value includes:

under the condition that the coincidence degree is larger than a preset coincidence degree threshold value, performing feature recognition on the first frame image to obtain a first feature recognition result, and synchronizing the first feature recognition result to the second frame image; and/or

And under the condition that the contact ratio is greater than a preset contact ratio threshold value, determining first key point information corresponding to the first frame image by using a key point positioning algorithm, and synchronizing the first key point information to the second frame image.

5. The method according to claim 3, wherein the reprocessing the second frame image and updating the second processing result corresponding to the second frame image when the coincidence degree is less than or equal to the preset coincidence degree threshold value comprises:

under the condition that the coincidence degree is less than or equal to the preset coincidence degree threshold value, performing feature recognition on the second frame image again, and updating a second feature recognition result corresponding to the second frame image; and/or

And under the condition that the contact ratio is less than or equal to the preset contact ratio threshold value, updating second key point information corresponding to the second frame image by using a key point positioning algorithm.

6. The method according to claim 5, wherein the updating the second key point information corresponding to the second frame image comprises:

and repositioning the second target semantic area image by using a preset key point positioning model to obtain new key point information corresponding to the second target semantic area image.

7. The method according to claim 2, wherein the semantically segmenting the first frame image to obtain a plurality of first semantic region images comprises:

and performing semantic segmentation on the first frame image by using a semantic segmentation algorithm to obtain a plurality of first semantic region images.

8. An image processing apparatus characterized by comprising:

the processing module is used for determining a first target semantic area image corresponding to the first frame image and a second target semantic area image corresponding to the second frame image, and the first target semantic area image and the second target semantic area image comprise the same target feature semantic information; and correspondingly processing the second frame image according to the coincidence degree of the first target semantic region image and the second target semantic region image.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the real-time image processing method according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the real-time image processing method according to any one of claims 1 to 7.