CN113723152A

CN113723152A - Image processing method and device and electronic equipment

Info

Publication number: CN113723152A
Application number: CN202010455973.XA
Authority: CN
Inventors: 李海洋; 王建国; 汪彪; 张超
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-05-26
Filing date: 2020-05-26
Publication date: 2021-11-30

Abstract

The application provides an image processing method, comprising the following steps: obtaining an image to be processed, wherein the image to be processed comprises at least two target objects; identifying a plurality of position areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images comprise characteristic information of the target object; associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object; and performing recognition processing on a plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object. When the image is analyzed to obtain the recognition result of the target object in the image, the multiple sub-images corresponding to the same target object in the image are associated with the target object, so that the subsequent target object recognition mode is simple and convenient. The problem that the existing method for identifying the target object in the image is complex is solved.

Description

Image processing method and device and electronic equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method and apparatus, an electronic device, and a computer storage medium.

Background

With the continuous development of digital technology, the application of image pickup apparatuses in daily life is becoming more and more widespread. The resulting video recording content is increasing. As is known, video recording content is mainly used for people to inquire useful information about the monitored environment. For example, in home intelligent monitoring, video segment information of a target object or an object of interest in a camera video can be searched from video recording content. Or in the public environment monitoring, the video segment information of the interested target object or object is found from the video recording content shot in real time, so as to track the motion track information of the interested target object or object.

After obtaining video recording content, the target object in the video recording content is generally identified based on the video frame. If a plurality of parts of at least one target object in a video frame need to be identified, the prior art mostly identifies each part of the target object respectively. However, since such a recognition method recognizes a plurality of parts separately, when a plurality of target objects are included in a video frame, it is necessary to correspond recognition results of different parts to the target objects, and the recognition process is not easy enough.

Disclosure of Invention

The application provides an image processing method, which aims to solve the problem that the existing method for identifying a target object in a video frame is complex. The application also provides an image processing device, and an electronic device and a computer medium corresponding to the image processing device.

An embodiment of the present application provides an image processing method, including:

obtaining an image to be processed, wherein the image to be processed comprises at least two target objects;

identifying a plurality of position areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images, wherein the sub-images contain characteristic information of the target object;

associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object;

and performing recognition processing on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object.

Optionally, the identifying, based on the feature information of the target object, the multiple sub-images corresponding to the target object to obtain an identification result for the target object includes:

respectively identifying the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a plurality of identification results;

and obtaining the identification result aiming at the target object according to the plurality of identification results.

Optionally, the image to be processed refers to a video frame image;

the video frame image is obtained by the following method:

sending a request for obtaining a video file to a video obtaining device;

and receiving the video file sent by the video acquisition device, and acquiring a video frame image according to the video file.

Optionally, associating the multiple sub-images corresponding to the same target object with the target object to obtain multiple sub-images corresponding to the target object includes:

obtaining a current target object;

determining a plurality of sub-images matched with the current target object based on the characteristic information of the target object contained in the sub-images;

and establishing an incidence relation between the plurality of sub-images matched with the current target object and the current target object to obtain a plurality of sub-images corresponding to the target object.

Optionally, the determining, based on feature information of a target object included in the sub-images, a plurality of sub-images matched with the current target object includes:

traversing each sub-image in the plurality of sub-images based on the characteristic information of the target object contained in the sub-image to obtain the matching degree information of each sub-image and the current target object;

and determining a plurality of sub-images matched with the current target object according to the matching degree information of each sub-image and the current target object and a preset matching degree threshold condition.

Optionally, the determining, according to the matching degree information between each sub-image and the current target object and the matching degree threshold condition, a plurality of sub-images matched with the current target object includes:

judging whether the matching degree information of the sub-image and the current target object meets the matching degree threshold condition or not;

if the sub-image meets the matching degree threshold condition, determining the sub-image as a sub-image matched with the current target object; in the same way, a plurality of sub-images matching the current target object are determined.

Optionally, the identifying, based on the image to be processed, a plurality of location areas corresponding to the target object to obtain a plurality of sub-images includes:

taking the image to be processed as input data of a convolutional neural network and image area candidate box generation network to obtain a partitioning result of the image to be processed; the convolutional neural network and the image area candidate frame generating network are jointly used as a neural network for partitioning an image to be processed to obtain an image partitioning result;

and obtaining a plurality of sub-images according to the partition result of the image to be processed.

Optionally, the plurality of sub-images include a first sub-image and a second sub-image;

the identifying the plurality of sub-images corresponding to the target object based on the feature information of the target object to obtain a plurality of identifying results comprises:

performing quality evaluation on the first sub-image and the second sub-image, and performing first key point positioning and second key point positioning on feature information contained in the first sub-image and feature information contained in the second sub-image respectively to obtain first key feature information and second key feature information which meet preset image quality evaluation conditions;

and identifying the first key feature information and the second key feature information which meet the preset image quality evaluation condition to obtain a first identification result and a second identification result, and confirming the first identification result and the second identification result as the plurality of identification results.

Optionally, the plurality of sub-images further includes a third sub-image;

performing third key location on the feature information contained in the third sub-image to obtain third key feature information;

classifying the feature information contained in the third sub-image to obtain the category of the feature information of the third sub-image;

and obtaining a third recognition result based on the third key feature information and the feature information category of the third sub-image, and confirming the first recognition result, the second recognition result and the third recognition result as the plurality of recognition results.

Optionally, the method further includes: judging whether the target objects are overlapped;

and if the target objects are overlapped, performing segmentation processing on the image to be processed.

Optionally, the method further includes: judging whether a target object in the current image meets a preset condition or not;

if the target object in the current image does not meet the preset condition, the image of which the target object meets the preset condition is obtained again, and the image of which the target object meets the preset condition is taken as the image to be processed.

Optionally, the method further includes: displaying the recognition result aiming at the target object, and obtaining feedback information aiming at the recognition result;

and judging whether to re-identify the identification result or not according to the feedback information.

Correspondingly, an embodiment of the present application provides an image processing apparatus, including:

the device comprises a to-be-processed image obtaining unit, a processing unit and a processing unit, wherein the to-be-processed image obtaining unit is used for obtaining an image to be processed, and the image to be processed comprises at least two target objects;

the sub-image obtaining unit is used for identifying a plurality of position areas corresponding to the target object based on the image to be processed and obtaining a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the target object;

the association unit is used for associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object;

and the target object identification unit is used for identifying the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain an identification result of the target object.

obtaining an image to be processed, wherein the image to be processed comprises a target object;

identifying the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the sub-images contain characteristic information of a target object;

associating a plurality of sub-images corresponding to the same target object to obtain a plurality of sub-images corresponding to the target object;

The embodiment of the application provides a traffic image processing method, which comprises the following steps:

obtaining a traffic image to be processed, wherein the traffic image to be processed comprises vehicles;

identifying the traffic image to be processed based on the traffic image to be processed to obtain a plurality of sub-images, wherein the sub-images contain characteristic information of vehicles;

associating a plurality of sub-images corresponding to the same vehicle to obtain a plurality of sub-images corresponding to the vehicle;

and performing recognition processing on the plurality of sub-images corresponding to the vehicle based on the characteristic information of the vehicle to obtain a recognition result of the vehicle.

An embodiment of the present application provides an image processing apparatus, including: the acquisition device is used for acquiring an identification result;

the acquisition device is used for acquiring an image to be processed, wherein the image to be processed comprises a target object;

the recognition result acquisition device is used for recognizing the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the sub-images contain characteristic information of a target object; associating a plurality of sub-images corresponding to the same target object to obtain a plurality of sub-images corresponding to the target object; and performing recognition processing on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object.

Optionally, the device further comprises a display device;

the display device is used for displaying the identification result aiming at the target object.

Correspondingly, an embodiment of the present application provides an electronic device, including:

a processor;

a memory for storing a computer program, the computer program being executable by the processor for performing an image processing method, the method comprising the steps of:

Correspondingly, an embodiment of the present application provides a computer storage medium, which stores a computer program, which is executed by a processor, and executes an image processing method, where the method includes:

Compared with the prior art, the method has the following advantages:

an embodiment of the present application provides an image processing method, including: obtaining an image to be processed, wherein the image to be processed comprises at least two target objects; identifying a plurality of position areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images, wherein the sub-images contain characteristic information of the target object; associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object; and performing recognition processing on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object. When the image comprising the target object is analyzed to obtain the recognition result of the target object in the image, aiming at the target object in the image, a plurality of sub-images corresponding to the same target object in the image are associated with the target object to obtain a plurality of sub-images corresponding to the target object, so that when a certain target object is recognized, a plurality of characteristic information of the target object is corresponding to the target object, and the subsequent target object recognition mode is simple and convenient. The problem that the existing method for identifying the target object in the image is complex is solved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.

Fig. 1-a is a first schematic diagram of an application scenario of an image processing method provided in the present application.

Fig. 1-B is a second schematic diagram of an application scenario of the image processing method provided in the present application.

Fig. 2 is a flowchart of an image processing method according to a first embodiment of the present application.

Fig. 3 is a schematic diagram of a behavior recognition algorithm framework according to a first embodiment of the present application.

Fig. 4 is a schematic diagram of an image processing apparatus according to a second embodiment of the present application.

Fig. 5 is a flowchart of an image processing method according to a third embodiment of the present application.

Fig. 6 is a flowchart of a traffic image processing method according to a fourth embodiment of the present application.

Fig. 7 is a schematic diagram of an image processing electronic device according to a sixth embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

In order to more clearly show the image processing method provided by the present application, an application scenario of the image processing method provided by the present application is introduced first. The image processing method provided by the application can be applied to animal identification in an zoo to help an animal manager or a breeder to manage or raise the scene of the animal.

As shown in fig. 1-a and fig. 1-B, which are schematic diagrams of application scenarios of the image processing method provided in the present application. To describe the application scenario of the image processing method with reference to fig. 1-a and 1-B, first, an image to be processed is obtained, where the image to be processed may refer to an image to be recognized in the present application, and the image to be processed is represented by an image to be analyzed in fig. 1-a and 1-B. The image to be recognized may be a video frame based on a video file captured by a video camera. The video camera may refer to a camera device installed in a certain space to collect real-time behavior data of a target object. For example, in a monkey house in a zoo, in order to observe daily physical health and diet habits of a plurality of monkeys, an imaging device is installed in the monkey house, the number of the imaging devices is only required to be ensured to be capable of acquiring all monkeys in the monkey house, and the imaging device is capable of acquiring audio data and image data of all monkeys in the monkey house. In this scenario, the target objects are all monkeys in the monkey house. After a video file of the camera equipment is obtained, the video file is subjected to framing processing to obtain a video frame, namely an image to be identified. As shown in fig. 1-a, the image to be recognized including two monkeys is obtained, but it is also possible to use an image including a plurality of monkeys or an image including one monkey as the image to be recognized. After the image to be recognized is obtained, the image to be recognized is subjected to area recognition, and a plurality of sub-images are obtained. Specifically, a region including the face, head, and upper body of the monkey in the image to be recognized may be recognized and marked to obtain a plurality of sub-images. Meanwhile, the plurality of sub-images may be further classified, for example, a sub-image corresponding to the face of the monkey may be used as the first sub-image, a sub-image corresponding to the head of the monkey may be used as the second sub-image, and a sub-image corresponding to the upper body of the monkey may be used as the third sub-image. Here, the monkey head may include a monkey face, and the monkey upper body may include the monkey head and the monkey face.

And after the plurality of sub-images are obtained, associating the sub-images with the target objects corresponding to the sub-images. I.e. to which monkey each sub-image corresponds, the sub-images are associated with monkeys. Specifically, a plurality of sub-images corresponding to the same monkey are actually associated with the monkey. As shown in fig. 1-a, a first sub-image of the first monkey, a second sub-image of the first monkey, and a third sub-image of the first monkey are obtained from the image to be recognized. At the same time, a first sub-image of the second monkey, a second sub-image of the second monkey, a third sub-image of the second monkey may also be obtained. The first sub-image of the first monkey, the second sub-image of the first monkey, and the third sub-image of the first monkey are all sub-images associated with the first monkey. Similarly, the first sub-image of the second monkey, the second sub-image of the second monkey, and the third sub-image of the second monkey are all sub-images associated with the second monkey.

Then, a plurality of sub-images associated with each monkey are identified. Specifically, taking the identification of a monkey in an image as an example, the identification is to perform identification processing on a plurality of sub-images corresponding to the monkey based on the characteristic information of the monkey in the sub-images, and obtain an identification result for the monkey. More specifically, the identification processing of a plurality of sub-images corresponding to the monkey to obtain the identification result for the monkey means: first, a plurality of sub-images corresponding to the monkey are respectively identified based on the characteristic information of the monkey, and a plurality of identification results are obtained. Then, an identification result for the monkey is obtained based on the plurality of identification results.

Specifically, based on the characteristic information of the monkey, a plurality of sub-images corresponding to the monkey are respectively identified to obtain a plurality of identification results, and the following modes are adopted:

firstly, quality evaluation is carried out on a first subimage corresponding to the monkey and a second subimage corresponding to the monkey, and first key point positioning and second key point positioning are respectively carried out on feature information contained in the first subimage corresponding to the monkey and feature information contained in the second subimage corresponding to the monkey, so that first key feature information and second key feature information meeting preset image quality evaluation conditions are obtained.

And then, identifying the first key feature information and the second key feature information which meet the preset image quality evaluation condition to obtain a first identification result and a second identification result.

The third sub-image is identified while the first sub-image is identified, and the process of identifying the third sub-image is described as follows.

And then, obtaining a third recognition result based on the third key characteristic information and the characteristic information category of the third sub-image. Finally, the first recognition result, the second recognition result, and the third recognition result are confirmed as a plurality of recognition results.

The first recognition result may be a result of an expression of the monkey recognized based on the monkey facial feature information, the second recognition result may be a result of an identification of the monkey recognized based on the monkey head feature information, and the third recognition result may be a result of an action of the monkey recognized based on the feature information of the upper half of the monkey. As shown in fig. 1-a, taking the first monkey as an example, the first recognition result of the first monkey, the second recognition result of the first monkey, and the third recognition result of the first monkey can be obtained by using the first sub-image of the first monkey, the second sub-image of the first monkey, and the third sub-image of the first monkey, and the first recognition result of the first monkey, the second recognition result of the first monkey, and the third recognition result of the first monkey are collectively used as the recognition result of the first monkey. Similarly, a first recognition result of the second monkey, a second recognition result of the second monkey, and a third recognition result of the second monkey can be obtained by the first sub-image of the second monkey, the second sub-image of the second monkey, and the third sub-image of the second monkey, and the first recognition result of the second monkey, the second recognition result of the second monkey, and the third recognition result of the second monkey are collectively used as a recognition result of the second monkey. Of course, no matter how many target objects are contained in the image to be recognized, the recognition result corresponding to the target object can be obtained in the manner shown in fig. 1-a and fig. 1-B.

The application scenario of the image processing method described above is only an embodiment of the application scenario of the image processing method provided in the present application, and the application scenario embodiment is provided to facilitate understanding of the image processing method provided in the present application, and is not intended to limit the image processing method provided in the present application. In the embodiment of the present application, description of other application scenarios of the image processing method is omitted.

The application provides an image processing method, an image processing device, an electronic device and a computer storage medium, and the following embodiments are provided.

Fig. 2 is a flowchart illustrating an embodiment of an image processing method according to a first embodiment of the present application. The method comprises the following steps.

Step S201: and obtaining an image to be processed, wherein the image to be processed comprises at least two target objects.

As a first step of the image processing method according to the first embodiment, an image to be processed is obtained first, where the image to be processed includes a target object. The image to be processed may refer to a video frame image, and as obtaining the image to be processed, the following manner may be adopted: first, a request for obtaining a video file is issued to a video obtaining apparatus. And then, receiving the video file sent by the video acquisition device, and acquiring a video frame image according to the video file.

Specifically, taking the example of applying the image processing method to a scene of a zoo, the image to be recognized may be a video frame of a video file captured based on a video camera. The video camera may refer to a camera device installed in a certain space to collect real-time behavior data of a target object. For example, in a monkey house in a zoo, in order to observe daily physical health and diet habits of a plurality of monkeys, an imaging device is installed in the monkey house, the number of the imaging devices is only required to be ensured to be capable of acquiring all monkeys in the monkey house, and the imaging device is capable of acquiring audio data and image data of all monkeys in the monkey house. In this scenario, the target objects are all monkeys in the monkey house. After a video file of the camera equipment is obtained, the video file is subjected to framing processing to obtain a video frame, namely an image to be identified.

In order to send the healthy state of all monkeys in this monkey shop and diet habit in time to the client of breeder, need carry out the control of making a video recording in real time to many monkeys in this monkey shop, can install a plurality of camera equipment in a plurality of positions in this monkey shop to guarantee that a plurality of camera equipment can monitor all monkeys in this monkey shop.

After a plurality of camera devices collect image data and audio data in the monkey house for a period of time, a plurality of stored video files of the camera devices are sent to a server. Specifically, the breeder may issue a request for obtaining a video file to each of the plurality of image capturing apparatuses by using the client, and each image capturing apparatus, after receiving the request, sends the respective video file to the client or the server, thereby further implementing obtaining the to-be-processed image in step S201.

When sending multiple video files to the client, the client can send the multiple video files to the server. When the plurality of video files are sent to the server, the server can directly obtain the plurality of video files. In either way, multiple video files need to be eventually sent to the server for processing of the video files.

Step S202: and identifying a plurality of position areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the target object.

After the to-be-processed image is obtained in step S201, a plurality of region areas corresponding to the target object in the to-be-processed image are identified, and a plurality of sub-images are obtained, where the plurality of sub-images include feature information of the target object.

Specifically, a plurality of region areas corresponding to the target object in the image to be processed are identified, a plurality of sub-images are obtained, the image to be identified is actually subjected to partition identification, and a plurality of sub-images are obtained based on the partition identification result.

As an embodiment of identifying a plurality of region areas corresponding to the target object in the image to be processed and obtaining a plurality of sub-images, the image to be processed may be used as input data of a convolutional neural network and an image area candidate frame generation network, and a partitioning result of the image to be processed is obtained. The convolutional neural network and the image area candidate frame generating network are jointly used as a neural network for partitioning an image to be processed to obtain an image partitioning result.

The convolutional neural network may extract feature information of a target object in an image, and the image region candidate block generation network may partition the image to be recognized based on the feature information of the target object extracted by the convolutional neural network to obtain a partition result. For example, in an image to be recognized including a plurality of monkeys, a convolutional neural network may be used to acquire feature information of the monkeys in the image, and to acquire face feature information, head feature information, and upper-body feature information of the monkeys. After the feature information of the target object is obtained, the image area candidate frame generation network can partition the image to be identified according to the feature information of the target object to obtain a partition result. The partition result refers to a partition result of the image to be recognized, and after the partition result of the image to be recognized is obtained, a plurality of sub-images are obtained according to the partition result of the image to be recognized. For example, after the faces, heads, and upper bodies of a plurality of monkeys in an image to be recognized are recognized, the recognized faces, heads, and upper bodies are marked in different ways to obtain a plurality of sub-images.

Step S203: and associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object.

After the plurality of region areas corresponding to the target object are identified and the plurality of sub-images are obtained in step S202, for each target object, the plurality of sub-images corresponding to the same target object are associated with the target object to obtain the plurality of sub-images corresponding to the target object.

Specifically, as a way of associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object, the method includes: first, a current target object is obtained. Then, a plurality of sub-images matching the current target object are determined based on the feature information of the target object contained in the sub-images. And finally, establishing an incidence relation between the plurality of sub-images matched with the current target object and the current target object to obtain a plurality of sub-images corresponding to the target object.

For example, when two monkeys are included in the image to be recognized, if only the face, head, and upper body of the monkey in the image are recognized, six sub-images are obtained, which are a sub-image corresponding to the face of the first monkey, a sub-image corresponding to the face of the second monkey, a sub-image corresponding to the head of the first monkey, a sub-image corresponding to the head of the second monkey, a sub-image corresponding to the upper body of the first monkey, and a sub-image corresponding to the upper body of the second monkey.

For each monkey, for example, monkey a in the two monkeys, a plurality of subimages corresponding to monkey a are associated with monkey a, and a plurality of subimages corresponding to monkey a are obtained. Specifically, the process determines a plurality of sub-images matching the monkey a based on the monkey feature information included in the sub-images. As the characteristic information of the monkey based on the sub-images, it may be determined that the plurality of sub-images match the monkey a as follows.

Firstly, traversing each sub-image in the six sub-images based on the characteristic information of the monkey contained in the sub-image to obtain the matching degree information of each sub-image and the monkey A. Of course, all the feature information of the monkey a may be previously stored in the database, and the feature information of the monkey in the plurality of sub-images may be respectively matched with the feature information of the monkey a stored in the database.

And then, determining a plurality of sub-images matched with the monkey A according to the matching degree information of each sub-image and the monkey A and a preset matching degree threshold condition.

As an embodiment of the foregoing determining a plurality of sub-images matching the monkey a according to the matching degree information of each sub-image with the monkey a and a preset matching degree threshold condition: and judging whether the matching degree information of the subimages and the monkey A meets the threshold condition of the matching degree. And if the sub-image meets the matching degree threshold condition, determining the sub-image as the sub-image matched with the monkey A, and determining a plurality of sub-images matched with the monkey A in the same way.

For example, in the six sub-images, by calculating the matching degrees, the matching degrees of the sub-image corresponding to the face of the first monkey, the sub-image corresponding to the head of the first monkey, the sub-image corresponding to the upper half of the first monkey and the monkey a all satisfy the threshold matching degree condition. At this time, the sub-image corresponding to the face of the first monkey, the sub-image corresponding to the head of the first monkey, the sub-image corresponding to the upper half of the first monkey, and the plurality of sub-images determined to match monkey a are all matched. And the subimage corresponding to the face of the second monkey, the subimage corresponding to the head of the second monkey, and the subimage corresponding to the upper half of the second monkey are all subimages which do not match monkey A.

Similarly, when the monkey B is used as the current target object, the matching degree of the sub-image corresponding to the face of the first monkey, the sub-image corresponding to the head of the first monkey, the sub-image corresponding to the upper half of the first monkey and the monkey B does not satisfy the threshold condition of the matching degree by calculating the matching degree. The sub-image corresponding to the face of the first monkey, the sub-image corresponding to the head of the first monkey, the sub-image corresponding to the upper half of the first monkey are all compared to the plurality of sub-images determined to be non-matching with monkey B. And the subimage corresponding to the face of the second monkey, the subimage corresponding to the head of the second monkey, and the subimage corresponding to the upper half of the second monkey are all subimages matched with monkey B.

By the above method, for each target object, the plurality of sub-images corresponding to the same target object are associated with the target object to obtain the plurality of sub-images corresponding to the target object.

Step S204: and performing recognition processing on a plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object.

After obtaining a plurality of sub-images corresponding to each target object in step S203, the plurality of sub-images corresponding to the target object are subjected to recognition processing for each target object based on the feature information of the target object, and a recognition result for the target object is obtained.

Specifically, based on the feature information of the target object, for each target object, the recognition processing is performed on the plurality of sub-images corresponding to the target object, and the recognition result for the target object may be obtained as described below.

First, a plurality of sub-images corresponding to a target object are respectively subjected to recognition processing based on feature information of the target object, and a plurality of recognition results are obtained.

Since the plurality of sub-images corresponding to each target object includes the first sub-image and the second sub-image, when the first sub-image and the second sub-image corresponding to the target object are recognized, the following manner may be used.

And respectively identifying the first sub-image and the second sub-image corresponding to the target object based on the characteristic information of the target object to obtain a first identification result and a second identification result corresponding to the target object.

As an embodiment of identifying the first sub-image and the second sub-image corresponding to the target object respectively based on the feature information of the target object, and obtaining a first identification result and a second identification result corresponding to the target object: firstly, respectively evaluating the quality of the first sub-image and the second sub-image, respectively positioning a first key point and a second key point on the feature information contained in the first sub-image and the feature information contained in the second sub-image, and obtaining the first key feature information and the second key feature information which meet the preset image quality evaluation condition. And then, identifying the first key feature information and the second key feature information which meet the preset image quality evaluation condition to obtain a first identification result and a second identification result, and confirming the first identification result and the second identification result as a plurality of identification results.

For example, when a face sub-image and a head sub-image of a monkey are recognized, an expression recognition result and an identity recognition result of the monkey can be obtained through the face sub-image and the head sub-image of the monkey. In practice, after the face sub-image and the head sub-image of the monkey are obtained, the quality of the face sub-image and the head sub-image of the monkey is evaluated, and an image which is not suitable for expression recognition or identity recognition can be filtered out by adopting a quality evaluation mode. For example, some blurred images are not suitable for expression recognition or identity recognition, and the images can be filtered in advance in a quality evaluation manner to improve recognition accuracy.

When the expression recognition or the identity recognition is carried out, the key point information in the sub-image is adopted for recognition, namely, the key characteristic information is adopted for recognition. Therefore, after the quality evaluation is performed, the first key feature information and the second key feature information are obtained and then the identification is performed based on the first key feature information and the second key feature information. For example, in case of facial expression recognition, the first key feature information may include key feature information of eyes, mouth, and the like, and in case of identity recognition, the second key feature information may include key feature information of eyes, nose, ears, mouth, and the like. The expression recognition result of the monkey can be obtained through the first key feature information of the face, such as expression expressing that the monkey is happy or unhappy or hungry. The identification result of the monkey, for example, whether the monkey is monkey a or monkey B, can be obtained through the second key feature information of the head.

Since the plurality of sub-images corresponding to each target object include the third sub-image, when the third sub-image corresponding to the target object is recognized, the following manner may be used.

And identifying a third sub-image corresponding to the target object based on the characteristic information of the target object to obtain a third identification result corresponding to the target object.

As an embodiment of recognizing the third sub-image corresponding to the target object based on the feature information of the target object, obtaining a third recognition result corresponding to the target object: first, a third key point is positioned on the feature information contained in the third sub-image, and third key feature information is obtained. And meanwhile, classifying the feature information contained in the third sub-image to obtain the category of the feature information of the third sub-image, namely performing action classification. And finally, obtaining a third recognition result based on the third key characteristic information and the characteristic information category of the third sub-image.

The embodiment can simultaneously identify the behavior of each target object of multiple target objects in multiple continuous video frames, as shown in fig. 3, which shows a schematic diagram of an algorithm framework for identifying the behavior of each target object in multiple target objects in multiple continuous video frames, and the identification method can be applied in a scene for simultaneously identifying the behavior of the upper body of each monkey in multiple monkeys. In practice, the behavior of each of a plurality of target objects in a plurality of successive video frames is identified simultaneously based on the video file. Firstly, inputting a plurality of continuous video frames into a convolutional neural network and a region candidate frame generation network, obtaining feature maps of the plurality of continuous video frames, aligning the region features, and performing target similarity matching to obtain target similarity. That is, the algorithm can not only obtain the static behavior of a monkey, but also recognize the dynamic action behavior of the upper body of the monkey in continuous video frames. In the identification process, the features are extracted through the feature extractor, and the features of a plurality of video frames are subjected to feature fusion, so that the action classification of a monkey in continuous video frames is obtained.

In this embodiment, the algorithm shown in fig. 3 can be used to classify the motion, and after the motion classification is performed, a third recognition result can be obtained based on the third key feature information.

After the first recognition result, the second recognition result, and the third recognition result are obtained, the first recognition result, the second recognition result, and the third recognition result are regarded as a plurality of recognition results.

When the image processing method of the embodiment is adopted, whether target objects in the image to be processed are overlapped or not can be judged in advance; and if the target objects are overlapped, performing segmentation processing on the image to be processed. Specifically, the image to be processed may be segmented in a frame regression (Bounding Box) manner to solve the problem of overlapping of target objects in the image to be processed, so as to obtain a plurality of sub-images.

In addition, before the image to be processed is obtained, whether the target object in the current image meets the preset condition or not can be judged in advance; and if the target object in the current image does not meet the preset condition, re-acquiring the image of which the target object meets the preset condition, and taking the image of which the target object meets the preset condition as the image to be processed.

Specifically, since the image to be processed of the present embodiment may include at least two target objects, as one case of the preset condition, the number of target objects in the image may be taken as the preset condition. Therefore, it can be determined in advance whether the number of target objects in the current image meets a preset number condition. And if the number of the target objects in the current image does not meet the preset number condition, re-obtaining the images of which the number of the target objects meets the preset number condition, and taking the images of which the number of the target objects meets the preset number condition as the images to be processed. For example, when the number of target objects in the current image is one, the image may not be processed in the manner of steps S201 to S204 until an image including at least two target objects is obtained, and the image may not be processed in the manner of steps S201 to S204.

In addition, the definition information of the target object in the image can be used as a preset condition, and when the definition of the target object meets the pixel condition capable of being identified, the image with the definition of the target object meeting the pixel condition capable of being identified is used as the image to be processed. Of course, it is understood that the preset condition may also be other conditions related to the target object, and all of the conditions are within the protection scope of the present application and are not described herein again.

Meanwhile, in order to improve the accuracy of the recognition result, the recognition result for the target object may be displayed to the user, and feedback information of the user for the recognition result may be obtained. For example, when the feedback information of the user is that the recognition result is correct, the recognition result may be directly used as the recognition result for the target object in the image to be processed. And when the feedback information of the user is that the recognition result is wrong, the image to be processed can be re-recognized.

The embodiment of the application provides an image processing method, which specifically comprises the following steps: obtaining an image to be processed, wherein the image to be processed comprises at least two target objects; identifying a plurality of position areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images comprise characteristic information of the target object; associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object; and performing recognition processing on a plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object. When the image comprising at least one target object is analyzed to obtain the identification result of the target object in the image, aiming at each target object in the image, a plurality of sub-images corresponding to the same target object in the image are associated with the target object to obtain a plurality of sub-images corresponding to the target object, so that a plurality of characteristic information of the target object is corresponding to the target object when a certain target object is identified, and the subsequent identification mode of the target object is simpler and more convenient. The problem that the existing method for identifying the target object in the image is complex is solved.

In the first embodiment described above, an image processing method is provided, and correspondingly, the present application also provides an image processing apparatus. Fig. 4 is a schematic diagram of an image processing apparatus according to a second embodiment of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The present embodiment provides an image processing apparatus including:

a to-be-processed image obtaining unit 401, configured to obtain an image to be processed, where the image to be processed includes at least two target objects;

a sub-image obtaining unit 402, configured to identify, based on the image to be processed, a plurality of location areas corresponding to the target object, and obtain a plurality of sub-images, where the plurality of sub-images include feature information of the target object;

an associating unit 403, configured to associate multiple sub-images corresponding to the same target object with the target object, so as to obtain multiple sub-images corresponding to the target object;

a target object recognition unit 404, configured to perform recognition processing on the plurality of sub-images corresponding to the target object based on feature information of the target object, and obtain a recognition result for the target object.

Optionally, the identification unit of the target object is specifically configured to:

Optionally, the image to be processed refers to a video frame image;

the to-be-processed image obtaining unit is specifically configured to:

sending a request for obtaining a video file to a video obtaining device;

Optionally, the association unit is specifically configured to:

obtaining a current target object;

Optionally, the association unit is specifically configured to:

Optionally, the sub-image obtaining unit is specifically configured to:

the identification unit of the target object is specifically configured to:

Optionally, the plurality of sub-images further includes a third sub-image;

the identification unit of the target object is specifically configured to:

Optionally, the image processing device further comprises a first judging unit and an image segmentation unit; the first judgment unit: the system is used for judging whether the target objects are overlapped or not;

and the image segmentation unit is used for performing segmentation processing on the image to be processed if the target objects are overlapped.

Optionally, the image processing device further comprises a second judging unit and an image selecting unit; the second judgment unit: judging whether a target object in the current image meets a preset condition or not;

and the image selection unit is used for re-obtaining the image of which the target object meets the preset condition if the target object in the current image does not meet the preset condition, and taking the image of which the target object meets the preset condition as the image to be processed.

Optionally, the system further comprises a feedback information obtaining unit and a re-identification unit; the feedback information obtaining unit is used for displaying the identification result aiming at the target object and obtaining feedback information aiming at the identification result;

and the re-identification unit is used for judging whether to re-identify the identification result or not according to the feedback information.

The present application further provides an image processing method, as shown in fig. 5, which is a flowchart of an embodiment of an image processing method provided in a third embodiment of the present application. The method comprises the following steps.

Step S501: and obtaining an image to be processed, wherein the processed image comprises a target object.

Step S502: and identifying the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the target object.

Step S503: and associating a plurality of sub-images corresponding to the same target object to obtain a plurality of sub-images corresponding to the target object.

Step S504: and performing recognition processing on a plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object.

It should be noted that the steps in this embodiment are substantially similar to those in the first embodiment, and the difference is only that step S502 directly identifies the object to be processed to obtain a plurality of sub-images, instead of identifying a plurality of region areas corresponding to the target object to obtain a plurality of sub-images. Step S503 associates a plurality of sub-images corresponding to the same target object, instead of associating a plurality of sub-images corresponding to the same target object with the target object. For the same parts of this embodiment as the first embodiment, please refer to the detailed description of the relevant parts of the first embodiment, which is not repeated herein.

Based on the third embodiment, a traffic image processing method is provided in a fourth embodiment of the present application, and as shown in fig. 6, it is a flowchart of an embodiment of the traffic image processing method provided in the fourth embodiment of the present application. The method comprises the following steps.

Step S601: and obtaining a traffic image to be processed, wherein the traffic image to be processed comprises vehicles.

Step S602: and identifying the traffic image to be processed based on the traffic image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the vehicle.

Step S603: and associating a plurality of sub-images corresponding to the same vehicle to obtain a plurality of sub-images corresponding to the vehicle.

Step S604: and performing recognition processing on a plurality of sub-images corresponding to the vehicle based on the characteristic information of the vehicle to obtain a recognition result of the vehicle.

The embodiment can identify some vehicle information according to the traffic live image obtained by the camera device in the traffic scene, for example, some fault vehicles or illegal vehicles can be identified.

Based on the third embodiment, a fifth embodiment of the present application provides an image processing apparatus comprising: the device comprises a collecting device, an identification result acquiring device and a display device.

The acquisition device is used for acquiring an image to be processed, wherein the image to be processed comprises a target object.

In the first embodiment described above, an image processing method is provided, and correspondingly, a sixth embodiment of the present application provides an electronic device corresponding to the method of the first embodiment. As shown in fig. 7, a schematic diagram of the electronic device provided in the present embodiment is shown.

A sixth embodiment of the present application provides an electronic device, including:

a processor 701;

a memory 702 for storing a computer program to be executed by a processor for performing an image processing method, said method comprising the steps of:

In the first embodiment described above, an image processing method is provided, and correspondingly, a seventh embodiment of the present application provides a computer storage medium corresponding to the method of the first embodiment.

A seventh embodiment of the present application provides a computer storage medium storing a computer program executed by a processor to perform an image processing method, the method including the steps of:

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer-readable medium does not include non-transitory computer-readable storage media (non-transitory computer readable storage media), such as modulated data signals and carrier waves.

2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. An image processing method, comprising:

2. The method according to claim 1, wherein the identifying the plurality of sub-images corresponding to the target object based on the feature information of the target object to obtain an identification result for the target object comprises:

3. The method according to claim 1, wherein the image to be processed is a video frame image;

the video frame image is obtained by the following method:

sending a request for obtaining a video file to a video obtaining device;

4. The method of claim 1, wherein associating the plurality of sub-images corresponding to the same target object with the target object to obtain the plurality of sub-images corresponding to the target object comprises:

obtaining a current target object;

5. The method according to claim 4, wherein the determining a plurality of sub-images matching the current target object based on the feature information of the target object contained in the sub-images comprises:

6. The method according to claim 5, wherein determining a plurality of sub-images matching the current target object according to the matching degree information of each sub-image with the current target object and a matching degree threshold condition comprises:

7. The method according to claim 1, wherein the identifying a plurality of region areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images comprises:

8. The method of claim 2, wherein the plurality of sub-images comprises a first sub-image and a second sub-image;

9. The method of claim 8, wherein the plurality of sub-images further comprises a third sub-image;

10. The method of claim 1, further comprising: judging whether the target objects are overlapped;

11. The method of claim 1, further comprising: judging whether a target object in the current image meets a preset condition or not;

12. The method of claim 1, further comprising: displaying the recognition result aiming at the target object, and obtaining feedback information aiming at the recognition result;

13. An image processing apparatus characterized by comprising:

14. An image processing method, comprising:

15. A traffic image processing method, characterized by comprising:

16. An image processing apparatus characterized by comprising: the acquisition device is used for acquiring an identification result;

17. The apparatus according to claim 16, further comprising a display device;

18. An electronic device, comprising:

a processor;

19. A computer storage medium storing a computer program to be executed by a processor to perform an image processing method, the method comprising: