CN111783724A

CN111783724A - Target object identification method and device

Info

Publication number: CN111783724A
Application number: CN202010674587.XA
Authority: CN
Inventors: 黄程
Original assignee: Shanghai Yitu Network Science and Technology Co Ltd
Current assignee: Shanghai Yitu Network Science and Technology Co Ltd
Priority date: 2020-07-14
Filing date: 2020-07-14
Publication date: 2020-10-16
Anticipated expiration: 2040-07-14
Also published as: CN111783724B

Abstract

The application discloses a target object identification method and a device, which belong to the image processing technology, wherein the method comprises the steps of obtaining a human body detection frame if detecting that an image to be identified contains a human image, and segmenting a first image block from the image to be identified according to the amplified human body detection frame; if the first image block is detected to contain the target object, extracting human body posture information of the portrait contained in the first image block, wherein the human body posture information comprises human body actions and human body local positions; when the human body action meets the preset action condition, determining the local position of the target object corresponding to the human body posture information in the first image block according to the stored corresponding relation among the human body action, the local position of the human body and the local position of the target object; and extracting local characteristic information of the target object according to the determined local position of the target object. Thus, the accuracy, the identification efficiency and the effectiveness of target object identification are improved.

Description

Target object identification method and device

Technical Field

The present application relates to image processing technologies, and in particular, to a target object identification method and apparatus.

Background

With the popularization of video monitoring equipment, video monitoring equipment is generally arranged in key public areas such as major roads, key areas and important places of cities so as to fight against crimes, social management and the like.

In the prior art, a target object (such as a motorcycle, a bicycle and the like) is generally required to be identified so as to determine attribute information of the target object, such as a license plate, a vehicle type and the like.

However, the accuracy and efficiency of target object recognition are low, and thus how to improve the accuracy and efficiency of target object recognition is a problem to be solved.

Disclosure of Invention

The embodiment of the application provides a target object identification method and device, which are used for improving the accuracy and the identification efficiency of target object identification when the target object is identified.

In one aspect, a target object identification method is provided, including:

receiving a target object identification request aiming at an image to be identified;

if the image to be recognized contains the portrait, obtaining a human body detection frame, and amplifying the human body detection frame according to a preset multiple;

according to the amplified human body detection frame, segmenting a first image block from an image to be recognized;

if the first image block is detected to contain the target object, extracting human body posture information of the portrait contained in the first image block, wherein the human body posture information comprises human body actions and human body local positions;

when the human body action meets the preset action condition, determining the local position of the target object corresponding to the human body posture information in the first image block according to the stored corresponding relation among the human body action, the local position of the human body and the local position of the target object;

and extracting local characteristic information of the target object according to the determined local position of the target object.

Preferably, if it is detected that the image to be recognized includes a portrait, obtaining a human detection frame includes:

carrying out human body detection on the image to be identified to obtain a human body detection result;

if the human body detection result represents that the image to be recognized contains the portrait, determining the coordinates of each human body key point contained in the image to be recognized;

and determining a human body detection frame according to the coordinates of each human body key point.

Preferably, if it is detected that the first image block includes the target object, extracting human body posture information of a human image included in the first image block includes:

carrying out target object detection on the first image block to obtain a target object detection result;

and if the target object detection result represents that the target object exists in the first image block, determining the human body local position and the human body action of each human body part according to the coordinates of each human body key point.

Preferably, extracting the local feature information of the target object according to the determined local position of the target object includes:

respectively segmenting corresponding local images of the target object from the first image block according to the local position of each target object;

respectively determining a feature recognition mode corresponding to each target object part according to the stored corresponding relation between the target object parts and the feature recognition modes;

and respectively identifying the corresponding local position of the target object according to the characteristic identification mode corresponding to each target object part to obtain corresponding local characteristic information.

Preferably, further comprising:

respectively determining the confidence degree of the local characteristic information of the local position of each target object;

adjusting the local position of the target object of which the confidence degree does not meet the preset identification condition;

and executing the step of extracting the local characteristic information of the target object according to the determined local position of the target object.

In one aspect, a target object recognition apparatus is provided, including:

a receiving unit configured to receive a target object identification request for an image to be identified;

the detection unit is used for obtaining a human body detection frame if the image to be identified contains a portrait, and amplifying the human body detection frame according to a preset multiple;

the segmentation unit is used for segmenting a first image block from the image to be recognized according to the amplified human body detection frame;

the first extraction unit is used for extracting human body posture information of a portrait contained in the first image block if the first image block is detected to contain the target object, wherein the human body posture information comprises human body actions and human body local positions;

the determining unit is used for determining the local position of the target object corresponding to the human body posture information in the first image block according to the stored corresponding relation among the human body action, the local position of the human body and the local position of the target object when the human body action meets the preset action condition;

and the second extraction unit is used for extracting the local characteristic information of the target object according to the determined local position of the target object.

Preferably, the detection unit is configured to:

Preferably, the first extraction unit is configured to:

Preferably, the second extraction unit is configured to:

In one aspect, there is provided a control apparatus comprising:

at least one memory for storing program instructions;

at least one processor for calling the program instructions stored in the memory and executing the steps of any of the above-mentioned target object recognition methods according to the obtained program instructions.

In one aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of any of the above-mentioned target object identification methods.

In the method and the device for identifying the target object, a target object identification request aiming at an image to be identified is received; if the image to be recognized contains the portrait, obtaining a human body detection frame, and amplifying the human body detection frame according to a preset multiple; according to the amplified human body detection frame, segmenting a first image block from an image to be recognized; if the first image block is detected to contain the target object, extracting human body posture information of the portrait contained in the first image block, wherein the human body posture information comprises human body actions and human body local positions; when the human body action meets the preset action condition, determining the local position of the target object corresponding to the human body posture information in the first image block according to the stored corresponding relation among the human body action, the local position of the human body and the local position of the target object; and extracting local characteristic information of the target object according to the determined local position of the target object. Therefore, the complex steps of target object identification are simplified, and the accuracy, the identification efficiency and the effectiveness of target object identification are improved.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a flowchart of an implementation of a target object identification method in an embodiment of the present application;

FIG. 2 is a schematic diagram of a human body key point in an embodiment of the present application;

FIG. 3a is a diagram illustrating an example of a first image block according to an embodiment of the present application;

FIG. 3b is a bicycle riding example in the embodiment of the present application;

fig. 4 is a schematic view of a flow architecture for target object identification according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a target object recognition apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a control device in an embodiment of the present application.

Detailed Description

In order to make the purpose, technical solution and beneficial effects of the present application more clear and more obvious, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The following explains some terms in the embodiments of the present application so as to be understood by those skilled in the art.

The terminal equipment: may be a mobile terminal, a fixed terminal, or a portable terminal such as a mobile handset, station, unit, device, multimedia computer, multimedia tablet, internet node, communicator, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, personal communication system device, personal navigation device, personal digital assistant, audio/video player, digital camera/camcorder, positioning device, television receiver, radio broadcast receiver, electronic book device, gaming device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the terminal device can support any type of interface to the user (e.g., wearable device), and the like.

A server: the cloud server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, big data and artificial intelligence platform and the like.

In order to improve the accuracy and the recognition efficiency of target object recognition when the target object is recognized, the embodiment of the application provides a target object recognition method and a target object recognition device.

In the embodiments of the present application, only two-wheeled vehicles and three-wheeled vehicles are taken as examples for the target object, such as motorcycles, bicycles, battery cars, and three-wheeled vehicles, and in practical applications, the target object may be other objects, which is not limited herein. The execution main body is a control device. Optionally, the control device may be a terminal device, or may be a server.

Fig. 1 is a flowchart illustrating an implementation of a target object identification method according to the present application. The specific implementation flow of the method is as follows:

step 100: the control device receives a target object recognition request for an image to be recognized.

Specifically, the control device receives a target object identification request for an image to be identified sent by other devices, or generates a target object identification request for the image to be identified through a user instruction received by interaction between an application interaction page and a user.

The image to be identified is acquired by the image acquisition equipment. Optionally, the image to be recognized may be received by the control device from another device, may also be stored locally by the control device, may be a single picture, or may also be a video frame extracted from the surveillance video. The image capturing device may be a video camera or a still camera or the like for image capturing.

The target object identification request is used for requesting identification of a target object contained in the image to be identified and obtaining local characteristic information of the target object. One target object may exist in the image to be recognized, or a plurality of target objects may exist.

The local feature information may include any one or any combination of the following attributes: a license plate, a target object type, a target object brand, a target object model, and a target object color. In practical applications, the attribute information of the target object may set other attributes according to practical application scenarios, which is not limited herein.

Further, if the control device receives a target object identification request for a video, extracting video frames in the video, performing portrait detection on the video frames in the video, taking each video frame including a portrait as an image to be identified according to a detection result, and sequentially performing target object identification on each image to be identified in subsequent steps.

Step 101: and if the control equipment detects that the image to be recognized contains the portrait, segmenting a first image block containing the portrait from the image to be recognized.

Specifically, when step 101 is executed, the control device may adopt the following steps:

s1010: and if the image to be recognized contains the portrait, obtaining a human body detection frame.

Specifically, the control device performs human body detection on the image to be recognized to obtain a human body detection result, and if the human body detection result represents that the image to be recognized contains the human image, the control device determines the coordinates of each human body key point contained in the image to be recognized. And the control equipment determines the human body detection frame according to the coordinates of the key points of each human body.

And the human body detection result indicates whether a human image exists in the image to be identified.

Note that, before performing S1011, the control apparatus sets in advance various human body key points, such as the top of the head, the bottom of the foot, the knee, and the elbow. And the control equipment respectively determines the position of each human body key point in the image to be identified and acquires the coordinates of the position.

Fig. 2 is a schematic diagram of key points of a human body. The figure in fig. 2 contains 14 key points of the human body, key point 1 (i.e. top of head), key belt 2 (i.e. chin) … …, and key point 14 (ankle).

Further, the control device may also screen the portrait included in the image to be recognized, that is, recognize only the specified target user, so that only the target object used by the target user is recognized in the subsequent step.

Specifically, the control device performs face recognition on the portrait contained in the image to be recognized, determines user information, and determines coordinates of human key points of the target user in the image to be recognized when the portrait contained in the image to be recognized is determined to be the target user according to the user information.

S1011: and the control equipment amplifies the human body detection frame according to a preset multiple.

In one embodiment, the control device determines the size of the human body detection frame according to the coordinates of each human body key point, and enlarges the size of the human body detection frame according to a preset multiple.

In one embodiment, before performing S1011, the control device obtains a plurality of image samples including the human figure and the target object, obtains a human detection frame size of the human figure and a target object size of each image sample, and determines a ratio between the target object size and the human detection frame size respectively. And finally, the control equipment determines the maximum value in all the ratios as a preset multiple.

In practical applications, the preset multiple may be set according to a practical application scenario, for example, 2 times, which is not limited herein.

S1012: and the control equipment divides the first image block from the image to be recognized according to the amplified human body detection frame.

Wherein, the first image frame contains a portrait.

Fig. 3a is a diagram illustrating an example of a first image block. The control device obtains the size of a human body detection frame of a human body in the image to be recognized, and divides a first image block containing a portrait from the image to be recognized according to the enlarged human body detection frame.

Step 102: and if the first image block is detected to contain the target object, extracting human body posture information of the portrait contained in the first image block.

In executing step 102, the control apparatus may adopt the following steps:

s1021: and the control equipment detects the target object of the first image block to obtain a target object detection result.

And the target object detection result indicates whether the target object exists in the first image block.

S1022: and if the target object detection result shows that the target object exists in the first image block, the control equipment determines the human body local position and the human body action of each human body part according to the coordinates of each human body key point.

Specifically, the control device obtains the stored correspondence between the human body parts and the human body key points, and determines the coordinates of each human body key point corresponding to each human body part respectively. The control equipment reconstructs the joints and limbs of the human body through the coordinates of key points of the human body, and determines the local position and the action of the human body according to the joints and the limbs of the human body.

Further, the control device may determine the human body orientation based on the coordinates of the respective human body key points and the relative positional relationship between the respective human body key points.

For example, the control device determines that the portrait faces the outside of the image to be recognized according to the coordinates of the key points of each human body, acquires the shooting direction of the image acquisition device, and determines the opposite direction of the shooting direction as the human body facing direction.

The human body posture information includes human body actions, human body local positions and human body orientations, and in practical application, the human body posture information can be set according to practical application scenes without limitation.

The target objects are mainly two-wheeled vehicles and three-wheeled vehicles, i.e., motorcycles, bicycles, battery cars and three-wheeled vehicles, and can also be other objects, which are not limited herein.

Therefore, the human body posture in the image to be recognized can be recognized according to the human body key points.

Step 103: and when the human body action meets the preset action condition, the control equipment determines the local position of the target object corresponding to the human body posture information according to the stored corresponding relation among the human body action, the local position of the human body and the local position of the target object.

Specifically, when the human body action meets the preset action condition, the control device identifies the local position of the target object by adopting the characteristic identification mode corresponding to the determined local position of the target object according to the stored corresponding relation between the local position of each target object and the characteristic identification mode, and obtains corresponding local characteristic information.

The preset action condition is determined according to a specified action when the user operates the target object.

In one embodiment, the predetermined movement condition is a human movement as a cycling movement or a pushing movement.

In practical application, the preset action condition may also be set according to a practical application scenario, which is not limited herein.

When step 103 is executed, the following steps may be adopted:

s1031: and when the human body action meets the preset action condition, the control equipment acquires the stored corresponding relation among the human body action, the human body local position and the target object local position.

For example, the human body movement is a standing movement, a riding movement, or the like. The local position of the human body is the area of the human body part in the first image block, and the local position of the target object is the area of the target object part in the first image block.

Optionally, the human body part may be any one or any combination of the following: the head, shoulders, hands, waist, hips, knees and feet of a human body. The target object part can be any one or any combination of the following: the bicycle comprises a bicycle head, a license plate, a bicycle seat, a bicycle handle, a bicycle logo and the like.

In practical application, the human body part and the target object part can be set according to a practical application scene, and are not limited herein.

Before executing S1031, the control device pre-determines a correspondence relationship among the human body motion, the local position of the human body, and the local position of the target object.

S1032: and the control equipment determines the local position of the human body and the local position of the target object correspondingly set by the human body action according to the corresponding relation.

For example, assuming that a human body moves as a cart motion, the area of the hand of the human figure in the image is acquired, and the area of the handlebar of the bicycle in the image is determined according to the correspondence.

In one embodiment, referring to fig. 3b, which is a bicycle riding example, as can be seen from fig. 3b, if the human body movement is determined to be a bicycle riding movement according to the human body movement, the areas of the hands and the hips of the human body in the first image block (i.e., the local positions of the human body) can be obtained, and the areas of the seat and the handlebar in the first image block (i.e., the local positions of the target object) are determined according to the correspondence.

In this way, the regions of the respective parts of the target object in the image can be specified.

Step 104: and the control equipment extracts the local characteristic information of the target object according to the determined local position of the target object.

Specifically, when step 104 is executed, the following steps may be adopted:

s1041: and the control equipment divides corresponding local images of the target object from the first image block according to the local position of each target object.

S1042: the control device acquires the stored correspondence between the target object portion and the feature recognition mode.

Before executing S1042, the control apparatus establishes a correspondence relationship between the target object portion and the feature recognition manner in advance.

The feature identification mode may be: image matching identification, image color identification and image number identification.

For example, the car logo can be identified by image matching, any target object part can be identified by image color, and the license plate can be identified by image number.

In practical application, the corresponding relationship between the target object location and the feature recognition mode may also be set according to a practical application scenario, which is not limited herein.

S1043: the control device determines the feature recognition mode corresponding to each target object part respectively according to the stored corresponding relation between the target object part and the feature recognition mode.

S1044: and the control equipment identifies the corresponding local position of the target object according to the characteristic identification mode corresponding to each target object part to obtain corresponding local characteristic information.

Specifically, the control device identifies the local position of the corresponding target object according to the feature identification mode corresponding to each target object part, and identifies the attribute value corresponding to each attribute of each target object part.

Wherein, the attribute may include any one or any combination of the following: target object color, license plate number, vehicle type, etc.

In practical application, the target object attribute may also be set according to a practical application scenario, which is not limited herein.

Further, to improve the accuracy of the local feature information, the control device may further determine the confidence level of the local feature information of the local position of each target object, adjust the local position of the target object whose confidence level does not meet the preset identification condition, and execute step 104 according to the adjusted local position of the target object.

In one embodiment, the control device may further perform the following steps for each attribute of the local position of each target object, respectively:

step a: the control device determines a degree of confidence of the property value of the local position of the target object.

Step b: if the confidence level does not meet the preset recognition condition, the local position of the target object is adjusted, and step 104 is executed according to the adjusted local position of the target object.

Wherein, the confidence level refers to the probability that the overall parameter value falls within a certain area of the sample statistic.

In one embodiment, the preset identification condition may be as follows: the confidence degrees of the attribute values are all higher than a preset confidence threshold value, or the attribute values are not higher than the preset confidence threshold value, and the variation amplitude of the attribute values is lower than a preset variation threshold value.

In practical application, the preset confidence threshold value and the preset variation threshold value may be set according to a practical application scenario, which is not limited herein.

In one embodiment, the variation range of the attribute value of the object attribute of the object region is determined by the following steps:

when the target object does not meet the preset identification condition, the local position of the target object is adjusted, step 104 is executed, a new attribute value is obtained, and the difference value between the new attribute value and the historical attribute value is used as the change amplitude.

In practical applications, the variation amplitude may also be a difference value or a difference value variation rate between multiple attribute values, and is not limited herein.

In one embodiment, the control device amplifies the local position of the target object according to a preset ratio.

This is because the confidence level of the attribute value of the target object portion does not meet the preset recognition condition, which indicates that the recognition of the target object portion is inaccurate, i.e., the corresponding local position of the target object may be too small, and therefore, the local position of the target object is enlarged.

In practical applications, the preset ratio may be set according to practical application scenarios, for example, 2, which is not limited herein.

Further, the control device may also obtain a comprehensive recognition result of the target object based on each local feature information.

Specifically, when determining the comprehensive identification result, the following steps may be adopted:

the method comprises the following steps: and the control equipment acquires the category corresponding to each attribute of the target object.

Specifically, the control device divides the attributes of each target object into a plurality of categories in advance.

In one embodiment, the control device classifies the license plate number, the brand of the vehicle, the model of the vehicle, and the like as attributes of a first type and the vehicle color as attributes of a second type.

In practical application, the first type attribute and the second type attribute may be set according to a practical application scenario, which is not limited herein.

Step two: the control device determines a confidence level for each attribute value of the first type of attribute.

The first type attribute may be one or more.

If the first type of attribute is multiple, the control equipment respectively determines the confidence degree of each attribute value of each first type of attribute.

Step three: the control device determines the maximum confidence level of the confidence levels corresponding to each first type of attribute respectively.

Step four: and the control equipment takes the attribute value corresponding to the maximum confidence degree of each first type of attribute as a corresponding target attribute value.

Step five: the control device determines a target object recognition result according to the attribute values of the second attributes of the target object parts and the target attribute value of the first attribute.

In one embodiment, the combination of the attribute values of each second type attribute and the target attribute value of the first type attribute are respectively used as the target object identification result.

Optionally, the second type of attribute may be a target object color. In this way, the target object color of each target object part of the target object can be determined.

For example, the head is silver, the seat is blue, the tail is black, the license plate is yellow, and the emblem is gold.

Further, if the control device receives a target object identification request for the video stream, the control device screens out video frames including the same person and the same target object, and the above steps 100 to 104 are adopted to respectively obtain the confidence level of each attribute value of the target object in each screened out video frame, and respectively determine the maximum value in the corresponding attribute values identified in each video frame for each first type of attribute, and use the maximum value as the target attribute value.

In one embodiment, it is determined whether two video frames contain a portrait or not for each two adjacent video frames, and if yes, it may be determined whether the portraits in different video frames are the same person according to a face recognition mode, a face matching mode between the video frames, or a human body recognition mode.

It should be noted that different people are not identified as the same human body only by changing the clothes and the appearance, and therefore, when the human figures in different video frames are identified as the same person through the human body identification result, the same person in a short time is determined.

In one embodiment, for every two adjacent video frames, when it is determined that the figures in the two video frames are the same person, the target object features in each video frame are extracted, and whether the target object in the two video frames is one or not is determined through feature matching according to the target object features.

In the conventional technology, target object identification is usually directly performed on target objects such as motorcycles, the identification result depends on the shooting effect of image acquisition equipment, the identification steps are complex, the identification accuracy is low, the identification efficiency is low, repeated identification can be performed on target objects such as motorcycles, battery cars and bicycles parked on roadside, identification of target objects such as unmanned motorcycles, battery cars and bicycles is achieved, and a large amount of system resources can be wasted due to the fact that the application scenes of searching vehicles or searching people through the target objects are not provided.

In the embodiment of the application, whether a target object exists near the portrait or not is judged according to the portrait in the picture to be identified, the target object is identified quickly according to the operation action of the user on the target object, the accuracy and the identification efficiency of identifying the effective target object are improved, extra occupation of hardware and waste of system resources are effectively avoided, and the containable peak value of the system is improved. The intelligent management system is mainly applied to identifying motorcycles, battery cars, bicycles and the like which are used by people, for example, the intelligent management system can be applied to a car loss case, a car searching case and an application scene of searching people according to target objects, so that intelligent management of bicycles or motorcycles, single-vehicle lost object finding, single-vehicle stolen vehicle finding and the like can be realized, and the intelligent management system can also be used for linkage searching according to human body characteristics or target object characteristics under a plurality of image acquisition devices, so that historical traces of the bicycles or the motorcycles can be quickly positioned, and more information support can be provided for executives.

Fig. 4 is a schematic diagram of a flow architecture of target object recognition. As can be seen from fig. 4, the architecture flow of target object recognition is as follows:

step 400: the control device receives a target object identification request for a picture to be identified.

Step 401: the control device determines the size of a human detection frame in the image to be recognized.

Step 402: and the control equipment divides the first image block from the image to be recognized according to the size of the human body detection frame.

Step 403: and the control equipment determines the local position of the target object in the first image block according to the human body posture information of the first image block.

Step 404: and the control equipment extracts the local characteristic information of the target object according to the determined local position of the target object.

Step 405: and judging whether the local feature information does not accord with the preset identification condition, if so, executing step 406, and otherwise, executing step 407.

Step 406: the control device adjusts the local position of the target object and performs step 404.

Step 407: and ending the flow.

Based on the same inventive concept, the embodiment of the present application further provides a target object identification apparatus, and as the principle of the apparatus and the device for solving the problem is similar to that of a target object identification method, the implementation of the apparatus can refer to the implementation of the method, and repeated details are not repeated.

As shown in fig. 5, which is a schematic structural diagram of a target object recognition apparatus provided in an embodiment of the present application, the target object recognition apparatus includes:

a receiving unit 501, configured to receive a target object identification request for an image to be identified;

the detection unit 502 is configured to obtain a human body detection frame if it is detected that the image to be recognized includes a portrait, and amplify the human body detection frame according to a preset multiple;

a dividing unit 503, configured to divide a first image block from the image to be recognized according to the amplified human body detection frame;

a first extracting unit 504, configured to extract human body posture information of a human figure included in the first image block if it is detected that the first image block includes the target object, where the human body posture information includes a human body action and a human body local position;

a determining unit 505, configured to determine, according to a correspondence relationship among the stored human body motion, the local position of the human body, and the local position of the target object, a local position of the target object corresponding to the human body posture information in the first image block when the human body motion meets a preset motion condition;

and a second extracting unit 506, configured to extract local feature information of the target object according to the determined local position of the target object.

Preferably, the detecting unit 502 is configured to:

Preferably, the first extracting unit 504 is configured to:

Preferably, the second extracting unit 506 is configured to:

For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same one or more pieces of software or hardware when implementing the present application.

Based on the above embodiments, referring to fig. 6, in an embodiment of the present application, a structural schematic diagram of a control device is shown.

The embodiment of the present application provides a control device, which may include a processor 610 (central processing Unit, CPU), a memory 620, and may further include an input device 630, an output device 640, and the like, where the input device 630 may include a keyboard, a mouse, a touch screen, and the like, and the output device 640 may include a Display device, such as a Liquid Crystal Display (LCD), a Cathode Ray Tube (CRT), and the like.

Memory 620 may include Read Only Memory (ROM) and Random Access Memory (RAM), and provides processor 610 with program instructions and data stored in memory 620. In the embodiment of the present application, the memory 620 may be used to store a program for target object identification in the embodiment of the present application.

The processor 610 is configured to execute a method for target object recognition provided by the embodiment shown in fig. 1 by calling program instructions stored in the memory 620 by the processor 610.

In an embodiment of the present application, a computer-readable storage medium is further provided, on which a computer program is stored, and when the computer program is executed by a processor, the method for identifying a target object in any of the above-mentioned method embodiments is implemented.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A target object recognition method, comprising:

according to the amplified human body detection frame, segmenting a first image block from the image to be recognized;

if the first image block is detected to contain the target object, extracting human body posture information of a portrait contained in the first image block, wherein the human body posture information comprises human body actions and human body local positions;

when the human body action meets a preset action condition, determining a target object local position corresponding to the human body posture information in the first image block according to a stored corresponding relation among the human body action, the human body local position and the target object local position;

and extracting the local characteristic information of the target object according to the determined local position of the target object.

2. The method of claim 1, wherein if it is detected that the image to be recognized includes a portrait, obtaining a human detection frame comprises:

if the human body detection result represents that the image to be recognized contains the human image, determining the coordinates of each human body key point contained in the image to be recognized;

3. The method of claim 2, wherein extracting human pose information of a human image contained in the first image block if it is detected that the first image block contains a target object comprises:

performing target object detection on the first image block to obtain a target object detection result;

4. The method according to any one of claims 1 to 3, wherein extracting the local feature information of the target object according to the determined local position of the target object comprises:

5. The method of any one of claims 1-3, further comprising:

6. A target object recognition apparatus, comprising:

the first extraction unit is used for extracting human body posture information of a portrait contained in the first image block if the first image block is detected to contain a target object, wherein the human body posture information comprises human body actions and human body local positions;

7. The apparatus of claim 6, wherein the detection unit is to:

8. The apparatus of claim 7, wherein the first extraction unit is to:

9. The apparatus of any one of claims 6-8, wherein the second extraction unit is to:

10. The apparatus of any one of claims 6-8, wherein the second extraction unit is to: