CN115439733A - Image processing method, image processing device, terminal equipment and computer readable storage medium - Google Patents

Image processing method, image processing device, terminal equipment and computer readable storage medium Download PDF

Info

Publication number
CN115439733A
CN115439733A CN202211025844.2A CN202211025844A CN115439733A CN 115439733 A CN115439733 A CN 115439733A CN 202211025844 A CN202211025844 A CN 202211025844A CN 115439733 A CN115439733 A CN 115439733A
Authority
CN
China
Prior art keywords
image
target object
processed
feature information
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211025844.2A
Other languages
Chinese (zh)
Inventor
王侃
庞建新
谭欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ubtech Robotics Corp
Original Assignee
Ubtech Robotics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ubtech Robotics Corp filed Critical Ubtech Robotics Corp
Priority to CN202211025844.2A priority Critical patent/CN115439733A/en
Publication of CN115439733A publication Critical patent/CN115439733A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application is applicable to the technical field of image processing, and provides an image processing method, an image processing device, terminal equipment and a computer readable storage medium, wherein the image processing method comprises the following steps: dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the position of a key point on the target object in the image to be processed; extracting local characteristic information of each sub-image; generating first global feature information of the image to be processed according to the local feature information of each sub-image; and identifying the target object in the image to be processed according to the first global feature information. By the method, the quality of the extracted image features can be effectively improved, and the accuracy of image recognition is further improved.

Description

Image processing method, image processing device, terminal equipment and computer readable storage medium
Technical Field
The present application belongs to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, a terminal device, and a computer-readable storage medium.
Background
Image recognition refers to a technique of processing, analyzing, and understanding an image with a computer to recognize various patterns of objects and objects. Image recognition technology is widely used in various fields, such as face recognition, vehicle recognition, license plate recognition, pedestrian re-recognition, and the like. The image recognition task relies on high quality feature information of the target object. Therefore, how to extract the features of the high-quality target object has been a difficulty of the image recognition technology.
The existing feature extraction method cannot extract feature information of a target object in a targeted manner. Especially, when the target object is not accurately detected due to external illumination change or target object posture change, the quality of the feature information of the target object extracted by the existing method is low, and the accuracy of the image recognition result is low.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, terminal equipment and a computer readable storage medium, which can effectively improve the quality of extracted image features and further improve the accuracy of image recognition.
In a first aspect, an embodiment of the present application provides an image processing method, including:
dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the position of a key point on the target object in the image to be processed;
extracting local characteristic information of each sub-image;
generating first global feature information of the image to be processed according to the local feature information of each sub-image;
and identifying the target object in the image to be processed according to the first global feature information.
In the embodiment of the application, in the process of dividing the image to be processed into a plurality of sub-images, the sub-images are divided based on the positions of key points on a target object in the image to be processed, so that the characteristic information of the figure image can be extracted in a targeted manner; particularly, under the condition that the target object is not accurately detected, the method can still accurately divide the local image of the target object from the image to be processed, so that the characteristic information of the target object with higher quality is extracted, and the accuracy of image identification is further improved.
In a possible implementation manner of the first aspect, the dividing, according to the position of a key point on a target object in an image to be processed, a region occupied by the target object in the image to be processed into a plurality of sub-images includes:
detecting the key points on the target object in the image to be processed;
determining a dividing line according to the position of the key point;
and dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the dividing line.
In a possible implementation manner of the first aspect, the detecting the keypoint on the target object in the image to be processed includes:
generating second global feature information of the image to be processed;
and detecting the key points on the target object in the image to be processed according to the second global feature information.
In a possible implementation manner of the first aspect, the key points carry category labels, the key points carrying the same category label belong to the same component, and the component represents a local area of the target object;
the determining of the segmentation line according to the position of the key point comprises the following steps:
generating a target position corresponding to each component according to the position of each key point contained in each component;
and generating a parting line according to the target position corresponding to each component.
In a possible implementation manner of the first aspect, the generating a target position corresponding to each component according to the position of each key point included in each component includes:
if the component is located in the non-edge area of the target object, calculating an average value of the positions of all key points contained in the component, wherein the target position corresponding to the component is the average value.
In a possible implementation manner of the first aspect, the generating a target position corresponding to each component according to the position of each key point included in each component includes:
if the assembly is located in the edge area of the target object, calculating the maximum value of the positions of all key points contained in the assembly, wherein the target position corresponding to the assembly is the maximum value.
In a possible implementation manner of the first aspect, the generating first global feature information of the image to be processed according to the local feature information of each sub-image includes:
and concatenating the local feature information of each sub-image into a vector, wherein the vector is the first global feature information.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
the dividing unit is used for dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the position of a key point on the target object in the image to be processed;
the extraction unit is used for extracting local characteristic information of each sub-image;
the generating unit is used for generating first global feature information of the image to be processed according to the local feature information of each sub-image;
and the identification unit is used for identifying the target object in the image to be processed according to the first global feature information.
In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the image processing method according to any one of the first aspect is implemented.
In a fourth aspect, the present application provides a computer-readable storage medium, and the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the image processing method according to any one of the above first aspects.
In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute the image processing method according to any one of the first aspect.
It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic flowchart of an average partition-based pedestrian feature extraction algorithm provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a pedestrian detection image provided by an embodiment of the application;
FIG. 3 is a schematic flowchart of an image processing method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of key points provided by an embodiment of the present application;
FIG. 5 is a schematic flowchart of a pedestrian feature extraction algorithm based on key points according to an embodiment of the present disclosure;
fig. 6 is a block diagram of an image processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing a relative importance or importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise.
First, a technical background of an embodiment of the present application will be described. Image recognition refers to a technique of processing, analyzing, and understanding an image with a computer to recognize various patterns of objects and objects. Image recognition technology is widely used in various fields, such as face recognition, vehicle recognition, license plate recognition, pedestrian re-recognition, and the like. The image recognition task relies on high quality feature information of the target object. Therefore, how to extract the features of the high-quality target object has been a difficulty of the image recognition technology. The existing feature extraction method cannot extract feature information of a target object in a targeted manner. Especially, when the target object is not accurately detected due to external illumination change or target object posture change, the quality of the feature information of the target object extracted by the existing method is low, and the accuracy of the image recognition result is low.
Taking pedestrian re-identification as an example, the task is to determine whether a specific pedestrian exists in the image or video. In the prior art, an average division pedestrian feature extraction algorithm is generally adopted. Referring to fig. 1, a schematic flow chart of a pedestrian feature extraction algorithm based on average division according to an embodiment of the present application is shown. As shown in fig. 1, the pedestrian feature extraction algorithm based on average division is mainly divided into three steps: global feature map generation, component feature extraction and global feature synthesis.
Firstly, generating a global feature map: for an input image I, a global feature map F is generated by using a deep neural network. The deep network used here can be any common deep learning network, such as heavyweight networks of the ResNet family and lightweight networks of the MobileNet family, the shuffleNet family, and the like.
Secondly, extracting component features: dividing the global feature map F into 6 local regions F averagely along the vertical direction 1 ,F 2 ,F 3 ,F 4 ,F 5 ,F 6 Extracting corresponding local features f from the 6 local areas 1 ,f 2 ,f 3 ,f 4 ,f 5 ,f 6 }。
Thirdly, global feature synthesis: 6 sets of local features f 1 ,f 2 ,f 3 ,f 4 ,f 5 ,f 6 And the four components are connected in series to form a vector f, namely the global feature of the pedestrian. Where the dimension of f is the sum of the dimensions of the 6 pedestrian component features.
In the method, the global feature map is divided averagely, so that when the pedestrian detection is inaccurate, the average division often cannot accurately position the human body components (namely the human body parts). Fig. 2 is a schematic diagram of a pedestrian detection image provided in the embodiment of the present application. As shown in fig. 2, the left image is a detection image obtained when the pedestrian detection is accurate, and the right image is a detection image obtained when the pedestrian detection is inaccurate. Comparing, it can be seen that if the right image is divided into local regions along the vertical direction, the two uppermost local regions obtained after division have no human body components. In this case, background features, not pedestrian features, are extracted from the two uppermost local regions of the right image, which results in that the global features finally generated in series are mixed with the background features, the quality of the global features is reduced, and the accuracy of pedestrian recognition is further reduced.
In order to solve the above problem, an embodiment of the present application provides an image processing method. Referring to fig. 3, which is a schematic flowchart of an image processing method provided in an embodiment of the present application, by way of example and not limitation, the method may include the following steps:
s301, dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the position of the key point on the target object in the image to be processed.
In one embodiment, one implementation of S301 may include: detecting key points on a target object in an image to be processed; determining the upper boundary and the lower boundary of the area occupied by the target object in the image to be processed according to the position of the key point; in a range between the upper boundary and the lower boundary, a region occupied by the target object is divided into a plurality of sub-images.
Specifically, only the boundary key points on the target object may be detected, and then the horizontal line in which the topmost key point is located in the image to be processed is determined as the upper boundary, and the horizontal line in which the bottommost key point is located in the image to be processed is determined as the lower boundary.
In the method, only the boundary key points on the target object need to be determined, the key points of each part on the target object do not need to be detected, and the time for detecting the key points is saved. However, in this method, since it is not necessary to detect a key point for each part of the target object, the area occupied by the target object can be divided into a plurality of sub-images, either evenly or randomly. For some target objects with larger local feature difference, the method cannot highlight the features of each local part. For example, in the case of human body recognition, the features of the head, the foot, and the waist of the human body are greatly different, and if the features are divided evenly or randomly, a plurality of parts are likely to be divided into one sub-image. If the head and shoulders are divided into one sub-image and the waist, legs and feet are divided into one sub-image, the features of the head and shoulders are fused together and the features of the waist, legs and feet are fused together. In this method, the features of each part cannot be extracted in a targeted manner, and the quality of the extracted feature information is still low.
To further improve the quality of the features, in one embodiment, S301 may comprise the steps of:
I. and detecting the key points on the target object in the image to be processed.
Optionally, one implementation manner of detecting the key points is as follows:
generating second global feature information of the image to be processed; and detecting the key points on the target object in the image to be processed according to the second global feature information.
The image to be processed can be input into the trained feature extraction model to generate second global feature information; and inputting the second global feature information into the trained detection model, and outputting the position information of the key point.
The feature extraction model and the detection model may adopt a neural network model or other algorithm models with detection functions. Such as heavyweight networks of the ResNet series and lightweight networks of the MobileNet series, the shuffleNet series, and the like.
Optionally, the feature extraction model and the detection mode may be integrated into one key point detection model. Specifically, the image to be processed is input into the trained key point detection model, and the position information of each key point on the target image is output.
The key points in step I include key points on various parts/portions of the target object. For example, when the target object is a human body, the key points may include a vertex key point, a left shoulder key point, a right shoulder key point, a left hip key point, a right hip key point, a left foot key point, a right foot key point, and the like. When the target object is a human face, the key points may include a left eye key point, a right eye key point, a nose tip key point, a left mouth corner key point, a right mouth corner key point, a mandible key point, and the like.
II. And determining a segmentation line according to the position of the key point.
In this step, the direction of the dividing line may be determined according to the posture of the target object. For example, when the target object is a pedestrian, the target objects in the image to be processed are distributed in the vertical direction of the image, and in this case, the direction of the dividing line is in the horizontal direction of the image, that is, the horizontal line where the key point is located in the image to be processed is determined as the dividing line. For another example, when the target object is a vehicle, the target objects in the image to be processed are distributed along the horizontal direction of the image, in this case, the direction of the dividing line is along the vertical direction of the image, that is, the vertical line where the key point is located in the image to be processed is determined as the dividing line.
In the embodiment of the application, the key points carry category labels, the key points carrying the same category label belong to the same component, and the component represents a local area of the target object. Illustratively, when the target object is a human body, the components may include a head, shoulders, a waist, knees, feet, and the like. When the target object is a vehicle, the assembly may include a front portion (a portion before the front axle), a middle portion (a portion between the front axle and the rear axle), and a rear portion (a portion after the rear axle) of the vehicle. The above is merely an example and is not used to define the division of the local region of the target object.
In one embodiment, step II may comprise:
generating a target position corresponding to each component according to the position of each key point contained in each component; and generating a parting line according to the target position corresponding to each component.
In this embodiment, there may be one or more keypoints detected on each component in the target object. When only one key point is detected on a certain component, the position of the key point is determined as the target position. When there are a plurality of key points detected on a certain component, the target position is determined according to the positions of the plurality of key points. Optionally, the method for determining the target position according to the positions of the plurality of key points is as follows:
if the component is located in a non-edge area of the target object, calculating an average value of positions of all key points contained in the component, wherein a target position corresponding to the component is the average value;
if the assembly is located in the edge area of the target object, calculating the maximum value of the positions of all key points contained in the assembly, wherein the target position corresponding to the assembly is the maximum value.
Exemplarily, refer to fig. 4, which is a schematic diagram of key points provided in the embodiments of the present application. As shown in fig. 4 (a), it is a schematic diagram of face key points obtained by using a detection method of 68 key points. The components contained in a human face are the eyebrows, eyes, nose and mouth. Wherein the eyebrows are positioned at the uppermost end of the face component and the mouth is positioned at the lowermost end of the face component, so that the eyebrows and the mouth are positioned at the edge region of the face and the eyes and the nose are positioned at the non-edge region of the face. Calculating the maximum value of the positions of the key points 18-27 on the eyebrows, namely the maximum value of the ordinate of the key points 18-27 (the ordinate gradually decreases from top to bottom in the image); calculating the maximum value of the positions of the key points 49-68 on the mouth, namely the minimum value of the ordinate of the key points 49-68; calculating the average value of the positions of the key points 37-48 on the eyes, namely the average value of the ordinate of the key points 37-48; the average of the positions of the key points 28-36 on the nose, i.e. the average of the ordinates of the key points 28-36, is calculated. In this example, the key points 1-17 on the face contour are not considered, since the face contour does not belong to a face component.
As shown in fig. 4 (b), a diagram of key points of the human body is shown. In this example, the defined body components include a head, shoulders, waist, and feet. Wherein the head is positioned uppermost in the body component and the foot is positioned lowermost in the body component, such that the head and foot are positioned in the marginal region of the body and the shoulder and waist are positioned in the non-marginal region of the body. As shown in the figure, a key point 1 is detected on the head, and the position of the key point is determined as the target position. Two key points 2-3 are detected on the shoulder, and the average value of the ordinate of the key points 2-3 is calculated. Two key points 4-5 are detected at the waist, and the average value of the ordinate of the key points 4-5 is calculated. The foot has two key points 6-7, and the maximum value of the key points 6-7, i.e. the minimum value of the ordinate of the key points 6-7, is calculated.
And III, dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the dividing line.
Specifically, the portion between each two dividing lines may be divided into one sub-image.
In the above embodiment, the area occupied by the target object in the image to be processed is divided according to the components of the target object, and the local features in the target object can be extracted in a targeted manner, so that the extraction of feature information with high quality is ensured.
S302, extracting local characteristic information of each sub-image.
In this step, the trained feature extraction model may be used to extract the local feature information of each sub-image. The feature extraction model may adopt a neural network model or an algorithm model with a feature extraction function, or the like.
And S303, generating first global feature information of the image to be processed according to the local feature information of each sub-image.
Optionally, an implementation manner of S303 is:
and concatenating the local feature information of each sub-image into a vector, wherein the vector is the first global feature information.
Of course, the local feature information of each sub-image may also be formed into a matrix, and each column/row in the matrix represents the local feature information of one sub-image.
S304, identifying the target object in the image to be processed according to the first global feature information.
In one implementation of this step, the first global feature information may be input into the trained recognition model, and the category information of the target object may be output.
In the image processing method, in the process of dividing the image to be processed into a plurality of sub-images, the image to be processed is divided based on the positions of key points on the target object in the image to be processed, so that the characteristic information of the figure image can be extracted in a targeted manner; particularly, under the condition that the target object is not accurately detected, the method can still accurately divide the local image of the target object from the image to be processed, so that the characteristic information of the target object with higher quality is extracted, and the accuracy of image identification is further improved.
For example, refer to fig. 5, which is a schematic flowchart of a pedestrian feature extraction algorithm based on key points provided in an embodiment of the present application. As shown in fig. 5, a human body image T is input into the trained key point detection network, and the detected key points on the detection image T 'carrying key point information are output, where the key points detected on T' include a vertex key point, a left shoulder key point, a right shoulder key point, a left hip key point, a right hip key point, a left foot key point, and a right foot key point. Determining a horizontal line where the key point at the top of the head is located as a dividing line l1; determining a horizontal line where the average vertical coordinates of the left shoulder key point and the right shoulder key point are located as a dividing line l2; determining a horizontal line where the average vertical coordinate of the key point of the left hip and the key point of the right hip is located as a dividing line l3; and determining a horizontal line where the minimum vertical coordinates of the left foot key point and the right foot key point are located as a dividing line l4. And then dividing the human body image according to the determined dividing line to obtain a dividing image. Specifically, the area between l1 and l2 is divided into one sub-image F1, and the area between l2 and l3 is divided intoAnd a sub-image F2, dividing the area between l3 and l4 into a sub-image F3. Respectively extracting local characteristic information of each sub-image to obtain { f 1 ,f 2 ,f 3 }. And finally, the local feature information of the 3 sub-images is connected in series to form global feature information f, and the pedestrians in the human body image T are identified by the global feature information f.
The segmentation graph shown in fig. 5 is only divided for the region occupied by the human body, the region above the vertex is the background image, and in the process of dividing the sub-images, the background image part is ignored, and the feature information of the background image part does not need to be extracted, so that the influence of the background image on the feature information is reduced, and the quality of the features is effectively improved.
In the above embodiment, the positions of the human body components can be obtained by using the key point information of the human body, so that the human body component features with higher quality can be extracted. Compared with the existing pedestrian feature extraction algorithm based on average division, the method provided by the embodiment of the application can still generate an accurate human body region under the condition of inaccurate human body detection, so that the high-quality pedestrian features are extracted; in addition, the pedestrian feature extraction algorithm based on the human body key point information divides the pedestrian into three local areas with physical significance according to the key point information of the human body structure, and meanwhile, the integrity of the biological structures of the upper half body and the lower half body is well kept. Therefore, the method provided by the embodiment of the application can effectively improve the accuracy of image recognition.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Fig. 6 is a block diagram of an image processing apparatus according to an embodiment of the present application, which corresponds to the image processing method described in the foregoing embodiment, and only a part related to the embodiment of the present application is shown for convenience of description.
Referring to fig. 6, the apparatus includes:
the dividing unit 61 is configured to divide an area occupied by a target object in an image to be processed into a plurality of sub-images according to a position of a key point on the target object in the image to be processed.
An extracting unit 62, configured to extract local feature information of each of the sub-images.
A generating unit 63, configured to generate first global feature information of the image to be processed according to the local feature information of each sub-image.
An identifying unit 64, configured to identify the target object in the image to be processed according to the first global feature information.
Optionally, the dividing unit 61 is further configured to:
detecting the key points on the target object in the image to be processed;
determining a dividing line according to the position of the key point;
and dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the dividing line.
Optionally, the dividing unit 61 is further configured to:
generating second global feature information of the image to be processed;
and detecting the key points on the target object in the image to be processed according to the second global feature information.
Optionally, the key points carry category labels, the key points carrying the same category label belong to the same component, and the component represents a local area of the target object.
Correspondingly, the dividing unit 61 is further configured to:
generating a target position corresponding to each component according to the position of each key point contained in each component;
and generating a parting line according to the target position corresponding to each component.
Optionally, the dividing unit 61 is further configured to:
if the assembly is located in a non-edge area of the target object, calculating an average value of positions of all key points contained in the assembly, wherein the target position corresponding to the assembly is the average value;
if the assembly is located in the edge area of the target object, calculating the maximum value of the positions of all key points contained in the assembly, wherein the target position corresponding to the assembly is the maximum value.
Optionally, the generating unit 63 is further configured to:
and concatenating the local feature information of each sub-image into a vector, wherein the vector is the first global feature information.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.
The image processing apparatus shown in fig. 6 may be a software unit, a hardware unit, or a combination of software and hardware unit built in an existing terminal device, may be integrated into the terminal device as a separate pendant, or may exist as a separate terminal device.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 7, the terminal device 7 of this embodiment includes: at least one processor 70 (only one shown in fig. 7), a memory 71, and a computer program 72 stored in the memory 71 and executable on the at least one processor 70, the processor 70 implementing the steps in any of the various image processing method embodiments described above when executing the computer program 72.
The terminal device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The terminal device may include, but is not limited to, a processor, a memory. Those skilled in the art will appreciate that fig. 7 is only an example of the terminal device 7, and does not constitute a limitation to the terminal device 7, and may include more or less components than those shown, or combine some components, or different components, for example, and may further include input/output devices, network access devices, and the like.
The Processor 70 may be a Central Processing Unit (CPU), and the Processor 70 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 71 may in some embodiments be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. In other embodiments, the memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 7. Further, the memory 71 may also include both an internal storage unit and an external storage device of the terminal device 7. The memory 71 is used for storing an operating system, an application program, a Boot Loader (Boot Loader), data, and other programs, such as program codes of the computer programs. The memory 71 may also be used to temporarily store data that has been output or is to be output.
The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.
The embodiments of the present application provide a computer program product, which when running on a terminal device, enables the terminal device to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to an apparatus/terminal device, recording medium, computer Memory, read-Only Memory (ROM), random-Access Memory (RAM), electrical carrier wave signals, telecommunications signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one type of logical function division, and other division manners may be available in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. An image processing method, comprising:
dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the position of a key point on the target object in the image to be processed;
extracting local characteristic information of each sub-image;
generating first global feature information of the image to be processed according to the local feature information of each sub-image;
and identifying the target object in the image to be processed according to the first global feature information.
2. The image processing method of claim 1, wherein dividing the region occupied by the target object in the image to be processed into a plurality of sub-images according to the position of the key point on the target object in the image to be processed comprises:
detecting the key points on the target object in the image to be processed;
determining a dividing line according to the position of the key point;
and dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the dividing line.
3. The image processing method of claim 2, wherein the detecting the keypoint on the target object in the image to be processed comprises:
generating second global feature information of the image to be processed;
and detecting the key points on the target object in the image to be processed according to the second global feature information.
4. The image processing method according to claim 2, wherein the key points carry category labels, the key points carrying the same category label belong to the same component, and the component represents a local region of the target object;
the determining a segmentation line according to the position of the key point comprises:
generating a target position corresponding to each component according to the position of each key point contained in each component;
and generating a parting line according to the target position corresponding to each component.
5. The image processing method according to claim 4, wherein the generating the target position corresponding to each component according to the position of each key point included in each component comprises:
if the component is located in the non-edge area of the target object, calculating an average value of the positions of all key points contained in the component, wherein the target position corresponding to the component is the average value.
6. The image processing method according to claim 4, wherein the generating the target position corresponding to each component according to the position of each key point included in each component comprises:
if the assembly is located in the edge area of the target object, calculating the maximum value of the positions of all key points contained in the assembly, wherein the target position corresponding to the assembly is the maximum value.
7. The image processing method according to claim 1, wherein the generating first global feature information of the image to be processed according to the local feature information of each of the sub-images comprises:
and concatenating the local feature information of each sub-image into a vector, wherein the vector is the first global feature information.
8. An image processing apparatus characterized by comprising:
the dividing unit is used for dividing the area occupied by the target object in the image to be processed into a plurality of sub-images according to the position of a key point on the target object in the image to be processed;
the extraction unit is used for extracting local characteristic information of each sub-image;
the generating unit is used for generating first global feature information of the image to be processed according to the local feature information of each sub-image;
and the identification unit is used for identifying the target object in the image to be processed according to the first global feature information.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202211025844.2A 2022-08-25 2022-08-25 Image processing method, image processing device, terminal equipment and computer readable storage medium Pending CN115439733A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211025844.2A CN115439733A (en) 2022-08-25 2022-08-25 Image processing method, image processing device, terminal equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211025844.2A CN115439733A (en) 2022-08-25 2022-08-25 Image processing method, image processing device, terminal equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115439733A true CN115439733A (en) 2022-12-06

Family

ID=84244395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211025844.2A Pending CN115439733A (en) 2022-08-25 2022-08-25 Image processing method, image processing device, terminal equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115439733A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830028A (en) * 2023-02-20 2023-03-21 阿里巴巴达摩院(杭州)科技有限公司 Image evaluation method, device, system and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830028A (en) * 2023-02-20 2023-03-21 阿里巴巴达摩院(杭州)科技有限公司 Image evaluation method, device, system and storage medium
CN115830028B (en) * 2023-02-20 2023-05-23 阿里巴巴达摩院(杭州)科技有限公司 Image evaluation method, device, system and storage medium

Similar Documents

Publication Publication Date Title
CN109684911B (en) Expression recognition method and device, electronic equipment and storage medium
CN109359538B (en) Training method of convolutional neural network, gesture recognition method, device and equipment
CN110020592B (en) Object detection model training method, device, computer equipment and storage medium
CN109117773B (en) Image feature point detection method, terminal device and storage medium
CN112528831B (en) Multi-target attitude estimation method, multi-target attitude estimation device and terminal equipment
CN111079785A (en) Image identification method and device and terminal equipment
CN112633084B (en) Face frame determining method and device, terminal equipment and storage medium
CN110852311A (en) Three-dimensional human hand key point positioning method and device
WO2019119396A1 (en) Facial expression recognition method and device
CN107895021B (en) image recognition method and device, computer device and computer readable storage medium
CN109740674A (en) A kind of image processing method, device, equipment and storage medium
CN111460910A (en) Face type classification method and device, terminal equipment and storage medium
CN110175500B (en) Finger vein comparison method, device, computer equipment and storage medium
CN115439733A (en) Image processing method, image processing device, terminal equipment and computer readable storage medium
Juang et al. Stereo-camera-based object detection using fuzzy color histograms and a fuzzy classifier with depth and shape estimations
CN112200004B (en) Training method and device for image detection model and terminal equipment
CN112560856A (en) License plate detection and identification method, device, equipment and storage medium
CN112364807A (en) Image recognition method and device, terminal equipment and computer readable storage medium
CN116129504A (en) Living body detection model training method and living body detection method
CN105224957A (en) A kind of method and system of the image recognition based on single sample
CN113743194B (en) Face silence living body detection method and device, electronic equipment and storage medium
CN111931794B (en) Sketch-based image matching method
Kulkarni Handwritten character recognition using HOG, COM by OpenCV & Python
CN112464753B (en) Method and device for detecting key points in image and terminal equipment
CN111931557A (en) Specification identification method and device for bottled drink, terminal equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination