CN113221751A - Method, device and equipment for detecting key points and storage medium - Google Patents

Method, device and equipment for detecting key points and storage medium Download PDF

Info

Publication number
CN113221751A
CN113221751A CN202110524390.2A CN202110524390A CN113221751A CN 113221751 A CN113221751 A CN 113221751A CN 202110524390 A CN202110524390 A CN 202110524390A CN 113221751 A CN113221751 A CN 113221751A
Authority
CN
China
Prior art keywords
coordinates
candidate
key point
key points
keypoint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110524390.2A
Other languages
Chinese (zh)
Other versions
CN113221751B (en
Inventor
杨黔生
王健
沈辉
丁二锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110524390.2A priority Critical patent/CN113221751B/en
Publication of CN113221751A publication Critical patent/CN113221751A/en
Application granted granted Critical
Publication of CN113221751B publication Critical patent/CN113221751B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a method, a device, equipment and a storage medium for key point detection, relates to the technical field of artificial intelligence, in particular to the field of computer vision and deep learning, and can be applied to a robot or an automatic driving scene. The specific implementation scheme is as follows: determining the space coordinates of the central key point of the target object in the image; determining the spatial coordinates of other key points by using the spatial coordinates of the central key point and the position relationship between the central key point and other key points of the target object; and determining the plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result. Since the above process takes into account the depth factor of the target object. Particularly in the case where a plurality of target objects are included, the relative position data of the respective target objects remains. And moreover, the accuracy of the space coordinates of each key point can be ensured by adopting radiation and a mode of correcting by using plane coordinates.

Description

Method, device and equipment for detecting key points and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence, and more particularly to the field of computer vision and deep learning, and can be applied in robotic or autonomous driving scenarios.
Background
With the progress of society and the rapid development of science and technology, industries such as short video, live broadcast, online education, automatic driving and the like are continuously rising, and the functional requirements for interaction based on key point information are more and more in various interactive scenes.
The related art relies on the output result of the neural network when performing the key point detection. Under the condition that the image is shielded or the training precision of the neural network is insufficient, the detection result of the key point cannot meet the requirement.
Disclosure of Invention
The disclosure provides a method, an apparatus, a device and a storage medium for key point detection.
According to an aspect of the present disclosure, there is provided a method of keypoint detection, which may include the steps of:
determining the space coordinates of the central key point of the target object in the image;
determining the spatial coordinates of other key points by using the spatial coordinates of the central key point and the position relationship between the central key point and other key points of the target object;
and determining the plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.
According to another aspect of the present disclosure, there is provided an apparatus for keypoint detection, which may include the following components:
the spatial coordinate determination module of the central key point is used for determining the spatial coordinate of the central key point of the target object in the image;
the spatial coordinate determination module of other key points is used for determining the spatial coordinates of other key points by utilizing the spatial coordinates of the central key point and the position relationship between the central key point and other key points of the target object;
and the space coordinate correction module is used for determining the plane coordinates of other key points and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to any one of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method in any of the embodiments of the present disclosure.
According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method in any of the embodiments of the present disclosure.
According to the technology disclosed by the invention, the spatial coordinates of other key points are confirmed by utilizing the central key point of the target object in a radiation mode. Since the above process takes into account the depth factor of the target object. Particularly in the case where a plurality of target objects are included, the relative position data of the respective target objects remains. And moreover, the accuracy of the space coordinate of each key point can be ensured by adopting a mode of determining the space coordinate by radiation and correcting by utilizing the plane coordinate.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a method of keypoint detection according to the present disclosure;
FIG. 2 is a schematic illustration of a detection box of a target object and spatial coordinates of a center keypoint of the target object according to the present disclosure;
FIG. 3 is a schematic illustration of a positional relationship between key points of a target object according to the present disclosure;
FIG. 4 is a schematic illustration of the planar coordinates of key points of a target object according to the present disclosure;
FIG. 5 is a schematic illustration of feature recognition of an image according to the present disclosure;
FIG. 6 is a flow chart for determining the spatial coordinates of a central keypoint in accordance with the present disclosure;
FIG. 7 is a schematic illustration of depth values for keypoints of a target object according to the present disclosure;
FIG. 8 is a flow chart for determining spatial coordinates of other keypoints according to the present disclosure;
FIG. 9 is a schematic illustration of determining a first candidate keypoint according to the present disclosure;
FIG. 10 is a flow chart for determining spatial coordinates of other keypoints according to the present disclosure;
FIG. 11 is a flow chart of correcting the spatial coordinates of other keypoints according to the present disclosure;
FIG. 12 is a flow chart of correcting the spatial coordinates of other keypoints according to the present disclosure;
FIG. 13 is a schematic diagram of an apparatus for keypoint detection according to the present disclosure;
FIG. 14 is a block diagram of an electronic device used to implement the method of keypoint detection of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
As shown in fig. 1, the present disclosure relates to a method of keypoint detection, which may include the steps of:
s101: determining the space coordinates of the central key point of the target object in the image;
s102: determining the spatial coordinates of other key points by using the spatial coordinates of the central key point and the position relationship between the central key point and other key points of the target object;
s103: and determining the plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.
The execution main body of the above scheme of the present disclosure may be an application installed in an intelligent device such as a mobile phone, a tablet computer, a sound box, or the like, or may be a server of the application, or the like. And receiving the image through the intelligent equipment so as to realize the key point detection of the target object in the image.
The target object may be a person, an animal, or an object such as a vehicle. The following embodiments are described taking as an example that the target object is a person.
The images may be input into a pre-trained model to obtain the spatial coordinates of the central keypoints of the target object. That is, the center keypoint may be a 3D keypoint. The pre-trained model may include a first model for determining target object box detection. For example, as shown in fig. 2, the first model receives an image, and may output a detection frame of a target object in the image, spatial coordinates of a central key point of the target object, and the like, and may also output a width value, a height value, and the like of the detection frame.
Further, the pre-trained model may further include a second model for determining a feature vector of a positional relationship between key points of the target object. For example, as shown in fig. 3, the second model receives an image and may output a feature vector of a positional relationship between key points of a target object in the image. The positional relationship may be a spatial positional relationship.
According to the space coordinates of the central key point of the target object and the position relation feature vector between the key points of the target object, the space coordinates of other key points around the central key point can be determined by sequentially diverging from the central key point.
Further, the spatial coordinates of the key points of all the target objects in the detection frame can be determined by using the continuous divergence of the spatial coordinates of the other key points of which the spatial coordinates are confirmed.
In addition, the pre-trained model may further include a third model that determines the plane coordinates of the key points of the target object. For example, as shown in FIG. 4, the third model receives the image and may output the planar coordinates of the key points of the target object in the image.
Generally, since the plane coordinates are two-dimensional data, the reliability is higher compared to three-dimensional data. Therefore, the spatial coordinates of the corresponding key points can be corrected by using the plane coordinates of each key point to obtain the detection result.
As shown in fig. 5, the extraction of the image features may be performed by using a trunk network (Hourglass) to obtain the image features. The image features are input into the first model, the second model and the third model respectively, and a detection frame of the corresponding target object, the space coordinates of the central key points of the target object, the position relation feature vector between the key points of the target object, the plane coordinates of the key points of the target object and the like can be obtained.
In addition, a fourth model may be further included, and the fourth model may be a model for determining the depth values of the keypoints, and the specific functions will be described in detail later.
The above-described first model, second model, and third model (including the fourth model) of the present application may be trained in advance. For example, taking the target object as a person as an example, the keypoint labeling can be performed on person image samples of different genders, different body types and different postures in advance. The annotation result may be an output result of the first model, the second model, and the third model described above. Wherein the resolution of the annotation result may be consistent with the resolution of the output results of the first model, the second model, and the third model.
The character image sample is input into the backbone network, and the image characteristics can be extracted. The prediction results can be obtained through the first model to be trained, the second model to be trained and the third model to be trained respectively. And respectively comparing the prediction result with the corresponding labeling result, and adjusting parameters in the first model to be trained, the second model to be trained and the third model to be trained by using the difference until the output result of the models is converged.
In addition, the generalization capability of the model is enhanced, and training data is enriched. Before training, the image samples may also be subjected to different scale scaling, different rotation angles and/or perturbation enhancement processing of color space, etc.
Through the scheme, the central key point of the target object is utilized, and the spatial coordinates of other key points are confirmed in a radiation mode. Since the above process takes into account the depth factor of the target object. Particularly in the case where a plurality of target objects are included, the relative position data of the respective target objects is retained. Moreover, the spatial coordinates of the key points are sequentially determined in a divergent mode, and the accuracy of the spatial coordinates of each key point can be ensured by utilizing a planar coordinate to correct.
As shown in fig. 6, in one embodiment, step S101 may further include the following sub-steps:
s601: determining a detection frame of a target object and a to-be-revised spatial coordinate of a central key point;
s602: determining key points appearing in the detection frame as key points of the target object;
s603: and revising the to-be-revised spatial coordinates of the central key points by using the depths of the key points of the target object to obtain the spatial coordinates of the central key points of the target object.
By using the first model, the detection frame of each target object in the image and the spatial coordinates of the central key point of the target object can be obtained. In the current embodiment, the spatial coordinates of the central key point of the target object obtained through the first model may be used as the spatial coordinates to be revised.
Using the fourth model, depth values for the key points of the target object may be determined. For example, as shown in FIG. 7, the fourth model receives an image and may output depth values for key points of a target object in the image.
For the same target object, the detection frame of the target object can be used for constraining each key point, so that each key point belonging to the same target object can be relatively accurately acquired.
An average value of the keypoint depth values of the same target object may be calculated, etc. And using the calculation result as a revision parameter to revise the spatial coordinates to be revised. For example, the spatial coordinates to be revised may be represented as (x, y, z). The calculation result of the average calculation using the respective keypoint depth values of the same target object may be denoted as z'. The revised spatial coordinates may be represented as (x, y, z').
By the scheme, a key point depth detection result is introduced. And correcting the space coordinates of the central key point of the target object by using the depth values of all key points of the target object so as to ensure that the accuracy of the corrected coordinates of the central key point is higher.
As shown in fig. 8, in an embodiment, step S102 may specifically include the following sub-steps:
s801: selecting at least one first candidate key point from other key points according to a preset rule, and forming a first key point set by the first candidate key point and the central key point;
s802: respectively determining the position relation between the central key point and each first candidate key in the first key point set by using a space vector model;
s803: determining the spatial coordinates of the corresponding first candidate key points by using the spatial coordinates of the central key points and the position relationship between the central key points and any first candidate key point in the first key point set;
the spatial coordinates of the first candidate keypoint are taken as the spatial coordinates of the other keypoints.
The predetermined rule may be selected according to the distance from the central key point, according to the linkage relationship between the joints or parts represented by the key points, or may be selected randomly.
For example, the linkage relationship between the joints represented by the key points is selected, and the key points corresponding to the other joints or parts having linkage relationship with the joint or part represented by the central key point may be selected as the first candidate key points with the central key point as the origin. The first candidate keypoints may be one or more.
The first candidate keypoint and the central keypoint are grouped into a set, which may be referred to as a link structure composed of keypoints.
The spatial coordinates of each first candidate keypoint in the set can be obtained by using the spatial coordinates of the central keypoint and the position relationship among the keypoints obtained by the second model.
By the scheme, the accuracy of the space coordinates of each key point can be ensured by adopting a mode of taking the central key point as an origin and radiating the surrounding key points to determine the space coordinates of the surrounding key points.
In one embodiment, the number of the first keypoint sets is multiple, and the first candidate keypoints in each first keypoint set are different.
As shown in connection with fig. 9, the first candidate keypoints having an association with the center keypoint may be selected to form different sets using a variety of predetermined rules. I.e. different sets correspond to different link structures. The first candidate keypoints in each first keypoint set may be different in number or include different keypoints. Taking the target object as a person, for example, the person may include 17 keypoints, and 16 keypoints other than the central keypoint may be used as the first candidate keypoint in the first keypoint set. In addition, according to the distance, N keypoints closer to the central keypoint may be selected as the first candidate keypoints in the first keypoint set.
With the above scheme, each first candidate keypoint may result in a plurality of different spatial coordinates. In the subsequent processing, a plurality of different spatial coordinates may be utilized to obtain a final calculation result of the spatial coordinates of each first candidate keypoint. The difference between the final calculation result of the space coordinate and the true value is small, and the precision requirement is met.
The specific process of obtaining the final calculation result of the spatial coordinates of each first candidate keypoint by using multiple different sets of spatial coordinates will be described in detail later.
As shown in fig. 10, in one embodiment, the method further includes:
s1001: according to a preset rule and the incidence relation between the first candidate key point with the determined space coordinate and other key points without the determined space coordinate, selecting at least one second candidate key point, and enabling the first candidate key point with the determined space coordinate and the second candidate key point to form a second key point set;
s1002: respectively determining the position relation of the first candidate key point with the determined space coordinates and each second candidate key in the second key point set by using a space vector model;
s1003: determining the spatial coordinates of corresponding second candidate key points by using the spatial coordinates of the first candidate key points with the determined spatial coordinates and the position relationship between the first candidate key points with the determined spatial coordinates and any second candidate key point in the second key point set;
and taking the spatial coordinates of the second candidate key point as the spatial coordinates of other key points.
In the case where the first candidate keypoint cannot cover all the keypoints of the target object, the keypoint for which the spatial coordinate is not determined may be taken as the second candidate keypoint.
In determining the spatial coordinates of the second candidate keypoints, the first candidate keypoints for which spatial coordinates have been determined may be used. For example, for a first candidate keypoint for which spatial coordinates have been determined, a second candidate keypoint associated with the first candidate keypoint may be selected by the aforementioned predetermined rule, constituting a second keypoint set.
According to the output result of the second model, the position relationship between the first candidate keypoint and each second candidate keypoint in the second keypoint set can be obtained. And then the position relation of each second candidate key point can be obtained by utilizing the position relation.
Further, in the foregoing embodiment, when the number of the first keypoint sets is multiple, multiple second keypoint sets may be obtained correspondingly. That is, a plurality of spatial coordinates of each second candidate keypoint may be obtained. Similarly, the final calculation result of the spatial coordinates of each first candidate keypoint may also be obtained according to different spatial coordinates.
Since the spatial coordinates of the second keypoints are obtained by using the spatial coordinates of the first keypoints, the accuracy of the coordinates of the second keypoints can meet the requirement.
As shown in fig. 11, in one embodiment, step S103 may further include the following sub-steps:
s1101: for any other key point, determining the plane coordinate of each other key point by using the plane coordinate determination model;
s1102: and correcting the space coordinates of other key points by using the difference between the plane coordinates of other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.
In the current embodiment, the planar coordinate acquisition model may employ a gaussian heatmap model.
For example, the image resolution is 640 × 640. The target object in the image is a person, and the person comprises 17 key points in total. Each (circular) keypoint occupies a portion of the pixels in the image in the form of a gaussian heat map. The coordinates at the center of the keypoint may be taken as the plane coordinates of the keypoint.
The method for correcting the spatial coordinates of the key points may include the following steps:
and under the condition that the difference between the plane coordinates of the key points and the x-axis coordinates and the y-axis coordinates in the space coordinates is not larger than the corresponding threshold value, replacing the x-axis coordinates and the y-axis coordinates in the space coordinates with the plane coordinates.
And under the condition that the difference between the plane coordinate of the key point and the x-axis coordinate and the y-axis coordinate in the space coordinate is larger than the corresponding threshold value, the x-axis coordinate and the y-axis coordinate in the space coordinate can be reserved.
Or, different weights may be set according to the difference. For example, for planar coordinates, the weight may be set to q1. According to the difference, the weight of the x-axis coordinate and the y-axis coordinate in the space coordinate can be set to be q2. The x-axis coordinate and the y-axis coordinate in the space coordinate can be corrected by calculating the weighted sum.
By the scheme, the spatial coordinates can be corrected by means of the plane coordinates of the key points with relatively high accuracy. In the present embodiment, since the output result of the second model is corrected after the depth value (z-axis coordinate) in the spatial coordinate, the x-axis coordinate and the y-axis coordinate in the spatial coordinate are corrected, so that the spatial coordinate has higher accuracy.
As shown in fig. 12, in an embodiment, in the case that the same other keypoint includes a plurality of spatial coordinates, step S103 may further include the following steps:
s1201: according to the difference between the plane coordinates of other key points and the x-axis coordinate and the y-axis coordinate in each space coordinate, weight is distributed to each space coordinate;
s1202: calculating the weight sum of each space coordinate to obtain a calculation result;
s1203: and taking the calculation result as a correction result of the space coordinates of other key points.
The other keypoints may correspond to the aforementioned first candidate keypoints and second candidate keypoints. For any other keypoint, the plane coordinates of that keypoint may be taken as a reference value. And scoring each spatial coordinate according to the plane coordinates of the key points.
The scoring may be performed based on differences in x-axis coordinates, y-axis coordinates, and plane coordinates in the spatial coordinates. The smaller the difference, the higher the score.
For spatial coordinates with a high score, a high weight may be assigned. Correspondingly, for coordinates with a low score, a low weight may be assigned.
Further, a corresponding threshold value may also be set. In the event that the score is below the corresponding threshold, the spatial coordinates corresponding to the low score may be discarded.
For the remaining spatial coordinates, a weighted sum may be calculated as the final result of the coordinates.
By the scheme, a plurality of spatial coordinates of the same key point can be obtained according to different link structures, and a final result with higher accuracy can be obtained by utilizing the plurality of spatial coordinates to perform comprehensive calculation.
As shown in fig. 13, the present disclosure relates to a device for keypoint detection, which may include the following components:
a spatial coordinate determination module 1301 of the central key point, configured to determine a spatial coordinate of the central key point of the target object in the image;
a spatial coordinate determination module 1302 for determining spatial coordinates of other key points by using the spatial coordinate of the central key point and the position relationship between the central key point and the other key points of the target object;
and the spatial coordinate correction module 1303 is configured to determine plane coordinates of other key points, and correct the spatial coordinates of the other key points by using the plane coordinates of the other key points to obtain a detection result.
In one embodiment, the module 1301 for determining the spatial coordinates of the central key point may further include:
the initial information determining submodule is used for determining a detection frame of a target object and a to-be-modified space coordinate of a central key point;
the key point determining submodule of the target object is used for determining key points appearing in the detection frame as key points of the target object;
and the space coordinate revision submodule is used for revising the space coordinate to be revised of the central key point by using the depth of the key point of the target object so as to obtain the space coordinate of the central key point of the target object.
In one embodiment, the spatial coordinate determination module 1302 for other key points may further include:
the first key point set determining submodule is used for selecting at least one first candidate key point from other key points according to a preset rule and forming the first candidate key point and the central key point into a first key point set;
the position relation determining submodule is used for respectively determining the position relation between the central key point and each first candidate key in the first key point set by utilizing a space vector model;
the spatial coordinate determination submodule of the first candidate key point is used for determining the spatial coordinate of the corresponding first candidate key point by utilizing the spatial coordinate of the central key point and the position relation between the central key point and any first candidate key point in the first key point set;
the spatial coordinates of the first candidate keypoint are taken as the spatial coordinates of the other keypoints.
In one embodiment, the number of the first keypoint sets is multiple, and the first candidate keypoints in each first keypoint set are different.
In an embodiment, the module 1302 for determining spatial coordinates of other key points may further include:
the second key point set determining submodule is used for selecting at least one second candidate key point from other key points of undetermined space coordinates according to a preset rule and the incidence relation between the second candidate key point and the first candidate key point of the determined space coordinates, and forming the first candidate key point of the determined space coordinates and the second candidate key point into a second key point set;
the position relation determining submodule of the second candidate key is used for respectively determining the position relation of the first candidate key with the determined space coordinates and each second candidate key in the second key set by utilizing a space vector model;
the spatial coordinate determination submodule of the second candidate key point is used for determining the spatial coordinate of the corresponding second candidate key point by utilizing the spatial coordinate of the first candidate key point with the determined spatial coordinate and the position relation between the first candidate key point with the determined spatial coordinate and any second candidate key point in the second key point set;
and taking the spatial coordinates of the second candidate key point as the spatial coordinates of other key points.
In one embodiment, the spatial coordinate correcting module 1303 may further include:
the plane coordinate determination submodule is used for determining the plane coordinate of each other key point by utilizing the plane coordinate determination model for any other key point;
and the space coordinate correction execution submodule is used for correcting the space coordinates of other key points by utilizing the difference between the plane coordinates of other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.
In one embodiment, in the case that the same other keypoint includes multiple spatial coordinates, the spatial coordinate correction execution submodule includes:
the weight distribution unit is used for distributing weight to each space coordinate according to the difference between the plane coordinates of other key points and the x-axis coordinate and the y-axis coordinate in each space coordinate;
the weight sum calculation unit is used for calculating the weight sum of each space coordinate to obtain a calculation result;
and taking the calculation result as a correction result of the space coordinates of other key points.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 14 shows a schematic block diagram of an electronic device 1400 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 14, the electronic device 1400 includes a computing unit 1410 that may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)1420 or a computer program loaded from a storage unit 1480 into a Random Access Memory (RAM) 1430. In the RAM1430, various programs and data required for the operation of the device 1400 may also be stored. The computing unit 1410, ROM1420 and RAM1430 are connected to each other by a bus 1440. An input/output (I/O) interface 1450 also connects to bus 1440.
Various components in electronic device 1400 are connected to I/O interface 1450, including: an input unit 1460 such as a keyboard, a mouse, or the like; an output unit 1470 such as various types of displays, speakers, and the like; a storage unit 1480 such as a magnetic disk, optical disk, or the like; and a communication unit 1490 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 1490 allows the electronic device 1400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
Computing unit 1410 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of computing unit 1410 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 1410 performs various methods and processes described above, such as a method of keypoint detection. For example, in some embodiments, the method of keypoint detection may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 1480. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1400 via the ROM1420 and/or the communication unit 1490. When loaded into RAM1430 and executed by computing unit 1410, may perform one or more steps of the above-described method of keypoint detection. Alternatively, in other embodiments, the computing unit 1410 may be configured by any other suitable means (e.g., by means of firmware) to perform the method of keypoint detection.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. A method of keypoint detection, comprising:
determining the space coordinates of the central key point of the target object in the image;
determining the spatial coordinates of other key points by using the spatial coordinates of the central key point and the position relationship between the central key point and the other key points of the target object;
and determining the plane coordinates of the other key points, and correcting the space coordinates of the other key points by using the plane coordinates of the other key points to obtain a detection result.
2. The method of claim 1, wherein determining spatial coordinates of a central keypoint of the target object in the image comprises:
determining a detection frame of the target object and a to-be-revised space coordinate of the central key point;
determining key points appearing inside the detection frame as key points of the target object;
and revising the to-be-revised spatial coordinates of the central key point by using the depth of the key point of the target object to obtain the spatial coordinates of the central key point of the target object.
3. The method of claim 1, wherein the determining spatial coordinates of other key points of the target object using the spatial coordinates of the central key point and the positional relationship of the central key point to the other key points comprises:
selecting at least one first candidate key point from the other key points according to a preset rule, and forming a first key point set by the first candidate key point and the central key point;
respectively determining the position relation between the central key point and each first candidate key in the first key point set by using a space vector model;
determining the spatial coordinates of the corresponding first candidate key points by using the spatial coordinates of the central key points and the position relationship between the central key points and any first candidate key point in the first key point set;
and taking the spatial coordinates of the first candidate key point as the spatial coordinates of the other key points.
4. The method of claim 3, wherein the first set of keypoints is plural in number, and the first candidate keypoints in each of the first set of keypoints is different.
5. The method of claim 3 or 4, further comprising:
according to the preset rule and the incidence relation between the preset rule and the first candidate key point of the determined space coordinate, selecting at least one second candidate key point from other key points of the undetermined space coordinate, and forming a second key point set by the first candidate key point of the determined space coordinate and the second candidate key point;
respectively determining the position relation between the first candidate key point with the determined space coordinates and each second candidate key point in the second key point set by using the space vector model;
determining the spatial coordinates of the corresponding second candidate key points by using the spatial coordinates of the first candidate key points of the determined spatial coordinates and the position relationship between the first candidate key points of the determined spatial coordinates and any second candidate key points in the second key point set;
and taking the space coordinates of the second candidate key points as the space coordinates of the other key points.
6. The method of claim 1, wherein the determining the plane coordinates of the other key points, and correcting the space coordinates of the other key points by using the plane coordinates of the other key points to obtain the detection result comprises:
for any other key point, determining the plane coordinate of each other key point by using a plane coordinate determination model;
and correcting the space coordinates of the other key points by using the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.
7. The method according to claim 6, wherein in a case that the same other keypoint includes multiple spatial coordinates, the correcting the spatial coordinates of the other keypoint using the difference between the plane coordinates of the other keypoint and the x-axis coordinates and the y-axis coordinates in the spatial coordinates comprises:
according to the difference between the plane coordinates of the other key points and the x-axis coordinate and the y-axis coordinate in each space coordinate, distributing weight to each space coordinate;
calculating the weight sum of each space coordinate to obtain a calculation result;
and taking the calculation result as a correction result of the space coordinates of the other key points.
8. An apparatus for keypoint detection, comprising:
the spatial coordinate determination module of the central key point is used for determining the spatial coordinate of the central key point of the target object in the image;
the spatial coordinate determination module of other key points is used for determining the spatial coordinates of other key points by utilizing the spatial coordinates of the central key point and the position relationship between the central key point and other key points of the target object;
and the spatial coordinate correction module is used for determining the plane coordinates of the other key points and correcting the spatial coordinates of the other key points by using the plane coordinates of the other key points to obtain a detection result.
9. The apparatus of claim 8, wherein the means for determining spatial coordinates of the central keypoint comprises:
the initial information determining submodule is used for determining a detection frame of the target object and a to-be-modified space coordinate of the central key point;
the key point determining submodule of the target object is used for determining key points appearing in the detection frame as key points of the target object;
and the space coordinate revision submodule is used for revising the space coordinate to be revised of the central key point by using the depth of the key point of the target object so as to obtain the space coordinate of the central key point of the target object.
10. The apparatus of claim 8, wherein the means for determining spatial coordinates of the other keypoints comprises:
a first keypoint set determining submodule, configured to select at least one first candidate keypoint from the other keypoints according to a predetermined rule, and form a first keypoint set by using the first candidate keypoint and the central keypoint;
a position relation determining submodule, configured to determine, by using a space vector model, a position relation between the central keypoint and each of the first candidate keypoints in the first keypoint set;
a spatial coordinate determination submodule of the first candidate keypoint, configured to determine a spatial coordinate of the corresponding first candidate keypoint by using the spatial coordinate of the central keypoint and a position relationship between the central keypoint and any one of the first candidate keypoints in the first keypoint set;
and taking the spatial coordinates of the first candidate key point as the spatial coordinates of the other key points.
11. The apparatus of claim 10, wherein the first set of keypoints is plural in number, and the first candidate keypoints in each of the first set of keypoints is different.
12. The apparatus of claim 9 or 10, the spatial coordinate determination module of the other keypoints, further comprising:
a second keypoint set determining submodule, configured to select at least one second candidate keypoint from other keypoints without the determined spatial coordinate according to the predetermined rule and an association relationship between the first candidate keypoint and the determined spatial coordinate, and combine the first candidate keypoint and the second candidate keypoint of the determined spatial coordinate into a second keypoint set;
a position relation determining submodule of the second candidate key, configured to respectively determine, by using the space vector model, a position relation between the first candidate key of the determined space coordinate and each of the second candidate keys in the second key set;
a spatial coordinate determination submodule of a second candidate keypoint, configured to determine a spatial coordinate of a corresponding second candidate keypoint by using a spatial coordinate of a first candidate keypoint of the determined spatial coordinate and a position relationship between the first candidate keypoint of the determined spatial coordinate and any second candidate keypoint of the second keypoint set;
and taking the space coordinates of the second candidate key points as the space coordinates of the other key points.
13. The apparatus of claim 8, wherein the spatial coordinate correction module comprises:
the plane coordinate determination submodule is used for determining the plane coordinate of each other key point by utilizing a plane coordinate determination model for any other key point;
and the space coordinate correction execution submodule is used for correcting the space coordinates of the other key points by utilizing the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.
14. The apparatus of claim 13, wherein in the case that the same other keypoint includes multiple spatial coordinates, the spatial coordinate modification performing submodule includes:
the weight distribution unit is used for distributing weight to each space coordinate according to the difference between the plane coordinates of other key points and the x-axis coordinate and the y-axis coordinate in each space coordinate;
the weight sum calculation unit is used for calculating the weight sum of each space coordinate to obtain a calculation result;
and taking the calculation result as a correction result of the space coordinates of the other key points.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 7.
16. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 7.
17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.
CN202110524390.2A 2021-05-13 2021-05-13 Method, device, equipment and storage medium for detecting key points Active CN113221751B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110524390.2A CN113221751B (en) 2021-05-13 2021-05-13 Method, device, equipment and storage medium for detecting key points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110524390.2A CN113221751B (en) 2021-05-13 2021-05-13 Method, device, equipment and storage medium for detecting key points

Publications (2)

Publication Number Publication Date
CN113221751A true CN113221751A (en) 2021-08-06
CN113221751B CN113221751B (en) 2024-01-12

Family

ID=77095672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110524390.2A Active CN113221751B (en) 2021-05-13 2021-05-13 Method, device, equipment and storage medium for detecting key points

Country Status (1)

Country Link
CN (1) CN113221751B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020029758A1 (en) * 2018-08-07 2020-02-13 北京市商汤科技开发有限公司 Object three-dimensional detection method and apparatus, intelligent driving control method and apparatus, medium, and device
CN112069988A (en) * 2020-09-04 2020-12-11 徐尔灵 Gun-ball linkage-based driver safe driving behavior detection method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020029758A1 (en) * 2018-08-07 2020-02-13 北京市商汤科技开发有限公司 Object three-dimensional detection method and apparatus, intelligent driving control method and apparatus, medium, and device
CN112069988A (en) * 2020-09-04 2020-12-11 徐尔灵 Gun-ball linkage-based driver safe driving behavior detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
贾昊龙;包启亮;秦睿;: "一种基于级联神经网络的无人机目标关键点检测算法", 光学与光电技术, no. 02 *

Also Published As

Publication number Publication date
CN113221751B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
CN114186632B (en) Method, device, equipment and storage medium for training key point detection model
US10346996B2 (en) Image depth inference from semantic labels
EP3852008A2 (en) Image detection method and apparatus, device, storage medium and computer program product
CN113963110B (en) Texture map generation method and device, electronic equipment and storage medium
CN114549710A (en) Virtual image generation method and device, electronic equipment and storage medium
CN114723888B (en) Three-dimensional hair model generation method, device, equipment, storage medium and product
CN112784765A (en) Method, apparatus, device and storage medium for recognizing motion
CN113205041A (en) Structured information extraction method, device, equipment and storage medium
CN111652181B (en) Target tracking method and device and electronic equipment
CN112580666A (en) Image feature extraction method, training method, device, electronic equipment and medium
CN114792359A (en) Rendering network training and virtual object rendering method, device, equipment and medium
CN113591683A (en) Attitude estimation method, attitude estimation device, electronic equipment and storage medium
CN113724388A (en) Method, device and equipment for generating high-precision map and storage medium
CN115147831A (en) Training method and device of three-dimensional target detection model
CN108027647B (en) Method and apparatus for interacting with virtual objects
CN116092120B (en) Image-based action determining method and device, electronic equipment and storage medium
CN115809325A (en) Document processing model training method, document processing method, device and equipment
CN113808192B (en) House pattern generation method, device, equipment and storage medium
CN113221751B (en) Method, device, equipment and storage medium for detecting key points
CN113344200A (en) Method for training separable convolutional network, road side equipment and cloud control platform
CN113379592A (en) Method and device for processing sensitive area in picture and electronic equipment
CN112749978A (en) Detection method, apparatus, device, storage medium, and program product
CN111814865A (en) Image identification method, device, equipment and storage medium
CN114842122B (en) Model rendering method, device, equipment and storage medium
CN113239899B (en) Method for processing image and generating convolution kernel, road side equipment and cloud control platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant