CN113221751B

CN113221751B - Method, device, equipment and storage medium for detecting key points

Info

Publication number: CN113221751B
Application number: CN202110524390.2A
Authority: CN
Inventors: 杨黔生; 王健; 沈辉; 丁二锐
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-05-13
Filing date: 2021-05-13
Publication date: 2024-01-12
Anticipated expiration: 2041-05-13
Also published as: CN113221751A

Abstract

The disclosure provides a method, a device, equipment and a storage medium for detecting key points, relates to the technical field of artificial intelligence, in particular to the field of computer vision and deep learning, and can be applied to a robot or an automatic driving scene. The specific implementation scheme is as follows: determining the space coordinates of a central key point of a target object in an image; determining the space coordinates of other key points by utilizing the space coordinates of the central key point and the position relation between the central key point and other key points of the target object; and determining plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result. Since the above process takes into account the depth factor of the target object. In particular, in the case of a plurality of target objects, the relative position data of the respective target objects is retained. And the accuracy of the space coordinates of each key point can be ensured by adopting the radiation and the correction mode by using the plane coordinates.

Description

Method, device, equipment and storage medium for detecting key points

Technical Field

The disclosure relates to the field of artificial intelligence technology, and in particular to the field of computer vision and deep learning, which can be applied to a robot or an automatic driving scene.

Background

With the progress of society and the rapid development of technology, industries such as short video, live broadcast, online education, automatic driving and the like are continuously emerging, and in various interaction scenes, the functional requirements for interaction based on key point information are more and more.

The related art relies on the output result of the neural network when performing the keypoint detection. Under the condition that the image is blocked or the neural network training precision is insufficient, the key point detection result cannot meet the requirement.

Disclosure of Invention

The present disclosure provides a method, apparatus, device, and storage medium for key point detection.

According to an aspect of the present disclosure, there is provided a method of keypoint detection, the method may include the steps of:

determining the space coordinates of a central key point of a target object in an image;

determining the space coordinates of other key points by utilizing the space coordinates of the central key point and the position relation between the central key point and other key points of the target object;

and determining plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.

According to another aspect of the present disclosure, there is provided an apparatus for keypoint detection, the apparatus may comprise:

the space coordinate determining module of the central key point is used for determining the space coordinate of the central key point of the target object in the image;

the space coordinate determining module of other key points is used for determining the space coordinates of other key points by utilizing the space coordinates of the central key point and the position relation between the central key point and other key points of the target object;

and the space coordinate correction module is used for determining the plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of any of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method in any of the embodiments of the present disclosure.

According to the technology disclosed by the disclosure, the central key point of the target object is utilized, and the space coordinates of other key points are confirmed in a radiation mode. Since the above process takes into account the depth factor of the target object. In particular, in the case of a plurality of target objects, the relative position data of the respective target objects is retained. And the accuracy of the space coordinates of each key point can be ensured by adopting a mode of determining the space coordinates by radiation and correcting by utilizing plane coordinates.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a flow chart of a method of keypoint detection in accordance with the present disclosure;

FIG. 2 is a schematic illustration of a detection frame of a target object and spatial coordinates of a center keypoint of the target object according to the present disclosure;

FIG. 3 is a schematic diagram of a positional relationship between keypoints of a target object according to the present disclosure;

FIG. 4 is a schematic illustration of planar coordinates of key points of a target object according to the present disclosure;

FIG. 5 is a schematic illustration of feature recognition of an image according to the present disclosure;

FIG. 6 is a flow chart of determining spatial coordinates of a center keypoint in accordance with the present disclosure;

FIG. 7 is a schematic diagram of depth values for key points of a target object according to the present disclosure;

FIG. 8 is a flow chart for determining spatial coordinates of other keypoints according to the disclosure;

FIG. 9 is a schematic illustration of determining a first candidate keypoint in accordance with the present disclosure;

FIG. 10 is a flow chart for determining spatial coordinates of other keypoints according to the disclosure;

FIG. 11 is a flow chart for correcting spatial coordinates of other keypoints according to the disclosure;

FIG. 12 is a flow chart for correcting spatial coordinates of other keypoints according to the disclosure;

FIG. 13 is a schematic diagram of an apparatus for keypoint detection in accordance with the present disclosure;

fig. 14 is a block diagram of an electronic device for implementing a method of keypoint detection of embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

As shown in fig. 1, the present disclosure relates to a method of keypoint detection, which may include the steps of:

s101: determining the space coordinates of a central key point of a target object in an image;

s102: determining the space coordinates of other key points by utilizing the space coordinates of the central key point and the position relation between the central key point and other key points of the target object;

s103: and determining plane coordinates of other key points, and correcting the space coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.

The execution subject of the above scheme of the present disclosure may be an application installed in an intelligent device such as a mobile phone, a tablet computer, a sound box, or may be a server of the foregoing application. And receiving the image through the intelligent equipment so as to realize the detection of the key points of the target object in the image.

The target object may be a person, an animal, or an object such as a vehicle. The following embodiment will be described taking the example in which the target object is a person.

The images may be input into a pre-trained model to obtain the spatial coordinates of the central keypoints of the target object. That is, the center keypoint may be a 3D keypoint. The pre-trained model may include a first model for determining target object box detection. For example, as shown in fig. 2, the first model may output a spatial coordinate or the like of a detection frame of the target object and a center key point of the target object in the image, and may output a width value, a height value, or the like of the detection frame.

Further, the pre-trained model may further include a second model for determining a positional relationship feature vector between keypoints of the target object. For example, as shown in fig. 3, the second model receives the image and may output a feature vector of the positional relationship between the keypoints of the target object in the image. The positional relationship may be a spatial positional relationship.

According to the spatial coordinates of the central key point of the target object and the position relation feature vectors among the key points of the target object, the spatial coordinates of other key points around the central key point can be determined by sequentially diverging from the central key point.

Further, the space coordinates of the key points of all the target objects in the detection frame can be determined by utilizing the space coordinates of other key points with the space coordinates confirmed to be continuously diverged.

The pre-trained model may further include a third model that determines planar coordinates of each keypoint of the target object. For example, as shown in fig. 4, the third model receives the image and may output the plane coordinates of each key point of the target object in the image.

Generally, since plane coordinates are two-dimensional data, reliability is higher than three-dimensional data. Therefore, the plane coordinates of each key point can be used for correcting the space coordinates of the corresponding key point so as to obtain a detection result.

As shown in connection with fig. 5, extraction of image features may be performed using a backbone network (hoursclass) to obtain image features. The image features are respectively input into the first model, the second model and the third model, so that a corresponding detection frame of the target object, space coordinates of central key points of the target object, position relation feature vectors among key points of the target object, plane coordinates of key points of the target object and the like can be obtained.

In addition, a fourth model may be included, which may be a model that determines the keypoint depth value, the specific function of which will be described in detail below.

The above-described first model, second model, and third model (including the fourth model) of the present application may be trained in advance. For example, taking a target object as a person as an example, the person image samples of different sexes, different body types, and different postures may be labeled with key points in advance. The labeling result may be output results of the first model, the second model, and the third model. The resolution of the labeling result can be consistent with the resolution of the output results of the first model, the second model and the third model.

The character image sample is input into a backbone network, and image characteristics can be extracted. The prediction result can be obtained through the first model to be trained, the second model to be trained and the third model to be trained respectively. And respectively comparing the predicted result with the corresponding labeling result, and adjusting parameters in the first model to be trained, the second model to be trained and the third model to be trained by utilizing the difference until the model output result converges.

In addition, training data is enriched in order to enhance the generalization ability of the model. The image samples may also be subjected to different scaling, different rotation angles and/or perturbation enhancement processing of the color space, etc., prior to training.

Through the scheme, the central key point of the target object is utilized, and the space coordinates of other key points are confirmed in a radiation mode. Since the above process takes into account the depth factor of the target object. In particular, in the case of a plurality of target objects, the relative position data of the respective target objects is preserved. And the space coordinates of the key points are sequentially determined in a divergent mode, and the accuracy of the space coordinates of the key points can be ensured by utilizing a plane coordinate correcting mode.

As shown in fig. 6, in one embodiment, step S101 may further include the sub-steps of:

s601: determining a detection frame of the target object and space coordinates to be revised of the central key point;

s602: determining key points appearing inside the detection frame as key points of the target object;

s603: and revising the space coordinates to be revised of the central key point by utilizing the depth of the key point of the target object so as to obtain the space coordinates of the central key point of the target object.

By using the first model, the detection frame of each target object in the image and the space coordinates of the central key point of the target object can be obtained. In the present embodiment, the spatial coordinates of the central key point of the target object obtained by the first model may be regarded as the spatial coordinates to be revised.

And determining the depth value of each key point of the target object by using the fourth model. For example, as shown in fig. 7, the fourth model receives the image and may output depth values of key points of the target object in the image.

For the same target object, the detection frame of the target object can be utilized to restrict each key point, so that each key point belonging to the same target object can be obtained relatively accurately.

An average value of depth values of key points of the same target object, etc. may be calculated. And taking the calculation result as a revision parameter to revise the space coordinate to be revised. For example, the spatial coordinates to be revised may be expressed as (x, y, z). The calculation result of the average value calculation using the depth values of the key points of the same target object may be denoted as z'. The revised spatial coordinates may be expressed as (x, y, z').

By the scheme, a key point depth detection result is introduced. And correcting the space coordinates of the central key point of the target object by utilizing the depth values of all the key points of the target object, so that the corrected coordinates of the central key point have higher accuracy.

As shown in fig. 8, in one embodiment, step S102 may specifically include the following sub-steps:

s801: selecting at least one first candidate key point from other key points according to a preset rule, and forming a first key point set by the first candidate key point and the central key point;

s802: determining the position relation between the central key point and each first candidate key in the first key point set by using a space vector model;

s803: determining the space coordinates of the corresponding first candidate key points by utilizing the space coordinates of the central key points and the position relation between the central key points and any first candidate key point in the first key point set;

and taking the spatial coordinates of the first candidate key points as the spatial coordinates of other key points.

The predetermined rule may be selected according to a distance from the center key point, or may be selected according to a linkage relationship between joints or parts represented by the key points, or may be selected randomly.

Taking the linkage relation between the joints represented by the key points as an example, the key point corresponding to another joint or part having a linkage relation with the joint or part represented by the central key point may be selected as the first candidate key point by taking the central key point as the origin. The first candidate key point may be one or more.

The first candidate keypoint is combined with the central keypoint into a set, which may be referred to as a link structure consisting of keypoints.

And obtaining the space coordinates of each first candidate key point in the set by using the space coordinates of the central key point and the position relation among the key points obtained by the second model.

By adopting the scheme, the accuracy of the space coordinates of all the key points can be ensured by adopting the mode of radiating the key points around the central key point by taking the central key point as the origin to determine the space coordinates of the surrounding key points.

In one embodiment, the number of first keypoint sets is a plurality, and the first candidate keypoints in each first keypoint set are different.

As shown in connection with fig. 9, a plurality of predetermined rules may be utilized to select a first candidate keypoint having an association with a central keypoint to compose a different set. That is, different sets correspond to different link structures. The first candidate keypoints in each first set of keypoints may be different in number, or may refer to including different keypoints. Taking the target object as a person as an example, the person may include 17 keypoints, and 16 keypoints other than the center keypoint may be the first candidate keypoint in the first set of keypoints. In addition, N keypoints that are closer to the center keypoint may be selected as the first candidate keypoints in the first set of keypoints according to the distances.

By the above scheme, each first candidate key point may obtain a plurality of different spatial coordinates. In the subsequent processing process, a plurality of different space coordinates can be utilized to obtain a final calculation result of the space coordinates of each first candidate key point. The difference between the final calculation result of the space coordinate and the true value is small, and the precision requirement is met.

The specific process of obtaining the final calculation result of the spatial coordinates of each first candidate key point by using multiple sets of different spatial coordinates will be described in detail later.

As shown in fig. 10, in one embodiment, the method further includes:

s1001: selecting at least one second candidate key point from other key points with the space coordinates not being determined according to a preset rule and the association relation between the first candidate key point and the determined space coordinates, and forming a second key point set by the first candidate key point and the second candidate key point of the determined space coordinates;

s1002: respectively determining the position relation between the first candidate key point of the determined space coordinate and each second candidate key in the second key point set by using a space vector model;

s1003: determining the space coordinates of the corresponding second candidate key points by using the space coordinates of the first candidate key points of the determined space coordinates and the position relation of any second candidate key in the first candidate key points and the second key point set of the determined space coordinates;

and taking the spatial coordinates of the second candidate key points as the spatial coordinates of other key points.

In the case that the first candidate keypoints cannot cover all the keypoints of the target object, the keypoints of which the spatial coordinates are not determined may be regarded as the second candidate keypoints.

In determining the spatial coordinates of the second candidate keypoint, the first candidate keypoint having determined the spatial coordinates may be used. For example, for a first candidate keypoint for which spatial coordinates have been determined, a second candidate keypoint associated with the first candidate keypoint may be selected by the aforementioned predetermined rule, constituting a second set of keypoints.

And according to the output result of the second model, the position relation between the first candidate key points in the second key point set and each second candidate key point can be obtained. And then the positional relationship can be utilized to obtain the positional relationship of each second candidate key point.

Further, in the foregoing embodiment, when the number of the first keypoint sets is plural, plural second keypoint sets may be obtained correspondingly. That is, a plurality of spatial coordinates of each second candidate keypoint may be obtained. Similarly, the final calculation result of the space coordinates of each first candidate key point can be obtained according to the different space coordinates.

Since the spatial coordinates of the second key point are obtained by using the spatial coordinates of the first key point, the coordinate accuracy of the second key point can also meet the requirement.

As shown in fig. 11, in one embodiment, step S103 may further include the sub-steps of:

s1101: for any other key point, determining the plane coordinates of each other key point by using a plane coordinate determination model;

s1102: and correcting the space coordinates of other key points by utilizing the difference between the plane coordinates of other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.

In the current embodiment, the planar coordinate acquisition model may employ a gaussian heat map model.

For example, the image resolution is 640×640. The target object in the image is a person, which includes 17 key points in total. Each (circular) keypoint will occupy a fraction of the pixels in the image in the form of a gaussian heat map. The coordinates at the center of the keypoint may be referred to as the planar coordinates of the keypoint.

The way to correct the spatial coordinates of the keypoints may include the following:

in the case that the difference between the plane coordinates of the key points and the x-axis coordinates and the y-axis coordinates in the space coordinates is not greater than the corresponding threshold value, the x-axis coordinates and the y-axis coordinates in the space coordinates can be replaced by the plane coordinates.

In the case that the difference between the plane coordinates of the key points and the x-axis coordinates and the y-axis coordinates in the space coordinates is greater than the corresponding threshold value, the x-axis coordinates and the y-axis coordinates in the space coordinates can be reserved.

Alternatively, different weights may be set correspondingly according to the magnitude of the difference. For example, for plane coordinates, a weight q may be set ₁ . According to the size of the difference, the weight q can be set as the x-axis coordinate and the y-axis coordinate in the space coordinates ₂ . The x-axis coordinate and the y-axis coordinate in the space coordinates can be corrected by calculating the weight sums.

By means of the scheme, the space coordinates can be corrected by means of the plane coordinates of the key points with relatively high accuracy. Since the depth value (z-axis coordinate) in the spatial coordinates is corrected according to the output result of the second model, in the present embodiment, the x-axis coordinate and the y-axis coordinate in the spatial coordinates are further corrected, so that the accuracy of the spatial coordinates is higher.

As shown in fig. 12, in an embodiment, in a case where the same other key point includes a plurality of spatial coordinates, step S103 may further include the steps of:

s1201: according to the difference between the plane coordinates of other key points and the x-axis coordinates and the y-axis coordinates in each space coordinate, weight is distributed to each space coordinate;

s1202: calculating the weight sum of each space coordinate to obtain a calculation result;

s1203: and taking the calculation result as a correction result of the space coordinates of other key points.

The other keypoints may correspond to the first candidate keypoint and the second candidate keypoint described previously. For any other key point, the plane coordinates of the key point may be taken as a reference value. And scoring each space coordinate according to the plane coordinate of the key point.

Scoring may be based on differences in the x-axis, y-axis and plane coordinates of the spatial coordinates. The smaller the difference, the higher the score.

For spatial coordinates with high scores, high weights may be assigned. Correspondingly, for coordinates with low scores, a low weight may be assigned.

Further, a corresponding threshold may also be set. In the case where the score is below the corresponding threshold, the spatial coordinates corresponding to the low score may be discarded.

For the remaining spatial coordinates, a weight sum may be calculated, with the weight sum being the final result of the coordinates.

According to the scheme, multiple space coordinates of the same key point can be obtained according to different link structures, comprehensive calculation is performed by utilizing the multiple space coordinates, and a final result with high accuracy can be obtained.

As shown in fig. 13, the present disclosure relates to an apparatus for keypoint detection, which may include the following components:

a spatial coordinate determining module 1301 for determining spatial coordinates of a central key point of the target object in the image;

the spatial coordinates determining module 1302 of other key points is configured to determine the spatial coordinates of other key points by using the spatial coordinates of the central key point and the positional relationship between the central key point and other key points of the target object;

the spatial coordinate correction module 1303 is configured to determine the plane coordinates of other key points, and correct the spatial coordinates of other key points by using the plane coordinates of other key points to obtain a detection result.

In one embodiment, the spatial coordinate determination module 1301 of the center key point may further include:

the initial information determining submodule is used for determining a detection frame of the target object and space coordinates to be repaired of the central key point;

a key point determining submodule of the target object, which is used for determining the key points appearing in the detection frame as the key points of the target object;

and the space coordinate revising sub-module is used for revising the space coordinate to be revised of the central key point by utilizing the depth of the key point of the target object so as to obtain the space coordinate of the central key point of the target object.

In one embodiment, the spatial coordinate determination module 1302 of other keypoints may further comprise:

the first key point set determining submodule is used for selecting at least one first candidate key point from other key points according to a preset rule, and forming a first key point set by the first candidate key point and the central key point;

the position relation determining sub-module is used for respectively determining the position relation between the central key point and each first candidate key in the first key point set by using the space vector model;

the space coordinate determination submodule is used for determining the space coordinate of the corresponding first candidate key point by utilizing the space coordinate of the central key point and the position relation between the central key point and any first candidate key point in the first key point set;

the second key point set determining sub-module is used for selecting at least one second candidate key point from other key points with the space coordinates which are not determined according to a preset rule and the association relation with the first candidate key points with the determined space coordinates, and forming a second key point set from the first candidate key points with the determined space coordinates and the second candidate key points;

the second candidate key position relation determining submodule is used for respectively determining the position relation of each second candidate key in the first candidate key point and the second key point set of the determined space coordinates by utilizing the space vector model;

a spatial coordinate determining sub-module of the second candidate key point, configured to determine a spatial coordinate of the corresponding second candidate key point by using the spatial coordinate of the first candidate key point of the determined spatial coordinate and a positional relationship between the first candidate key point of the determined spatial coordinate and any second candidate key in the second key point set;

In one embodiment, the spatial coordinate correction module 1303 may further include:

the plane coordinate determination submodule is used for determining the plane coordinate of each other key point by using a plane coordinate determination model for any other key point;

and the space coordinate correction execution sub-module is used for correcting the space coordinates of other key points by utilizing the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.

In one embodiment, in the case that the same other key point includes a plurality of spatial coordinates, the spatial coordinate correction execution sub-module includes:

the weight distribution unit is used for distributing weights to each space coordinate according to the difference between the plane coordinates of other key points and the x-axis coordinates and the y-axis coordinates in each space coordinate;

the weight sum calculating unit is used for calculating the weight sum of each space coordinate to obtain a calculation result;

and taking the calculation result as a correction result of the space coordinates of other key points.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

Fig. 14 shows a schematic block diagram of an electronic device 1400 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 14, the electronic device 1400 includes a computing unit 1410 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 1420 or a computer program loaded from a storage unit 1480 into a Random Access Memory (RAM) 1430. In RAM1430, various programs and data may also be stored as needed for operation of device 1400. The computing unit 1410, the ROM1420, and the RAM1430 are connected to each other through a bus 1440. An input output (I/O) interface 1450 is also connected to bus 1440.

A number of components in electronic device 1400 are connected to I/O interface 1450, including: an input unit 1460 such as a keyboard, a mouse, etc.; an output unit 1470 such as various types of displays, speakers, and the like; storage unit 1480, such as a magnetic disk, optical disk, etc.; and a communication unit 1490 such as a network card, modem, wireless communication transceiver, etc. The communication unit 1490 allows the electronic device 1400 to exchange information/data with other devices through a computer network, such as the internet, and/or various telecommunications networks.

The computing unit 1410 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1410 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1410 performs the various methods and processes described above, such as the method of keypoint detection. For example, in some embodiments, the method of keypoint detection may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1480. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1400 via the ROM1420 and/or the communication unit 1490. When the computer program is loaded into RAM1430 and executed by computing unit 1410, one or more steps of the method of keypoint detection described above may be performed. Alternatively, in other embodiments, computing unit 1410 may be configured to perform the method of keypoint detection by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A method of keypoint detection, comprising:

determining the space coordinates of other key points by utilizing the space coordinates of the central key point and the position relation between the central key point and the other key points of the target object;

determining plane coordinates of the other key points, and correcting the space coordinates of the other key points by using the plane coordinates of the other key points to obtain a detection result;

the determining the spatial coordinates of other key points by using the spatial coordinates of the central key point and the position relation between the central key point and the other key points of the target object includes:

selecting at least one first candidate key point from the other key points according to a preset rule, and forming a first key point set by the first candidate key point and the central key point;

determining the position relation between the central key point and each first candidate key in the first key point set by using a space vector model;

determining the spatial coordinates of the corresponding first candidate key points by utilizing the spatial coordinates of the central key points and the position relation between the central key points and any one of the first candidate key points in the first key point set;

and taking the space coordinates of the first candidate key points as the space coordinates of the other key points.

2. The method of claim 1, wherein determining spatial coordinates of a central keypoint of the target object in the image comprises:

determining a detection frame of the target object and space coordinates to be revised of the central key point;

determining key points appearing inside the detection frame as key points of the target object;

and revising the space coordinates to be revised of the central key point by utilizing the depth of the key point of the target object so as to obtain the space coordinates of the central key point of the target object.

3. The method of claim 1, wherein the number of first set of keypoints is a plurality, the first candidate keypoint in each of the first set of keypoints being different.

4. A method according to claim 1 or 3, further comprising:

selecting at least one second candidate key point from other key points with undetermined space coordinates according to the preset rule and the association relation with the first candidate key point with the determined space coordinates, and forming a second key point set by the first candidate key point with the determined space coordinates and the second candidate key point;

determining the position relation between the first candidate key point of the determined space coordinate and each second candidate key in the second key point set by using the space vector model;

determining the spatial coordinates of the corresponding second candidate key points by utilizing the spatial coordinates of the first candidate key points of the determined spatial coordinates and the position relation between the first candidate key points of the determined spatial coordinates and any one of the second candidate keys in the second key point set;

and taking the space coordinates of the second candidate key points as the space coordinates of the other key points.

5. The method of claim 1, wherein the determining the plane coordinates of the other key points, and correcting the spatial coordinates of the other key points by using the plane coordinates of the other key points to obtain the detection result, includes:

for any other key point, determining the plane coordinates of each other key point by using a plane coordinate determining model;

and correcting the space coordinates of the other key points by utilizing the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.

6. The method according to claim 5, wherein, in the case that the same other key point includes a plurality of spatial coordinates, the correcting the spatial coordinates of the other key point by using differences between the planar coordinates of the other key point and x-axis coordinates and y-axis coordinates of the spatial coordinates includes:

according to the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in each space coordinate, weight is distributed to each space coordinate;

calculating the weight sum of each space coordinate to obtain a calculation result;

and taking the calculation result as a correction result of the space coordinates of the other key points.

7. An apparatus for keypoint detection, comprising:

the space coordinate determining module of other key points is used for determining the space coordinates of other key points by utilizing the space coordinates of the central key point and the position relation between the central key point and the other key points of the target object;

the space coordinate correction module is used for determining the plane coordinates of the other key points, and correcting the space coordinates of the other key points by utilizing the plane coordinates of the other key points to obtain a detection result;

the space coordinate determining module of the other key points comprises:

a first key point set determining sub-module, configured to select at least one first candidate key point from the other key points according to a predetermined rule, and form the first candidate key point and the central key point into a first key point set;

the position relation determining submodule is used for respectively determining the position relation between the central key point and each first candidate key in the first key point set by utilizing a space vector model;

the space coordinate determining sub-module is used for determining the space coordinate of the corresponding first candidate key point by utilizing the space coordinate of the central key point and the position relation between the central key point and any one of the first candidate key points in the first key point set;

8. The apparatus of claim 7, wherein the spatial coordinate determination module of the center keypoint comprises:

9. The apparatus of claim 7, wherein the number of the first set of keypoints is a plurality, the first candidate keypoint in each of the first set of keypoints being different.

10. The apparatus of claim 7 or 8, the spatial coordinate determination module of the other keypoints, further comprising:

the second key point set determining sub-module is used for selecting at least one second candidate key point from other key points with undetermined space coordinates according to the preset rule and the association relation with the first candidate key points with the determined space coordinates, and forming a second key point set by the first candidate key points with the determined space coordinates and the second candidate key points;

a second candidate key position relation determining sub-module, configured to determine, using the space vector model, a position relation between a first candidate key point of the determined space coordinate and each of the second candidate keys in the second key point set;

a spatial coordinate determining sub-module of a second candidate key point, configured to determine a spatial coordinate of a corresponding second candidate key point by using a spatial coordinate of a first candidate key point of the determined spatial coordinate and a positional relationship between the first candidate key point of the determined spatial coordinate and any one of the second candidate keys in the second key point set;

11. The apparatus of claim 7, wherein the spatial coordinate correction module comprises:

the plane coordinate determining sub-module is used for determining the plane coordinates of each other key point by using a plane coordinate determining model for any other key point;

and the space coordinate correction execution sub-module is used for correcting the space coordinates of the other key points by utilizing the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in the space coordinates.

12. The apparatus of claim 11, wherein, in a case where the same other keypoint includes a plurality of spatial coordinates, the spatial coordinate correction performing sub-module comprises:

the weight distribution unit is used for distributing weight to each space coordinate according to the difference between the plane coordinates of the other key points and the x-axis coordinates and the y-axis coordinates in each space coordinate;

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.

14. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 6.