CN109284681A

CN109284681A - Position and posture detection method and device, electronic equipment and storage medium

Info

Publication number: CN109284681A
Application number: CN201810950565.4A
Authority: CN
Inventors: 汪旻; 刘文韬; 钱晨
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2018-08-20
Filing date: 2018-08-20
Publication date: 2019-01-29
Anticipated expiration: 2038-08-20
Also published as: CN109284681B

Abstract

This disclosure relates to a kind of position and posture detection method and device, electronic equipment and storage medium, the method comprise the steps that the first location information in each fisrt feature portion of target object in target image is determined, wherein the target image is absorbed by picture pick-up device；The three dimensional local information in the second feature portion for the target object is determined based on the device parameter of first location information and the picture pick-up device, the second feature portion includes at least each fisrt feature portion；The spatial pose of the target object is determined based on the corresponding first location information and three dimensional local information.Pose detection accuracy can be improved in the embodiment of the present disclosure.

Description

Position and posture detection method and device, electronic equipment and storage medium

Technical field

This disclosure relates to technical field of image processing more particularly to a kind of position and posture detection method and device, electronic equipment and Storage medium.

Background technique

In computer vision, human body attitude estimation is important man-machine interactive interface, especially to human body in three-dimensional space Between in posture and location estimation, be most basic and most critical content in interactive process.

In traditional technology, depth camera head apparatus costly is utilized, the three-dimensional for carrying out relative coarseness to human body is built Mould, to achieve the purpose that predict human body with respect to camera pose.But nowadays most of picture pick-up device only has RGB triple channel Data are unable to get large-scale popularization in the conventional way.

Using RGB camera device, existing technology is confined to two-dimension human body critical point detection and 3 D human body posture Rough estimate can not position size and location of the human body in entire three-dimensional space, it means that whether unpredictable people carries out Movement.

In recent years, the progress of deep learning algorithm, so that directly predicting that 3 D human body posture becomes mostly by picture The preferred method of number application, but the prediction of pose is directly carried out using deep neural network, it is also unable to get satisfactory Precision.

Summary of the invention

The embodiment of the present disclosure provides a kind of device parameter executor's posture detection of combination picture pick-up device to improve Position and posture detection method and device, the electronic equipment and storage medium of detection accuracy.

According to the one side of the disclosure, a kind of position and posture detection method is provided comprising:

Determine the first location information in each fisrt feature portion of target object in target image, wherein the target image by Picture pick-up device intake；

The second spy for the target object is determined based on the device parameter of first location information and the picture pick-up device The three dimensional local information in sign portion, the second feature portion include at least each fisrt feature portion；

The spatial pose of the target object is determined based on the corresponding first location information and three dimensional local information.

In the embodiments of the present disclosure, the first position in each fisrt feature portion of target object is believed in the determining target image Breath includes:

Obtain the information in the fisrt feature portion to be identified；

The information in the fisrt feature portion based on acquisition identifies each fisrt feature portion in the target object；

The first location information in each fisrt feature portion is determined based on the two-dimensional coordinate system of foundation.

In the embodiments of the present disclosure, described to be directed to based on the determination of the device parameter of first location information and the picture pick-up device The three dimensional local information in the second feature portion of the target object includes:

Device parameter based on the picture pick-up device executes normalized to each first location information, obtains second Location information；

The three dimensional local information in the second feature portion is determined using each second location information.

In the embodiments of the present disclosure, the device parameter based on the picture pick-up device holds each first location information Row normalized, obtaining second location information includes:

The first normalized is executed to the first location information using the device parameter, it is special to obtain each described first The third place information in sign portion；

Determine the mean value and variance of the third place information in each fisrt feature portion；

The second normalized is executed to each the third place information based on the mean value and variance, obtains described second Location information.

In the embodiments of the present disclosure, described that the first normalization is executed to the first location information using the device parameter Processing, the third place information for obtaining each fisrt feature portion include:

Distortion is gone to handle first location information execution using the device parameter；

To going distortion treated first location information to execute the first normalized, each fisrt feature portion is obtained The third place information.

In the embodiments of the present disclosure, described to go distortion to handle first location information execution using the device parameter Include:

Distortion processing is gone to handle to described in first location information execution using the first formula, wherein the first formula packet It includes:

X'=(x-c_x)/f_x

Y'=(y-c_y)/f_y；

R=x'²+y'²

Δ x=2p₁x'y'+p₂(r²+2x'²)

Δ y=p₁(r²+2y'²)+2p₂x'y'

U'=(x'- Δ x) t

V'=(y'- Δ y) t

U=u'f_x+c_x

V=v'f_x+c_y

Wherein, f_xThe focal length for being picture pick-up device in x-axis, f_yFor the focal length of picture pick-up device on the y axis, c_xAnd c_yRespectively The abscissa value and ordinate value of the optical center coordinate position of picture pick-up device, k₁、k₂、k₃、k₄、k₅、k₆The respectively diameter of picture pick-up device To distortion parameter, p₁、p₂For the tangential distortion parameter of picture pick-up device, x and y are respectively the abscissa value of first location information and indulge Coordinate value, u and v are respectively processing treated abscissa value and the ordinate value that distort.

In the embodiments of the present disclosure, described that each the third place information execution second is returned based on the mean value and variance One change processing, obtaining the second location information includes:

Using the second formula, the second normalized is executed to the third place information based on the mean value and variance, Wherein second formula includes:

Wherein, s and t respectively indicates the abscissa and ordinate of second location information, x_iAnd y_iIt respectively indicates i-th first The abscissa value and ordinate value of the third place information of feature, mean function are mean function, and std function is variance function, i For positive integer.

In the embodiments of the present disclosure, the three-dimensional position that the second feature portion is determined using each second location information Confidence ceases

Using preset model according to the second location information in each fisrt feature portion, obtain for the target object The three dimensional local information in second feature portion；

Wherein, the preset model includes deep learning model.

The 4th three-dimensionally formed location information in the second feature portion is determined using each second location information；

Inverse normalized is executed to each 4th location information and obtains the three dimensional local information in each second feature portion.

In the embodiments of the present disclosure, described execute each 4th location information against normalized obtains each second spy The three dimensional local information in sign portion includes:

Using third formula, inverse normalized is executed to each 4th location information and obtains three dimensional local information, Described in third formula include:

X '=X*std (X)+mean (X)

Y '=Y*std (Y)+mean (Y)

Z'=Z*std (Z)+mean (Z)

Wherein, X ', Y ' and Z' respectively indicate three coordinate values of three dimensional local information, and X, Y and Z respectively indicate the 4th position Three coordinate values of information, std indicate that variance function, mean indicate mean function.

In the embodiments of the present disclosure, described based on described in the corresponding first location information and three dimensional local information determination The spatial pose of target object includes:

Correction parameter is determined based on the first location information and three dimensional local information in same characteristic features portion；

The three dimensional local information is corrected based on the correction parameter；

The spatial pose of the target object is determined based on the three dimensional local information after correction.

In the embodiments of the present disclosure, the first location information and three dimensional local information based on same characteristic features portion determines school Positive parameter includes:

The three dimensional local information is converted into the 5th location information of two dimensional form using spin matrix and translation matrix；

Based on the peace of spin matrix described in the difference feedback regulation between the 5th location information and second location information Matrix is moved, until the difference meets preset requirement；

Spin matrix and translation matrix when meeting preset requirement based on the difference determine the correction parameter.

In the embodiments of the present disclosure, described that the three dimensional local information is converted into two using spin matrix and translation matrix 5th location information of dimension form, comprising:

The three dimensional local information is converted into two dimensional form using spin matrix and translation matrix by the 4th formula 5th location information, wherein the 4th formula includes:

S₅=K [R | T] S₃

Wherein, f_xThe focal length for being picture pick-up device in x-axis, f_yFor the focal length of picture pick-up device on the y axis, c_xAnd c_yRespectively The abscissa value and ordinate value of the optical center coordinate position of picture pick-up device, S₅For the 5th location information, S₃For three dimensional local information.

In the embodiments of the present disclosure, the difference feedback based between the 5th location information and second location information The spin matrix and translation matrix are adjusted, until the difference meets preset requirement, comprising:

The feedback regulation of the spin matrix and translation matrix, the expression formula of the Optimized model are executed using Optimized model Include:

Wherein, the minimum function of arg min function representation difference, S₂Indicate second location information.

In the embodiments of the present disclosure, described to include: based on the correction parameter correction three dimensional local information

Using three dimensional local information described in the 5th formula correction, wherein the 5th formula includes:

P'=P*R+T

Wherein, the three dimensional local information before P is correction, P ' are the three dimensional local information after correction, and R is spin matrix, T For translation matrix.

In the embodiments of the present disclosure, described based on described in the corresponding first location information and three dimensional local information determination The spatial pose of target object, further includes:

The second identifier of first identifier and second feature portion based on the fisrt feature portion determines fisrt feature portion and Same characteristic features portion in two features.

In the embodiments of the present disclosure, the method also includes:

Obtain target image；

Identify the target object in the target image.

In the embodiments of the present disclosure, the fisrt feature portion includes: head, neck, shoulder, ancon, wrist, hip, knee At least one of portion, ankle.

According to the second aspect of an embodiment of the present disclosure, a kind of apparatus for detecting position and posture is provided comprising:

First determining module is configured to determine the first position letter in each fisrt feature portion of target object in target image Breath, wherein the target image is absorbed by picture pick-up device；

Second determining module, the device parameter determination for being configured to first location information and the picture pick-up device are directed to The three dimensional local information in the second feature portion of the target object, the second feature portion include at least each fisrt feature portion；

Third determining module is configured to described in the corresponding first location information and three dimensional local information determination The spatial pose of target object.

In the embodiments of the present disclosure, first determining module includes:

Information acquisition unit is configured to obtain the information in the fisrt feature portion to be identified；

Feature identification unit is configured to the information in the fisrt feature portion obtained, identifies in the target object Each fisrt feature portion；

Two-dimensional position determination unit, the two-dimensional coordinate system for being configured to establish determine the of each fisrt feature portion One location information.

In the embodiments of the present disclosure, second determining module includes:

Normalization unit, the device parameter for being configured to the picture pick-up device execute each first location information Normalized obtains second location information；

Three-dimensional position determination unit is configured to determine the three of the second feature portion using each second location information Tie up location information.

In the embodiments of the present disclosure, the normalization unit is additionally configured to using the device parameter to the first position Information executes the first normalized, obtains the third place information in each fisrt feature portion, and determine each fisrt feature The mean value and variance of the third place information in portion, and second is executed to each the third place information based on the mean value and variance Normalized obtains the second location information.

In the embodiments of the present disclosure, the normalization unit is additionally configured to using the device parameter to the first position Information execution goes distortion to handle, and to going distortion treated first location information to execute the first normalized, obtains each institute State the third place information in fisrt feature portion.

In the embodiments of the present disclosure, the normalization unit is additionally configured to using the first formula to the first location information Distortion processing processing is gone described in execution, wherein the first formula includes:

X'=(x-c_x)/f_x

Y'=(y-c_y)/f_y；

R=x'²+y'²

Δ x=2p₁x'y'+p₂(r²+2x'²)

Δ y=p₁(r²+2y'²)+2p₂x'y'

U'=(x'- Δ x) t

V'=(y'- Δ y) t

U=u'f_x+c_x

V=v'f_x+c_y

In the embodiments of the present disclosure, the normalization unit is additionally configured to be based on the mean value and side using the second formula Difference executes the second normalized to the third place information, wherein second formula includes:

In the embodiments of the present disclosure, the three-dimensional position determination unit is additionally configured to using preset model according to each described the The second location information of one features obtains the three dimensional local information in the second feature portion for the target object；

Wherein, the preset model includes deep learning model.

Three-dimensional position determination unit is configured to determine the three of the second feature portion using each second location information Tie up the 4th location information formed；

Inverse normalization unit is configured to obtain each second spy to each inverse normalized of 4th location information execution The three dimensional local information in sign portion.

In the embodiments of the present disclosure, the inverse normalization unit is additionally configured to using third formula, to each described 4th Confidence breath executes inverse normalized and obtains three dimensional local information, wherein the third formula includes:

X '=X*std (X)+mean (X)

Y '=Y*std (Y)+mean (Y)

Z'=Z*std (Z)+mean (Z)

In the embodiments of the present disclosure, the third determining module includes:

Correction parameter determination unit, first location information and the three dimensional local information for being configured to same characteristic features portion are true Determine correction parameter；

Unit is corrected, the correction parameter is configured to and corrects the three dimensional local information；

Pose determination unit, the three dimensional local information after being configured to correction determine the space bit of the target object Appearance.

In the embodiments of the present disclosure, the correction parameter determination unit is additionally configured to utilize spin matrix and translation matrix will The three dimensional local information is converted into the 5th location information of two dimensional form；

In the embodiments of the present disclosure, the correction parameter determination unit is additionally configured to utilize spin matrix by the 4th formula The three dimensional local information is converted into the 5th location information of two dimensional form with translation matrix, wherein the 4th formula packet It includes:

S₅=K [R | T] S₃

In the embodiments of the present disclosure, the correction parameter determination unit is additionally configured to execute the rotation using Optimized model The expression formula of the feedback regulation of matrix and translation matrix, the Optimized model includes:

In the embodiments of the present disclosure, the correction unit is additionally configured to believe using three-dimensional position described in the 5th formula correction Breath, wherein the 5th formula includes:

P'=P*R+T

In the embodiments of the present disclosure, the third determining module further include:

Matching unit is configured to the first identifier in the fisrt feature portion and the second identifier in second feature portion, Determine the same characteristic features portion in fisrt feature portion and second feature portion.

In the embodiments of the present disclosure, described device further include:

Image collection module is configured to obtain target image；

Object Identification Module is configured to identify the target object in the target image.

According to the third aspect of an embodiment of the present disclosure, a kind of electronic equipment is provided comprising:

Processor；

Memory for storage processor executable instruction；

Wherein, the processor is configured to: perform claim require any one of 1 to 18 described in method.

According to the third aspect of an embodiment of the present disclosure, a kind of computer readable storage medium is provided, meter is stored thereon with Calculation machine program instruction, which is characterized in that realize in claim 1 to 18 and appoint when the computer program instructions are executed by processor Method described in meaning one.

The embodiment of the present disclosure executes the pose detection of objects in images by bonding apparatus parameter, wherein can eliminate not It is influenced with device parameter on brought by attitude detection, can be improved the precision of pose detection, while the embodiment of the present disclosure can be with The correction parameter for adjusting three dimensional local information is determined using the difference between first location information and three dimensional local information, thus Further increase the detection accuracy of pose.

It should be understood that above general description and following detailed description is only exemplary and explanatory, rather than Limit the disclosure.

According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.

Detailed description of the invention

The drawings herein are incorporated into the specification and forms part of this specification, and those figures show meet this public affairs The embodiment opened, and together with specification it is used to illustrate the technical solution of the disclosure.

Fig. 1 shows the flow chart of the position and posture detection method according to the embodiment of the present disclosure；

Fig. 2 shows the flow charts of step S100 in the attitude detecting method according to the embodiment of the present disclosure；

The corresponding relationship in fisrt feature portion that Fig. 3 shows the target image according to the embodiment of the present disclosure and identifies；

Fig. 4 shows the flow chart of the step S200 in the position and posture detection method according to the embodiment of the present disclosure；

Fig. 5, which is shown, determines that the three-dimensional position in second feature portion is believed using each second location information according in the embodiment of the present disclosure The structural schematic diagram of second preset model of breath；

Fig. 6 shows the flow chart of step S201 in the position and posture detection method according to the embodiment of the present disclosure；

Fig. 7 shows the flow chart of the step S2011 in the position and posture detection method according to the embodiment of the present disclosure；

Fig. 8 shows the flow chart of step S200 in the position and posture detection method according to the embodiment of the present disclosure

Fig. 9 shows the flow chart of step S300 in the position and posture detection method according to the embodiment of the present disclosure；

Figure 10 shows the flow chart of step S301 in the position and posture detection method according to the embodiment of the present disclosure；

Figure 11 shows the block diagram of the apparatus for detecting position and posture according to the embodiment of the present disclosure；

Figure 12 shows the block diagram of the electronic equipment according to the embodiment of the present disclosure；

Figure 13 shows the block diagram of the electronic equipment according to the embodiment of the present disclosure.

Specific embodiment

Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.

Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.

The terms "and/or", only a kind of incidence relation for describing affiliated partner, indicates that there may be three kinds of passes System, for example, A and/or B, can indicate: individualism A exists simultaneously A and B, these three situations of individualism B.In addition, herein Middle term "at least one" indicate a variety of in any one or more at least two any combination, it may for example comprise A, B, at least one of C can indicate to include any one or more elements selected from the set that A, B and C are constituted.

In addition, giving numerous details in specific embodiment below to better illustrate the disclosure. It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.

It is appreciated that above-mentioned each embodiment of the method that the disclosure refers to, without prejudice to principle logic, To engage one another while the embodiment to be formed after combining, as space is limited, the disclosure is repeated no more.

In addition, the disclosure additionally provides image processing apparatus, electronic equipment, computer readable storage medium, program, it is above-mentioned It can be used to realize any image processing method that the disclosure provides, corresponding technical solution and description and referring to method part It is corresponding to record, it repeats no more.

Fig. 1 shows the flow chart of the position and posture detection method according to the embodiment of the present disclosure, wherein as shown in Figure 1, the pose Detection method may include:

S100: the first location information in each fisrt feature portion of target object in target image is determined, wherein the target Image is absorbed by picture pick-up device；

S200: for the target object is determined based on the device parameter of first location information and the picture pick-up device The three dimensional local information of two features, the second feature portion include at least each fisrt feature portion；

S300: the space bit of the target object is determined based on the corresponding first location information and three dimensional local information Appearance.

Position and posture detection method provided by the embodiment of the present disclosure can be used for the pose detection of human object in image, pose It may include the location information and spatial attitude of each features of human object.Wherein, posture may include running, sitting, standing, going Walk, jump, crawling, droping to the ground, flying etc. state, being also possible to other states, it is all can be as the space shape of target object The case where state, all can serve as the posture type of embodiment of the present disclosure identification.In addition, the embodiment of the present disclosure can be somebody's turn to do in conjunction with intake The parameter of the picture pick-up device of image executes the position optimizations of each features of human object, and using determining correction parameter into One-step optimization each position information, to improve the accuracy of attitude detection.

Target object in the present embodiment refers to the image for executing attitude detection, is performed posture in the target object The object of detection is referred to as target object, which may include people, animal etc..It can be obtained first in the embodiment of the present disclosure Target image is taken, such as target image can be selected from the image data of storage, or can also be received from other equipment The target image of transmission, or be also possible to directly absorb target image by picture pick-up device, it above are only acquisition target image Exemplary illustration, the disclosure is not limited this.

After obtaining target image, the target object in the target image can be identified, wherein can pass through image recognition Algorithm identifies the target object in target image, can also pass through trained machine learning network model performance objective pair The identification of elephant, the machine learning network model may include neural network model or deep learning neural network model etc., sheet It is open that this is also not limited.Wherein, the embodiment of the present disclosure is illustrated taking human as target object, in other embodiments mesh Marking object may be animal, cartoon figure etc..

In step S100, after identifying target object, first of each fisrt feature portion in target object can be determined Location information.Wherein, the fisrt feature portion of target object is the key feature position on target object, such as may include: head At least one of portion, neck, shoulder, ancon, wrist, hip, knee, ankle.Wherein shoulder can be divided into left shoulder and the right side Shoulder, ancon can be divided into left ancon and right ancon, and wrist can be divided into left wrist and right wrist, and hip can be divided into left hip With right hip, knee, which can be divided into left knee and right knee and ankle, can be divided into left ankle and right ankle.Wherein, for The identification in above-mentioned fisrt feature portion can also be executed by preset feature recognition algorithms, or can also be by by training Machine learning network model identify.In addition after identifying each fisrt feature portion, each fisrt feature portion can be determined Location information.Target image can be directly input to first for training and completing in advance by identification and determination for fisrt feature portion If model, each fisrt feature portion of the target object in target image can be gone out with Direct Recognition by first preset model.Or Person can also directly utilize each fisrt feature portion in the first preset algorithm identification target object.The embodiment of the present disclosure can use Prior art means execute the training and foundation of the first preset model, are not limited to this.In addition, the first preset algorithm can be with Including arbitrary feature recognition algorithms.

In addition, the embodiment of the present disclosure can also obtain the information in the fisrt feature portion to be identified, and then identification pair first The position in the fisrt feature portion answered.Fig. 2 shows the flow chart of step S100 in the attitude detecting method according to the embodiment of the present disclosure, Wherein step S100 may include:

S101: the information in the fisrt feature portion to be identified is obtained；

S102: the information in the fisrt feature portion based on acquisition identifies each fisrt feature portion in the target object；

S103: the first location information in each fisrt feature portion is determined based on the two-dimensional coordinate system of foundation.

It is possible, firstly, to obtain the information in the fisrt feature portion to be identified, fisrt feature portion as described above may include head At least one of portion, neck, shoulder, ancon, wrist, hip, knee, ankle, the information in the fisrt feature portion obtained at this time Mark including the fisrt feature portion to be identified is also possible to preset if the mark can be the title in character pair portion Number, can uniquely correspond to fisrt feature portion based on the number.

After the information for obtaining fisrt feature portion, then the identification operation in fisrt feature portion can be executed, wherein can basis First preset algorithm or the first preset model operate to execute the identification, wherein the first preset algorithm may include local feature At least one of point detection algorithm, spot detection algorithm and Corner Detection Algorithm, or also may include that other can be real The detection in existing fisrt feature portion and the algorithm of identification.First preset model can be the above-mentioned network model completed by training, It such as may include machine learning network model, such as neural network model, deep learning neural network model.Fig. 3 shows root According to the target image of the embodiment of the present disclosure and the corresponding relationship in the fisrt feature portion identified, wherein the table in a manner of dot Show each fisrt feature portion of identification.

After identifying each fisrt feature portion, then step S103 can be executed, be determined based on the rectangular coordinate system of foundation The first location information in each fisrt feature portion, each first location information indicate in the form of two-dimensional coordinate, such as (x, y).

Here, the determination of first location information can also realize that the model can be real by above-mentioned first preset model The identification in existing fisrt feature portion and the determination of corresponding position information, or can also be by between the fisrt feature portion that identifies Relative position determine each first location information, such as using the position in one of fisrt feature portion as coordinate origin, utilize Relative positional relationship between each fisrt feature portion can then determine the position coordinates in remaining each fisrt feature portion.The above is only true Determine the exemplary illustration of first location information, the embodiment of the present disclosure is to this without limiting.

It, then can be according to intake target image after the first location information in each fisrt feature portion of target object has been determined Picture pick-up device device parameter determine target object second feature portion three dimensional local information.Wherein three dimensional local information is Refer to the location information under three-dimensional system of coordinate, and first location information is the location information under two-dimensional coordinate system, passes through three-dimensional position Confidence breath can more accurately detect the posture of target object.

Wherein, the embodiment of the present disclosure can be first with picture pick-up device parameter to the first location information in each fisrt feature portion It is normalized, is influenced with removing distinct device parameter to location information bring otherness, then further according to normalization First location information that treated execute fisrt feature portion two-dimensional coordinate to second feature portion three-dimensional coordinate conversion, obtain Three dimensional local information.Fig. 4 shows the flow chart of step S200 in the position and posture detection method according to the embodiment of the present disclosure, wherein this public affairs Opening embodiment step S200 may include:

S201: the device parameter based on the picture pick-up device executes normalized to each first location information, obtains To second location information；

S202: the three dimensional local information in the second feature portion is determined using each second location information.

Wherein the embodiment of the present disclosure can use the second preset model and realize the second confidence based on each fisrt feature portion Cease the three dimensional local information for determining second feature portion.Fisrt feature portion in the embodiment of the present disclosure may include in second feature portion Interior, i.e. second feature portion can be identical with fisrt feature portion, it is possible to have the features more than fisrt feature portion.Such as The second feature portion of the embodiment of the present disclosure can also include: crotch midpoint, lumbar vertebrae midpoint, nose and ridge compared to fisrt feature portion At least one of vertebra midpoint.Wherein crotch midpoint is determined by the position of left hip and right hip, and vertebra midpoint can be with It is determined according to cervical vertebra and crotch midpoint and lumbar vertebrae midpoint can also be determined according to cervical vertebra and crotch midpoint, nose can be with It is determined based on head feature point.

Fig. 5, which is shown, determines that the three-dimensional position in second feature portion is believed using each second location information according in the embodiment of the present disclosure The structural schematic diagram of second preset model of breath, wherein the second of each fisrt feature portion can be inputted to second preset model Location information can obtain the three dimensional local information in corresponding second feature portion by the learning manipulation of the second preset model.Its In, the second preset model may include deep learning model.May include in second preset model full articulamentum A, batch processing with And ReLU function layer B and dropout function layer C, for the generating process of the second preset model, the embodiment of the present disclosure is to this Without repeating, can by largely about the fisrt feature portion of two-dimensional coordinate form information carry out learning training, To optimize the machine learning model generated.Such as having prepared about 300,000 groups of data, each group of data are (two-dimension human body a skeletons Coordinate and corresponding three-dimensional human skeleton coordinate), expression mathematically is (x₁,y₁,x₂,y₂,...,x₁₄,y₁₄) and (X₁, Y₁,Z₁,...,X₁₇,Y₁₇,Z₁₇), wherein x₁…x₁₄The abscissa value of the second location information in respectively 14 fisrt feature portions, y₁…y₁₄The ordinate value of the second location information in respectively 14 fisrt feature portions, (X, Y, Z) are respectively the second spy generated The coordinate value of the three dimensional local information in sign portion.Wherein, the quantity in fisrt feature portion and second feature portion can carry out according to demand Setting, the second preset model can complete the determination in second feature portion according to corresponding configuration information.

It is possible to further utilize matched fisrt feature portion and the corresponding first location information in second feature portion and three-dimensional Location information is modified three dimensional local information, and the space bit of target object is determined according to revised three dimensional local information Appearance.

The device parameter condition according to picture pick-up device may be implemented due to difference in above-mentioned configuration based on the embodiment of the present disclosure Device parameter influences the differentiation of features location information, meanwhile, according to the two-dimensional position information and three-dimensional in character pair portion Location information is corrected three-dimensional position, improves the accuracy of attitude detection.

The embodiment of the present disclosure is described in detail below, wherein as described in above-described embodiment, the embodiment of the present disclosure can be with Normalized is executed to the first location information in each fisrt feature portion obtained step S100 and obtains corresponding second confidence Breath, to eliminate the influence that device parameter difference detects location information.Fig. 6 is shown to be detected according to the pose of the embodiment of the present disclosure The flow chart of step S201 in method, wherein step S201 may include:

S2011: executing the first normalized to the first location information using the device parameter, obtains each described The third place information in fisrt feature portion；

S2012: the mean value and variance of the third place information in each fisrt feature portion are determined；

S2013: the second normalized is executed to each the third place information based on the mean value and variance, obtains institute State second location information.

Wherein, the embodiment of the present disclosure can use device parameter pair when executing the normalized of first location information The first location information of each fisrt feature executes the first normalized, obtains the third place information in each fisrt feature portion.It is logical Each first location information distortion error as brought by the parameter of picture pick-up device can be removed by crossing first normalized, and Execute normalized using distortion treated first location information afterwards, obtain the third place information, further normalization by The otherness caused by distinct device parameter influences.

Fig. 7 shows the flow chart of the step S2011 in the position and posture detection method according to the embodiment of the present disclosure, wherein step S201 may include:

S20111: distortion is gone to handle first location information execution using the device parameter；

S20112: to going distortion treated first location information to execute the first normalized, each described first is obtained The third place information of features.Wherein it is possible to be gone at distortion using the first formula to described in first location information execution Reason processing, wherein the first formula may include:

X'=(x-c_x)/f_x

Y'=(y-c_y)/f_y；

R=x'²+y'²

Δ x=2p₁x'y'+p₂(r²+2x'²)

Δ y=p₁(r²+2y'²)+2p₂x'y'

U'=(x'- Δ x) t

V'=(y'- Δ y) t

U=u'f_x+c_x

V=v'f_x+c_y

Wherein, f_xThe focal length for being picture pick-up device in x-axis, f_yFor the focal length of picture pick-up device on the y axis, c_xAnd c_yRespectively The abscissa value and ordinate value of the optical center coordinate position of picture pick-up device, k₁、k₂、k₃、k₄、k₅、k₆The respectively diameter of picture pick-up device To distortion parameter, p₁、p₂For the tangential distortion parameter of picture pick-up device, x and y are respectively the abscissa value of first location information and indulge Coordinate value, u and v are respectively processing treated abscissa value and the ordinate value that distort.Wherein, radial distortion refers to vector end-points The variation dr occurred along its length, that is, the variation of radius vector, tangential distortion refer to what vector end-points occurred along a tangential direction Variation, that is, the variation dt of angle.

Distortion treated first location information (u, v) can be obtained by above-mentioned first formula, can then be executed First normalized, wherein the formula of the first normalized may include:

Wherein, x_nAnd y_nThe abscissa and ordinate value of the third place information after respectively the first normalized.It is based on Above-mentioned configuration can execute normalized after execution goes distortion to handle, can be further improved the accurate of location information Positioning.

After the third place information for obtaining each fisrt feature portion, then inequality and the side of each the third place information can be determined Difference, and the second normalized is further executed according to the variance and mean value, to obtain the second for fisrt feature portion Confidence breath.In the embodiment of the present disclosure, the mean value and variance of the third place information in each fisrt feature portion of determination may include: The abscissa value of the third place information based on each fisrt feature portion, determine the abscissa of the third place information mean value and Variance；The ordinate value of the third place information based on each fisrt feature determines the equal of the ordinate of the third place information Value and variance.Wherein, it can determine the third place information in each fisrt feature portion respectively using mean value formula and formula of variance The variance and mean value of abscissa value and the variance and mean value of ordinate value.

Alternatively, can also be generated and each third based on the third place information in each fisrt feature portion in the embodiment of the present disclosure The corresponding abscissa mean value of the abscissa of location information and abscissa variance, and it is corresponding with the ordinate of each the third place information Ordinate mean value and ordinate variance.That is, the abscissa of each the third place information be corresponding with respective variance and Value, ordinate are also all corresponding with respective mean value and variance.For example, first location information can be executed by third preset model Normalized process.Third preset model may include neural network model, can be in the training neural network model It is trained by a large amount of data, such as 300,000 groups of training datas can be inputted, wherein every group of training data may include defeated The third place information of each features entered and the corresponding second location information for after normalization.Based in training data The mean value and variance of the abscissa of the third place information in each same characteristic features portion, are determined as corresponding to the abscissa in this feature portion Mean value and variance, and the third place information based on same characteristic features each in training data portion ordinate mean value and side Difference is determined as corresponding to the ordinate mean value and variance in this feature portion.

Therefore, it when inputting the third place information in each fisrt feature portion to the third preset model, corresponding can obtain Take the abscissa mean value, variance and ordinate mean value, variance in individual features portion.Then according to the mean value and variance pair of each features The third place information executes the second normalized, wherein can use the second formula, based on the mean value and variance to described The third place information executes the second normalized, wherein second formula includes:

Wherein, s and t respectively indicates the abscissa and ordinate of the second location information in i-th of fisrt feature portion, x_iAnd y_i The abscissa value and ordinate value of the third place information in i-th of fisrt feature portion are respectively indicated, mean function is corresponding first The abscissa of features perhaps ordinate mean function std function be corresponding fisrt feature portion abscissa or ordinate it is equal Variance function, i is positive integer.

In the case where the second location information in each fisrt feature portion has been determined, second location information execution can use The determination process of the three dimensional local information in second feature portion.It wherein, in the embodiments of the present disclosure, can also be to each second feature portion Three dimensional local information carry out inverse normalized, target pair can be used as according to the three dimensional local information after inverse normalized The actual position coordinate relative to picture pick-up device of elephant, for more accurately determining the posture of target object.Wherein, Fig. 8 shows Out according to the flow chart of step S200 in the position and posture detection method of the embodiment of the present disclosure.Wherein, step S200 may include:

S201: the three dimensional local information for determining the second feature portion using each second location information includes:

S202: it using preset model according to the second location information in each fisrt feature portion, obtains and is directed to the target 4th location information of the three dimensional form in the second feature portion of object；

S203: inverse normalized is executed to each 4th location information and obtains the three-dimensional position letter in each second feature portion Breath.

Wherein, S201 and S202 is identical as the processing mode of step S201 and S202 in Fig. 4, in the embodiment of the present disclosure 4th location information is equivalent to the three dimensional local information in embodiment corresponding to Fig. 4, and the embodiment of the present disclosure can be to the 4th Confidence breath carries out inverse normalized, to reduce the influence to location information such as training parameter.

It in the embodiments of the present disclosure, can also three-dimensional position after the three dimensional local information to each second feature portion or correction Confidence breath carries out inverse normalized, can be used as the reality of target object according to the three dimensional local information after inverse normalized Position coordinates, for more accurately determining the posture of target object.The embodiment of the present disclosure is to believe the three-dimensional position after correction Breath carries out inverse normalized and is illustrated, directly to uncorrected three dimensional local information be normalized process with etc Seemingly, it is not repeated herein.Wherein the process against normalized may include:

Using third formula, inverse normalized is executed to the three dimensional local information after correction and obtains the 5th location information, Wherein the third formula includes:

X '=X*std (X)+mean (X)

Y '=Y*std (Y)+mean (Y)

Z'=Z*std (Z)+mean (Z)

Corresponding, the X-coordinate mean value of each 4th location information in each second feature portion in the embodiment of the present disclosure can be It is obtained using mean function based on the X-coordinate value of each each 4th location information, the Y coordinate mean value of each 4th location information can To be Z obtained using mean function based on the Y-coordinate value of each each 4th location information and each the 4th location information Coordinate mean value, which can be, to be obtained based on the Z coordinate value of each each 4th location information using mean function.And the disclosure is real The variance for applying the X-coordinate of each the 4th location information in example, which can be, is based on each each 4th location information using variance function X-coordinate value and X-coordinate mean value obtain, the variance of the Y coordinate of each 4th location information can be is based on using variance function The side of Z coordinate that the Y-coordinate value and Y coordinate mean value of each each 4th location information obtain and each the 4th location information Difference can be Z coordinate value using variance function based on each each 4th location information and Z coordinate mean value obtains.

Alternatively, the mean function in the embodiment of the present disclosure is also possible to the respectively three dimensional local information in character pair portion X, the mean value std function of Y perhaps Z can be respectively X, Y or Z of the three dimensional local information after the correction in character pair portion Variance.That is, the 4th location information in each second feature portion may have the mean value of corresponding X, variance, the mean value of Y and side The mean value and method of difference and Z.It wherein, can be with training process when being trained using training data to third preset model X in each 4th location information in each second feature portion when each 4th location information that middle basis obtains determines practical application respectively Mean value and variance, the mean value and variance of Y and the mean value and variance of Z.Such as can obtain generated in training data for head The mean value and variance of the X of 4th location information of portion's features, the mean value and variance of Y and the mean value and variance of Z, it is corresponding The mean value and variance of the X of the 4th position letter of other available features, the mean value and variance of Y and the mean value and variance of Z, With the mean value and variance of the X of this 4th location information for obtaining each features, the mean value and variance of Y and mean value and the side of Z Difference.

Therefore, the mean value and variance that can use the 4th location information in character pair portion execute the inverse of each second feature portion Normalized obtains the three dimensional local information in accurate each second feature portion to reduce influence brought by training data.Most The pose of target object is obtained according to the three dimensional local information in the second feature portion eventually.

In the embodiment of the present disclosure, after three dimensional local information has been determined, three dimensional local information can also be corrected, With the corresponding spatial pose of determination.Fig. 9 shows the flow chart of step S300 in the position and posture detection method according to the embodiment of the present disclosure, Wherein step S300 may include:

S301: correction parameter is determined based on the first location information and three dimensional local information in same characteristic features portion；

S302: the three dimensional local information is corrected based on the correction parameter；

S303: the spatial pose of the target object is determined based on the three dimensional local information after correction.

Wherein, there may be different features for the fisrt feature portion in the embodiment of the present disclosure and second feature portion, therefore, In the timing for executing three dimensional local information, it is necessary first to the phase same characteristic features in corresponding fisrt feature portion and second feature portion Portion.Wherein, the fisrt feature portion in the embodiment of the present disclosure and second feature portion can be corresponding with identification information, for example, first Features can be corresponding with first identifier, and second feature portion can be corresponding with second identifier, therefore can be corresponding by matching The identification information in fisrt feature portion and second feature portion determines identical features.Such as first with same identification information Features and second feature portion are used for determining correction parameter, this has the fisrt feature portion of same identification information and second feature Practical portion is identical features.Alternatively, being reflected between the first identifier and second identifier in same characteristic features portion there may be corresponding Relationship is penetrated, therefore, the first identifier that can will be mutually related and the corresponding fisrt feature portion of second identifier and second feature portion are made For same characteristic features portion.Here, identification information, which can be, is characterized unique mark that part is matched, wherein can be number or name Claim etc..In addition, the three dimensional local information in step S301 may include three dimensional local information or Fig. 8 reality in Fig. 4 embodiment Apply the three dimensional local information in example.

The embodiment of the present disclosure can execute the determination process of correction parameter based on determining same characteristic features portion, and Figure 11 is shown According to the flow chart of step S301 in the position and posture detection method of the embodiment of the present disclosure, wherein step S301 may include:

S3011: the three dimensional local information is converted into the 5th of two dimensional form using spin matrix and translation matrix Confidence breath；

S3012: based on spin moment described in the difference feedback regulation between the 5th location information and second location information Battle array and translation matrix, until the difference meets preset requirement；

S3013: spin matrix and translation matrix when meeting preset requirement based on the difference determine the correction parameter.

In the embodiment of the present disclosure, since the second location information in fisrt feature portion is two-dimensional coordinate form, and three-dimensional position Information is three-dimensional coordinate form, therefore, it is necessary to which the coordinate form of the two is unified, to determine correction coefficient.The embodiment of the present disclosure In, it can use the 4th location information that three dimensional local information is converted into two dimensional form by spin matrix and translation matrix.Wherein, Spin matrix is rotating vector of the target object relative to picture pick-up device, and translation matrix is target object relative to picture pick-up device Translation vector.The embodiment of the present disclosure can execute the correction course of the three dimensional local information with above-mentioned third preset model, wherein the Three preset models can execute turning for the two dimensional form of three dimensional local information according to preconfigured spin matrix and translation matrix Change.For example, the three dimensional local information can be converted into two-dimentional shape using spin matrix and translation matrix by the 4th formula 5th location information of formula, wherein the 4th formula may include:

S₅=K [R | T] S₃

In addition, third preset model can also be according between the second location information and the 5th location information in character pair portion Difference feedback regulation spin matrix and translation matrix, until the second location information and the 5th location information of all features Difference all meets preset requirement, and it may include that the distance between two location informations are less than preset distance that this, which meets preset requirement, Threshold value, wherein preset distance threshold can according to the preconfigured value of demand, can be set as in various embodiments Different values.Simultaneously can using when difference meets preset requirement spin matrix and translation matrix as correction parameter.

Alternatively, also can use Optimized model in other embodiments of the disclosure and execute the spin matrix peace The feedback regulation of matrix is moved, the expression formula of the Optimized model includes:

argminK[R][T]·S₃-S₂

R, T

Wherein, the minimum function of arg min function representation difference, S₂Indicate second location information.That is, the disclosure is implemented Can be adjusted by Optimized model in example spin matrix R and translation matrix T make the 5th location information and second location information it Between difference minimize.Here Optimized model may include neural network model or other machines learning model.

In addition, the correction can be based on after obtaining the spin matrix R and translation matrix T when meeting preset condition Three dimensional local information described in parameter correction, wherein may include: the wherein institute using three dimensional local information described in the 5th formula correction Stating the 5th formula includes:

P'=P*R+T

I.e., it is possible to using spin matrix corresponding to the difference for meeting preset requirement and translation matrix to three dimensional local information It is corrected, and the three dimensional local information after being corrected, determines the space of target object using the three dimensional local information with this Pose.Wherein, the embodiment of the present disclosure can be carried out directly using the three dimensional local information in each second feature portion after the correction empty Between pose determination, with improve pose detection precision.

In order to more clearly illustrate the embodiment of the present disclosure, the pose detection algorithm of the embodiment of the present disclosure is exemplified below Process, wherein may include:

Image data is obtained, which may include video or picture；

Using two-dimension human body critical point detection tool, 14 points on image of key point position, i.e. 14 fisrt feature are obtained The first location information in portion；

Using two-dimentional key point location information, obtain for three-dimensional human skeleton (17 key points are closed at midpelvis Key point position is fixed as origin), which is the three dimensional local information of three-dimensional key point；

Two human body Critical point models obtained in above-mentioned steps are subjected to alignment operation (determining same characteristic features portion), are made It is consistent in the physical sense to obtain each key point；

The internal reference K of known current device calculates outer ginseng spin matrix R and translation square of the target body under camera coordinates system Battle array T.Wherein,

Wherein, f_x, f_y, c_x, c_yIt can be demarcated from current device by Zhang Zhengyou calibration method.Two-dimentional people after alignment might as well be set Body skeleton S₂With three-dimensional human skeleton S₃, then optimizing following equation can be

argminK[R][T]·S₃-S₂；

R, T

After the R and T that optimization has been determined, three dimensional local information can be corrected, such as obtained by P'=P*R+T P ' is obtained, and then can determine pose.

Since the embodiment of the present disclosure can use video data as image data, in the optimization operation for executing R and T When, the R and T of former frame can be used as the initial value of a later frame, further increase optimization precision.

In conclusion the embodiment of the present disclosure executes the pose detection of objects in images by bonding apparatus parameter, wherein Distinct device parameter can be eliminated on influence brought by attitude detection, can be improved the precision of pose detection, while the disclosure Embodiment can use the difference between first location information and three dimensional local information to determine the school for adjusting three dimensional local information Positive parameter, to further increase the detection accuracy of pose.

It will be understood by those skilled in the art that each step writes sequence simultaneously in the above method of specific embodiment It does not mean that stringent execution sequence and any restriction is constituted to implementation process, the specific execution sequence of each step should be with its function It can be determined with possible internal logic.

Fig. 2 shows the block diagrams according to a kind of apparatus for detecting position and posture of the embodiment of the present disclosure, wherein the apparatus may include:

First determining module 10 is configured to determine the first position in each fisrt feature portion of target object in target image Information, wherein the target image is absorbed by picture pick-up device；

Second determining module 20, the device parameter for being configured to first location information and the picture pick-up device determine needle To the three dimensional local information in the second feature portion of the target object, the second feature portion includes at least each fisrt feature portion；

Third determining module 30, is configured to the corresponding first location information and three dimensional local information determines institute State the spatial pose of target object.

X'=(x-c_x)/f_x

Y'=(y-c_y)/f_y；

R=x'²+y'²

Δ x=2p₁x'y'+p₂(r²+2x'²)

Δ y=p₁(r²+2y'²)+2p₂x'y'

U'=(x'- Δ x) t

V'=(y'- Δ y) t

U=u'f_x+c_x

V=v'f_x+c_y

Wherein, the preset model includes deep learning model.

X '=X*std (X)+mean (X)

Y '=Y*std (Y)+mean (Y)

Z'=Z*std (Z)+mean (Z)

S₅=K [R | T] S₃

P'=P*R+T

In the embodiments of the present disclosure, described device further include:

Image collection module is configured to obtain target image；

In some embodiments, the embodiment of the present disclosure provides the function that has of device or comprising module can be used for holding The method of row embodiment of the method description above, specific implementation are referred to the description of embodiment of the method above, for sake of simplicity, this In repeat no more

The embodiment of the present disclosure also proposes a kind of computer readable storage medium, is stored thereon with computer program instructions, institute It states when computer program instructions are executed by processor and realizes the above method.Computer readable storage medium can be non-volatile meter Calculation machine readable storage medium storing program for executing.

The embodiment of the present disclosure also proposes a kind of electronic equipment, comprising: processor；For storage processor executable instruction Memory；Wherein, the processor is configured to the above method.

The equipment that electronic equipment may be provided as terminal, server or other forms.

Figure 12 is the block diagram of a kind of electronic equipment 800 shown according to an exemplary embodiment.For example, electronic equipment 800 It can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, Medical Devices, Body-building equipment, the terminals such as personal digital assistant.

Referring to Fig.1 2, electronic equipment 800 may include following one or more components: processing component 802, memory 804, Power supply module 806, multimedia component 808, audio component 810, the interface 812 of input/output (I/O), sensor module 814, And communication component 816.

The integrated operation of the usual controlling electronic devices 800 of processing component 802, such as with display, call, data are logical Letter, camera operation and record operate associated operation.Processing component 802 may include one or more processors 820 to hold Row instruction, to perform all or part of the steps of the methods described above.In addition, processing component 802 may include one or more moulds Block, convenient for the interaction between processing component 802 and other assemblies.For example, processing component 802 may include multi-media module, with Facilitate the interaction between multimedia component 808 and processing component 802.

Memory 804 is configured as storing various types of data to support the operation in electronic equipment 800.These data Example include any application or method for being operated on electronic equipment 800 instruction, contact data, telephone directory Data, message, picture, video etc..Memory 804 can by any kind of volatibility or non-volatile memory device or it Combination realize, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable Except programmable read only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, fastly Flash memory, disk or CD.

Power supply module 806 provides electric power for the various assemblies of electronic equipment 800.Power supply module 806 may include power supply pipe Reason system, one or more power supplys and other with for electronic equipment 800 generate, manage, and distribute the associated component of electric power.

Multimedia component 808 includes the screen of one output interface of offer between the electronic equipment 800 and user. In some embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch surface Plate, screen may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touches Sensor is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding The boundary of movement, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, Multimedia component 808 includes a front camera and/or rear camera.When electronic equipment 800 is in operation mode, as clapped When taking the photograph mode or video mode, front camera and/or rear camera can receive external multi-medium data.It is each preposition Camera and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.

Audio component 810 is configured as output and/or input audio signal.For example, audio component 810 includes a Mike Wind (MIC), when electronic equipment 800 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone It is configured as receiving external audio signal.The received audio signal can be further stored in memory 804 or via logical Believe that component 816 is sent.In some embodiments, audio component 810 further includes a loudspeaker, is used for output audio signal.

I/O interface 812 provides interface between processing component 802 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.

Sensor module 814 includes one or more sensors, for providing the state of various aspects for electronic equipment 800 Assessment.For example, sensor module 814 can detecte the state that opens/closes of electronic equipment 800, the relative positioning of component, example As the component be electronic equipment 800 display and keypad, sensor module 814 can also detect electronic equipment 800 or The position change of 800 1 components of electronic equipment, the existence or non-existence that user contacts with electronic equipment 800, electronic equipment 800 The temperature change of orientation or acceleration/deceleration and electronic equipment 800.Sensor module 814 may include proximity sensor, be configured For detecting the presence of nearby objects without any physical contact.Sensor module 814 can also include optical sensor, Such as CMOS or ccd image sensor, for being used in imaging applications.In some embodiments, which may be used also To include acceleration transducer, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 816 is configured to facilitate the communication of wired or wireless way between electronic equipment 800 and other equipment. Electronic equipment 800 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.Show at one In example property embodiment, communication component 816 receives broadcast singal or broadcast from external broadcasting management system via broadcast channel Relevant information.In one exemplary embodiment, the communication component 816 further includes near-field communication (NFC) module, short to promote Cheng Tongxin.For example, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band can be based in NFC module (UWB) technology, bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, electronic equipment 800 can be by one or more application specific integrated circuit (ASIC), number Word signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.

In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating The memory 804 of machine program instruction, above-mentioned computer program instructions can be executed by the processor 820 of electronic equipment 800 to complete The above method.

Figure 13 is the block diagram of a kind of electronic equipment 1900 shown according to an exemplary embodiment.For example, electronic equipment 1900 may be provided as a server.Referring to Fig.1 3, it further comprises one that electronic equipment 1900, which includes processing component 1922, A or multiple processors and memory resource represented by a memory 1932, can be by processing component 1922 for storing The instruction of execution, such as application program.The application program stored in memory 1932 may include one or more every One corresponds to the module of one group of instruction.In addition, processing component 1922 is configured as executing instruction, to execute the above method.

Electronic equipment 1900 can also include that a power supply module 1926 is configured as executing the power supply of electronic equipment 1900 Management, a wired or wireless network interface 1950 is configured as electronic equipment 1900 being connected to network and an input is defeated (I/O) interface 1958 out.Electronic equipment 1900 can be operated based on the operating system for being stored in memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.

In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating The memory 1932 of machine program instruction, above-mentioned computer program instructions can by the processing component 1922 of electronic equipment 1900 execute with Complete the above method.

The disclosure can be system, method and/or computer program product.Computer program product may include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the disclosure.

Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment Equipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium More specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only deposits It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire Electric signal.

Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.

Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure Face.

Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/ Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.

These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas The processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datas When the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is produced The device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instruction Computer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagram The instruction of the various aspects of defined function action.

Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.

The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.

The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology Other those of ordinary skill in domain can understand each embodiment disclosed herein.

Claims

1. a kind of position and posture detection method characterized by comprising

The first location information for determining each fisrt feature portion of target object in target image, wherein the target image is by imaging Equipment intake；

The second feature portion for being directed to the target object is determined based on the device parameter of first location information and the picture pick-up device Three dimensional local information, the second feature portion include at least each fisrt feature portion；

2. the method according to claim 1, wherein each the first of target object special in the determining target image The first location information in sign portion includes:

Obtain the information in the fisrt feature portion to be identified；

3. method according to claim 1 or 2, which is characterized in that described to be set based on first location information with the camera shooting Standby device parameter determines that the three dimensional local information in the second feature portion for the target object includes:

Device parameter based on the picture pick-up device executes normalized to each first location information, obtains the second position Information；

4. according to the method described in claim 3, it is characterized in that, the device parameter based on the picture pick-up device is to each institute It states first location information and executes normalized, obtaining second location information includes:

The first normalized is executed to the first location information using the device parameter, obtains each fisrt feature portion The third place information；

The second normalized is executed to each the third place information based on the mean value and variance, obtains the second position Information.

5. according to the method described in claim 4, it is characterized in that, described believe the first position using the device parameter Breath executes the first normalized, and the third place information for obtaining each fisrt feature portion includes:

To going distortion treated first location information to execute the first normalized, the third in each fisrt feature portion is obtained Location information.

6. according to the method described in claim 5, it is characterized in that, described believe the first position using the device parameter Breath execution goes distortion to handle

Distortion processing is gone to handle to described in first location information execution using the first formula, wherein the first formula includes:

X'=(x-c_x)/f_x

Y'=(y-c_y)/f_y；

R=x'²+y'²

Δ x=2p₁x'y'+p₂(r²+2x'²)

Δ y=p₁(r²+2y'²)+2p₂x'y'

U'=(x'- Δ x) t

V'=(y'- Δ y) t

U=u'f_x+c_x

V=v'f_x+c_y

Wherein, f_xThe focal length for being picture pick-up device in x-axis, f_yFor the focal length of picture pick-up device on the y axis, c_xAnd c_yRespectively camera shooting is set The abscissa value and ordinate value of standby optical center coordinate position, k₁、k₂、k₃、k₄、k₅、k₆The respectively radial distortion of picture pick-up device Parameter, p₁、p₂For the tangential distortion parameter of picture pick-up device, x and y are respectively the abscissa value and ordinate value of first location information, U and v is respectively processing treated abscissa value and the ordinate value that distort.

7. the method according to any one of claim 4-6, which is characterized in that the mean value and variance of being based on is to each The third place information executes the second normalized, and obtaining the second location information includes:

Wherein, s and t respectively indicates the abscissa and ordinate of second location information, x_iAnd y_iRespectively indicate i-th of fisrt feature The third place information abscissa value and ordinate value, mean function is mean function, and std function is variance function, and i is positive Integer.

8. a kind of apparatus for detecting position and posture characterized by comprising

First determining module is configured to determine the first location information in each fisrt feature portion of target object in target image, Wherein the target image is absorbed by picture pick-up device；

Second determining module, the device parameter for being configured to first location information and the picture pick-up device are determined for described The three dimensional local information in the second feature portion of target object, the second feature portion include at least each fisrt feature portion；

Third determining module, is configured to the corresponding first location information and three dimensional local information determines the target The spatial pose of object.

9. a kind of electronic equipment characterized by comprising

Processor；

Memory for storage processor executable instruction；

Wherein, the processor is configured to: perform claim require any one of 1 to 7 described in method.

10. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that the computer Method described in any one of claim 1 to 7 is realized when program instruction is executed by processor.