KR101717377B1 - Device and method for head pose estimation - Google Patents

Device and method for head pose estimation Download PDF

Info

Publication number
KR101717377B1
KR101717377B1 KR1020150169265A KR20150169265A KR101717377B1 KR 101717377 B1 KR101717377 B1 KR 101717377B1 KR 1020150169265 A KR1020150169265 A KR 1020150169265A KR 20150169265 A KR20150169265 A KR 20150169265A KR 101717377 B1 KR101717377 B1 KR 101717377B1
Authority
KR
South Korea
Prior art keywords
face posture
codewords
code
image
feature point
Prior art date
Application number
KR1020150169265A
Other languages
Korean (ko)
Inventor
김현덕
손명규
이상헌
Original Assignee
재단법인대구경북과학기술원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 재단법인대구경북과학기술원 filed Critical 재단법인대구경북과학기술원
Priority to KR1020150169265A priority Critical patent/KR101717377B1/en
Application granted granted Critical
Publication of KR101717377B1 publication Critical patent/KR101717377B1/en

Links

Images

Classifications

    • G06K9/00268
    • G06K9/00275
    • G06K9/00335
    • G06K9/38

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to a face posture estimation apparatus and a face posture estimation method. The face posture estimating apparatus according to the present invention includes an extracting unit for extracting a feature point from an image and a code word (m is a natural number equal to or more than 3) And a processing unit for encoding the feature point with an encoding code related to face posture recognition in the image using a codeword.

Description

TECHNICAL FIELD [0001] The present invention relates to a face posture estimation apparatus and a face posture estimation method,

The present invention relates to a face posture estimation apparatus and a face posture estimation method.

In the prior art, a technique of detecting a shape of a person through a feature point has been used. At this time, a vector quantization (VQ) method and a sparse coding (SC) method have been used as methods of encoding feature points. However, the encoding method in the prior art may have the following disadvantages.

Fig. 1 is a diagram for explaining a method of coding feature points in the prior art.

The vector quantization method shown in FIG. 1 (a) is the simplest encoding method in the related art. The vector quantization method is a method of encoding the feature descriptor corresponding to the most locally closest code using a least squares method. The code through the vector quantization method has a disadvantage that the feature value and the code word of the codebook correspond to each other one-to-one, thereby causing a high possibility of causing a quantization error.

The sparse coding method shown in FIG. 1 (b) is a coding method widely used in the field of computer vision. The sparse coding method is a method of correspondingly coding a small number of code words considering only the number of codes in order to reduce errors in vector quantization. However, the sparse encoding method has a disadvantage in that the performance may be limited by encoding only emphasizing scarcity without considering the structure of data in the feature space.

Therefore, there is a need for a coding method capable of reducing the possibility of occurrence of quantization error and considering the structure of data by limiting the locality.

SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to quickly and accurately recognize a face posture by performing limited local sparsity coding with a small amount of computation and improved accuracy with regard to minutiae points.

The face posture estimating apparatus for achieving the above object comprises an extracting unit for extracting a feature point from an image and a code word (m is a natural number of 3 or more) close to coordinates possessed by the feature point from a codebook, and a processing unit for encoding the minutiae with an encoding code for recognizing a face posture in the image using m codewords.

In order to accomplish the above object, a face posture estimation method includes extracting a feature point from an image, calculating m (m is a natural number of 3 or more) codewords close to coordinates possessed by the feature point, And encoding the feature point with an encoding code for recognizing face orientation in the image using the m codewords.

According to the embodiment of the present invention, by performing limited local scarcity encoding with a small amount of computation and improved accuracy with respect to feature points, the face posture can be quickly and accurately recognized.

In addition, according to the embodiment of the present invention, it is possible to improve the interaction performance with the advertisement and the contents requiring fast recognition due to the fast and accurate face posture recognition, and without the need of a separate input device (e.g., keyboard, It is possible to improve the interaction performance with a computer apparatus requiring accuracy.

Fig. 1 is a diagram for explaining a method of coding feature points in the prior art.
2 is a block diagram showing a face posture estimating apparatus according to an embodiment of the present invention.
Figure 3 is a diagrammatic representation of limited local sparse encoding in accordance with one embodiment of the present invention.
4 is a flowchart illustrating a method of estimating a face posture according to an exemplary embodiment of the present invention.

Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to or limited by the embodiments. Like reference symbols in the drawings denote like elements.

The face posture estimating apparatus and the face posture estimating method described in this specification can encode codes considering the structure of data by limiting the locality.

2 is a block diagram showing a face posture estimating apparatus according to an embodiment of the present invention.

The face posture estimation apparatus 200 of the present invention may include an extraction unit 210 and a processing unit 220. [

The extraction unit 210 extracts feature points from the image. That is, the extracting unit 210 may extract feature points using a Histogram of Oriented Gradient (HOG) feature descriptor. At this time, the extracting unit 210 extracts a set of n feature points extracted from the image

Figure 112015117060735-pat00001
Can be extracted. The extraction unit 210 can extract feature points efficiently by using the characteristics of the constant HOG feature descriptor with respect to the change in rotation.

The processing unit 220 identifies m code words (m is a natural number of 3 or more) close to the coordinates of the minutiae from the codebook, and uses the m code words to identify the minutiae as the in- And is encoded with an encoding code related to attitude recognition. That is, the processing unit 220 can identify at least three adjacent code words from the feature points. At this time, the processing unit 220 generates codebooks having m code words

Figure 112015117060735-pat00002
Lt; / RTI > In addition, the processing unit 220 may encode the feature points through limited local sparse encoding to recognize the face posture distinct from the background in the image.

In addition, the processing unit 130 may generate the encoded code by inversely calculating a function composed of the identified m codewords with the minutiae. That is, the processing unit 220 determines

Figure 112015117060735-pat00003
To an m-dimensional code c (coded code).

In addition, the processing unit 220 identifies the m codewords, and if the number of codewords having a value of 0 is relatively small, repeating the identification of the codewords and limiting the scarcity for the m codewords identified can do. That is, the processing unit 220 may limit the scarcity of m codewords so that a code c having a value of 0 is larger than a code c having a value other than 0.

In addition, when the feature point is extracted as a plurality of the feature points including the first feature point and the second feature point, the processing unit 220 may determine that the feature point is close to the coordinates of the second feature point except for the code word identified based on the first feature point The m codewords can be identified. That is, when a plurality of features are extracted from the feature points, the processing unit 220 may extract the code words that are close to the second feature points, excluding the code words identified by the first feature points with respect to the second feature points. For a more detailed description of limited local sparse encoding, reference is made to Fig. 3 below.

Figure 3 is a diagrammatic representation of limited local sparse encoding in accordance with one embodiment of the present invention.

In FIG. 3, it is assumed that the face posture estimation apparatus 200 extracts two minutiae as an example, but the present invention is not limited to this. That is, the face posture estimation apparatus 200 can extract n feature points.

First, the face posture estimation apparatus 200 can limit the scarcity of the code c such that most of the code c has a value of 0 and only a few numbers have a value other than 0. For example, the face posture estimation apparatus 200 identifies three code words 321, 322, and 323, and if the number of code words 321, 322, and 323 having a value of 0 is relatively small, To limit the sparseness for the three code words 321, 322, and 323 that are identified.

Next, the face posture estimation apparatus 200 can extract the first and second minutiae 310 and 320. At this time, the face posture estimation apparatus 200 can identify at least three code words 321, 322, and 323 for the second feature point 320. [ The face posture estimation apparatus 200 excludes the code words 311, 312, and 313 identified by the first feature point 310 when the code words 321, 322, and 323 for the second feature point 320 are identified And identify code words 321, 322, and 323 that are close to the second feature point 320. As described above, the face posture estimation apparatus 200 can restrict the locality by selecting only the near code word corresponding to the code c.

Referring again to FIG. 2, the processing unit 220 can perform limited local sparse encoding through the objective function equation (1).

Figure 112015117060735-pat00004

here,

Figure 112015117060735-pat00005
Represents the product of the elements,
Figure 112015117060735-pat00006
Can be the distance between the feature point X and the codebook B. Further, the processing unit 220 may transform Equation (1) into Equation (2) using a slack variable.

Figure 112015117060735-pat00007

The processing unit 220 can derive Equation 2 as in Equation 3 using inexact ALM (Augmented Lagrange Multiplier).

Figure 112015117060735-pat00008

The processing unit 220 may repeatedly update the variables for Equation (3), thereby generating an encoded code.

At this time, the processing unit 220 can select the codeword in the closest distance to the coordinates through the K-nearest neighbor algorithm. That is, the processing unit 220 uses the code word close to the feature point X

Figure 112015117060735-pat00009
Lt; / RTI >
Figure 112015117060735-pat00010
(Code c) using Equation 4, which is a linear system of a smaller size.

Figure 112015117060735-pat00011

The overall algorithm performed by the processing unit 220 may be as shown in Equation (5).

Figure 112015117060735-pat00012

In addition, the processing unit 220 may apply the encoding code to a linear SVM (Support Vector Machine) to recognize the face posture from the image. That is, the processing unit 220 can quickly recognize the face posture through learning through the linear SVM using the encoded code. At this time, the processing unit 220 may use various detection models other than the SVM. For example, the processing unit 220 may use a kernel based SVM, a Bayes classifier, or the like.

Further, the processing unit 220 can separate the face posture divided by the encoding code from the background in the image. That is, the processing unit 220 can recognize the face posture of the encoded code generated from the subsequent image by learning the encoded code separated for the face posture and the background.

According to the facial attitude estimation apparatus 200 of the present invention, the facial attitude can be quickly and accurately recognized by performing the limited local scarcity coding with a small amount of computation and improved accuracy with regard to the feature points.

In addition, according to the face posture estimation apparatus 200 of the present invention, it is possible to improve the interaction performance between the advertisement and the content requiring fast recognition due to the quick and accurate face posture recognition, Mouse, etc.), it is possible to improve the interaction performance with a computer apparatus that requires accuracy.

4 is a flowchart illustrating a method of estimating a face posture according to an exemplary embodiment of the present invention.

First, the face posture estimation method according to the present embodiment can be performed by the above-described face posture estimation apparatus 200. [

First, the face posture estimation apparatus 200 extracts feature points from an image (410). That is, step 220 may be a process of extracting feature points using HOG (Histogram of Oriented Gradient) feature descriptor. At this time, the face posture estimation apparatus 200 estimates a set of n feature points extracted from the image

Figure 112015117060735-pat00013
Can be extracted. The face posture estimation apparatus 200 can effectively extract the feature points by using the characteristics of the HOG feature descriptor that is invariant to the change in rotation.

Next, the face posture estimation apparatus 200 identifies 420 code words (m is 3 or more natural numbers) close to the coordinates of the feature points from the codebook (420). That is, step 420 may be a process of identifying at least three adjacent code words from the feature points. At this time, the face posture estimation apparatus 200 calculates the position of the codebook

Figure 112015117060735-pat00014
Lt; / RTI > In addition, the face posture estimation apparatus 200 can recognize the face posture distinguished from the background in the image by encoding the feature points through limited local sparse encoding.

Next, the face postural estimation apparatus 200 codes the feature points using the m codewords into an encoding code related to face posture recognition in the image (430). That is, the step 430 may encode the feature points through limited local sparse encoding to recognize the face posture distinct from the background in the image.

In addition, step 430 may generate the encoded code by inversely calculating a function composed of the identified m codewords with the minutiae. That is, the face posture estimation apparatus 200 calculates

Figure 112015117060735-pat00015
To an m-dimensional code c (coded code).

In this case, if the feature point is extracted as a plurality of the feature points including the first feature point and the second feature point, the step 420 may be performed so that the coordinates of the second feature point The m codewords can be identified. That is, when a plurality of features are extracted from the feature points, the face posture estimation apparatus 200 can extract a code word that is close to the second feature point, excluding the code word identified by the first feature point with respect to the second feature point.

Step 420 also identifies the m codewords, if the number of codewords with a value of 0 is relatively small, repeating the identification of the codewords and limiting the scarcity for the identified m codewords can do. That is, the face posture estimation apparatus 200 can limit the sparseness of m codewords so that a code c having a value of 0 is larger than a code c having a value other than 0.

The face posture estimation apparatus 200 can perform limited local sparse encoding through the objective function equation (6).

Figure 112015117060735-pat00016

here,

Figure 112015117060735-pat00017
Represents the product of the elements,
Figure 112015117060735-pat00018
Can be the distance between the feature point X and the codebook B. In addition, the face posture estimation apparatus 200 can transform Equation (6) into Equation (7) using a slack variable.

Figure 112015117060735-pat00019

The face posture estimation apparatus 200 can derive Equation (7) as Equation (8) using inexact ALM (Augmented Lagrange Multiplier).

Figure 112015117060735-pat00020

The face posture estimation apparatus 200 can generate the encoded code by updating the variables repeatedly with respect to Equation (8).

In addition, the step 420 may be a process of selecting the codeword in the closest distance to the coordinates through the K-nearest neighbor algorithm.

That is, the face posture estimation apparatus 200 uses the code word close to the feature point X

Figure 112015117060735-pat00021
Lt; / RTI >
Figure 112015117060735-pat00022
(Code c) using Equation (9), which is a linear system of smaller size.

Figure 112015117060735-pat00023

The overall algorithm performed by the face posture estimation apparatus 200 may be as shown in Equation (10).

Figure 112015117060735-pat00024

According to the embodiment, the face posture estimation apparatus 200 can apply the encoding code to a linear SVM (Support Vector Machine) to recognize the face posture from the image. That is, the face posture estimation apparatus 200 can quickly recognize the face posture through learning through the linear SVM using the encoding code. At this time, the face posture estimation apparatus 200 can use various detection models other than the SVM. For example, the face posture estimation apparatus 200 can use a kernel based SVM, a Bayes classifier, and the like.

According to the embodiment, the face posture estimation apparatus 200 can separate the face posture divided by the encoding code from the background in the image. That is, the face posture estimation apparatus 200 can recognize the face posture of the encoded code generated from the subsequent image by learning the encoded code separated for the face posture and the background.

According to the facial attitude estimation method of the present invention, the facial attitude can be recognized quickly and accurately by performing limited local scarcity coding with a small amount of computation and improved accuracy with respect to the feature points.

According to the facial attitude estimation method of the present invention, it is possible to improve the interaction performance between the advertisement and the contents requiring fast recognition due to the quick and accurate facial attitude recognition, It is possible to improve the interaction performance with a computer apparatus that requires accuracy.

The method according to an embodiment of the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

200: face posture estimation device
210:
220:

Claims (12)

An extracting unit for extracting a feature point from an image; And
(M is a natural number of 3 or more) codewords that are close to the coordinates of the minutiae from the codebook, and the minutiae points are encoded using the m code words A processing unit
Lt; / RTI >
When a plurality of feature points including the first feature point and the second feature point are extracted from the image,
Wherein,
Identifying the m codewords that are close to the coordinates of the second feature point, excluding codewords identified based on the first feature point, from among the plurality of feature points,
If the number of codewords with a value of 0 is less than the predetermined number, the identification of the codeword is repeated to limit the scarcity for the identified m codewords
A face posture estimation device.
delete The method according to claim 1,
Wherein,
And a function composed of the identified m codewords is inversely calculated with respect to the minutiae to generate the coding code
A face posture estimation device.
The method according to claim 1,
Wherein,
The coding code is applied to a linear SVM (Support Vector Machine) to recognize the face posture from the image
A face posture estimation device.
delete The method according to claim 1,
Wherein,
The codeword is selected in the order of closest distances from the coordinates through the K-nearest neighbor algorithm
A face posture estimation device.
The method according to claim 1,
Wherein,
And separating the face posture divided by the encoding code from the background in the image
A face posture estimation device.
Extracting feature points from an image;
Identifying m code words (m is a natural number of 3 or more) close to the coordinates of the feature points from the codebook; And
Encoding the minutiae points with an encoding code for recognizing a face posture in the image using the m code words;
Lt; / RTI >
When a plurality of feature points including the first feature point and the second feature point are extracted from the image,
Wherein identifying the code word from a codebook comprises:
Identifying among the plurality of minutiae the m codewords close to the coordinates of the second minutia except the codeword identified on the basis of the first minutiae; And
If the number of codewords having a value of 0 is less than the predetermined number, repeating the identification of the codewords and restricting the sparseness to the identified m codewords
The method comprising the steps of:
delete 9. The method of claim 8,
Wherein the encoding of the encoded code comprises:
Calculating a function consisting of the identified m codewords in inverse matrix with the minutiae to generate the coding code
The method comprising the steps of:
9. The method of claim 8,
In the face posture estimation method,
Applying the encoded code to a linear SVM (Support Vector Machine) to recognize the face posture from the image
Further comprising the steps of:
delete
KR1020150169265A 2015-11-30 2015-11-30 Device and method for head pose estimation KR101717377B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150169265A KR101717377B1 (en) 2015-11-30 2015-11-30 Device and method for head pose estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150169265A KR101717377B1 (en) 2015-11-30 2015-11-30 Device and method for head pose estimation

Publications (1)

Publication Number Publication Date
KR101717377B1 true KR101717377B1 (en) 2017-03-17

Family

ID=58501940

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150169265A KR101717377B1 (en) 2015-11-30 2015-11-30 Device and method for head pose estimation

Country Status (1)

Country Link
KR (1) KR101717377B1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463903A (en) * 2017-08-08 2017-12-12 北京小米移动软件有限公司 Face key independent positioning method and device
CN111310512A (en) * 2018-12-11 2020-06-19 杭州海康威视数字技术股份有限公司 User identity authentication method and device
KR20210037925A (en) * 2019-09-30 2021-04-07 주식회사 씨엘 Passenger counting apparatus using computer vision and passenger monitoring system thereof
KR20230058863A (en) * 2021-10-25 2023-05-03 에스케이텔레콤 주식회사 Method for Code-Level Super Resolution And Method for Training Super Resolution Model Therefor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110052427A (en) * 2009-11-12 2011-05-18 한국전자통신연구원 Method and apparatus for video encoding/decoding using adaptive codeword assignment to syntax element
KR20120079083A (en) * 2009-09-02 2012-07-11 록스타 비드코, 엘피 Systems and methods of encoding using a reduced codebook with adaptive resetting

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120079083A (en) * 2009-09-02 2012-07-11 록스타 비드코, 엘피 Systems and methods of encoding using a reduced codebook with adaptive resetting
KR20110052427A (en) * 2009-11-12 2011-05-18 한국전자통신연구원 Method and apparatus for video encoding/decoding using adaptive codeword assignment to syntax element

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463903A (en) * 2017-08-08 2017-12-12 北京小米移动软件有限公司 Face key independent positioning method and device
CN107463903B (en) * 2017-08-08 2020-09-04 北京小米移动软件有限公司 Face key point positioning method and device
CN111310512A (en) * 2018-12-11 2020-06-19 杭州海康威视数字技术股份有限公司 User identity authentication method and device
CN111310512B (en) * 2018-12-11 2023-08-22 杭州海康威视数字技术股份有限公司 User identity authentication method and device
KR20210037925A (en) * 2019-09-30 2021-04-07 주식회사 씨엘 Passenger counting apparatus using computer vision and passenger monitoring system thereof
KR102287683B1 (en) * 2019-09-30 2021-08-10 주식회사 씨엘 Passenger counting apparatus using computer vision and passenger monitoring system thereof
KR20230058863A (en) * 2021-10-25 2023-05-03 에스케이텔레콤 주식회사 Method for Code-Level Super Resolution And Method for Training Super Resolution Model Therefor
KR102566798B1 (en) * 2021-10-25 2023-08-11 에스케이텔레콤 주식회사 Method for Code-Level Super Resolution And Method for Training Super Resolution Model Therefor

Similar Documents

Publication Publication Date Title
KR102221118B1 (en) Method for extracting feature of image to recognize object
KR102010378B1 (en) Device and method to extract feature of image including object
KR102077260B1 (en) Method and apparatus of face recognition using confidence based on probabilistic model
KR101717377B1 (en) Device and method for head pose estimation
CN111079683A (en) Remote sensing image cloud and snow detection method based on convolutional neural network
KR102043960B1 (en) Method and systems of face expression features classification robust to variety of face image appearance
KR102399025B1 (en) Improved data comparison method
KR101451854B1 (en) Apparatus for recongnizing face expression and method thereof
KR20220076398A (en) Object recognition processing apparatus and method for ar device
JP2017515222A (en) Line segmentation method
US20140233848A1 (en) Apparatus and method for recognizing object using depth image
KR20170024303A (en) System and method for detecting feature points of face
KR102434574B1 (en) Method and apparatus for recognizing a subject existed in an image based on temporal movement or spatial movement of a feature point of the image
US20160078314A1 (en) Image Retrieval Apparatus, Image Retrieval Method, and Recording Medium
CN110826554B (en) Infrared target detection method
Vafadar et al. A vision based system for communicating in virtual reality environments by recognizing human hand gestures
CN114140831A (en) Human body posture estimation method and device, electronic equipment and storage medium
CN107533671B (en) Pattern recognition device, pattern recognition method, and recording medium
Mantecón et al. Enhanced gesture-based human-computer interaction through a Compressive Sensing reduction scheme of very large and efficient depth feature descriptors
EP3192010A1 (en) Image recognition using descriptor pruning
JP7031686B2 (en) Image recognition systems, methods and programs, as well as parameter learning systems, methods and programs
JP6393495B2 (en) Image processing apparatus and object recognition method
KR102399673B1 (en) Method and apparatus for recognizing object based on vocabulary tree
KR101514551B1 (en) Multimodal user recognition robust to environment variation
KR102014093B1 (en) System and method for detecting feature points of face

Legal Events

Date Code Title Description
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20191203

Year of fee payment: 4