WO2019085519A1 - Procédé et dispositif de suivi facial - Google Patents

Procédé et dispositif de suivi facial Download PDF

Info

Publication number
WO2019085519A1
WO2019085519A1 PCT/CN2018/092634 CN2018092634W WO2019085519A1 WO 2019085519 A1 WO2019085519 A1 WO 2019085519A1 CN 2018092634 W CN2018092634 W CN 2018092634W WO 2019085519 A1 WO2019085519 A1 WO 2019085519A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial feature
target
coordinate
coordinates
distance value
Prior art date
Application number
PCT/CN2018/092634
Other languages
English (en)
Chinese (zh)
Inventor
陆小松
张涛
蒲天发
Original Assignee
宁波视睿迪光电有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宁波视睿迪光电有限公司 filed Critical 宁波视睿迪光电有限公司
Publication of WO2019085519A1 publication Critical patent/WO2019085519A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction

Definitions

  • the present invention relates to the field of image processing technologies, and in particular, to a face tracking method and apparatus.
  • the face tracking technology is applied to the naked-eye 3D stereoscopic display technology.
  • the tracking of the user's face position is used to ensure that the naked-eye 3D picture of the composite display is adapted to the position of the user's face, thereby ensuring that the user views a better stereoscopic effect.
  • the environment in which the user is located is complicated, for example, if the background pattern is disturbed or crowded, it is difficult to accurately track the user who needs to be tracked, thereby causing the naked-eye 3D picture of the display to be unsuitable, and even causing discomfort to the user such as dazzling, dizziness, and the like. sense.
  • an object of the present invention is to provide a face tracking method and apparatus to effectively improve the above drawbacks.
  • an embodiment of the present invention provides a face tracking method, which is applied to a display terminal.
  • the method includes: determining at least one facial feature coordinate from the acquired current image; determining, from the at least one facial feature coordinate, a target facial feature coordinate that satisfies a preset criterion, wherein the target facial feature coordinate is needed Mapping the facial features of the target user tracked by the display terminal in the current image; synthesizing the current naked-eye 3D image for display according to the target facial feature coordinates, so that the displayed naked-eye 3D image is adapted The current viewing position of the target user.
  • an embodiment of the present invention provides a face tracking device applied to a display terminal.
  • the apparatus includes an acquisition module for determining at least one facial feature coordinate from the acquired current image.
  • a determining module configured to determine a target facial feature coordinate that satisfies a preset criterion from the at least one facial feature coordinate, wherein the target facial feature coordinate is a facial feature distribution of a target user that needs to be tracked by the display terminal The coordinates in the current image.
  • a generating module configured to synthesize a current naked-eye 3D image for display according to the target facial feature coordinates, so that the displayed current naked-eye 3D image is adapted to a current viewing position of the target user.
  • the target facial feature coordinates satisfying the preset criterion are determined from at least one facial feature coordinate in the current image by processing the acquired current frame of each frame. Because the target facial feature coordinates correspond to the target user that needs to display the terminal tracking, and then the current naked-eye 3D image for display is synthesized according to the target facial feature coordinates, the displayed current naked-eye 3D image can be adapted to the current viewing of the target user. position. Therefore, the display terminal can effectively track the face of the target user regardless of the change of the environment background, so that displaying the naked-eye 3D picture is a better 3D effect of adapting the user in real time.
  • the target facial feature coordinates satisfying the preset criterion are determined from at least one facial feature coordinate in the current image by processing the acquired current frame of each frame. Because the target facial feature coordinates correspond to the target user that needs to display the terminal tracking, and then the current naked-eye 3D image for display is synthesized according to the target facial feature coordinates, the displayed current naked-eye 3D image can be adapted to the current viewing of the target user. position. Therefore, the display terminal can effectively track the face of the target user regardless of the change of the environment background, so that displaying the naked-eye 3D picture is a better 3D effect of adapting the user in real time.
  • FIG. 1 is a structural block diagram of a display terminal according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of a face tracking method according to a first embodiment of the present invention
  • FIG. 3 is a flowchart of a first method of step S200 in a face tracking method according to a first embodiment of the present invention
  • FIG. 4 is a flowchart showing a second method of step S200 in a face tracking method according to a first embodiment of the present invention
  • FIG. 5 is a flowchart of a third method of step S200 in a face tracking method according to a first embodiment of the present invention
  • FIG. 6 is a flowchart showing a fourth method of step S200 in a face tracking method according to a first embodiment of the present invention
  • FIG. 7 is a structural block diagram of a face tracking device according to a second embodiment of the present invention.
  • FIG. 8 is a structural block diagram of an acquisition module in a face tracking device according to a second embodiment of the present invention.
  • FIG. 9 is a block diagram showing a first structure of a determining module in a face tracking device according to a second embodiment of the present invention.
  • FIG. 10 is a block diagram showing a second structure of a determining module in a face tracking device according to a second embodiment of the present invention.
  • FIG. 1 is a block diagram showing the terminal 10.
  • the display terminal 10 includes a face tracking device, a memory 101, a memory controller 102, a processor 103, a peripheral interface 104, an input and output unit 105, a display unit 106, an image acquisition unit 107, and a distance acquisition unit 108.
  • the memory 101, the memory controller 102, the processor 103, the peripheral interface 104, the input and output unit 105, the display unit 106, the image acquisition unit 107, and the distance collection unit 108 are electrically connected directly or indirectly to each other.
  • the components can be electrically connected to one another via one or more communication buses or signal lines.
  • the face tracking device includes at least one software function module that can be stored in the memory or in a Windows operating system of the display terminal 10 in the form of software or firmware.
  • the processor 103 is configured to execute an executable module stored in the memory 101, such as a software function module or a computer program included in the face tracking device.
  • the memory 101 can be, but is not limited to, a random access memory (Random) Access Memory, RAM), read only memory (Read Only Memory, ROM), Programmable Read Only Memory (Programmable) Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), EEPROM (Electric) Erasable Programmable Read-Only Memory, EEPROM), etc.
  • the memory 101 is used to store a program, and the processor 103 executes the program after receiving the execution instruction, and the method executed by the display terminal 10 defined by the process disclosed in any embodiment of the present invention may be applied to
  • the processor 103 is implemented by or by the processor 103.
  • the processor 103 can be an integrated circuit chip with signal processing capabilities.
  • the above processor may be a general purpose processor, including a central processing unit (Central) Processing Unit (CPU), Network Processor (NP), etc.; can also be digital signal processor (DSP), application specific integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic
  • the methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or executed by a device, a discrete gate or a transistor logic device, or a discrete hardware component.
  • the general purpose processor 103 can be a microprocessor or the processor can be any conventional processor or the like.
  • the peripheral interface 104 couples various input and output units 105 to the processor 103 and the memory 101.
  • the peripheral interface, processor, and memory controller can be implemented in a single chip. In other instances, they can be implemented by separate chips.
  • the input and output unit 105 is configured to provide input data to the user to implement interaction between the user and the data collection terminal.
  • the input and output unit may be, but not limited to, a mouse, a keyboard, and the like.
  • the display unit 106 provides an interactive interface between the mobile terminal and the user, such as a user operation interface, or for displaying image data for user reference.
  • the display unit may be a liquid crystal display or a touch display.
  • a touch display it can be a capacitive touch screen or a resistive touch screen that supports single-point and multi-touch operations. Supporting single-point and multi-touch operations means that the touch display can sense the simultaneous touch operation from one or more positions on the touch display, and the touch operation is performed by the processor. Calculation and processing.
  • the display unit 106 can also be used to display a naked-eye 3D picture.
  • the image acquisition unit 107 may be mounted above the display unit 106, and the image acquisition unit 107 may be a conventional model of a visible light camera or an infrared camera or the like.
  • the image acquisition unit 107 can be configured to output the acquired image of each frame within a certain field of view angle in front of the display unit 106 to the processor 103 for processing.
  • the distance acquisition unit 108 can also be mounted above the display unit 106, which can be a conventional model distance sensor.
  • the distance collecting unit 108 can be configured to output the collected distance of the target within a certain field of view angle in front of the display unit 106 to the processor 103 for processing.
  • a first embodiment of the present invention provides a face tracking method, which is applied to a display terminal, and the face tracking method includes: step S100, step S200, and step S300.
  • Step S100 Determine at least one facial feature coordinate from the collected current image.
  • the image acquisition unit provided on the display terminal is a conventional camera, so that the display terminal can continuously acquire multiple frames of images at a certain sampling rate, for example, 60 frames per second.
  • the image of each frame acquired may be an image within a certain field of view angle in front of the display unit of the display terminal.
  • the display terminal may process the acquired current image of each frame to extract at least one face in the current image by processing, and obtain each face at the current The facial feature coordinates in the image to obtain at least one facial feature coordinate.
  • the display terminal can process the current image by using a conventional image recognition extraction algorithm to determine whether there are several faces in the current image. For example, when two users face the naked eye 3D picture displayed by the display terminal toward the display terminal at the time of acquiring a certain current image, the display terminal determines that there are two faces in the current image. Thereafter, the display terminal may also extract corresponding facial feature coordinates from each face by a conventional image recognition extraction algorithm, and the facial feature coordinates may be coordinates of a midpoint between the eyes of the face. Further, the display terminal can obtain at least one facial feature coordinate in the current image.
  • the display terminal may extract coordinates of each facial feature in the face according to the image recognition extraction algorithm. For example, the display terminal can obtain: coordinates of the upper lip, coordinates of the tip of the nose, coordinates of the left ear, coordinates of the right ear, coordinates of the left pupil, coordinates of the right pupil, and the like. Further, the display terminal can obtain the facial feature coordinates in the face according to the coordinates of the facial features.
  • the display terminal may use the coordinates of the upper lip or the coordinates of the tip of the nose as the facial feature coordinates, or the display terminal may move the coordinates of the upper lip or the coordinates of the tip of the nose to obtain the facial feature coordinates, or the display terminal may also be based on the coordinates of the left ear.
  • the coordinates of the right ear determine the midpoint coordinates between the coordinates of the left ear and the coordinates of the right ear as the facial feature coordinates, or the display terminal also determines the coordinates of the left pupil and the right pupil according to the coordinates of the left pupil and the coordinates of the right pupil.
  • the midpoint coordinates between the coordinates are used as facial feature coordinates and the like.
  • the specific selection of the facial feature coordinates may be adjusted according to the specific implementation, which is not specifically limited in this embodiment.
  • the display terminal may determine the facial feature coordinates according to the coordinates of the left pupil and the coordinates of the right pupil. Specifically, the display terminal obtains a double pupil distance value between the coordinates of the left pupil of each facial image in the current image and the coordinates of the right pupil according to the current image, thereby obtaining at least one double pupil distance value corresponding to the at least one facial. Further, according to the coordinates of the left pupil of each facial image and the coordinates of the right pupil, the display terminal obtains the facial feature coordinates corresponding to each double pupil distance value. Each facial feature coordinate is a midpoint coordinate between the coordinates of the corresponding left pupil and the coordinates of the right pupil. Therefore, the display terminal can obtain at least one facial feature coordinate corresponding to at least one face.
  • the obtained at least one double-turn distance value is convenient for execution of an implementation manner in a subsequent process. If the actual chosen embodiment is different, at least one double ⁇ distance value can be selectively obtained, ie it may not be obtained.
  • Step S200 determining a target facial feature coordinate that satisfies a preset criterion from the at least one facial feature coordinate, wherein the target facial feature coordinate is a facial feature distribution of the target user that needs to be tracked by the display terminal at the current The coordinates in the image.
  • the display terminal may further process the at least one facial feature coordinate.
  • a preset standard that needs to be met is preset in the display terminal, wherein the preset standard may be used to obtain the maximum value or the minimum value of the plurality of data, or the preset standard may be satisfied.
  • the data is located within a range of values.
  • the display terminal obtains a comparison with a certain value by comparing at least one facial feature coordinate with each other, so that a target facial feature coordinate satisfying the preset criterion can be obtained from at least one facial feature coordinate.
  • the target facial feature coordinate so that the corresponding user is the target user that the display terminal needs to track in real time
  • the target facial feature coordinate is also the coordinate of the facial feature of the target user that needs to be displayed by the display terminal to be distributed in the current image.
  • Step S300 synthesize a current naked eye 3D image for display according to the target facial feature coordinates, so that the displayed current naked eye 3D image is adapted to the current viewing position of the target user.
  • the display terminal After the display terminal determines the target facial feature coordinates in the current image, the display terminal further processes the target facial feature coordinates.
  • a naked eye 3D image synthesis algorithm is preset in the display terminal, and the naked eye 3D image synthesis algorithm can calculate the offset of the left eye to be viewed by the left eye and the right eye need to be viewed based on the target facial feature coordinates. The offset of the video picture. Further, the display terminal obtains the shifted left-eye video image and the offset right-eye video image according to the naked-eye 3D image synthesis algorithm, and synthesizes the offset left-eye video image and the offset right-eye video image into The current naked eye 3D image for display.
  • the offset left-eye video image can be adapted to the target user's left-eye disparity when the target user is in the current viewing position
  • the offset right-eye video image is also The target user's right eye disparity can be adapted to the target user's current viewing position.
  • synthesizing the current naked-eye 3D image for display according to the target facial feature coordinates even if the target user's environment is complex or the target user is constantly moving, the target user can also view the better naked-eye 3D effect displayed by the display terminal. And it will not cause discomfort such as dazzling or dizziness.
  • a first embodiment of the method sub-process of step S200 includes: step S210 and step S220.
  • Step S210 Obtain a distance value between each of the at least one facial feature coordinates and a center line in the current image, to obtain at least one distance value, wherein each of the facial feature coordinates is The coordinates of the midpoint between the double turns.
  • the display terminal can determine the target facial feature coordinates through the centerline rule.
  • a value of a center line is preset in the display terminal, and the center line can be applied to the current image of each frame, that is, the current image can be divided into the left half and the right half of the same area by the center line.
  • the display terminal may calculate a distance value between each of the at least one facial feature coordinate and a center line in the current image.
  • Each distance value may be connected to a line segment perpendicular to the center line between the facial feature coordinate point corresponding to the distance value and the center line, and the length value of the line segment is a corresponding distance value.
  • the display terminal is capable of obtaining at least one distance value corresponding to at least one facial feature coordinate.
  • Step S220 determining a distance value with the smallest value from the at least one distance value as the target distance value satisfying the preset criterion, and using the facial feature coordinate corresponding to the target distance value as the target facial feature coordinate.
  • the display terminal After the display terminal obtains at least one distance value, the display terminal performs operation according to at least one distance value, that is, each distance value of the at least one distance value and each of the at least one distance value except the distance value
  • the comparison thus shows that the terminals can obtain a distance value having the smallest value among the at least one distance value by mutual comparison.
  • the preset criterion that is satisfied is that the value of the distance value is the distance value with the smallest value among the at least one distance value.
  • the terminal displays the distance value with the smallest value as the target distance value, and also uses a facial feature coordinate to which the line segment corresponding to the target distance value is connected as the target facial feature coordinate. Therefore, the display terminal determines the target facial feature coordinates satisfying the preset criteria from the at least one facial feature coordinate by comparing the distance values.
  • determining the target distance value by means of mutual comparison is only one embodiment, and should not be construed as limiting the embodiment. It can also be selected, for example, by directly selecting a minimum value from at least one distance value.
  • the target user corresponding to the target feature coordinate determined by the center line rule should be the user closer to the middle portion of the display unit, and the environmental object or user located at the edge portion of the display unit is excluded.
  • a second embodiment of the method sub-flow of step S200 includes: step S230 and step S240.
  • Step S230 Obtain a coordinate difference value between each of the at least one facial feature coordinate and the corresponding front partial feature coordinate in the previous frame image, to obtain at least one coordinate difference value, wherein each of the The facial feature coordinates are the coordinates of the midpoint between the double turns.
  • the display terminal can determine the target facial feature coordinates by the minimum variation rule.
  • the front part feature coordinates determined in the previous frame image are previously provided in the display terminal.
  • the display terminal may calculate each facial feature coordinate with the front feature coordinate determined in the previous frame image, that is, calculate between each facial feature coordinate and the front feature coordinate.
  • the target coordinate difference Thereby, the display terminal can obtain at least one coordinate difference value corresponding to the number of at least one facial feature coordinate.
  • Step S240 determining, from the at least one coordinate difference value, a coordinate difference value with the smallest value as the target coordinate difference value that satisfies the preset criterion, and using the facial feature coordinate corresponding to the target coordinate difference value as the target Facial feature coordinates.
  • the display terminal After the display terminal obtains at least one coordinate difference value, the display terminal performs operation according to at least one coordinate difference value, that is, each coordinate difference value of at least one coordinate difference value is equal to at least one coordinate difference value except the coordinate difference value. Each of the other coordinate difference values is compared, so that the display terminal can obtain a coordinate difference value having the smallest value among the at least one coordinate difference value by mutual comparison.
  • the preset criterion that is satisfied is that the value of the coordinate difference value is the coordinate difference value with the smallest value among the at least one coordinate difference value.
  • the display terminal also uses the coordinate difference value with the smallest value as the target coordinate difference value, and also uses a facial feature coordinate corresponding to the target coordinate difference value as the target facial feature coordinate. Therefore, the display terminal determines the target facial feature coordinates satisfying the preset criteria from the at least one facial feature coordinate by comparing the coordinate difference values.
  • determining the target coordinate difference value by means of mutual comparison is only one embodiment, and should not be construed as limiting the embodiment. It can also be selected, for example, by directly selecting a minimum value from at least one of the coordinate differences.
  • the target user corresponding to the target feature coordinate determined by the minimum variation rule should maintain a relatively stable posture during the viewing process, and the fast moving environment object or the user who suddenly moves in and out in the background is excluded.
  • a third embodiment of the method sub-process of step S200 includes: step S250 and step S260.
  • Step S250 Determine a maximum distance value of the double pupil from the at least one double pupil distance value as the target double pupil distance value that satisfies the preset criterion.
  • the display terminal can determine the target facial feature coordinates by the first nearest rule.
  • the display terminal may obtain at least one double-distance distance value.
  • the display terminal can be regarded as a target user, which can be the user closest to the display terminal when viewing the naked-eye 3D picture displayed by the display terminal.
  • the display terminal performs calculation according to the at least one double-distance distance value, that is, each of the at least one double-dip distance value and each of the at least one double-difference distance value except the double-turn distance value
  • the double ⁇ distance values are compared, so that the display terminal can obtain a double ⁇ distance value with the largest value among the at least one double ⁇ distance value by mutual comparison.
  • the preset criterion that is satisfied is that the value of the double-distance distance value is the maximum value of the double-twisted distance among the at least one double-turn distance value.
  • the display terminal also uses the maximum distance value of the double ⁇ distance as the target double ⁇ distance value.
  • determining the target binocular distance value by means of mutual comparison is only one embodiment, and should not be construed as limiting the embodiment. It can also be selected, for example, by directly selecting the maximum value from at least one of the double pupil distance values.
  • Step S260 using the facial feature coordinates corresponding to the target binocular distance value as the target facial feature coordinates to determine the target facial feature coordinates from the at least one facial feature coordinate.
  • the display terminal After determining the target binocular distance value with the maximum value, the display terminal also obtains the facial feature coordinates corresponding to the target binocular distance value according to the target binocular distance value, and also uses the face feature coordinates as the target facial feature. coordinate. Therefore, it is also achieved that the target facial feature coordinates are determined from at least one facial feature coordinate.
  • the target user corresponding to the target feature coordinate determined by the first nearest rule should be the user closest to the display terminal during the viewing process, and the background environment object or the onlooker user is excluded.
  • a fourth embodiment of the method sub-flow of step S200 includes: step S270, step S280, and step S290.
  • Step S270 Obtain current distance data between the face of the target user and the display terminal, and obtain a corresponding threshold range according to the current distance data.
  • the display terminal is provided with a distance collecting unit.
  • the distance collecting unit also obtains the current distance data at the time of acquiring each frame image.
  • the display terminal can also obtain the current distance data corresponding to the frame image when processing each frame image.
  • the current distance data is a distance between a face of the target user located in front of the display terminal and the display terminal taken by the distance collecting unit.
  • the display terminal obtains a threshold range corresponding to the current distance data according to the current distance data according to a preset algorithm.
  • Step S280 Determine, from the at least one double ⁇ distance value, a double ⁇ distance value whose value is within the threshold range as the target ⁇ distance value that satisfies the preset criterion.
  • the display terminal may match each of the at least one double-dip distance value to the threshold range. Since the threshold range is obtained by detecting the corresponding target user by the distance, by matching, it can be determined that the value of one of the at least one double-dip distance value is within the threshold range. Further, by matching, the display terminal uses the double pupil distance value within the threshold range as the target double pupil distance value.
  • Step S290 using the facial feature coordinates corresponding to the target binocular distance value as the target facial feature coordinates to determine the target facial feature coordinates from the at least one facial feature coordinate.
  • the display terminal After determining the target binocular distance value of the value matching, the display terminal also obtains the facial feature coordinates corresponding to the target binocular distance value according to the target binocular distance value, and also uses the face feature coordinate as the target face feature coordinate. Therefore, it is also achieved that the target facial feature coordinates are determined from at least one facial feature coordinate.
  • the target user corresponding to the target feature coordinate determined by the second nearest rule should be the user closest to the display terminal in the current viewing time, and the user who appears in the background with a large avatar poster or onlookers is excluded. .
  • the target facial feature coordinates may also be determined by combining at least any two of the above-described centerline rule, minimum variation method, first nearest rule, and second nearest rule.
  • the centerline rule, the minimum variation rule, and the first nearest rule are combined, and the target facial feature coordinates are determined by linear weighting, voting mechanism, Boolean operation, or time-sharing triggering.
  • linear weighting it is further obtained by obtaining the reciprocal of each distance value by the center line rule, obtaining the reciprocal of each coordinate difference value by the minimum variation rule, and obtaining the double ⁇ distance value by the first nearest law. Then, a reciprocal of a coordinate difference value corresponding to each other, a reciprocal of a distance value, and a distance value of each double ⁇ are grouped into one group, thereby obtaining at least one group. After that, for each group, the reciprocal of the coordinate difference value *1 + the reciprocal of the distance value * coefficient 2+ double ⁇ distance value * coefficient 3, and the coefficient 1 + coefficient 2+ coefficient 3 is 1, for each group The corresponding result for each group.
  • the result with the largest value is determined from the at least one result as the target result, and the facial feature coordinate corresponding to the coordinate difference value, the distance value, and the double pupil distance value is obtained as the target facial feature coordinate. It can be understood that the accuracy of the target facial feature coordinate determination can be effectively improved by the above mixing method.
  • the target facial feature coordinate determined by the centerline rule is A
  • the target facial feature coordinate determined by the minimum variation rule is A
  • the target facial feature determined by the first nearest rule is determined.
  • the coordinates are B.
  • the voting mechanism 2 to 1 it is determined that the target facial feature coordinate is A as the final target facial feature coordinate. It can be understood that the accuracy of the target facial feature coordinate determination can also be effectively improved by the above mixing method.
  • the user can manually control the display terminal by his own needs, so as to implement the rules for manually switching the display terminal to determine the target facial feature. For example, if the user feels dissatisfied with the centerline method, the operation display terminal switches the centerline method to any one of the minimum variation rule, the first nearest rule, and the second nearest rule.
  • the display terminal can automatically switch the adopted rule according to the scene. For example, when the distribution of the face in the scene is detected, and the target user is present in a plurality of times, and the plurality of people are crowded, the display terminal automatically switches to the first nearest rule to obtain the second nearest rule, and when the target user has more time. When you look at the front and watch it occasionally, the terminal automatically switches to the centerline rule.
  • a second embodiment of the present invention provides a face tracking device 100.
  • the face tracking device 100 is applied to a display terminal.
  • the face tracking device 100 includes:
  • the acquiring module 110 is configured to determine at least one facial feature coordinate from the acquired current image.
  • a determining module 120 configured to determine a target facial feature coordinate that satisfies a preset criterion from the at least one facial feature coordinate, wherein the target facial feature coordinate is a facial feature distribution of a target user that needs to be tracked by the display terminal The coordinates in the current image.
  • the generating module 130 is configured to synthesize a current naked eye 3D image for display according to the target facial feature coordinates, so that the displayed current naked eye 3D image is adapted to a current viewing position of the target user.
  • the acquisition module 110 includes:
  • the ⁇ distance acquiring unit 111 is configured to obtain a double ⁇ distance value in each facial image from the current image, and at least one double ⁇ distance value.
  • the coordinate acquiring unit 112 is configured to obtain facial feature coordinates corresponding to each of the at least one double pupil distance value, and at least one facial feature coordinate, wherein each of the facial feature coordinates is a double The coordinates of the midpoint.
  • the determining module 120 includes:
  • a first obtaining unit 121 configured to obtain a distance value between each of the at least one facial feature coordinates and a center line in the current image, to obtain at least one distance value, wherein each of the The facial feature coordinates are the coordinates of the midpoint between the double turns.
  • a first determining unit 122 configured to determine, from the at least one distance value, a distance value with a minimum value as the target distance value that satisfies a preset criterion, and use the facial feature coordinate corresponding to the target distance value as the Target facial feature coordinates.
  • the determining module 120 further includes:
  • a second obtaining unit 123 configured to obtain a coordinate difference between each of the at least one facial feature coordinate and a corresponding front feature coordinate in the previous frame image, to obtain at least one coordinate difference, wherein Each of the facial feature coordinates is a coordinate of a midpoint between the double turns.
  • a second determining unit 124 configured to determine, from the at least one coordinate difference value, a coordinate difference value with a smallest value as the target coordinate difference value that satisfies a preset criterion, and the facial feature corresponding to the target coordinate difference value
  • the coordinates are the target facial feature coordinates.
  • a computer program product of a computer readable storage medium for executing a processor-executable non-volatile program code comprising a computer readable storage medium storing program code, the program code including instructions are available
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
  • the embodiment of the invention provides a face tracking method and device, and the face tracking method is applied to a display terminal.
  • the method includes: determining at least one facial feature coordinate from the collected current image; determining, from the at least one facial feature coordinate, a target facial feature coordinate that satisfies a preset criterion, wherein the target facial feature coordinate is a target user that needs to display the terminal tracking
  • the facial features are distributed in coordinates in the current image; the current naked-eye 3D image for display is synthesized according to the target facial feature coordinates such that the displayed current naked-eye 3D image is adapted to the current viewing position of the target user.
  • the target facial feature coordinates satisfying the preset criterion are determined from at least one facial feature coordinate in the current image by processing the acquired current frame of each frame. Because the target facial feature coordinates correspond to the target user that needs to display the terminal tracking, and then the current naked-eye 3D image for display is synthesized according to the target facial feature coordinates, the displayed current naked-eye 3D image can be adapted to the current viewing of the target user. position. Therefore, the display terminal can effectively track the face of the target user regardless of the change of the environment background, so that displaying the naked-eye 3D picture is a better 3D effect of adapting the user in real time.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Ophthalmology & Optometry (AREA)
  • Computer Graphics (AREA)
  • Image Processing (AREA)

Abstract

Selon certains modes de réalisation, la présente invention concerne un procédé et un dispositif de suivi facial, et se rapporte au domaine technique du traitement d'image. Le procédé consiste : à déterminer au moins un ensemble de coordonnées de caractéristiques faciales à partir d'une image actuelle acquise; à déterminer un ensemble de coordonnées de caractéristiques faciales cibles satisfaisant une norme prédéfinie à partir dudit ensemble de coordonnées de caractéristiques faciales, l'ensemble de coordonnées de caractéristiques faciales cibles étant des coordonnées de caractéristiques faciales d'un utilisateur cible devant être suivies par un terminal d'affichage et distribuées dans l'image actuelle; à combiner et à former une image 3D d'œil nu actuelle destinée à être affichée en fonction de l'ensemble de coordonnées de caractéristiques faciales cibles, de sorte que l'image 3D d'œil nu actuelle affichée est adaptée à une position de visualisation actuelle de l'utilisateur cible. Par conséquent, indépendamment du degré de changement se produisant dans un environnement d'arrière-plan, un terminal d'affichage peut suivre efficacement le visage d'un utilisateur cible, de sorte qu'une image 3D d'œil nu affichée est modifiée en temps réel afin d'obtenir un effet 3D amélioré pour les utilisateurs.
PCT/CN2018/092634 2017-11-01 2018-06-25 Procédé et dispositif de suivi facial WO2019085519A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711063566.9 2017-11-01
CN201711063566.9A CN107833263A (zh) 2017-11-01 2017-11-01 面部跟踪方法及装置

Publications (1)

Publication Number Publication Date
WO2019085519A1 true WO2019085519A1 (fr) 2019-05-09

Family

ID=61651574

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/092634 WO2019085519A1 (fr) 2017-11-01 2018-06-25 Procédé et dispositif de suivi facial

Country Status (2)

Country Link
CN (1) CN107833263A (fr)
WO (1) WO2019085519A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114356088A (zh) * 2021-12-30 2022-04-15 纵深视觉科技(南京)有限责任公司 一种观看者跟踪方法、装置、电子设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107833263A (zh) * 2017-11-01 2018-03-23 宁波视睿迪光电有限公司 面部跟踪方法及装置
CN111246196B (zh) * 2020-01-19 2021-05-07 北京字节跳动网络技术有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN114173109A (zh) * 2022-01-12 2022-03-11 纵深视觉科技(南京)有限责任公司 一种观看用户跟踪方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102830793A (zh) * 2011-06-16 2012-12-19 北京三星通信技术研究有限公司 视线跟踪方法和设备
CN104216510A (zh) * 2013-06-03 2014-12-17 由田新技股份有限公司 使荧幕中的游标移至可按物件的方法及其电脑系统
CN105955465A (zh) * 2016-04-25 2016-09-21 华南师范大学 一种桌面便携式视线跟踪方法及装置
CN106547341A (zh) * 2015-09-21 2017-03-29 现代自动车株式会社 注视跟踪器及其跟踪注视的方法
CN106843821A (zh) * 2015-12-07 2017-06-13 百度在线网络技术(北京)有限公司 自动调整屏幕的方法和装置
CN107833263A (zh) * 2017-11-01 2018-03-23 宁波视睿迪光电有限公司 面部跟踪方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2950984B1 (fr) * 2009-10-05 2012-02-03 Interactif Visuel Systeme Ivs Procede et equipement de mesures pour la personnalisation et le montage de lentilles ophtalmiques correctrices
CN103402106B (zh) * 2013-07-25 2016-01-06 青岛海信电器股份有限公司 三维图像显示方法及装置
CN104683786B (zh) * 2015-02-28 2017-06-16 上海玮舟微电子科技有限公司 裸眼3d设备的人眼跟踪方法及装置
CN105072431A (zh) * 2015-07-28 2015-11-18 上海玮舟微电子科技有限公司 一种基于人眼跟踪的裸眼3d播放方法及系统
CN106218409A (zh) * 2016-07-20 2016-12-14 长安大学 一种可人眼跟踪的裸眼3d汽车仪表显示方法及装置
CN106709303B (zh) * 2016-11-18 2020-02-07 深圳超多维科技有限公司 一种显示方法、装置及智能终端

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102830793A (zh) * 2011-06-16 2012-12-19 北京三星通信技术研究有限公司 视线跟踪方法和设备
CN104216510A (zh) * 2013-06-03 2014-12-17 由田新技股份有限公司 使荧幕中的游标移至可按物件的方法及其电脑系统
CN106547341A (zh) * 2015-09-21 2017-03-29 现代自动车株式会社 注视跟踪器及其跟踪注视的方法
CN106843821A (zh) * 2015-12-07 2017-06-13 百度在线网络技术(北京)有限公司 自动调整屏幕的方法和装置
CN105955465A (zh) * 2016-04-25 2016-09-21 华南师范大学 一种桌面便携式视线跟踪方法及装置
CN107833263A (zh) * 2017-11-01 2018-03-23 宁波视睿迪光电有限公司 面部跟踪方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114356088A (zh) * 2021-12-30 2022-04-15 纵深视觉科技(南京)有限责任公司 一种观看者跟踪方法、装置、电子设备及存储介质
CN114356088B (zh) * 2021-12-30 2024-03-01 纵深视觉科技(南京)有限责任公司 一种观看者跟踪方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN107833263A (zh) 2018-03-23

Similar Documents

Publication Publication Date Title
US11947729B2 (en) Gesture recognition method and device, gesture control method and device and virtual reality apparatus
CN106873778B (zh) 一种应用的运行控制方法、装置和虚拟现实设备
US9575559B2 (en) Gaze-assisted touchscreen inputs
US10095030B2 (en) Shape recognition device, shape recognition program, and shape recognition method
CN110363867B (zh) 虚拟装扮系统、方法、设备及介质
CN106705837B (zh) 一种基于手势的物体测量方法及装置
CN114303120A (zh) 虚拟键盘
EP3113114A1 (fr) Procédé et dispositif de traitement d'image
WO2019085519A1 (fr) Procédé et dispositif de suivi facial
KR20170031733A (ko) 디스플레이를 위한 캡처된 이미지의 시각을 조정하는 기술들
CN110968187B (zh) 由外围设备启用的远程触摸检测
KR20200079170A (ko) 시선 추정 방법 및 시선 추정 장치
KR20120068253A (ko) 사용자 인터페이스의 반응 제공 방법 및 장치
KR101892735B1 (ko) 직관적인 상호작용 장치 및 방법
WO2017084319A1 (fr) Procédé de reconnaissance gestuelle et dispositif de sortie d'affichage de réalité virtuelle
KR20160094190A (ko) 시선 추적 장치 및 방법
TW202025719A (zh) 圖像處理方法及裝置、電子設備及儲存介質
US10607069B2 (en) Determining a pointing vector for gestures performed before a depth camera
WO2016169409A1 (fr) Procédé et appareil permettant d'afficher un objet virtuel dans un espace tridimensionnel (3d)
CN111860252A (zh) 图像处理方法、设备及存储介质
WO2019142560A1 (fr) Dispositif de traitement d'informations destiné à guider le regard
JP2004265222A (ja) インタフェース方法、装置、およびプログラム
US20200326783A1 (en) Head mounted display device and operating method thereof
US20160110909A1 (en) Method and apparatus for creating texture map and method of creating database
EP3582068A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18874333

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18874333

Country of ref document: EP

Kind code of ref document: A1