CN110032921B

CN110032921B - Adjusting device and method of face recognition equipment

Info

Publication number: CN110032921B
Application number: CN201811465177.3A
Authority: CN
Inventors: 姜海洋
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2018-12-03
Filing date: 2018-12-03
Publication date: 2023-03-24
Anticipated expiration: 2038-12-03
Also published as: CN110032921A

Abstract

The present disclosure provides an adjusting device and method for a face recognition device, including: after a user approaches the face recognition equipment, starting a camera to capture a face biological characteristic image of the user and judging whether the image is captured or not; if the camera is judged not to be shot, informing a user of carrying out corresponding dynamic adjustment on the installation base by providing one or more instructions corresponding to each definition data item based on a plurality of preset definition data items which respectively define different dynamic adjustment actions corresponding to the installation base of the camera; after receiving an instruction provided by a user, the mounting base is correspondingly and dynamically adjusted, so that the visual angle range of the camera mounted on the mounting base is adjusted to capture a human face biological characteristic image which can be recognized by the user. Therefore, under the condition that the user does not need to move front and back, left and right and/or change the height of the face, the face recognition is quickly and conveniently realized, the use experience of the user is improved, and the recognition efficiency is improved.

Description

Adjusting device and method of face recognition equipment

Technical Field

The invention belongs to the technical field of face recognition, and particularly relates to an adjusting device and method of face recognition equipment.

Background art:

face recognition (face brushing) is a biometric identification technology which is widely applied at present, and compared with other biometric identification technologies (fingerprint identification, palm shape identification, iris identification and voice identification), the face recognition has the following advantages: every other biological feature recognition method needs the cooperation action of the user, and the face recognition does not need the passive cooperation of the user, so that the method can be automatically used in hidden occasions, such as the monitoring action of a public security department; when the biological record of a user attempting to log in is recorded, only the face of the user can be more visual, and the identity of the user can be more conveniently checked; compared with the traditional biological feature recognition technology, the face recognition has the advantages of simplicity, convenience, accuracy, economy, good expandability and the like, and can be widely applied to various aspects of safety verification, monitoring, control and the like. In particular, the face recognition technology can be applied to various scenes, such as face-brushing payment, face-brushing access control, face-brushing login, face-brushing sign-in, face-brushing ticket checking and the like.

As shown in fig. 1A-1C, in these specific application scenarios, face recognition is mostly implemented by using a fixedly installed face recognition device (i.e. a face brushing machine), a camera with a fixed viewing angle range is disposed on the fixed face recognition device, and in use, the face recognition device requires a user to be spaced apart from the fixed face recognition device, so that the face of the user can enter the viewing angle range of the camera installed on the face recognition device (as shown in fig. 1A), that is, if the height of the user is too low (as shown in fig. 1B) or too high (as shown in fig. 1C), the face of the user cannot completely enter the viewing angle range of the camera when the user stands at a certain distance from the face recognition device, and the user needs to actively move back and forth to recognize the face within the viewing angle range of the camera. Therefore, the scheme of using the fixedly installed face recognition device to perform face recognition has the following defects: the visual angle range of the camera installed on the face recognition device is fixed, the camera cannot be dynamically adjusted in a three-dimensional coordinate space, and if the application scene is in a common business supermarket or a retail store, the space is small or the camera cannot be conveniently moved back and forth by a user, so that the face recognition of the user by the fixed face recognition device cannot be conveniently realized.

At present, under many use scenes, especially in common merchants and retail stores, a portable face recognition device is also widely used for face brushing payment, a camera with a fixed view angle range is also arranged on the portable face recognition device, the camera cannot be dynamically adjusted in a three-dimensional coordinate space and is usually fixedly arranged on a cash desk, and due to the fact that the application scenes are usually relatively narrow in space and different in height of the cash desk, for users with different heights, when the user stations with the height being unmatched with the height of the cash desk are positioned in front of the cash desk, the face cannot enter the view angle range of the camera on the portable face recognition device, and the user can adjust the distance between the user stations and the cash desk by moving forwards and backwards or leftwards or adjusting the height of the face by squatting and foot-padding, great inconvenience is caused, and the face brushing payment experience of the user is poor.

Disclosure of Invention

In view of the problem that the conventional face recognition device cannot dynamically adjust the camera in the three-dimensional coordinate space, the present invention aims to provide an adjusting device and method for the face recognition device, so that the camera of the face recognition device can adjust the view angle range to align with the face of the user under the condition that the user does not need to move around and/or change the face height, thereby quickly and conveniently realizing face recognition, improving the user experience, and improving the recognition efficiency.

According to a first aspect of the present invention, there is provided an adjustment method for a face recognition apparatus, the face recognition apparatus comprising an input device for inputting required data; a mounting base which can be dynamically adjusted; the camera is arranged on the mounting base and used for shooting a human face biological characteristic image of a user; a display for displaying images captured by the camera and/or a speaker for outputting sound; a processor for processing the data from the camera and the input device and displaying the processed content through the display and/or outputting sound through the speaker; and a data memory for storing processing programs and data used by the processor, the adjustment method comprising the steps of:

a shooting judgment step: after the user approaches the face recognition equipment, starting the camera to shoot a face biological feature image of the user, judging whether the camera shoots a recognizable face biological feature image of the user, and entering a subsequent prompting and explaining step if the camera does not shoot the face biological feature image of the user or only can shoot a part of the face biological feature image; and

prompting and explaining steps: based on a plurality of preset definition data items which are stored in the data storage and respectively define the installation base to correspondingly perform different dynamic adjustment actions, the processor provides a prompt for a user through the display and/or the loudspeaker, and informs the user that the installation base can be correspondingly dynamically adjusted by providing one or more instructions corresponding to the definition data items; and

and a dynamic adjustment step: after receiving one or more instructions provided by the user, the processor causes the mounting base to perform dynamic adjustment corresponding to one or more definition data items in the plurality of definition data items corresponding to the provided instructions, so as to adjust the view angle range of the camera mounted on the mounting base to capture the recognizable human face biological characteristic image of the user.

Optionally, in an example of the above aspect, a preset storing step is further included before the image capturing determination step: inputting the preset definition data items through the input device, and storing the definition data items in the data storage.

Optionally, in an example of the above aspect, an adjustment ending step is further included after the dynamic adjustment step: if the camera captures a recognizable facial biometric image of the user, receiving another instruction provided by the user to indicate that the dynamic adjustment is finished, so that the mounting base stops the dynamic adjustment.

Optionally, in an example of the above aspect, the plurality of definition data items includes: a first defined data item representing a dynamic adjustment of the mounting base in an upward height direction corresponding to a first user instruction; a second defining data item representing a dynamic adjustment of the mounting base height downward corresponding to a second user instruction; a third defined data item representing a dynamic adjustment of the mounting base to turn left corresponding to a third user instruction; a fourth definition data item representing a dynamic adjustment of the mounting base to turn right corresponding to a fourth user instruction; a fifth definition data item indicating that the mounting base corresponding to the fifth user instruction is subjected to dynamic adjustment in which the pitch angle becomes larger; a sixth definition data item indicating that the mounting base corresponding to the sixth user instruction is subjected to dynamic adjustment in which the pitch angle becomes smaller; and a seventh defining data item representing that the mounting base corresponding to the seventh user instruction stops dynamic adjustment.

Optionally, in an example of the above aspect, the user instructions corresponding to the definition data items include: different gestures provided by the user corresponding to each defined data item.

Optionally, in an example of the above aspect, the face recognition device further includes a sound pickup connected to the processor, and the user instructions corresponding to the definition data items include: the speech of different contents provided by the user corresponding to each definition data item.

Optionally, in an example of the above aspect, the different user-provided gestures corresponding to the respective defined data items include: the gesture corresponding to the first defined data item is that a fist is made to hold the thumb up, which means that the mounting base is required to be dynamically adjusted in height up; the gesture corresponding to the second defined data item is that a fist is made to hold the thumb downwards, which indicates that the mounting base is required to be dynamically adjusted downwards in height; the gesture corresponding to the third defined data item is to raise the thumb of the fist to the left, indicating a dynamic adjustment requiring the mounting base to make a left turn; the gesture corresponding to the fourth defined data item is a fist holding the thumb right, indicating a dynamic adjustment requiring the mounting base to make a right turn; the gesture corresponding to the fifth definition data item is that five fingers are closed upwards, and the fact that the mounting base is required to be subjected to dynamic adjustment of increasing the pitching angle is shown; the gesture corresponding to the sixth definition data item is that five fingers are closed downwards, and the fact that the mounting base is required to be dynamically adjusted to reduce the pitching angle is indicated; and the gesture corresponding to the seventh defined data item is a clenched fist indicating that the mounting base is required to stop the dynamic adjustment.

Optionally, in an example of the above aspect, the user-provided different content voices corresponding to the respective definition data items include: the voice corresponding to the first defined data item is "up", which means that the mounting base is required to be dynamically adjusted in height upwards; the voice corresponding to the second defined data item is "down", which means that the mounting base is required to be dynamically adjusted in height downward; the voice corresponding to the third defined data item is "left", which indicates that the mounting base is required to make a dynamic adjustment to the left; a voice corresponding to the fourth defining data item is "right", indicating that the dynamic adjustment of the mount base to turn right is required; a voice corresponding to the fifth definition data item is "pitch large", indicating that the mount base is required to make a dynamic adjustment in which the pitch angle becomes large; the voice corresponding to the sixth definition data item is "pitch small", which indicates that the mounting base is required to make dynamic adjustment in which the pitch angle becomes small; and the voice corresponding to the seventh defining data item is "end", indicating that the mounting base is required to stop the dynamic adjustment.

Optionally, in an example of the above aspect, the image capturing determination step further includes a reset step: before the camera is started to capture the facial biometric image of the user, the mounting base is reset to an initial height and an initial pitch angle.

Optionally, in an example of the above aspect, in the prompting, by the display and/or the speaker, a user to make a corresponding dynamic adjustment of the mounting base by providing gestures corresponding to the defined data items.

Optionally, in an example of the above aspect, in the prompting, via the display and/or the speaker, a user to make a corresponding dynamic adjustment of the mounting base by providing a voice with content corresponding to the respective definition data items.

Optionally, in an example of the above aspect, in the dynamic adjustment step, the camera captures one or more gestures provided by the user, and the processor performs gesture recognition processing on the captured gesture image and compares the captured gesture image with one or more predefined data items stored in the data storage, so as to issue a corresponding instruction, so that the mounting base performs dynamic adjustment corresponding to the corresponding one or more defined data items.

Optionally, in an example of the above aspect, in the dynamic adjustment step, after the sound pickup picks up the voice of the one or more items of content provided by the user, the processor performs voice recognition processing on the picked-up voice data and compares the voice data with the one or more preset definition data items stored in the data storage, so as to issue a corresponding instruction, so that the mounting base performs dynamic adjustment corresponding to the corresponding one or more definition data items.

Optionally, in an example of the above aspect, in the dynamic adjustment step, the dynamic adjustment performed by the mounting base includes vertical height adjustment and/or horizontal rotation adjustment and/or pitch angle size adjustment performed by the mounting base.

Optionally, in an example of the above aspect, in the dynamic adjustment step, the dynamic adjustment performed by the mounting base is performed at a constant speed at a predetermined speed.

Optionally, in an example of the above aspect, the plurality of definition data items further includes: an eighth defining data item indicating an adjustment speed of the mount base corresponding to an eighth user instruction.

Optionally, in an example of the above aspect, an eighth user instruction corresponding to the eighth defining data item includes: gestures of different motion speeds.

Optionally, in an example of the above aspect, the face recognition device further includes a sound pickup connected to the processor, and the eighth user instruction corresponding to the eighth definition data item includes: containing speech representing content of different movement velocities.

Optionally, in an example of the above aspect, in the dynamic adjustment step, after receiving gesture images of different motion speeds provided by a user and captured by the camera, the processor performs gesture recognition processing on the captured gesture images of different motion speeds, and compares the gesture images with the eighth defined data item stored in the data storage, so as to issue a corresponding instruction, so that the mounting base performs dynamic adjustment at a corresponding speed.

Optionally, in an example of the above aspect, in the dynamic adjustment step, after receiving voice data which is provided by a user and contains contents representing different moving speeds and picked up by the sound pickup device, the processor performs voice recognition processing on the picked-up voice data containing contents representing different moving speeds and compares the voice data with the eighth defined data item stored in the data storage, so as to issue a corresponding instruction, so that the mounting base is dynamically adjusted at a corresponding speed.

According to a second aspect of the present invention, there is provided an adjustment device for a face recognition apparatus, the face recognition apparatus comprising an input device for inputting required data; a mounting base which can be dynamically adjusted; the camera is arranged on the mounting base and used for shooting a human face biological characteristic image of a user; a display for displaying images captured by the camera and/or a speaker for outputting sound; a processor for processing the data from the camera and the input device and displaying the processed content through the display and/or outputting sound through the speaker; and a data memory for storing processing programs and data used by the processor, the adjusting device comprising:

the camera shooting judging module is used for starting the camera to shoot the human face biological characteristic image of the user after the user approaches the human face recognition equipment, judging whether the camera shoots the human face biological characteristic image which can be recognized by the user or not, and if the camera does not shoot the human face biological characteristic image of the user or can only shoot a part of the human face biological characteristic image, executing the next processing by the prompt instruction module;

a prompt instruction module, configured to provide a prompt to a user based on a plurality of defined data items stored in the data storage and preset in advance, where the defined data items respectively define that the installation base performs different dynamic adjustment actions, and notify the user that the installation base can perform corresponding dynamic adjustment by providing one or more instructions corresponding to the defined data items; and

and the dynamic adjustment module is used for enabling the mounting base to perform dynamic adjustment corresponding to one or more corresponding definition data items based on one or more definition data items corresponding to the provided instruction in the plurality of definition data items after receiving one or more instructions provided by the user, so that the visual angle range of the camera mounted on the mounting base is adjusted to capture the recognizable human face biological feature image of the user.

Optionally, in an example of the foregoing aspect, the adjusting apparatus further includes an input setting module, configured to input the predetermined plurality of definition data items; and a definition data item storage module for storing the plurality of definition data items.

Optionally, in an example of the above aspect, the adjusting apparatus further includes an adjustment ending module, configured to receive another instruction provided by the user to indicate that the dynamic adjustment ends if the camera captures a recognizable facial biometric image of the user, so that the mounting base stops the dynamic adjustment.

Optionally, in an example of the above aspect, the plurality of definition data items includes: a first defined data item representing a dynamic adjustment of the mounting base in an upward height direction corresponding to a first user instruction; a second defined data item representing a dynamic adjustment of the mounting base height downward corresponding to a second user instruction; a third defining data item representing a dynamic adjustment of the mounting base to turn left corresponding to a third user instruction; a fourth definition data item representing a dynamic adjustment of the mounting base to turn right corresponding to a fourth user instruction; a fifth definition data item indicating that the mounting base corresponding to the fifth user instruction is subjected to dynamic adjustment in which the pitch angle becomes larger; a sixth definition data item indicating that the mount base corresponding to the sixth user instruction performs dynamic adjustment in which the pitch angle becomes smaller; and a seventh defining data item indicating that the mounting base corresponding to the seventh user instruction stops the dynamic adjustment.

Optionally, in an example of the above aspect, the user-provided different content voices corresponding to the respective definition data items include: the voice corresponding to the first defined data item is "up", which means that the mounting base is required to be dynamically adjusted in height upwards; the voice corresponding to the second defined data item is "down", which means that the mounting base is required to be dynamically adjusted in height downward; a voice corresponding to the third defined data item is "to the left", indicating that the mounting base is required to make a dynamic adjustment to the left; the voice corresponding to the fourth defined data item is "right", which indicates that the dynamic adjustment of turning to the right is required for the mounting base; the voice corresponding to the fifth definition data item is "pitch large", which means that the mounting base is required to perform dynamic adjustment in which the pitch angle becomes large; the voice corresponding to the sixth definition data item is "pitch small", which indicates that the mounting base is required to make dynamic adjustment in which the pitch angle becomes small; and the voice corresponding to the seventh defining data item is "end", indicating that the mounting base is required to stop the dynamic adjustment.

Optionally, in an example of the above aspect, the camera is reset to an initial height and an initial pitch angle by the mounting base before the camera initiates taking of the biometric image of the user's face.

Optionally, in an example of the foregoing aspect, the instruction module prompts a user to make a corresponding dynamic adjustment on the mounting base by providing gestures corresponding to the definition data items.

Optionally, in an example of the above aspect, the instruction module prompts the user to make a corresponding dynamic adjustment by providing a voice with one or more items of content corresponding to the defined data items.

Optionally, in an example of the above aspect, after the camera captures one or more gestures provided by the user, the dynamic adjustment module performs gesture recognition processing on the captured gesture image, and compares the captured gesture image with one or more preset definition data items stored in the definition data item storage module, so as to issue a corresponding instruction, so that the mounting base performs corresponding dynamic adjustment.

Optionally, in an example of the above aspect, after receiving the voice of one or more items of content provided by the user and picked up by the sound pickup, the dynamic adjustment module performs voice recognition processing on the picked-up voice data, and compares the voice data with one or more preset definition data items stored in the definition data item storage module, so as to issue a corresponding instruction, so that the mounting base performs corresponding dynamic adjustment.

Optionally, in an example of the above aspect, the dynamic adjustment by the mounting base includes up-down height adjustment and/or left-right rotation adjustment and/or pitch angle size adjustment by the mounting base.

Optionally, in an example of the above aspect, the dynamic adjustment by the mounting base is performed at a constant speed at a predetermined speed.

Optionally, in an example of the above aspect, after receiving gesture images with different motion speeds provided by a user and captured by the camera, the dynamic adjustment module performs gesture recognition processing on the captured gesture images with different motion speeds, and performs comparison recognition processing on the captured gesture images with the eighth definition data item stored in the definition data item storage module, so as to issue a corresponding instruction, so that the mounting base performs dynamic adjustment at a corresponding speed.

Optionally, in an example of the above aspect, after receiving the voice data containing contents indicating different moving speeds, which is provided by the user and picked up by the sound pickup device, the dynamic adjustment module performs a voice recognition process on the picked-up voice data containing contents indicating different moving speeds, and performs a comparison recognition process on the voice data and the eighth definition data item stored in the definition data item storage module, so as to issue a corresponding instruction, so that the mounting base performs dynamic adjustment at a corresponding speed.

Optionally, in an example of the above aspect, the input setting module in the adjusting device is implemented by the input device in the face recognition apparatus.

Optionally, in an example of the above aspect, the defining data item storing module in the adjusting device is implemented by the data storage in the face recognition apparatus.

Optionally, in an example of the above aspect, the image capturing judgment module in the adjustment apparatus is implemented by a camera and a processor in the face recognition device.

Optionally, in an example of the above aspect, the hint specification module in the adjusting means is implemented by the processor and the display and/or the speaker in the face recognition device;

optionally, in an example of the above aspect, the dynamic adjustment module and the adjustment ending module in the adjusting apparatus are implemented by the processor in the face recognition device.

According to a third aspect of the present invention, there is provided a face recognition apparatus comprising input means for inputting desired data; a mounting base which can be dynamically adjusted; the camera is arranged on the mounting base and used for shooting a human face biological characteristic image of a user; a display for displaying images captured by the camera and/or a speaker for outputting sound by a user; a processor for processing the data from the camera and the input device and displaying the processed content through the display and/or outputting sound through the speaker; and a data memory for storing processing programs and data used by the processor, wherein the face recognition device further comprises the above-mentioned adjusting means.

Optionally, in an example of the above aspect, the input setting module in the adjusting device is implemented by the input device in the face recognition apparatus

Optionally, in an example of the above aspect, the definition data item storage module in the adjusting device is implemented by the data storage in the face recognition apparatus.

Optionally, in an example of the above aspect, the camera determination module in the adjusting device is implemented by the camera and the processor in the face recognition apparatus.

optionally, in an example of the above aspect, the dynamic adjustment module and the adjustment ending module in the adjustment apparatus are implemented by the processor in the face recognition device.

Optionally, in an example of the above aspect, the face recognition device further includes a sound pickup connected to the processor, and the instruction module prompts the user to make a corresponding dynamic adjustment of the mounting base by providing a voice with one or more contents corresponding to the defined data items. If the sound pickup picks up the voice of one or more contents provided by the user, the dynamic adjustment module causes the installation base to perform dynamic adjustment corresponding to the corresponding one or more definition data items based on one or more definition data items corresponding to the contents of the voice provided by the user among the plurality of definition data items.

According to the adjusting device and method of the face recognition equipment and the face recognition equipment comprising the adjusting device, the installation base of the camera provided with the face recognition equipment is dynamically adjusted by utilizing different biological characteristics provided by a user, so that the shooting angle range of the camera is changed, and the camera of the face recognition equipment can shoot the human face biological characteristics recognizable by the user under the condition that the user does not need to move and/or change the face height, so that the face recognition is quickly and conveniently realized, the comfortable face brushing experience effect of the user is improved, and the overall recognition efficiency is improved.

Drawings

A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.

FIGS. 1A-1C are schematic diagrams illustrating application scene states of a face recognition device in the prior art;

fig. 2 is a block diagram illustrating an adjusting apparatus of a face recognition device based on gesture recognition according to an embodiment of the present disclosure;

fig. 3 is a block diagram illustrating a structure of an adjusting apparatus of a face recognition device based on speech recognition according to another embodiment of the present disclosure;

FIG. 4 illustrates a flow chart of an adjustment method of a face recognition device based on gesture recognition according to an embodiment of the present disclosure;

5A-5C are schematic diagrams illustrating application scene states adjusted by the adjustment method of the face recognition device based on gesture recognition shown in FIG. 4;

fig. 6 is a flowchart illustrating a gesture recognition process in the adjustment method of the face recognition device based on gesture recognition shown in fig. 4;

fig. 7 shows a flow chart of an adjustment method of a face recognition device based on speech recognition according to another embodiment of the present disclosure;

8A-8C are schematic diagrams illustrating states of an application scene adjusted by the method for adjusting the face recognition device based on voice recognition shown in FIG. 7;

fig. 9 shows a flowchart of a voice recognition process in an adjustment method of a face recognition apparatus based on voice recognition according to another embodiment of the present disclosure;

FIG. 10 is a block diagram of a face recognition device including an adjustment mechanism based on gesture recognition according to an embodiment of the present disclosure;

fig. 11 is a block diagram illustrating a configuration of a face recognition apparatus including an adjustment device based on speech recognition according to another embodiment of the present disclosure.

Detailed Description

The subject matter described herein will now be discussed with reference to specific embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. In addition, features described with respect to some examples may also be combined in other examples.

As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.

Fig. 2 shows a block diagram illustrating the structure of the adjustment apparatus 100 of the face recognition device 500/600 based on gesture recognition according to an embodiment of the present disclosure.

As shown in fig. 2, there is provided an adjusting apparatus 100 for a face recognition device 500/600 based on gesture recognition, the face recognition device 500 (see fig. 10) comprising an input means 510 for inputting required data; a mounting base 520 capable of dynamic adjustment including vertical height adjustment and/or horizontal rotation adjustment and/or pitch angle adjustment; a camera 530 disposed on the mounting base 520 for capturing a facial biometric image of a user; a display 540 for displaying images captured by the cameras 530/630; a processor 550 for performing required processing on image data from the camera 530 and data from the input device 510; and a data storage unit 560 for storing processing programs and data used by the processor 550, wherein the adjustment apparatus 100 includes an input setting module 110, a defined data item storage module 120, a camera judgment module 130, a hint specification module 140, a dynamic adjustment module 150, and an adjustment end module 160.

The input setting module 110 is implemented by the input device 510 in the face recognition apparatus 500, and is configured to input a plurality of definition data items preset to correspond to different gestures provided by a user, where each definition data item defines the mounting base 520 to perform a corresponding dynamic adjustment operation, where the definition data items include: a first defined data item representing a dynamic adjustment of the mounting base 520 in height up corresponding to a user gesture of raising a fist with the thumb up; a second defining data item representing a dynamic adjustment of the mounting base 520 in a downward height corresponding to a user gesture of raising a fist downward; a third defined data item representing a dynamic adjustment of a left turn of the mounting base 520 corresponding to a user gesture of holding a fist up with a thumb to the left; a fourth defining data item representing a dynamic adjustment of a right turn of the mounting base 520 corresponding to a user gesture of raising a fist to the right; a fifth definition data item indicating that the dynamic adjustment of the increase of the pitch angle is performed for the mounting base 520 whose five fingers are closed upward corresponding to the user gesture; a sixth definition data item indicating that the dynamic adjustment of the pitch angle of the mounting base 520 is performed with the five fingers closed downward corresponding to the user gesture; and a seventh defining data item indicating that the mounting base 520 corresponding to the fist closed by the user's gesture stops the dynamic adjustment.

The definition data item storage module 120 is implemented by the data storage 560 in the face recognition apparatus 500 for storing the plurality of definition data items.

The camera determination module 130 is implemented by the camera 530 and the processor 550 in the face recognition device 500, and is configured to enable the camera 530 to start capturing a face biometric image of the user before or after the user is located in the face recognition device 500, determine whether the camera 530 captures a recognizable face biometric image of the user by the processor 550, and if it is determined that the camera 530 captures the recognizable face biometric image of the user, perform subsequent recognition processing on the captured recognizable face biometric image of the user by the processor 550; if the camera 530 does not capture the facial biometric image of the user or only captures a part of the facial biometric image (for example, the user is tall as shown in fig. 5A), the prompt instruction module 140 executes the next processing.

The instruction module 140 is implemented by the display 540 and the processor 550 in the face recognition device 500, and is configured to display an instruction content through the display 540 to prompt a user to make a corresponding dynamic adjustment on the mounting base 520 by providing different gestures corresponding to the defined data items to the camera 530, where the instruction content includes: 1. if the camera is required to ascend, please make a gesture to the camera: the fist is held with the thumb upright; 2. if the camera is required to descend, please make a gesture to the camera: the fist is held with the thumb downward; 3. if the camera needs to turn left, please make a gesture to the camera: the fist is held with the thumb standing to the left; 4. if the camera needs to turn right, please make a gesture to the camera: the fist is held and the thumb is erected rightwards; 5. if the pitching angle of the camera is required to be increased, please make a gesture to the camera: the five fingers are folded upwards; 6. if the pitching angle of the camera is required to be reduced, please make a gesture to the camera: the five fingers are closed and face downwards; 7. if the adjustment is finished, please make a gesture to the camera: a clenched fist.

The dynamic adjustment module 150 is implemented by the processor 550 in the face recognition device 500, and is configured to, after the display 540 displays that the camera 530 captures a gesture provided by the user (for example, in a case where the user is high in height as shown in fig. 5A, the gesture provided by the user is a fist raising and thumb raising), perform gesture recognition processing on the captured gesture image, and issue an instruction to cause the mounting base 520 to perform corresponding dynamic adjustment at a constant speed (for example, perform dynamic adjustment at a constant speed in the height direction) based on a definition data item corresponding to the provided gesture (for example, the mounting base performs dynamic adjustment in the height direction) in the plurality of definition data items. Subsequently, the user may also provide a gesture with another movement speed (for example, five fingers are closed upward), after the display 540 displays that the camera 530 captures the gesture provided by the user, the dynamic adjustment module 150 performs gesture recognition processing on the captured gesture image, and compares the captured gesture image with a corresponding definition data item (for example, the mounting base 520 performs dynamic adjustment of a larger pitch angle) stored in the definition data item storage module 120, so as to issue a corresponding instruction, so that the mounting base 520 changes from previous dynamic adjustment of a constant speed (for example, dynamic adjustment of a constant speed upward in height) to another dynamic adjustment of a constant speed (for example, a larger pitch angle). The dynamic adjustment of the constant speed refers to, for example, moving up and down at a speed of 1 cm/second, and/or rotating left and right at a speed of 1 degree/second, and/or changing a pitch angle at a speed of 1 degree/second, and of course, different constant speeds may be set according to different application scenarios.

The adjustment ending module 160 is implemented by the processor 550 in the face recognition device 500, and is configured to enable the camera 530 to continue to capture the facial biometric image of the user when the mount base 520 performs the uniform dynamic adjustment (e.g., performs the uniform dynamic adjustment in which the height is increased and/or the pitch angle is increased), if the display 540 displays the recognizable facial biometric image captured by the camera 530, the user provides a gesture (a closed fist) indicating the end of the dynamic adjustment, the processor 550 performs gesture recognition processing on the captured gesture image, and compares the captured gesture image with a corresponding definition data item stored in the definition data item storage module 120 (e.g., the mount base 520 ends the dynamic adjustment) to issue an indication of the end of the adjustment, so that the mount base 520 stops the dynamic adjustment, and the captured facial biometric image of the user is sent to the processor 550 for subsequent recognition processing.

In a variant of this embodiment (not shown), said input setting module 110 and said defining data item storage module 120 are not included; the instruction module 140 provides a prompt to the user based on a plurality of defined data items stored in the data storage and respectively defining the installation base 520 to perform different dynamic adjustment actions, and informs the user that the installation base 520 can perform corresponding dynamic adjustment by providing one or more instructions corresponding to the defined data items.

In another variation of this embodiment (not shown), the end of adjustment module 160 is not included; the plurality of definition data items does not include the seventh definition data item; the prompt content displayed by the display 540 does not include: 7. if the adjustment is finished, please make a gesture to the camera: a clenched fist; after the camera 530 captures the recognizable facial biometric image of the user, the dynamic adjustment module 150 can automatically recognize and issue an instruction to end the dynamic adjustment of the mounting base 520.

In another variant of this embodiment (not shown), the plurality of definition data items further comprises: an eighth defined data item representing different adjustment speeds of the mounting base 520 corresponding to gestures with different movement speeds provided by the user (e.g., a fist-raising thumb-up gesture reciprocating at different movement speeds), for example, setting the movement speeds of the gestures provided by the user to a range a (reciprocating 1 time/second), a range B (reciprocating 2 times/second), and a range C (reciprocating 3 times and more/second), the dynamic adjustment speed of the mounting base 520 corresponds to a speed a (high-low adjustment of 1 cm/second; or left-right or pitch angle adjustment of 1 degree/second), a speed B (high-low adjustment of 2 cm/second; or left-right or pitch angle adjustment of 2 degrees/second), and a speed C (high-low adjustment of 3 cm/second; or left-right or pitch angle adjustment of 3 degrees/second).

In this variation, the dynamic adjustment module 140 performs gesture recognition processing on the captured gesture image with a movement speed based on the display 540 showing that the camera 530 captures the gesture provided by the user with a movement speed (e.g. the thumb is held up for 2 times/second), and compares the captured gesture image with a movement speed with the corresponding definition data item (e.g. the dynamic adjustment of the height of the mounting base 520 is performed at the speed B: 2 cm/second) stored in the definition data item storage module 120, wherein the corresponding definition data item includes the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520 performs corresponding dynamic adjustment (e.g. height up) at a corresponding speed (e.g. speed B: 2 cm/second). Subsequently, the user provides a gesture with another movement speed (for example, five fingers move back and forth upwards for 1 time/second), after the display 540 displays that the camera 530 captures the gesture provided by the user, the dynamic adjustment module 140 performs gesture recognition processing on the captured gesture image with the movement speed, and compares the captured gesture image with the movement speed according to a corresponding definition data item (for example, the mounting base 520 performs dynamic adjustment of a larger pitch angle at an adjustment speed of a speed: 1 degree/second) stored in the definition data item storage module 120 and including the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520/620 performs corresponding dynamic adjustment (for example, a larger pitch angle) at a corresponding speed (for example, a speed: 1 degree/second). Until the display 540 displays the recognizable facial biometric image of the user captured by the camera 530, the user provides a gesture (a clenched fist) indicating the end of the dynamic adjustment, and the processor 550 performs gesture recognition processing on the captured gesture image and compares the captured gesture image with a corresponding definition data item stored in the definition data item storage module 120 (for example, the installation base 520 ends the dynamic adjustment) to send an indication of the end of the adjustment, so that the installation base 520 stops the dynamic adjustment.

In another variation (not shown) of this embodiment, the face recognition device 500 further includes a speaker 570, the instruction module 140 is implemented by the processor 550 and the display 540 and/or the speaker 570 in the face recognition device, and the instruction module 140 displays the instruction content through the display 540 and/or broadcasts the instruction content through the speaker 570.

In another variation of this embodiment (not shown), the camera determination module 230 resets the mounting base 520 to an initial height and an initial pitch angle calculated based on the counted average height of the occupants of the area before the camera 530 initiates the capturing of the biometric image of the face of the user by the processor 550.

Fig. 3 is a block diagram illustrating an adjusting apparatus 200 of a face recognition device based on speech recognition according to another embodiment of the present disclosure.

As shown in fig. 3, there is provided an adjusting apparatus 200 for a face recognition device based on speech recognition, the face recognition device 600 (see fig. 11) includes an input means 610 for inputting required data; a mounting base 620 capable of dynamic adjustment including vertical height adjustment and/or lateral rotation adjustment and/or pitch angle adjustment; the camera 630 is arranged on the mounting base 620 and used for shooting a human face biological characteristic image of the user; a display 640 for displaying an image captured by the camera 630; a sound pickup 670 for picking up a voice provided by a user; a processor 650 for performing required processing on image data from the camera 630 and data from the input device 610; and a data storage device 660 for storing processing programs and data used by the processor 650, wherein the adjusting device 200 comprises an input setting module 210, a defined data item storage module 220, a camera shooting judging module 230, a prompt instruction module 240, a dynamic adjusting module 250 and an adjusting ending module 260.

The input setting module 210 is implemented by the input device 610 in the face recognition apparatus 600, and is configured to input a plurality of definition data items preset to correspond to voices with different contents provided by a user, where each definition data item defines the installation base 620 to perform a corresponding dynamic adjustment operation, where the definition data items include: a first defining data item indicating that the installation base 620 corresponding to the user's voice content being "up" is dynamically adjusted in height upward; a second definition data item indicating that the installation base 620 corresponding to the user's voice content being "down" is dynamically adjusted in height down; a third definition data item indicating a dynamic adjustment of turning left corresponding to the installation base 620 whose user voice content is "left"; a fourth definition data item indicating a dynamic adjustment of turning to the right corresponding to the installation base 620 whose user's voice content is "to the right"; a fifth definition data item indicating that the dynamic adjustment of the increase of the pitch angle is performed corresponding to the mounting base 620 whose user voice content is "pitch large"; a sixth definition data item indicating that the dynamic adjustment of the pitch angle reduction is performed for the mounting base 620 whose user voice content is "pitch reduction"; and a seventh defining data item indicating that the mounting base 620 corresponding to the user's voice content being "stop" stops the dynamic adjustment.

The definition data item storage module 220 is implemented by the data storage 660 in the face recognition apparatus 600 for storing the plurality of definition data items.

The camera determining module 230 is implemented by the camera 630 and the processor 650 in the face recognition apparatus 600, and is configured to enable the camera 630 to start capturing a facial biometric image of the user before or after the user is located in the face recognition apparatus 600, and determine, by the processor 650, whether the camera 630 captures an identifiable facial biometric image of the user, and if it is determined that the camera 630 captures the identifiable facial biometric image of the user, the processor 650 performs subsequent recognition processing on the captured identifiable facial biometric image of the user; if the camera 630 determines that the face biometric image of the user is not captured or only a part of the face biometric image is captured (for example, in the case that the user is short as shown in fig. 8A), the prompt instruction module 240 executes the next processing.

The instruction prompting module 240 is implemented by the display 640 and the processor 650 in the face recognition device 600, and is configured to display prompting contents through the display 640 to prompt a user to make a corresponding dynamic adjustment on the installation base 620 by providing the camera 630 with voices of different contents corresponding to the defined data items. The prompt content comprises: 1. if the camera is required to ascend, please say "upward"; 2. if the camera is required to descend, please say "down"; 3. if the camera needs to be turned left, please say "left"; 4. if the camera needs to be turned right, please say "right"; 5. if the camera is required to be lifted upwards, please say that the pitch is large; 6. if the camera needs to be lifted downwards, please say that the pitching is small; 7. if the adjustment is over, please say "end".

The dynamic adjustment module 250 is implemented by the processor 650 in the face recognition device 600, and is configured to perform a voice recognition process on the picked-up voice data after the voice pickup 670 picks up the voice provided by the user (for example, in a case where the user has a short height as shown in fig. 8A, the user utters the voice "down"), and compare the picked-up voice data based on a definition data item corresponding to the content of the provided voice (for example, the mounting base 620 performs a dynamic adjustment in height direction) among the plurality of definition data items, so as to issue a corresponding instruction to make the mounting base 620 perform a corresponding dynamic adjustment in constant speed (for example, perform a dynamic adjustment in constant speed in height direction), and then, the user may also provide a voice (for example, "small pitch") with another content, and after the voice pickup 670 picks up the another voice provided by the user, the dynamic adjustment module 250 performs a voice recognition process on the picked-up voice data and makes a comparison according to the corresponding definition data item (for example, the dynamic adjustment in small pitch) stored in the definition data item storage module 220, thereby making the corresponding dynamic adjustment in pitch module 520 perform a comparison, so as to make the corresponding dynamic adjustment in constant speed (for example, the dynamic adjustment in constant speed (for example, making the dynamic adjustment in constant speed adjustment) before). The dynamic adjustment of the constant speed refers to, for example, moving up and down at a speed of 1 cm/second, and/or rotating left and right at a speed of 1 degree/second, and/or changing a pitch angle at a speed of 1 degree/second, and of course, different constant speeds may be set according to different application scenarios.

The adjustment ending module 260 is implemented by the processor 650 in the face recognition device 600, and is configured to enable the camera 630 to continue to capture the biometric facial image of the user when the mounting base 620 performs the constant-speed dynamic adjustment (e.g., performs the constant-speed dynamic adjustment in which the height is downward and/or the pitch angle is decreased), provide a voice indicating that the dynamic adjustment ends if the display 640 displays the recognizable biometric facial image captured by the camera 630 to the user ("end"), perform voice recognition processing on the captured voice data, and compare the captured voice data according to the corresponding definition data items stored in the definition data item storage module 220 (e.g., the mounting base 620 ends the dynamic adjustment), so as to issue an indication that the adjustment ends, so that the mounting base 620 ends the dynamic adjustment, and the captured biometric facial image of the user is sent to the processor 650 for subsequent recognition processing.

In a variant of this embodiment (not shown), said input settings module 210 and said definition data item storage module 220 are not included; the instruction prompting module 240 provides a prompt to a user based on a plurality of defined data items stored in the data storage, wherein the defined data items are preset and respectively define different dynamic adjustment actions corresponding to the installation base 620, and informs the user that the installation base 620 can be dynamically adjusted correspondingly by providing one or more instructions corresponding to the defined data items.

In another variation of this embodiment (not shown), the end of adjustment module 260 is not included; the plurality of definition data items does not include the seventh definition data item; the prompt contents displayed by the display 640 do not include: 7. if the adjustment is finished, please say "finished"; after the camera 630 captures an image of a human face biometric feature recognizable by the user, the dynamic adjustment module 250 may automatically recognize and issue an instruction, so that the dynamic adjustment is finished by the mounting base 620.

In another variant of this embodiment (not shown), the plurality of definition data items further comprises: an eighth definition data item indicating different adjustment speeds of the mounting base 620 corresponding to the voice contents of different movement speeds provided by the user (e.g., fast/medium/slow), for example, setting the movement speed expressed in the voice contents provided by the user to "slow" (reciprocating 1 time/second), "medium" (reciprocating 2 times/second) and "fast" (reciprocating 3 times and more/second), the dynamic adjustment speed of the mounting base 620 corresponds to a-speed (high-low or left-right adjustment of 1 cm/second; or pitch angle adjustment of 1 degree/second), B-speed (high-low or left-right adjustment of 2 cm/second; or pitch angle adjustment of 2 degrees/second) and C-speed (high-low or left-right adjustment of 3 cm/second; or pitch angle adjustment of 3 degrees/second).

In this modification, the dynamic adjustment module 250 performs a voice recognition process on the voice with a content provided by the user (for example, the user utters the voice "slow down") picked up by the sound pickup 670, and compares the voice with the corresponding definition data item including the eighth definition data item (for example, the mounting base 620 performs dynamic adjustment of the height down at a speed of a: 1 cm/sec) stored in the definition data item storage module 220, thereby issuing a corresponding instruction to cause the mounting base 620 to perform corresponding dynamic adjustment at a corresponding speed (for example, the mounting base 620 performs dynamic adjustment of the height up at a speed of a: 1 cm/sec). Subsequently, when the user provides a voice with another content (for example, the user utters a voice "pitch is large"), after the voice is picked up by the sound pickup 670, the dynamic adjustment module 250 performs voice recognition processing on the picked-up voice data, and compares the picked-up voice data with a corresponding definition data item (for example, the dynamic adjustment of the mounting base 620 for decreasing the pitch angle at the speed of B speed: 2 degrees/sec) stored in the definition data item storage module 220, which includes the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 620 changes from a previous dynamic adjustment (for example, the dynamic adjustment of the mounting base 620 for increasing the height at the speed of a: 1 cm/sec) to another dynamic adjustment at a corresponding speed (for example, the dynamic adjustment of the mounting base 620 for decreasing the pitch angle at the speed of B: 2 degrees/sec).

In this variation, the face recognition device 600 further includes a speaker 680, the instruction module 240 is implemented by the processor 650 and the display 640 and/or the speaker 680 in the face recognition device 600, and the instruction module 240 displays the instruction content through the display 640 and/or broadcasts the instruction content through the speaker 680.

In another variation of this embodiment (not shown), the camera determination module 230 resets the mounting base 620 to an initial height and an initial pitch angle calculated based on the counted average height of the occupants of the area before the camera 630 initiates the capturing of the biometric image of the face of the user.

Fig. 4 shows a flowchart of an adjustment method 300 of the face recognition apparatus implemented by using the adjustment device shown in fig. 2 according to an embodiment of the present disclosure.

As shown in fig. 4, there is provided an adjustment method 300 for a face recognition apparatus based on gesture recognition, the face recognition apparatus 500 (see fig. 10) includes an input device 510 for inputting required data; a mounting base 520 capable of dynamic adjustment including vertical height adjustment and/or horizontal rotation adjustment and/or pitch angle adjustment; the camera 530 is arranged on the installation base 520 and is used for shooting a human face biological characteristic image of a user; a display 540 for displaying the image captured by the camera 530; a processor 550 for performing required processing on image data from the camera 530 and data from the input device 510; and a data storage unit 560 for storing a processing program and data used by the processor 550, wherein the adjustment method 300 comprises an input setting step 310, a definition data item storage step 320, a photographing judgment step 330, a presentation instruction step 340, a dynamic adjustment step 350, and an adjustment end step 360.

The input setting step 310 is implemented 510 through the input device of the face recognition apparatus 500 to input a plurality of defined data items preset to correspond to different gestures provided by a user, each defined data item respectively defining the mounting base 520 to perform a corresponding dynamic adjustment operation, wherein the plurality of defined data items include: a first defined data item representing a dynamic adjustment of the mounting base 520 in height up corresponding to a user gesture of raising a fist with the thumb up; a second defining data item representing a dynamic adjustment of the mounting base 520 for a downward height corresponding to a user gesture of raising a fist; a third defined data item representing a dynamic adjustment of a left turn of the mounting base 520 corresponding to a user gesture of holding a fist up with a thumb to the left; a fourth defining data item representing a dynamic adjustment of a right turn of the mounting base 520 corresponding to a user gesture of raising a fist to the right; a fifth definition data item indicating that the dynamic adjustment of the increase of the pitch angle is performed for the mounting base 520 whose five fingers are closed upward corresponding to the user gesture; a sixth definition data item indicating that the dynamic adjustment of the pitch angle of the mounting base 520 is performed with the five fingers closed downward corresponding to the user gesture; and a seventh defining data item indicating that the mounting base 520 corresponding to the fist closed by the user's gesture stops the dynamic adjustment.

The defining data item storing step 320 is implemented by the data storage 560 in the face recognition apparatus 500 to store the plurality of defining data items.

The image capturing step 330 is implemented by the camera 530 and the processor 550 in the face recognition device 500, so that the camera 530 starts to capture a face biometric image of the user before and after the user is located in the face recognition device 500, the processor 550 determines whether the camera 530 captures a recognizable face biometric image of the user, and if the camera 530 captures the recognizable face biometric image of the user, the processor 550 performs subsequent recognition processing on the captured recognizable face biometric image of the user; if the camera 530 does not capture the facial biometric image of the user or only captures a part of the facial biometric image (for example, the user is tall as shown in fig. 5A), the process proceeds to the prompt instruction step 340 to execute the next process.

The prompting instruction step 340 is implemented by the display 540 and the processor 550 in the face recognition device 500 to display prompting content through the display 540 to prompt the user to make corresponding dynamic adjustments to the mounting base 520 by providing different gestures corresponding to the various defined data items to the camera 530, where the prompting content includes: 1. if the camera is required to ascend, please make a gesture to the camera: the fist is held with the thumb upward; 2. if the camera is required to descend, please make a gesture to the camera: the fist is held with the thumb downward; 3. if the camera needs to turn left, please make a gesture to the camera: the fist is held with the thumb standing to the left; 4. if the camera needs to be turned right, please make a gesture to the camera: the fist is held and the thumb is erected rightwards; 5. if the pitching angle of the camera is required to be increased, please make a gesture to the camera: the five fingers are folded upwards; 6. if the pitching angle of the camera is required to be reduced, please make a gesture to the camera: the five fingers are closed and face downwards; 7. if the adjustment is finished, please make a gesture to the camera: a clenched fist.

The dynamic adjustment step 350 is implemented by the processor 550 in the face recognition device 500 to perform gesture recognition processing after the display 540 displays that the camera 530 captures the gesture provided by the user (for example, in the case that the user is high in height as shown in fig. 5A, the gesture provided by the user makes a fist and raises a thumb upwards), and issue an instruction to make the mounting base 520 perform corresponding dynamic adjustment at a constant speed (for example, perform dynamic adjustment at a constant speed upwards) based on the definition data item corresponding to the provided gesture in the plurality of definition data items (for example, the mounting base performs dynamic adjustment in height upwards). Subsequently, the user may also provide a gesture with another movement speed (for example, five fingers are closed up upwards), after the display 540 displays that the camera 530 captures the gesture provided by the user, the processor 550 performs gesture recognition processing on the captured gesture and compares the captured gesture with a corresponding definition data item (for example, the mounting base 520 performs dynamic adjustment of a larger pitch angle) stored in the definition data item storage module 220 and including the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520 changes from previous dynamic adjustment of a constant speed (for example, dynamic adjustment of a constant speed in an upward direction) to another dynamic adjustment of a constant speed (for example, a larger pitch angle). The dynamic adjustment of the constant speed refers to, for example, moving up and down at a speed of 1 cm/second, and/or rotating left and right at a speed of 1 degree/second, and/or changing a pitch angle at a speed of 1 degree/second, and of course, different constant speeds may be set according to different application scenarios.

The adjustment ending step 360 is implemented by the processor 550 in the face recognition device to enable the camera 530 to continue to capture the biometric facial image of the user when the mounting base 520 performs the uniform dynamic adjustment (e.g., performs the uniform dynamic adjustment in which the height is increased and/or the pitch angle is increased), if the display 540 displays the recognizable biometric facial image captured by the camera 530 to the user, the user provides a gesture (a clenched fist) indicating the end of the dynamic adjustment, the processor 550 performs a gesture recognition process on the captured gesture image and compares the captured gesture image according to the corresponding definition data item (e.g., the mounting base 520 ends the dynamic adjustment) stored in the definition data item storage module 120 to issue an indication that the adjustment is ended, so that the mounting base 520 stops the dynamic adjustment, and the captured biometric facial image of the user is sent to the processor 550 for subsequent recognition processing.

In a variant of this embodiment (not shown), said input setting step 310 and said definition data item storing step 320 are not included; in the prompt instruction step 340, a prompt is provided to the user based on a plurality of definition data items stored in the data storage and respectively defining the installation base 520 to perform different dynamic adjustment operations, and the user is informed that the installation base 520 can perform corresponding dynamic adjustment by providing one or more instructions corresponding to each definition data item.

In another variant of this embodiment (not shown), said adjustment end step 360 is not included; the plurality of definition data items does not include the seventh definition data item; the prompt content displayed by the display 540 does not include: 7. if the adjustment is finished, please make a gesture to the camera: a clenched fist; after the camera 530 captures the recognizable facial biometric image of the user, in the dynamic adjustment step 350, the processor 550 can automatically recognize and issue an instruction to end the dynamic adjustment of the mounting base 520.

In this modification, in the dynamic adjustment step 350, after the camera 530 captures a gesture with a movement speed provided by the user (e.g. making a fist and raising a thumb to reciprocate up for 2 times/second) based on the display 540, the processor 550 performs gesture recognition processing on the captured gesture image with the movement speed, and compares the captured gesture image with the movement speed according to the corresponding definition data item (e.g. the mounting base 520 performs dynamic adjustment in the height direction at a speed of B speed: 2 cm/second) including the eighth definition data item stored in the definition data item storage module 120, so as to issue a corresponding instruction, so that the mounting base 520 performs corresponding dynamic adjustment (e.g. in the height direction) at a corresponding speed (e.g. B speed: 2 cm/second). Subsequently, the user provides a gesture with another movement speed (for example, five fingers are closed and reciprocated upwards for 1 time/second), after the display 540 displays that the camera 530 captures the gesture provided by the user, the processor 550 performs gesture recognition processing on the captured gesture image with the another movement speed, and compares the captured gesture image with the gesture recognition processing according to a corresponding definition data item (for example, the mounting base 520 performs dynamic adjustment of increasing the pitch angle, the adjustment speed is a speed a: 1 degree/second) stored in the definition data item storage module 220 and including the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520 performs corresponding dynamic adjustment (for example, increasing the pitch angle) at a corresponding speed (for example, a speed: 1 degree/second). Until the display 540/640 shows that the camera 530 captures a recognizable facial biometric image of the user, the user provides a gesture (clenched fist) indicating the end of the dynamic adjustment causing the mounting base 520 to stop the dynamic adjustment.

In this variation, the face recognition device 500 further includes a speaker 570, and in the prompt instruction step 340, the prompt content is displayed on the display 540 and/or broadcast through the speaker 570.

In another variation of this embodiment (not shown), in the camera determination step 330, the mounting base 520 is reset by the processor 550 to an initial height and an initial pitch angle calculated based on the counted average height of the occupants in the area before the camera 530 initiates the taking of the biometric image of the face of the user.

Fig. 5A-5C are schematic diagrams illustrating states of an application scene adjusted by using the adjustment method 300 of the face recognition device 500 based on gesture recognition shown in fig. 4.

As shown in fig. 5A, a user with a high height stands in front of the face recognition device 500 (see fig. 10), the display 540 in the face recognition device 500 displays that the camera 530 cannot recognize the face biometric features of the user or can only take part of the face biometric features, and the following prompts are displayed through the display 540: 1. if the camera is required to ascend, please make a gesture to the camera: the fist is held with the thumb upright; 2. if the camera is required to descend, please make a gesture to the camera: the fist is held with the thumb downward; 3. if the camera needs to turn left, please make a gesture to the camera: the fist is held with the thumb standing to the left; 4. if the camera needs to be turned right, please make a gesture to the camera: the fist is held and the thumb is erected rightwards; 5. if the pitching angle of the camera is required to be increased, please make a gesture to the camera: the five fingers are folded upwards; 6. if the pitching angle of the camera is required to be reduced, please make a gesture to the camera: the five fingers are closed and face downwards; 7. if the adjustment is finished, please make a gesture to the camera: a clenched fist.

As shown in fig. 5B, the user provides a gesture of raising a fist upwards to the camera 530 based on the prompt content provided by the display 540, the processor 550 performs gesture recognition processing on the captured gesture image according to the gesture provided by the user captured by the camera 530, and based on the definition data item corresponding to the provided gesture in the plurality of definition data items: the mounting base performs dynamic adjustment of the height upwards, and issues an instruction to make the mounting base 520 perform corresponding dynamic adjustment of the height upwards at a constant speed (for example, a speed of 1 cm/sec). Subsequently, the user may also provide a gesture with another movement speed (for example, five fingers are closed up upwards), after the display 540 displays that the camera 530 captures the gesture provided by the user, the processor 550 performs gesture recognition processing on the captured gesture image, and compares the captured gesture image with a corresponding definition data item (for example, the mounting base 520 performs dynamic adjustment for increasing the pitch angle) stored in the definition data item storage module 220 and including the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520 changes from the previous dynamic adjustment for the constant speed (for example, dynamic adjustment for the constant speed for the height upwards) to another dynamic adjustment for the constant speed (for example, speed of 1 degree/second) (for increasing the pitch angle). Of course, different constant speed can be set according to different application scenarios.

As shown in fig. 5C, when the mounting base 520 performs the constant speed dynamic adjustment (e.g., performs the constant speed dynamic adjustment with the height increasing and/or the pitch angle increasing), the camera 530 continues to capture the facial biometric image of the user, after the display 540 displays the recognizable facial biometric image captured by the camera 530 of the user, the user provides a gesture (a closed fist) indicating that the dynamic adjustment is finished, the processor 550 performs gesture recognition processing on the captured gesture image, and compares the captured gesture image with the corresponding definition data item stored in the definition data item storage module 120 (e.g., the mounting base 520 finishes the dynamic adjustment) to issue an indication that the adjustment is finished, so that the mounting base 520 stops the dynamic adjustment, and the captured facial biometric image of the user is sent to the processor 550 for subsequent recognition processing.

In a variation of this application scenario (not shown), the prompt displayed by the display 540 does not include: 7. if the adjustment is finished, please make a gesture to the camera: a clenched fist; after the display 540 displays that the camera 530 captures the recognizable facial biometric image of the user, the processor 550 can automatically recognize and send an instruction to end the dynamic adjustment of the mounting base 520.

In another variant of this application scenario (not shown), the plurality of definition data items further comprises: an eighth defined data item representing different adjustment speeds of the mounting base 520 corresponding to gestures with different movement speeds provided by the user (e.g., a fist-raising thumb-up gesture reciprocating at different movement speeds), for example, setting the movement speeds of the gestures provided by the user to a range a (reciprocating 1 time/second), a range B (reciprocating 2 times/second), and a range C (reciprocating 3 times and more/second), the dynamic adjustment speed of the mounting base 520 corresponds to a speed a (high-low adjustment of 1 cm/second; or left-right or pitch angle adjustment of 1 degree/second), a speed B (high-low adjustment of 2 cm/second; or left-right or pitch angle adjustment of 2 degrees/second), and a speed C (high-low adjustment of 3 cm/second; or left-right or pitch angle adjustment of 3 degrees/second).

In this variation, the processor 550 shows, based on the display 540, that the camera 530 has captured a gesture with a movement speed provided by the user (e.g., a 2-way upward motion/second of a fist holding the thumb) during the dynamic adjustment of the mounting base 520. The captured gesture image with a movement speed is subjected to gesture recognition processing, and is compared according to the corresponding definition data item (for example, the height of the mounting base 520 is dynamically adjusted upwards, and the adjustment speed is B speed: 2 cm/s) stored in the definition data item storage module 120 and including the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520 is correspondingly (for example, the height is upwards) dynamically adjusted at a corresponding speed (for example, B speed: 2 cm/s). Subsequently, the user provides a gesture with another movement speed (for example, five fingers are closed and reciprocated upwards for 1 time/second), after the display 540 displays that the camera 530 captures the gesture provided by the user, the processor 550 performs gesture recognition processing on the captured gesture image with the another movement speed, and compares the captured gesture image with the gesture recognition processing according to a corresponding definition data item (for example, the mounting base 520 performs dynamic adjustment of increasing the pitch angle, the adjustment speed is a speed a: 1 degree/second) stored in the definition data item storage module 220 and including the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 520 performs corresponding dynamic adjustment (for example, increasing the pitch angle) at a corresponding speed (for example, a speed: 1 degree/second). Until the display 540/640 shows that the camera 530 captures a recognizable facial biometric image of the user, the user provides a gesture (clenched fist) indicating the end of the dynamic adjustment causing the mounting base 520 to stop the dynamic adjustment.

In this variation, the face recognition device 500 further includes a speaker 570, and the prompt content is displayed on the display 540 and/or broadcasted through the speaker 570.

In another variation of this embodiment (not shown), the mounting base 520 is reset to an initial height and an initial pitch angle calculated based on the counted average height of the occupants of the area before the camera 530 initiates the capturing of the biometric image of the user's face.

Fig. 6 shows a flowchart of the dynamic adjustment step 350 and the gesture recognition processing part in the adjustment ending step 360 of the adjustment method 300 of the face recognition device 500 based on gesture recognition shown in fig. 4.

First, the gesture recognition processing portion in the dynamic adjustment step 350 and the adjustment ending step 460 of the adjustment method 300 may adopt any one of the following gesture recognition techniques in the prior art: artificial neural network-based gesture recognition techniques, hidden markov model-based gesture recognition techniques, geometric feature-based gesture recognition techniques, or other gesture recognition techniques.

As shown in fig. 6, the gesture recognition processing part in the dynamic adjustment step 350 and the adjustment ending step 360 of the adjustment method 300 includes the following steps: a step 3501 for ingesting a user-provided gesture; a step 3502 of preprocessing the captured gesture image; a step 3503 of extracting features from the preprocessed gesture image data, and a step 3504 of recognizing the extracted features.

In the step 3501, the gesture provided by the user is captured by the camera 530.

In the step 3502, the captured gesture image provided by the user is preprocessed by the processor 550, such as gesture segmentation, gesture tracking, error compensation, and filtering.

In the step 3503, feature extraction is performed on the preprocessed result data by the processor 550 to obtain extracted features.

At the step 3504, the extracted features are classified, modeled, trained and matched by the processor 550 to obtain a gesture recognition result, so that, in the dynamic adjustment step, the processor 550 makes a corresponding dynamic adjustment to the mounting base 520 based on a definition data item corresponding to the provided gesture among the plurality of definition data items according to the obtained gesture recognition result.

Fig. 7 shows a flowchart of an adjusting method 400 of the face recognition apparatus 600 based on speech recognition implemented by using the adjusting device shown in fig. 3 according to another embodiment of the present disclosure.

As shown in fig. 7, there is provided an adjustment method 400 for a face recognition apparatus based on speech recognition, the face recognition apparatus 600 (see fig. 11) includes an input device 610 for inputting required data; a mounting base 620 capable of dynamic adjustment including vertical height adjustment and/or lateral rotation adjustment and/or pitch angle adjustment; the camera 630 is arranged on the mounting base 620 and used for shooting a human face biological characteristic image of the user; a display 640 for displaying the image captured by the camera 630; a sound pickup 670 for picking up a voice provided by a user; a processor 650 for performing required processing on image data from the camera 630 and data from the input device 610; and a data storage device 660 for storing processing programs and data used by the processor 650, wherein the adjustment method 400 comprises an input setting step 410, a definition data item storage step 420, an image pickup judgment step 430, a presentation instruction step 440, a dynamic adjustment step 450, and an adjustment ending step 460.

The input setting step 410 is implemented by the input device 610 of the face recognition apparatus 600 to input a plurality of definition data items preset to correspond to voices with different contents provided by a user, each definition data item respectively defining the installation base 620 to perform a corresponding dynamic adjustment operation, wherein the plurality of definition data items include: a first defining data item indicating that the installation base 620 corresponding to the user's voice content being "up" is dynamically adjusted in height upward; a second defining data item indicating that the mounting base 620 corresponding to the user's voice content being "down" is dynamically adjusted in height down; a third definition data item indicating a dynamic adjustment of turning left corresponding to the user's voice content being "left" by the installation base 620; a fourth definition data item indicating a dynamic adjustment of turning to the right corresponding to the installation base 620 whose user's voice content is "to the right"; a fifth definition data item indicating that the dynamic adjustment of the increase of the pitch angle is performed corresponding to the mounting base 620 whose user voice content is "pitch large"; a sixth definition data item indicating that the dynamic adjustment of the pitch angle reduction is performed for the mounting base 620 whose user voice content is "pitch small"; and a seventh defining data item indicating that the mounting base 620 corresponding to the user's voice content being "stop" stops the dynamic adjustment.

The definition data item storage step 420 is implemented by the data storage 660 in the face recognition apparatus 600 to store the plurality of definition data items.

The image capturing judging step 430 is implemented by the camera 630 and the processor 650 in the face recognition device 600, so that the camera 630 starts to capture a facial biometric image of the user before and after the user is located in the face recognition device 600, the processor 650 judges whether the camera 630 captures a recognizable facial biometric image of the user, and if the camera 630 captures the recognizable facial biometric image of the user, the processor 650 performs subsequent recognition processing on the captured recognizable facial biometric image of the user; if the camera 630 determines that the biometric image of the face of the user is not captured or only a part of the biometric image of the face is captured (for example, in the case that the user is short as shown in fig. 8A), the process proceeds to the instruction step 440 to execute the next step.

The prompting instruction step 440 is implemented by the display 640 and the processor 650 in the face recognition device 600 to display prompting content through the display 640 to prompt a user to make a corresponding dynamic adjustment of the mounting base 620 by providing the camera 630 with voices of different contents corresponding to the defined data items. The prompt content comprises: 1. if the camera is required to ascend, please say "upward"; 2. if the camera is required to descend, please say "down"; 3. if the camera needs to be turned left, please say "left"; 4. if the camera needs to be turned right, please say "right"; 5. if the camera needs to be lifted upwards, please say that the pitching is large; 6. if the camera needs to be lifted downwards, please say that the pitching is small; 7. if the adjustment is over, please say "end".

The dynamic adjustment step 450 is implemented by the processor in the face recognition device, after the voice provided by the user is picked up by the sound pickup 670 (for example, in the case that the user is short in height, the user utters a voice "down" and/or "small pitch"), the processor 650 performs a voice recognition process on the picked-up voice provided by the user, and performs a comparison recognition based on a definition data item (for example, the mounting base 620 performs a dynamic adjustment of downward height and/or small pitch angle) corresponding to the provided voice content in the plurality of definition data items, so as to issue an adjustment instruction to make the mounting base 620 perform a corresponding dynamic adjustment of a constant speed (for example, perform a dynamic constant speed adjustment of downward height and/or small pitch angle), for example, perform vertical height movement at a speed of 1 cm/second and/or horizontal rotation at a speed of 1 degree/second and/or perform pitch angle change at a speed of 1 degree/second, although different constant speeds may be set according to different application scenarios.

The adjustment ending step 460 is implemented by the processor in the face recognition device to enable the camera 630 to continue to capture the biometric facial image of the user when the mounting base 620 performs the constant speed dynamic adjustment (e.g., performs the constant speed dynamic adjustment in which the altitude is downward and/or the pitch angle is decreased), if the display 640 displays the recognizable biometric facial image captured by the camera 630 to the user, the user provides a voice ("end") indicating the end of the dynamic adjustment, the processor 650 performs the voice recognition processing on the captured voice data and compares the captured voice data according to the corresponding definition data items stored in the definition data item storage module 220 (e.g., the mounting base 620 ends the dynamic adjustment) to issue an indication that the adjustment is ended, so that the mounting base 620 ends the dynamic adjustment, and the captured biometric facial image of the user is sent to the processor 650 for subsequent recognition processing.

In a variant of this embodiment (not shown), said input setting step 410 and said definition data item storing step 420 are not included; in the presentation step 440, a user is provided with a presentation based on a plurality of definition data items stored in the data storage and respectively defining the installation base 620 to perform different dynamic adjustment operations, and the user is informed that the installation base 620 can perform corresponding dynamic adjustment by providing one or more instructions corresponding to each definition data item.

In another variant of this embodiment (not shown), said adjustment ending step 460 is not included; the plurality of definition data items does not include the seventh definition data item; the prompt contents displayed by the display 640 do not include: 7. if the adjustment is finished, please say "finished"; after the camera 630 captures the recognizable facial biometric image of the user, in the dynamic adjustment module 450, the processor 650 can automatically recognize and issue an instruction to end the dynamic adjustment of the mounting base 620.

In this modification, in the dynamic adjustment step 450, the processor 650 performs a voice recognition process on the user-provided voice having a content (for example, the user utters the voice "slow down") picked up by the sound pickup 670, and compares the user-provided voice with a content according to the corresponding definition data item (for example, the mounting base 620 performs dynamic adjustment of the height downward at an adjustment speed of a speed: 1 cm/sec) stored in the definition data item storage module 220 and including the eighth definition data item, thereby issuing a corresponding instruction to cause the mounting base 620 to perform corresponding dynamic adjustment at a corresponding speed (for example, the mounting base 620 performs dynamic adjustment of the height upward at a speed: 1 cm/sec). Subsequently, when the user provides a voice with another content (for example, the user utters a voice "pitch is large"), the sound pickup device 670 picks up the voice, the processor 650 performs a voice recognition process on the picked-up voice data, and compares the picked-up voice data with a corresponding definition data item (for example, the mounting base 620 performs a dynamic adjustment of a pitch angle decreasing at a speed of B: 2 degrees/sec) stored in the definition data item storage module 220, which includes the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 620 changes from a previous dynamic adjustment (for example, the mounting base 620 performs a dynamic adjustment of a pitch angle upward at a speed of a: 1 cm/sec) to another dynamic adjustment at a corresponding speed (for example, the mounting base 620 performs a dynamic adjustment of a pitch angle decreasing at B: 2 degrees/sec).

In this variation, the face recognition device 600 further includes a speaker 680, the instruction module 240 is implemented by the processor 650 and the display 640 and/or the speaker 680 in the face recognition device 600, and the instruction step 440 displays the instruction content through the display 640 and/or broadcasts the instruction content through the speaker 680.

In another variation of this embodiment (not shown), in the camera determining step, the mounting base 620 is repositioned by the processor 650 to an initial height and an initial pitch angle calculated based on the counted average height of the occupants in the area before the camera 630 initiates the taking of the biometric image of the user's face.

Fig. 8A-8C are schematic diagrams illustrating application scene states adjusted by using the adjusting method 300 of the face recognition device 600 based on speech recognition shown in fig. 7.

As shown in fig. 8A, a short user stands in front of the face recognition device 600 (see fig. 11), the display 640 in the face recognition device 600 displays that the camera 630 cannot recognize the face biometric features of the user or can only take part of the face biometric features, and the following prompts are displayed through the display 640: 1. if the camera is required to ascend, please say "upward"; 2. if the camera is required to descend, please say "down"; if the camera needs to turn left, please say "left"; 4. if the camera needs to be turned right, please say "right"; 5. if the camera needs to be lifted upwards, please say that the pitching is large; 6. if the camera needs to be lifted downwards, please say that the pitching is small; 7. if the adjustment is over, please say "end".

As shown in fig. 8B, after the user utters a voice "down" based on the prompt content provided by the display 540, and the sound pickup 670 picks up the voice provided by the user, the processor 650 performs a voice recognition process on the picked-up voice data, and performs a comparison recognition based on a definition data item (for example, the installation base 620 performs dynamic adjustment of the height down) corresponding to the provided voice content in the plurality of definition data items stored in the definition data item storage module 220, so as to issue a corresponding adjustment instruction, so that the installation base 620 performs corresponding dynamic adjustment of a uniform speed (for example, performs dynamic adjustment of the height down uniform speed). The dynamic adjustment at a constant speed is, for example, vertical movement at a speed of 1 cm/sec, horizontal rotation at a speed of 1 degree/sec, and/or a change in pitch angle at a speed of 1 degree/sec. Of course, different constant speed can be set according to different application scenarios.

As shown in fig. 8C, when the mounting base 620 performs the constant speed dynamic adjustment (e.g., performs a height-down constant speed dynamic adjustment), the camera 630 continues to perform the facial biometric image capture of the user, and after the display 640 displays that the camera 630 captures the recognizable facial biometric image of the user, the user provides a voice indicating that the dynamic adjustment is finished ("finished"), the processor 650 performs a voice recognition process on the captured voice data and compares the captured voice data with the corresponding definition data items stored in the definition data item storage module 220 (e.g., the mounting base 620 finishes the dynamic adjustment), so as to issue an instruction that the adjustment is finished, and the captured facial biometric image of the user is sent to the processor 650 for a subsequent recognition process.

In a variation of this application scenario (not shown), the prompt displayed by the display 640 does not include: 7. if the adjustment is finished, please say "finished"; after the display 640 displays that the camera 630 captures the biometric image of the recognizable human face of the user, the processor 650 may automatically recognize and issue an instruction to end the dynamic adjustment of the mounting base 620.

In another variant of this application scenario (not shown), the plurality of definition data items further comprises: an eighth definition data item representing different adjustment speeds of the mounting base 620 corresponding to the voice contents (e.g., fast/medium/slow) of different movement speeds provided by the user, for example, the movement speeds expressed in the voice contents provided by the user are set to "slow" (reciprocating for 1 time/second), "medium" (reciprocating for 2 times/second) and "fast" (reciprocating for 3 times and more/second), and then the dynamic adjustment speeds of the mounting base 620 correspond to a-speed (high-low or side-to-side adjustment of 1 cm/second; or pitch angle adjustment of 1 degree/second), B-speed (high-low or side-to-side adjustment of 2 cm/second; or pitch angle adjustment of 2 degrees/second) and C-speed (high-low or side-to-side adjustment of 3 cm/second; or pitch angle adjustment of 3 degrees/second).

In this variation, the processor 650 performs a voice recognition process on the user-provided voice having a content (e.g., the user utters a voice "slow down") picked up by the sound pickup 670, and compares the user-provided voice with the content according to the corresponding definition data item (e.g., the mounting base 620 performs a dynamic adjustment of the height downward at a speed of a: 1 cm/sec) including the eighth definition data item stored in the definition data item storage module 220, thereby issuing a corresponding instruction to cause the mounting base 620 to perform the corresponding dynamic adjustment at a corresponding speed (e.g., the mounting base 620 performs a dynamic adjustment of the height upward at a speed of a: 1 cm/sec). Subsequently, the user may also provide a voice with another content (for example, the user utters a voice "pitch is large"), after the voice pickup 670 picks up the voice, the processor 650 performs voice recognition processing on the picked-up voice data, and compares the picked-up voice data with a corresponding definition data item (for example, the mounting base 620 performs dynamic adjustment of pitch angle decreasing at a speed of B speed: 2 degrees/second) stored in the definition data item storage module 220, which includes the eighth definition data item, so as to issue a corresponding instruction, so that the mounting base 620 changes from the previous dynamic adjustment (for example, the mounting base 620 performs dynamic adjustment of height upward at a speed: 1 cm/second) to another dynamic adjustment at a corresponding speed (for example, the mounting base 620 performs dynamic adjustment of pitch angle decreasing at B speed: 2 degrees/second).

In this variation, the face recognition device 600 further includes a speaker 680, and the prompt content is displayed by the display 640 and/or broadcast by the speaker 680.

In another variation of this application scenario (not shown), the mounting base 620 is reset to an initial height and an initial pitch angle calculated based on the counted average height of the occupants of the area before the camera 630 initiates the capture of the biometric image of the user's face.

Fig. 9 is a flowchart illustrating the voice recognition process in the dynamic adjustment step 450 and the adjustment ending step 460 of the adjustment method 400 of the face recognition apparatus 600 based on voice recognition shown in fig. 7.

Fig. 9 shows a flowchart of the speech recognition processing part in the dynamic adjustment step 450 and the adjustment ending step 460 of the adjustment method 400 of the face recognition device 600 based on gesture recognition shown in fig. 7.

First, the gesture recognition processing part in the dynamic adjustment step 450 and the adjustment ending step 460 of the adjustment method 400 can use the voice recognition technology in the prior art, such as the voice recognition technology based on the acoustic model of the GMM-HMM, wherein the GMM is used for modeling the acoustic model of the distribution of the voice acoustic features, and the HMM is used for modeling the language model of the time sequence of the voice signal.

As shown in fig. 9, the speech recognition processing part in the dynamic adjustment step 450 and the adjustment ending step 460 of the adjustment method 400 comprises the following steps: a step 4501 for picking up a voice provided by the user; a step 4502 for performing signal processing and feature extraction on the picked-up voice; step 4503 for calculating an acoustic model score of the extracted feature, step 4504 for calculating a language model score of the extracted feature, and step 4505 for conducting a decoding search based on the acoustic model score and the language model score to obtain a speech recognition result.

At the step 4501, the user-provided speech is captured by the sound pickup 670.

In the step 4502, signal processing and feature extraction are performed on the picked-up speech by the processor 650.

At the step 4503, an acoustic model score for the extracted features is calculated by the processor 650 based on the constructed acoustic model for the distribution of the acoustic features of the speech.

At the step 4504, a language model score for the extracted features is calculated by the processor 650 based on the constructed language model for the chronology of the speech signal.

In step 4505, a decoding search is performed by the processor 650 based on the acoustic model score and the language model score to obtain a voice recognition result, so that in the dynamic adjustment step, the processor 650 dynamically adjusts the mounting base 620 accordingly based on a definition data item corresponding to the provided voice in the plurality of definition data items according to the obtained voice recognition result.

Fig. 10 is a block diagram illustrating a structure of a face recognition apparatus 500 including the gesture recognition based adjustment device 100 shown in fig. 2 according to an embodiment of the present disclosure.

As shown in fig. 10, there is provided a face recognition apparatus 500 including an input device 510 for inputting desired data; a dynamically adjustable mounting base 520; a camera 530 disposed on the mounting base 520 for capturing a facial biometric image of a user; a display 540 for displaying the image captured by the camera 530; a processor 550 for performing required processing on data from said camera 530 and said input device 510; and a data memory 560 for storing processing programs and data used by the processor 550, wherein the face recognition apparatus 500 further comprises the above-mentioned adjusting device 100.

Wherein the input setting module 110 in the adjusting apparatus 100 is implemented by the input device 510 in the face recognition device 500; the definition data item storage module 120 in the adjustment apparatus 100 is implemented by the data storage 560 in the face recognition device; the camera shooting judging module 130 in the adjusting apparatus 100 is implemented by the camera 530 in the face recognition device 500 and the processor 550; the hint specification module 140 in the adjustment apparatus 100 is implemented by the display 540 and the processor 550 in the face recognition device 500; and the dynamic adjustment module 150 and the adjustment ending module 160 in the adjustment apparatus 100 are implemented by the processor 550 in the face recognition device 500.

In a variation (not shown) of this embodiment, the input setting module 110 and the definition data item storage module 120 are not included; the instruction prompting module 140 provides a prompt to a user based on a plurality of defined data items stored in the data storage, wherein the defined data items are preset and respectively define that the installation base 520 performs different dynamic adjustment actions, and informs the user that the installation base 520 can perform corresponding dynamic adjustment by providing one or more instructions corresponding to each defined data item.

In another variation of this embodiment (not shown), the end of adjustment module 160 is not included; after the camera 530 captures the recognizable facial biometric image of the user, the dynamic adjustment module 150 can automatically recognize and issue an instruction to end the dynamic adjustment of the mounting base 520.

In another variation (not shown) of this embodiment, the face recognition device 500 further comprises a speaker 570 connected to the processor 550, wherein the instruction module 140 of the adjustment apparatus 100 is implemented by the processor 550 and the display 540 and/or the speaker 570 of the face recognition device 500.

Fig. 11 is a block diagram illustrating a structure of a face recognition apparatus 600 including the speech recognition-based adjustment device 200 shown in fig. 3 according to another embodiment of the present disclosure.

As shown in fig. 11, there is provided a face recognition apparatus 600 including an input device 610 for inputting desired data; a dynamically adjustable mounting base 620; the camera 630 is arranged on the mounting base 620 and used for shooting a human face biological characteristic image of the user; a display 640 for displaying an image captured by the camera 630; a sound pickup device 670 for picking up a voice provided by a user, a processor 650 for performing a desired process on data from the camera 630, the input device 610, and the sound pickup device 670; and a data storage 660 for storing processing programs and data used by the processor 650, wherein the face recognition apparatus 600 further comprises the adjusting device 200.

Wherein the input setting module 210 in the adjusting apparatus 200 is implemented by the input apparatus 610 in the face recognition device 600; the definition data item storage module 220 in the adjustment apparatus 200 is implemented by the data storage 660 in the face recognition device; the camera judging module 230 in the adjusting apparatus 200 is implemented by the camera 630 and the processor 650 in the face recognition device 600; the hint specification module 240 in the adjustment apparatus 200 is implemented by the processor 650 and the display 640 in the face recognition device 600; and the dynamic adjustment module 250 and the adjustment ending module 260 in the adjustment apparatus 200 are implemented by the processor 650 in the face recognition device 600.

In a variation (not shown) of this embodiment, the input setting module 210 and the definition data item storage module 220 are not included; the instruction module 240 provides a prompt to the user based on a plurality of defined data items stored in the data storage and respectively defining different dynamic adjustment actions corresponding to the installation base 620, and informs the user that the installation base 620 can perform corresponding dynamic adjustment by providing one or more instructions corresponding to the defined data items.

In another variation of this embodiment (not shown), the end of adjustment module 260 is not included; after the camera 630 captures the recognizable facial biometric image of the user, the dynamic adjustment module 250 can automatically recognize and issue an instruction, so that the dynamic adjustment of the mounting base 620 is finished.

In another variation (not shown) of this embodiment, the face recognition device 600 further comprises a speaker 680 connected to the processor 650, wherein the hint instruction module 240 in the adjustment apparatus 200 is implemented by the processor 650 and the display 640 and/or the speaker 680 in the face recognition device 600.

It should be noted that, the number of the plurality of definition data items in the foregoing embodiments may be set to different numbers according to different application requirements; the content defined by each definition data item can also be set into different content according to different application requirements; the gestures and/or voices corresponding to the definition data items can be set to different contents according to different application requirements.

It should be noted that not all steps and units in the above flows and device structure block diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.

In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The embodiments of the apparatus (including a processor) may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor in which the software implementation is located.

The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An adjustment method of a face recognition device, the face recognition device comprises an input device for inputting required data; a mounting base which can be dynamically adjusted; the camera is arranged on the mounting base and used for shooting a human face biological characteristic image of a user; a display for displaying images captured by the camera and/or a speaker for outputting sound; a processor; and a data storage, the adjustment method comprising the steps of:

a shooting judgment step: after the user approaches the face recognition equipment, starting the camera to capture a face biological feature image of the user, judging whether the camera captures the face biological feature image which can be recognized by the user, and entering a subsequent prompting and explaining step if the camera does not capture the face biological feature image of the user or only can capture part of the face biological feature image;

prompting and explaining steps: based on a plurality of preset definition data items which are stored in the data storage and respectively define the installation base to correspondingly perform different dynamic adjustment actions, the processor provides prompts to a user through the display and/or the loudspeaker, and informs the user that the installation base can be correspondingly dynamically adjusted by providing one or more instructions corresponding to each definition data item; and

and a dynamic adjustment step: after receiving one or more instructions provided by the user, the processor issues an instruction to enable the mounting base to perform dynamic adjustment corresponding to the corresponding one or more definition data items based on one or more definition data items corresponding to the provided instructions in the plurality of definition data items, so as to adjust the visual angle range of the camera mounted on the mounting base to capture the recognizable facial biometric image of the user.

2. The adjustment method according to claim 1, further comprising a preset storing step before the image capturing judgment step: inputting the preset definition data items through the input device, and storing the definition data items in the data storage.

3. The adjusting method according to claim 1, further comprising an adjusting ending step after the dynamic adjusting step: if the camera captures a recognizable facial biometric image of the user, receiving another instruction provided by the user to indicate that the dynamic adjustment is finished, so that the mounting base stops the dynamic adjustment.

4. The adaptation method according to any one of claims 1-3, wherein the plurality of definition data items comprises: a first defined data item representing a dynamic adjustment of the mounting base in an upward height direction corresponding to a first user instruction; a second defined data item representing a dynamic adjustment of the mounting base height downward corresponding to a second user instruction; a third defined data item representing a dynamic adjustment of the mounting base to turn left corresponding to a third user instruction; a fourth definition data item representing a dynamic adjustment of the mounting base to turn right corresponding to a fourth user instruction; a fifth definition data item indicating that the mounting base corresponding to the fifth user instruction is subjected to dynamic adjustment in which the pitch angle becomes larger; a sixth definition data item indicating that the mount base corresponding to the sixth user instruction performs dynamic adjustment in which the pitch angle becomes smaller; and a seventh defining data item indicating that the mounting base corresponding to the seventh user instruction stops the dynamic adjustment.

5. The adjustment method according to claim 4, wherein the respective user instructions corresponding to the respective definition data items include: different gestures provided by the user corresponding to each defined data item.

6. The adjustment method according to claim 4, wherein the face recognition device further comprises a sound pickup connected to the processor, and wherein the user instructions corresponding to the defined data items comprise: the speech of different contents provided by the user corresponding to each definition data item.

7. The adjustment method according to claim 5, wherein the different user-provided gestures corresponding to the respective defined data items comprise: the gesture corresponding to the first defined data item is that a fist is made to hold the thumb upwards, which indicates that the mounting base is required to perform dynamic adjustment in the height direction; the gesture corresponding to the second defined data item is that a fist is made to hold the thumb downwards, which indicates that the mounting base is required to be dynamically adjusted downwards in height; the gesture corresponding to the third defined data item is to raise the thumb of the fist to the left, indicating a dynamic adjustment requiring the mounting base to make a left turn; the gesture corresponding to the fourth defined data item is a fist holding the thumb right, indicating a dynamic adjustment requiring the mounting base to make a right turn; the gesture corresponding to the fifth definition data item is that five fingers are closed upwards, and the fact that the mounting base is required to be subjected to dynamic adjustment of increasing the pitching angle is shown; the gesture corresponding to the sixth definition data item is that five fingers are closed downwards, and the situation that the installation base is required to be dynamically adjusted to reduce the pitch angle is indicated; and the gesture corresponding to the seventh defined data item is a clenched fist indicating that the mounting base is required to stop the dynamic adjustment.

8. The adjustment method according to claim 6, wherein the user-provided voices of different contents corresponding to the respective definition data items include: the voice corresponding to the first defined data item is "up", which means that the mounting base is required to be dynamically adjusted in height upwards; the voice corresponding to the second defined data item is "down", which means that the mounting base is required to be dynamically adjusted in height downward; the voice corresponding to the third defined data item is "left", which indicates that the mounting base is required to make a dynamic adjustment to the left; a voice corresponding to the fourth defining data item is "right", indicating that the dynamic adjustment of the mount base to turn right is required; the voice corresponding to the fifth definition data item is "pitch large", which means that the mounting base is required to perform dynamic adjustment in which the pitch angle becomes large; the voice corresponding to the sixth definition data item is "pitch small", which indicates that the mounting base is required to make dynamic adjustment in which the pitch angle becomes small; and the voice corresponding to the seventh defining data item is "end", indicating that the mounting base is required to stop the dynamic adjustment.

9. The adjustment method according to claim 1, wherein the image pickup judgment step further comprises a reset step of: before the camera is started to capture the facial biometric image of the user, the mounting base is reset to an initial height and an initial pitch angle.

10. The adjustment method according to claim 5, wherein in the step of prompting the user via the display and/or the speaker to prompt the user to make a corresponding dynamic adjustment of the mounting base by providing respective gestures corresponding to the respective defined data items.

11. The adjustment method according to claim 6, wherein in the prompting specification step, the user is prompted through the display and/or the speaker to make a corresponding dynamic adjustment of the mounting base by providing a voice having contents corresponding to the respective definition data items.

12. The adjusting method according to claim 5, wherein in the dynamic adjusting step, after the camera captures one or more gestures provided by the user, the processor performs gesture recognition processing on the captured gesture images and compares the captured gesture images with the one or more predefined data items stored in the data storage, so as to issue corresponding instructions, so that the mounting base performs corresponding dynamic adjustment.

13. The adjusting method according to claim 6, wherein in the dynamic adjusting step, after the sound pickup picks up the voice of the one or more contents provided by the user, the processor performs voice recognition processing on the picked-up voice data and compares the voice data with the one or more defined data items stored in the data storage, so as to issue a corresponding instruction, so that the mounting base performs a corresponding dynamic adjustment.

14. The adjustment method according to claim 1, wherein in the dynamic adjustment step, the dynamic adjustment by the mounting base includes up-down height adjustment and/or pan adjustment and/or pitch angle size adjustment by the mounting base.

15. The adjustment method according to claim 14, wherein in the dynamic adjustment step, the dynamic adjustment of the mounting base is performed at a constant speed at a predetermined speed.

16. The adaptation method according to claim 4, wherein the plurality of definition data items further comprises: an eighth defining data item indicating an adjustment speed of the mount base corresponding to an eighth user instruction.

17. The adjustment method according to claim 16, wherein an eighth user instruction corresponding to the eighth defining data item comprises: gestures of different motion speeds.

18. The adjustment method according to claim 16, wherein the face recognition device further comprises a sound pickup connected to the processor, and wherein the eighth user instruction corresponding to the eighth defined data item comprises: containing speech representing content of different movement velocities.

19. The adjusting method according to claim 17, wherein in the dynamic adjusting step, after receiving gesture images with different movement speeds provided by a user and captured by the camera, the processor performs gesture recognition processing on the captured gesture images with different movement speeds, and compares the gesture images with the eighth defined data item stored in the data storage, so as to issue corresponding instructions, so that the mounting base performs dynamic adjustment at a corresponding speed.

20. The adjusting method according to claim 18, wherein in the dynamic adjusting step, after receiving the voice data containing contents representing different moving speeds provided by the user and picked up by the sound pickup, the processor performs voice recognition processing on the picked-up voice data containing contents representing different moving speeds and compares it with the eighth defined data item stored in the data storage, thereby issuing a corresponding instruction to make the mounting base dynamically adjusted at a corresponding speed.

21. An adjusting device of a face recognition device, wherein the face recognition device comprises an input device for inputting required data; a mounting base which can be dynamically adjusted; the camera is arranged on the mounting base and used for shooting a human face biological characteristic image of a user; a display for displaying images captured by the camera and/or a speaker for outputting sound; a processor; and a data storage, the adjusting device comprising:

the camera shooting judging module is used for starting the camera to shoot the face biological characteristic image of the user after the user approaches the face recognition equipment, judging whether the camera shoots the recognizable face biological characteristic image of the user or not, and executing the next step of processing by the prompt description module if the camera does not shoot the face biological characteristic image of the user or only can shoot a part of the face biological characteristic image;

a prompt instruction module, configured to provide a prompt to a user based on a plurality of defined data items stored in the data storage, where the defined data items are preset and respectively define that the installation base performs different dynamic adjustment actions, and inform the user that the installation base can perform corresponding dynamic adjustment by providing one or more instructions corresponding to the defined data items; and

the dynamic adjustment module is used for sending out an instruction to enable the installation base to carry out dynamic adjustment corresponding to one or more definition data items in the definition data items based on one or more definition data items corresponding to the provided instruction after receiving one or more instructions provided by a user, so that the visual angle range of the camera installed on the installation base is adjusted to shoot the recognizable human face biological feature image of the user.

22. The adjustment device of claim 21, further comprising:

the input setting module is used for inputting the preset multiple definition data items; and

and the definition data item storage module is used for storing the plurality of definition data items.

23. The adjustment device of claim 22, further comprising:

and the adjustment ending module is used for receiving another instruction which indicates the end of the dynamic adjustment and is provided by the user if the camera shoots the recognizable human face biological characteristic image of the user, so that the installation base stops the dynamic adjustment.

24. The adaptation means according to any of claims 21-23, wherein the plurality of definition data items comprises: a first defined data item representing a dynamic adjustment of the mounting base in an upward height direction corresponding to a first user instruction; a second defined data item representing a dynamic adjustment of the mounting base height downward corresponding to a second user instruction; a third defined data item representing a dynamic adjustment of the mounting base to turn left corresponding to a third user instruction; a fourth definition data item representing a dynamic adjustment of the mounting base to turn right corresponding to a fourth user instruction; a fifth definition data item indicating that the mounting base corresponding to the fifth user instruction is subjected to dynamic adjustment in which the pitch angle becomes larger; a sixth definition data item indicating that the mounting base corresponding to the sixth user instruction is subjected to dynamic adjustment in which the pitch angle becomes smaller; and a seventh defining data item indicating that the mounting base corresponding to the seventh user instruction stops the dynamic adjustment.

25. The adjustment device of claim 24, wherein the respective user instructions corresponding to the respective definition data items comprise: different gestures provided by the user corresponding to each defined data item.

26. The adjustment apparatus according to claim 24, wherein the face recognition device further comprises a sound pickup connected to the processor, and wherein the user instructions corresponding to the defined data items comprise: the speech of different contents provided by the user corresponding to each definition data item.

27. The adjustment device of claim 25, wherein the different user-provided gestures corresponding to the respective defined data items comprise: the gesture corresponding to the first defined data item is that a fist is made to hold the thumb up, which means that the mounting base is required to be dynamically adjusted in height up; the gesture corresponding to the second defined data item is that the fist is made to hold the thumb downwards, which means that the mounting base is required to be dynamically adjusted in height downwards; the gesture corresponding to the third defined data item is to raise the thumb of the fist to the left, indicating a dynamic adjustment requiring the mounting base to make a left turn; the gesture corresponding to the fourth defined data item is a fist holding the thumb right, indicating a dynamic adjustment requiring the mounting base to make a right turn; the gesture corresponding to the fifth definition data item is that five fingers are closed upwards, and the fact that the mounting base is required to be subjected to dynamic adjustment of increasing the pitching angle is shown; the gesture corresponding to the sixth definition data item is that five fingers are closed downwards, and the fact that the mounting base is required to be dynamically adjusted to reduce the pitching angle is indicated; and the gesture corresponding to the seventh defined data item is a clenched fist indicating that the mounting base is required to stop the dynamic adjustment.

28. The adjustment apparatus according to claim 26, wherein the user-provided voices of different contents corresponding to the respective definition data items include: the voice corresponding to the first defined data item is "up", which means that the mounting base is required to be dynamically adjusted in height upwards; the voice corresponding to the second defined data item is "down", which means that the mounting base is required to be dynamically adjusted in height downward; the voice corresponding to the third defined data item is "left", which indicates that the mounting base is required to make a dynamic adjustment to the left; the voice corresponding to the fourth defined data item is "right", which indicates that the dynamic adjustment of turning to the right is required for the mounting base; the voice corresponding to the fifth definition data item is "pitch large", which means that the mounting base is required to perform dynamic adjustment in which the pitch angle becomes large; the voice corresponding to the sixth definition data item is "pitch small", which indicates that the mounting base is required to make dynamic adjustment in which the pitch angle becomes small; and the voice corresponding to the seventh defining data item is "end", indicating that the mounting base is required to stop the dynamic adjustment.

29. The adjustment device of claim 21, wherein the camera is repositioned to an initial height and an initial tilt angle by the mounting base before the camera is initiated to take a biometric image of the user's face.

30. The adjustment device of claim 25, wherein the instruction module prompts a user to make a corresponding dynamic adjustment of the mounting base by providing gestures corresponding to the defined data items.

31. The adjustment device of claim 26, wherein the instruction module prompts a user to cause the mounting base to make a corresponding dynamic adjustment by providing a voice having one or more items of content corresponding to the respective defined data items.

32. The adjusting device according to claim 25, wherein after the camera captures one or more gestures provided by the user, the dynamic adjusting module performs gesture recognition on the captured gesture image and compares the captured gesture image with the one or more preset definition data items stored in the definition data item storage module, so as to issue a corresponding instruction, so that the mounting base performs corresponding dynamic adjustment.

33. The adjusting apparatus according to claim 26, wherein the dynamic adjusting module receives the voice of one or more items of content provided by the user picked up by the sound pickup device, performs voice recognition processing on the picked-up voice data, and compares the voice data with the one or more preset definition data items stored in the definition data item storage module, thereby issuing a corresponding instruction to make the corresponding dynamic adjustment on the mounting base.

34. The adjustment device of claim 21, wherein the dynamic adjustment of the mounting base comprises an up-down height adjustment and/or a side-to-side rotation adjustment and/or a pitch angle size adjustment of the mounting base.

35. The adjustment device of claim 34, wherein the dynamic adjustment of the mounting base is performed at a constant speed at a predetermined speed.

36. The adjustment device of claim 24, wherein the plurality of definition data items further comprises: an eighth defining data item indicating an adjustment speed of the mount base corresponding to an eighth user instruction.

37. The adjustment device of claim 36, wherein an eighth user instruction corresponding to the eighth defined data item comprises: gestures of different motion speeds.

38. The adjustment apparatus according to claim 36, wherein the face recognition device further comprises a sound pickup coupled to the processor, and wherein the eighth user instruction corresponding to the eighth defined data item comprises: containing speech representing content of different movement velocities.

39. The adjusting device according to claim 37, wherein after receiving the gesture images with different movement speeds provided by the user and captured by the camera, the dynamic adjusting module performs gesture recognition processing on the captured gesture images with different movement speeds, and compares the gesture images with the eighth definition data item stored in the definition data item storage module, so as to issue a corresponding instruction, so that the mounting base performs dynamic adjustment at a corresponding speed.

40. The adjustment apparatus according to claim 38, wherein the dynamic adjustment module receives the voice data containing the contents representing different moving speeds, which is provided by the user and picked up by the sound pickup device, performs voice recognition processing on the picked-up voice data containing the contents representing different moving speeds, and compares it with the eighth definition data item stored in the definition data item storage module, thereby issuing a corresponding instruction to dynamically adjust the mounting base at a corresponding speed.

41. The adjustment device of claim 23, wherein the input setting module is implemented by the input device; the defined data item storage module is implemented by the data store; the camera shooting judging module is realized by the camera and the processor; the cue specification module is implemented by the processor and the display and/or the speaker; and the dynamic adjustment module and the end of adjustment module are implemented by the processor.

42. A face recognition apparatus, the said face recognition apparatus includes the input device used for inputting the necessary data; a mounting base which can be dynamically adjusted; the camera is arranged on the mounting base and used for shooting a human face biological characteristic image of a user; a display for displaying images captured by the camera and/or a speaker for outputting sound by a user; a processor; and a data storage, characterized in that the face recognition device further comprises adjustment means according to any of claims 21 to 41.