CN115577337A

CN115577337A - Intelligent security equipment control method and system based on face recognition

Info

Publication number: CN115577337A
Application number: CN202211049524.0A
Authority: CN
Inventors: 唐皓; 刘楠城; 陈彬
Original assignee: Yunding Network Technology Beijing Co Ltd
Current assignee: Yunding Network Technology Beijing Co Ltd
Priority date: 2021-12-08
Filing date: 2021-12-08
Publication date: 2023-01-06
Also published as: CN113901423A; CN113901423B

Abstract

The embodiment of the specification discloses an intelligent security equipment control method and system based on face recognition, and relates to the field of data processing; acquiring a first image based on the environment information, wherein the first image comprises a first target face image; carrying out face recognition on the first target face image, and controlling equipment to execute a first response operation based on a recognition result of the first target face image; if the first response operation is not the target operation, repeatedly executing to obtain a second image, wherein the second image comprises a second target face image, and determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition; and carrying out face recognition on the second target face image with the similarity meeting the preset condition, and controlling the equipment to execute a second response operation based on the recognition result of the second target face image.

Description

Intelligent security equipment control method and system based on face recognition

Description of the cases

The application is a divisional application of Chinese patent application 202111493108.5 entitled "intelligent security equipment control method and system based on face recognition" filed on 12 months and 08 days in 2021.

Technical Field

The specification relates to the field of data processing, in particular to an intelligent security equipment control method and system based on face recognition.

Background

Currently, a user needs to carry a key or a door card for opening the smart lock, but the key or the door card has the risk of being inconvenient to carry and easy to damage, fail or lose. With consumer acceptance of smart locks, there is a growing acceptance of non-contact biometric means in particular. Face recognition is gradually becoming the main control mode of high-end intelligent locks. However, the face recognition scene of the intelligent lock is complex, and the face recognition efficiency and the face recognition experience are improved.

Therefore, it is desirable to provide a method and a system for controlling an intelligent security device based on face recognition, so as to improve the efficiency of face recognition, thereby improving the use experience of the intelligent security device based on face recognition.

Disclosure of Invention

One embodiment of the present specification provides a method for controlling an intelligent security device based on face recognition, where the method includes: acquiring environmental information; acquiring a first image based on the environment information, wherein the first image comprises a first target face image; controlling the equipment to execute a first response operation based on the first target face image; and if the first response operation is not the target operation, repeatedly executing to acquire a second image, wherein the second image comprises a second target face image, determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition, and controlling the equipment to execute a second response operation based on the second target face image when the similarity meets the preset condition.

One of the embodiments of the present specification provides an intelligent security device control system based on face recognition, including: the information acquisition module is used for acquiring environmental information; the image acquisition module is used for acquiring a first image based on the environment information, wherein the first image comprises a first target face image; the intelligent security equipment control module is used for controlling equipment to execute a first response operation based on the first target face image, and is also used for repeatedly executing control to the image acquisition module to acquire a second image if the first response operation is not the target operation, wherein the second image comprises a second target face image, and the similarity between the first target face image and the second target face image is determined until the similarity meets a preset condition, and is also used for controlling the equipment to execute a second response operation based on the second target face image when the similarity meets the preset condition.

One embodiment of the present specification provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the above-mentioned intelligent security device control method based on face recognition when executing the computer program.

One of the embodiments of the present specification provides a computer-readable storage medium, where the storage medium stores computer instructions, and is characterized in that after a computer reads the computer instructions in the storage medium, the computer executes the above-mentioned method for controlling an intelligent security device based on face recognition.

Drawings

The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:

FIG. 1 is a schematic diagram of an application scenario of an intelligent security device control system based on face recognition according to some embodiments of the present disclosure;

FIG. 2 is an exemplary block diagram of a face recognition based intelligent security device control system according to some embodiments of the present description;

FIG. 3 is an exemplary flow chart of a method for controlling an intelligent security device based on face recognition according to some embodiments of the present disclosure;

FIG. 4 is an exemplary flow diagram illustrating the acquisition of a first image based on a target capture area according to some embodiments of the present description;

FIG. 5 is a schematic diagram of a third machine learning model, shown in accordance with some embodiments of the present description.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

It should be understood that "system", "apparatus", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" are intended to cover only the explicitly identified steps or elements as not constituting an exclusive list and that the method or apparatus may comprise further steps or elements.

Flowcharts are used in this specification to illustrate the operations performed by the system according to embodiments of the present specification. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

Fig. 1 is a schematic view of an application scenario of a smart security device control system 100 based on face recognition according to some embodiments of the present disclosure.

In some embodiments, the intelligent security device control system 100 based on face recognition may implement the intelligent security device control based on face recognition by implementing the methods and/or processes disclosed in this specification.

As shown in fig. 1, the intelligent security device control system 100 based on face recognition may include a server 110, a network 120, a user terminal 130, a storage device 140, an information acquisition device 150, a camera capture device 160, and an intelligent lock 170.

The server 110 may be used to process data and/or information from at least one component of the smart security device control system 100 based on face recognition or an external data source (e.g., a cloud data center). For example, the server 110 may be used to acquire environmental information of the detection area from the information acquisition device 150. Also for example, it may be used to acquire a first image from the camera acquisition device 160. For example, the server 110 may be further configured to control the device to perform a first response operation based on the first target face image. For example, the server 110 may be further configured to, when it is determined that the lock is not to be unlocked, repeatedly execute the control of controlling the camera capturing device 160 to obtain a second image, where the second image includes a second target face image, and determine a similarity between the first target face image and the second target face image, until the similarity satisfies a preset condition, and when the similarity satisfies the preset condition, the server 110 executes a second response operation based on the second image control device. In some embodiments, during the processing, the server 110 may obtain data (e.g., instructions) from the storage device 140 or save data (e.g., a result of determining whether to control the smart lock 170) to the storage device 140, or may read data (e.g., environmental information, etc.) from other sources such as the user terminal 130 or output data (e.g., a result of face recognition, etc.) to the user terminal 130 through the network 120.

In some embodiments, the server 110 may include a Central Processing Unit (CPU), a Digital Signal Processor (DSP), the like, and/or any combination thereof. In some embodiments, the server 110 may be local, remote, or implemented on a cloud platform.

The network 120 may provide a conduit for the exchange of information. In some embodiments, information may be exchanged between the server 110, the user terminal 130, the storage device 140, the information acquisition apparatus 150, the camera capture apparatus 160, and the smart lock 170 via the network 120. ( For example, the server 110 may receive the environmental information acquired by the information acquisition device 150 through the network 120. As another example, server 110 may read data stored by storage device 140 over network 120. )

User terminal 130 refers to one or more terminal devices or software used by a user. In some embodiments, the user terminal 130 may be one or any combination of a mobile device, a tablet computer, a laptop computer, a desktop computer, or other device having input and/or output capabilities. In some embodiments, the user terminal 130 may be used as a display terminal of the user for obtaining and displaying the face recognition result of the server 110 via the network 120. The above examples are intended only to illustrate the broad scope of the user terminal 130 device and not to limit its scope.

Storage device 140 may be used to store data and/or instructions. In some embodiments, the storage device 140 may obtain data and/or instructions from, for example, the user terminal 130, the information acquisition apparatus 150, the camera acquisition apparatus 160, and/or the like. In some embodiments, storage device 140 may store data and/or instructions used by server 110 to perform or use to perform the exemplary methods described in this specification.

The information acquiring means 150 may be used to acquire environmental information, wherein the environmental information may be information related to determining whether an obstacle exists in the detection area. In some embodiments, the information acquisition device 150 may send the environmental information to the server 110 through the network 120. For more description of the information acquisition device 150, reference may be made to fig. 3 and its associated description.

The camera acquisition device 160 may be configured to acquire a first image, where the first image may include an image of image information of the detection area. In some embodiments, the camera capture device 160 may include a camera device and a light source, wherein the light source may be used to illuminate light information to the detection area, and the camera device may be used to capture light signals illuminated to the detection area. In some embodiments, the camera capturing device 160 may include a plurality of camera devices, one of the camera devices is a main camera device, the remaining camera devices are secondary camera devices, and a shooting area corresponding to the main camera device may partially overlap with a shooting area corresponding to the secondary camera devices, so as to reduce a monitoring dead angle of the camera capturing device 160. For example, a sub-camera may be installed below the main camera, and when a user (e.g., a child) is short, the user is located below the shooting area corresponding to the main camera and cannot be shot by the main camera, and the sub-camera may be used to acquire an image of the user. In some embodiments, the camera device may include a depth camera and/or a flat-panel camera. The depth camera is used for collecting depth information of the detection area. The plane camera is used for collecting two-dimensional images of the detection area. In some embodiments, the depth camera may include a sensor that scans spatial three-dimensional information of the detection area to obtain depth information (e.g., point cloud data) of the detection area. For example, sensors that spatially scan a detection area by white light interference. As another example, a sensor that scans the detection area spatially by white light confocal. Also, for example, a structured light camera that projects specific light information (e.g., criss-cross laser lines, black and white squares, circles, etc.) to the detection area by a projector. In some embodiments, the depth Camera may also include a binocular Camera, a TOF (Time of light Camera), and the like. In some embodiments, the planar camera may comprise a black and white camera, a color camera, a scanner, or the like, or any combination thereof. In some embodiments, the light source may project light information to the detection area to enable the depth camera and/or the flat-panel camera to acquire information of the detection area. In some embodiments, the light source may include a visible light source that may be used to project light visible to the human eye, a visible light source that may include a monochromatic light source that may be used for light of a single frequency (or wavelength) (e.g., red, orange, yellow, green, blue, violet, etc.), and a composite light source that is used to project light that is a mixture of monochromatic light of different frequencies (or wavelengths). Such as incandescent light sources, fluorescent light sources, etc. In some embodiments, the light source may also include a source of non-visible light, which may be used to project light that is not visible to the human eye (e.g., radio waves, microwaves, infrared light, ultraviolet light, x-rays, gamma rays, far infrared rays, etc.). In some embodiments, the number of light sources may include one or more. In some embodiments, the light source may include monochromatic light, as well as composite light. In some embodiments, the colors of the plurality of light sources may be the same or different. For more description of the camera capture device 160, reference may be made to fig. 3, 4 and their associated description.

In some embodiments, the camera capture device 160 may send the first image to the server 110 via the network 120. Further description of the camera capture device 160 can be found in relation to fig. 3 and its associated description.

The smart lock 170 may be used to perform operations and the smart lock 170 may be installed on a door. In some embodiments, the smart lock 170 may include an electronic control assembly and a lock component, and the electronic control assembly may drive the lock component to perform certain operations, for example, the electronic control assembly may be used to drive the bolt of the lock component to extend or retract. When the bolt of the mechanical lock is retracted, the smart lock 170 is in an open state; when the bolt of the mechanical lock is extended, the smart lock 170 is in a locked state. In some embodiments, the electrically controlled component may be a relay, a solenoid, or the like. In some embodiments, the smart security device for performing the first response operation or the second response operation may include the smart lock 170, and may further include other components, such as a fingerprint recognition component, a prompt component (e.g., a speaker, an LED lamp) for emitting a prompt message, and the like.

Fig. 2 is an exemplary block diagram of a system 200 for controlling smart security devices based on face recognition according to some embodiments of the present disclosure.

As shown in fig. 2, the intelligent security device control system 200 based on face recognition may include an information acquisition module 210, an image acquisition module 220, and an intelligent security device control module 230 based on face recognition.

The information obtaining module 210 may be used to obtain the environment information. For more description of the information obtaining module 210, reference may be made to fig. 3 and 4 and their related description.

The image acquisition module 220 may be configured to acquire a first image based on the environment information, the first image including a first target face image. In some embodiments, the image obtaining module 220 may determine whether a human body exists in the detection area based on the environment information, and obtain the first image if the human body exists. For more description of the detection area and the first image, refer to fig. 3 and its related description, which are not repeated herein.

In some embodiments, before acquiring the first image, the image acquisition module 220 may further acquire a pre-image based on the environment information, determine a target photographing region based on the pre-image, and acquire the first image based on the target photographing region. In some embodiments, the image acquisition module 220 may determine human body feature information based on the pre-image and determine the target photographing region based on the human body feature information. In some embodiments, the image obtaining module 220 may obtain a current position of the human body based on the pre-image, and determine whether the human body is located in the target shooting area based on the current position and the target shooting area, and if so, obtain the first image. For more description of the pre-image, the target shooting area and the human body feature information, reference may be made to fig. 3 and 4 and their related description.

The smart security device control module 230 may be configured to control the smart security device to perform a first response operation based on the first target face image. In some embodiments, the smart security device control module 230 may be further configured to, when the first response operation is not the target operation, control the image obtaining module to repeatedly perform obtaining of a second image, where the second image includes a second target face image, and determine a similarity between the first target face image and the second target face image until the similarity satisfies a preset condition, and further configured to determine whether to control the device to perform the operation based on the second image when the similarity satisfies the preset condition. For further description of the first response operation, the target operation, the second image, the second target face image, the similarity, the preset condition, and the second response operation executed by the control device based on the second image, refer to fig. 3 and related description thereof, which are not repeated herein.

In some embodiments, the smart security device control module 230 may obtain a face image of at least one face in the first image; determining face state information of at least one face based on a face image of the at least one face; and determining a first target face image from the face images of the at least one face based on the face state information of the at least one face. In some embodiments, the smart security device control module 230 may determine the validity of the first target face image based on a legal face set, where the legal face set includes at least one legal face image, and control the device to perform the first response operation based on the validity. For further description of the face state information, the first target face image, and the validity of the first target face image, reference may be made to fig. 3 and the related description thereof, which are not described herein again.

Fig. 3 is an exemplary flowchart of a method 300 for controlling a smart security device based on face recognition according to some embodiments of the present disclosure. As shown in fig. 3, the method 300 for controlling the smart security device based on face recognition includes the following steps. In some embodiments, the intelligent security device control method 300 based on face recognition may be executed by the server 110.

At step 310, environmental information is obtained. In some embodiments, this step 310 may be performed by the information acquisition module 210.

In some embodiments, the environmental information may be information related to determining whether an obstacle is present within the detection area, wherein the detection area may be an area in front of the door. In some embodiments, the information acquisition module 210 may acquire the environmental information through an information acquisition device (e.g., the information acquisition device 150). In some embodiments, the information acquisition device 150 may be integrated with at least one sensor for acquiring information about obstacles within the detection area. In some embodiments, the sensor may comprise an infrared sensor, an ultrasonic sensor, a laser sensor, or the like.

In some embodiments, to reduce erroneous determination, the environmental information may also be information related to determination of whether or not a living body exists in the detection region. In some embodiments, the sensor may also be used to acquire living body information of the detection area. In some embodiments, the living body information may include body temperature, blood oxygen, heart rate, finger veins, and the like. In some embodiments, the sensor may further comprise an infrared pyroelectric sensor.

At step 320, a first image is obtained based on the environment information. In some embodiments, this step 320 may be performed by the image acquisition module 220.

In some embodiments, the first image may be an image containing image information of the detection region. The Format of the first Image may include Joint Photographic Experts Group (JPEG), tagged Image File Format (TIFF), graphics Interchange Format (GIF), kodak Flash PiX (FPX), digital Imaging and Communications in Medicine (DICOM), and the like. The first image may be a two-dimensional (2d, two-dimensional) image or a three-dimensional (3d, three-dimensional) image.

In some embodiments, the image acquisition module 220 may acquire the first image via a camera acquisition device (e.g., camera acquisition device 160). In some embodiments, when the information obtaining module 210 determines that an obstacle exists in the detection area based on the environment information obtained by the information obtaining device 150, the camera capturing device 160 may obtain an image or a video, and the image obtaining module 220 may take the image obtained by the camera capturing device 160 as the first image or extract at least one frame of image from the video obtained by the camera capturing device 160 as the first image.

In some embodiments, in order to avoid triggering the acquisition of the first image by a non-living body, when the information acquisition module 210 determines that a living body exists in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 may acquire the first image; when the information acquisition module 210 determines that there is no living body in the detection area based on the environmental information acquired by the information acquisition device 150, the image capture and acquisition device 160 may not perform acquisition of the first image.

In some embodiments, in order to avoid triggering the acquisition of the first image by a living body (e.g., a pet) other than a human body, the camera acquisition device 160 may acquire the first image when the information acquisition module 210 determines that a human body exists in the detection area based on the environmental information acquired by the information acquisition device 150; when the information acquisition module 210 determines that there is no human body in the detection area based on the environmental information acquired by the information acquisition device 150, the image capture and acquisition device 160 may not perform acquisition of the first image. For example, when the sensor of the information acquiring device 150 acquires at least one of blood oxygen, heart rate, and finger vein, the camera capturing device 160 may acquire a first image; when the information obtaining module 210 does not obtain at least one of blood oxygen, heart rate, and finger vein based on the information obtaining device 150, the camera capturing device 160 may not obtain the first image. In some embodiments, the image capturing device may include a binocular camera, and when the first image needs to be obtained, the binocular camera may obtain at least two original images including the detection area, and use one of the two original images as the main image and the other original image as the sub image. The camera capture device 160 may calculate a parallax by using a specific algorithm (e.g., a Semi-Global-Block Matching (SGBM) algorithm or a Block Matching (Block Matching) algorithm) according to a pixel point having a corresponding relationship between the main image and the sub image, determine a conversion formula of the parallax and the depth value (e.g., depth = (f) baseline)/disp according to a geometric relationship of parallel binocular vision, where depth represents a depth value, f represents a normalized focal length, baseline is a distance between optical centers of two cameras of the two binocular cameras, and disp is a parallax value), convert the parallax into a depth value of a corresponding pixel point based on the conversion formula, and finally obtain a first image including depth information of the detection region according to the depth values of the main image, the sub image, and the corresponding pixel point.

In some embodiments, the image capture device 160 may further be a structured light camera, where the structured light camera includes a projector and a camera, and when a first image needs to be obtained, the projector may project a pattern with a special structure (e.g., discrete light spots, stripe light, coded structured light, etc.) onto the detection area, and the camera is configured to obtain the first image of the detection area onto which the pattern is projected, where the first image is a two-dimensional color image.

In some embodiments, the light source of the camera capture device 160 and the flat-panel camera may emit visible light (e.g., monochromatic light, composite light, etc.) to illuminate the detection area when the first image needs to be obtained, and the flat-panel camera is configured to obtain the first image projected with the visible light, where the first image may be a two-dimensional black-and-white image or a two-dimensional color image.

In some embodiments, the image obtaining module 220 may further perform a preprocessing on the first image obtained by the camera capturing device 160, where the preprocessing may further include image denoising, image enhancing, and the like.

The image denoising refers to removing interference information in the first image. The interference information in the first image may degrade the quality of the first image. In some embodiments, the image acquisition module 220 may implement image denoising via a median filter, a machine learning model, or the like.

Image enhancement refers to adding missing information in the first image. The missing information in the first image may cause image blurring. In some embodiments, the image acquisition module 220 may implement image enhancement through a smoothing filter, a median filter, or the like.

In some embodiments, the pre-processing may also include other operations (e.g., image segmentation, etc.).

In some embodiments, after acquiring a first image, the image acquiring module 220 may further determine whether the acquired first image meets the quality requirement, and if the acquired first image does not meet the quality requirement, the image acquiring module 220 may control the camera capturing device 160 to acquire the first image again until the first image meets the quality requirement.

In some embodiments, the quality of the image may be characterized by a quality characteristic of the image. In some embodiments, the quality features of the image may include a human face, noise features, gray scale distribution, global gray scale, resolution, contrast, and the like.

The face refers to a face image contained in the first image, and if the first image does not contain at least one face image, subsequent face recognition based on the first target face image cannot be performed.

The noise is interference information in the image, and the noise in the first image not only reduces the quality of the first image and affects the visual effect of the image, but also affects the efficiency of subsequent processing such as identification of the first target face image. The noise feature is used for describing noise information of the image, and is a numerical representation of information related to noise in the image. In some embodiments, the noise characteristics may include noise distribution, noise global strength, noise level, noise rate, and the like.

The gray distribution characteristics reflect the distribution of the gray values of the pixels in the image. The gray distribution characteristics can be obtained by processing the image. For example, the mean value or the standard deviation of the gradation values in the image may be used as the gradation distribution characteristic.

Global gray refers to the average gray value or weighted average gray value of all pixels in an image. The larger the global grayscale value, the darker the image, and the smaller the global grayscale value, the brighter the image.

Resolution refers to the amount of information stored in an image. In some embodiments, the resolution may be characterized by the number of pixels contained in a unit area of the image. It will be appreciated that the higher the resolution, the sharper the image.

Contrast refers to the measurement of different brightness levels in an image, representing the magnitude of the image's gray scale contrast. In some embodiments, the contrast ratio may be obtained by using the formulas of weber contrast ratio, root mean square contrast ratio, michelson contrast ratio, and the like.

In some embodiments, the image acquisition module 220 may analyze the quality characteristics of the first image to determine whether the quality of the first image meets a requirement. For example, if the facial features are not included in the first image, the first image does not meet the quality requirement. Also for example, if the resolution of the first image is less than 1024x 768, the first image does not meet the quality requirement.

In some embodiments, before the image obtaining module 220 identifies the first target face of the first image, it determines whether the first image meets the quality requirement, so as to avoid identifying the invalid first image, and improve the accuracy and efficiency of face identification.

In some embodiments, the first image may comprise a first target face image.

The first target face image may be a face image in the first image for subsequent face recognition. In some embodiments, the first image may include a face image of at least one face, and the image obtaining module 220 may determine the first target face image from the face image of the at least one face.

In some embodiments, the image acquisition module 220 may acquire a face image of at least one face from the first image. In some embodiments, the image acquisition module 220 may first extract a plurality of image blocks from the first image with an overlap. In some embodiments, the image acquisition module 220 may extract the plurality of image blocks from the image through a multi-scale (multi-scale) Sliding window (Sliding-window), selective Search (Selective Search), neural network, or the like. Further, the image obtaining module 220 may further extract features of the plurality of image blocks, and determine whether the image blocks include faces, thereby obtaining at least one image block including a face, and using the at least one image block including a face as a face image of the at least one face.

In some embodiments, the image acquisition module 220 may determine the first target face image from the face image of the at least one face based on a feature determination of the face image of the at least one face. For example, the image obtaining module 220 may determine a ratio of a face area in the face image to a total face area of the face image, and use the face image with the largest ratio as the first target face. For another example, the image obtaining module 220 may determine whether the face image includes a complete facial feature image, and use the face image including the complete facial feature as the first target face image.

In some embodiments, the image acquisition module 220 may also acquire a first target face image from the first image based on the first machine learning model. In some embodiments, the first machine learning model may include, but is not limited to, a visual geometry Group Network (VGG) model, an inclusion NET model, a full Convolutional neural Network (FCN) model, a Segmentation Network (SegNet) model, a Mask-Convolutional neural Network (Mask-RCNN) model, and the like.

In some embodiments, when the image obtaining module 220 trains the first machine learning model, a plurality of sample images with labels may be used as training data, and parameters of the model may be learned in a common manner (for example, gradient descent, etc.), where the sample images may include face images of at least one face, and the labels may be target face images of the sample images. In some embodiments, the first machine learning model may be trained in another device or module.

In some embodiments, the features of the face image may further include face state information of the face, where the face state information may be used to characterize a state of the face when the first image acquisition is performed.

In some embodiments, the face state information may include face area, facial expression, face angle, and the like, where the facial expression may be determined based on facial features of the face image/key features of the face (e.g., facial features), and the face angle may be an angle of the face with respect to the camera acquisition device 160.

In some embodiments, the image acquisition module 220 may determine face state information based on a feature extraction algorithm. In some embodiments, the Feature extraction algorithm for extracting the face state information of the face image includes, but is not limited to, histogram of Oriented Gradients (HOG), local Binary Pattern (LBP) algorithm, scale Invariant Feature Transform (SIFT) algorithm, haar-like algorithm, gray-level co-occurrence matrix Method (GLCM), hough Transform (Hough Transform), fourier Transform (Fourier Transform), fourier shape descriptor Method (Fourier Transform), shape parameter Method (shape factor), finite Element Method (FEM), rotation function (Turning), wavelet descriptor (Wavelet) and the like.

In some embodiments, the image acquisition module 220 may determine the first target face image from face images of at least one face based on face state information of the at least one face. In some embodiments, the image acquisition module 220 may determine a willingness of the face image to be recognized based on the face state information, wherein the willingness of the face image to be recognized is used for representing a willingness degree of a person corresponding to the face image as a face recognition object. In some embodiments, the image acquisition module 220 may determine the willingness to recognize the face image based on at least one of the face area, the face expression, and the face angle. For example, the larger the face area, the greater the intention to be recognized. Also for example, the quieter the facial expression, the greater the willingness to be recognized. Also for example, the smaller the face angle, the greater the willingness to be recognized.

In some embodiments, the image obtaining module 220 may further normalize the face area, the face expression, and the face angle, and recognize the will based on the normalized face area, the normalized face expression, and the weighted result of the face angle. For example, the image acquisition module 220 may determine the willingness to be identified based on the following formula:

Q＝aX+bY+cZ；

wherein Q is the intention to be recognized, X is the normalized face area, Y is the normalized face expression, Z is the normalized face angle, a is the weight of the normalized face area, b is the weight of the normalized face expression, and c is the weight of the normalized face angle.

In some embodiments, the image acquisition module 220 may determine the first target face based on the recognized intent. For example, the image acquisition module 220 may take the face image with the greatest recognition will as the first target face. For another example, the image obtaining module 220 may use a face image with a recognition intention greater than a preset intention threshold as the first target face.

In some embodiments, the image obtaining module 220 determines the intention of the face image to be recognized based on the face state information, and then determines the first target face image from the face image of at least one face in the first image based on the intention to be recognized, so that the person needing face recognition can be determined more accurately.

In step 330, the device is controlled to execute a first response operation based on the first target face image. In some embodiments, this step 320 may be performed by the smart security device control module 230.

In some embodiments, the smart security device control module 230 may determine whether the first target facial image is a legal facial image, and control the device to perform a first response operation based on the determination result, where the first response operation may be an operation performed by the smart security device. For example, the smart lock 170 is unlocked, the smart lock 170 is further locked (e.g., the first bolt is controlled to extend when the second bolt is extended), the fingerprint recognition component is unlocked, the prompt component sends a prompt message, and the like. Illustratively, when the intelligent security device control module 230 determines that the first target face is a legal target face, the intelligent lock (e.g., the intelligent lock 170) may be controlled to unlock and/or unlock a fingerprint identification component; when the intelligent security device control module 230 determines that the first target face is not a legal target face, the intelligent lock (e.g., the intelligent lock 170) may be controlled to perform further locking and/or prompt the component to send out prompt information (e.g., voice information, light information, etc.). The legal face image can be a face image stored in advance.

In some embodiments, the smart security device control module 230 may determine whether the first target face image is legal based on a legal face set, wherein the legal face set may include at least one legal face image.

In some embodiments, the smart security device control module 230 may obtain the set of legitimate faces from one or more components of the smart security device control system 100 (e.g., the user terminal 130, the storage device 140, etc.) or from an external source (e.g., a database) via the network 120.

In some embodiments, the intelligent security device control module 230 may determine whether the first target face image is legal by determining whether a legal face image similar to the first target face image exists in the legal face set. For example, when a legal face image similar to the first target face image exists in the legal face set, the intelligent security device control module 230 may determine that the first target face image is legal; when a legal face image similar to the first target face image does not exist in the legal face set, the intelligent security device control module 230 may determine that the first target face image does not have validity.

In some embodiments, the smart security device control module 230 may determine whether the first target face image is legal based on the first image feature of the first target face image and the second image feature of the at least one legal face image.

In some embodiments, the smart security device control module 230 may determine whether the first target face image is legal based on the first image feature of the first target face image and the second image feature of the at least one legal face image through the second machine learning model. In some embodiments, the input of the second machine learning model is a first image feature of the first target face image and a second image feature of at least one legal face image, and the output of the second machine learning model is whether the first target face image is legal or not. In some embodiments, the second machine learning model may include, but is not limited to, a visual geometry Group Network (VGG) model, an inclusion NET model, a full Convolutional neural Network (FCN) model, a Segmentation Network (SegNet) model, a Mask-Convolutional neural Network (Mask-RCNN) model, and the like.

In some embodiments, when the image obtaining module 220 trains the second machine learning model, a plurality of sample images with labels may be used as training data, and parameters of the model are learned in a common manner (for example, gradient descent, etc.), where the sample images may include face images of at least one face, and the labels may be whether the sample images are legal or not. In some embodiments, the first machine learning model may be trained in another device or module.

And 340, if the first response operation is not the target operation, repeatedly executing to obtain a second image, wherein the second image comprises a second target face image, determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition, and controlling the intelligent security equipment to execute a second response operation based on the second target face image when the similarity meets the preset condition. In some embodiments, this step 340 may be performed by the smart security device control module 230.

In some embodiments, the target operation may be an operation performed by the smart security device after determining that the first target face image is a legal face image (e.g., the smart lock 170 unlocks and/or unlocks a fingerprint recognition component, etc.).

In some embodiments, if the user cannot recognize the first target face image as a legal face image due to facial occlusion (e.g., wearing a mask), the user needs to leave the detection area and return to the detection area again to trigger next face recognition even if the user removes the facial occlusion, and the user experience is poor. In order to improve the user experience, after the first target face image is identified as not a legal face image, a second target face image may be obtained, so that after the user removes the facial occlusion, the intelligent security device control module 230 may directly identify the second target face image without the user leaving the detection area.

The second image may be an image including image information of the detection area acquired again after the image acquisition device (e.g., the image acquisition device 160) acquires the first image.

In some embodiments, if the first response operation is not the target operation, the smart security device control module 230 may control a camera acquisition device (e.g., the camera acquisition device 160) to acquire the second image. The manner in which the camera capture device (e.g., camera capture device 160) acquires the second image is similar to the manner in which the camera capture device (e.g., camera capture device 160) acquires the first image, and further description regarding acquiring the second image can be found in relation to acquiring the first image.

The second image may include a second target face image, and the second target face image may be one face image in the second image. In some embodiments, the second image may include a face image of at least one face, and the image obtaining module 220 may determine the second target face image from the face image of the at least one face. The manner in which the second target face is determined from the second image is similar to the manner in which the first target face is determined from the first image, and further description of obtaining the second target face can be found in relation to obtaining the first target face.

In some embodiments, the smart security device control module 230 may also determine a similarity between the first target face image and the second target face image. In some embodiments, the smart security device control module 230 may determine the similarity between the first target face image and the second target face image based on the face image feature of the first target face image (i.e., the third image feature) and the face image feature of the second target face image (i.e., the fourth image feature). In some embodiments, the third image feature may include facial features of the facial image/key features of the face (e.g., facial features), face location, and the like.

In some embodiments, the smart security device control module 230 may determine the similarity between the first target face image and the second target face image based on at least one of key features of the facial features/faces, and face positions. For example, the more similar the key features of the facial features/faces of the first target face image and the second target face image, the higher the similarity between the first target face image and the second target face image. For example, the closer the face position of the first target face image is to the face position of the second target face image, the higher the similarity between the first target face image and the second target face image.

In some embodiments, the preset condition may be a condition for determining whether the second image needs to be acquired again. In some embodiments, the preset condition may be that the similarity between the first target face image and the second target face image is less than a preset similarity threshold (e.g., 50%). For example, when the similarity between the first target face image and the second target face image is smaller than a preset similarity threshold (e.g., 50%), the smart security device control module 230 may perform face recognition again based on the second target face image of the second image; when the similarity between the first target face image and the second target face image is greater than a preset similarity threshold (e.g., 50%), the smart security device control module 230 may control the camera capture device (e.g., the camera capture device 160) to obtain a second image, and determine the similarity between the second target face image in the second image and the first target face image until the similarity between the first target face image and the second image obtained by the camera capture device is less than the preset similarity threshold.

In some embodiments, the smart security device control module 230 may determine the similarity between the first target face image and the second target face image based on the third machine learning model and determine whether the second image satisfies a preset condition.

As shown in fig. 5, the third machine learning model may include a feature extraction layer 503, a similarity calculation layer 506, and a discrimination layer 508.

In some embodiments, the third machine learning model may analyze the first target face image and the second target face image separately, and obtain a similarity between a third image feature of the first target face image and a fourth image feature of the second target face image in the pair of images.

The feature extraction layer 503 may be configured to process the first target face image 501 and the second target face image 502 to obtain a third image feature and a fourth image feature, respectively.

The third image feature is a facial feature of the face/key feature of the face (e.g., a feature of five sense organs) in the first target face image 501. The fourth image feature is a facial feature of the face/key feature of the face (e.g., a feature of five sense organs) in the second target face image 502.

In some embodiments, the feature extraction layer 503 may include Convolutional Neural Network (CNN) models such as ResNet, resNeXt, SE-Net, denseNet, mobileNet, shuffleNet, regNet, efficientNet, or Incepton, or recurrent Neural network models.

The inputs to the feature extraction layer 503 may be a first target face image 501 and a second target face image 502. For example, the first target face image 501 and the second target face image 502 may be spliced and input to the feature extraction layer 503. The output of the feature extraction layer 503 may be a third image feature 504 of the first target face image and a fourth image feature 505 of the second target face image.

The similarity calculation layer 506 may be configured to calculate a similarity between the third image feature 504 of the first target face image and the fourth image feature 505 of the second target face image. For example, the intelligent security device control module 230 may input the third image feature 504 of the first target face image and the fourth image feature 505 of the second target face image into the similarity calculation layer 506, and the similarity calculation layer 506 outputs the similarity 507 between the first target face image 501 and the second target face image 502.

The determination layer 508 may determine whether the similarity 507 satisfies a predetermined condition. Specifically, the discrimination layer 508 may compare the similarity 507 with a similarity threshold, and if the similarity 507 is greater than the similarity threshold, the similarity 507 does not satisfy a preset condition.

In some embodiments, the third machine learning model may be a machine learning model of preset parameters. The preset parameters refer to model parameters which are learned by a machine in the training process of a learning model. Taking a neural network as an example, the model parameters include Weight (Weight) and bias (bias), etc. The preset parameters of the third machine learning model are generated through a training process. For example, smart security device control module 230 may train an initial third machine learning model based on a plurality of training samples with labels to derive a third machine learning model.

The training samples include one or more sample image pairs with labels. Each sample image pair includes a first sample image and a second sample image. Wherein the first sample image may comprise a first face image. The second sample image contains a second face image. The label of the training sample may indicate whether the similarity between the second sample image and the first sample image satisfies a preset condition.

In some embodiments, the intelligent security device control module 230 may input the training sample into the initial third machine learning model, and update the parameters of the initial feature extraction layer, the initial similarity calculation layer, and the initial discrimination layer through training until the updated third machine learning model meets the preset condition. The updated third machine learning model may be designated as a trained third machine learning model, wherein the preset condition may be that a loss function of the updated third machine learning model is less than a threshold, converges, or a number of training iterations reaches a threshold.

In some embodiments, the intelligent security device control module 230 may train an initial feature extraction layer, an initial similarity calculation layer, and an initial discrimination layer in the third machine learning model in an end-to-end training manner. The end-to-end training mode is to input a training sample into an initial model, determine a loss value based on the output of the initial model, and update the initial model based on the loss value. The initial model may contain multiple sub-models or modules for performing different data processing operations, which are considered as a whole in the training, to be updated simultaneously. For example, in the initial third machine learning model, the first sample image and the at least one second sample image may be input into the initial feature extraction layer, a loss function may be established based on the output result of the initial discrimination layer and the label, and the parameters of the initial layers in the third machine learning model may be updated simultaneously based on the loss function.

In some embodiments, the third machine learning model may be pre-trained by the processing device or a third party and stored in the storage device, and the processing device may invoke the third machine learning model directly from the storage device.

In some embodiments, when the similarity between the first target face image and the second target face image is smaller than a preset similarity threshold (for example, 50%), it may be determined that the face for face recognition has changed greatly (for example, the face corresponding to the first target face image has removed facial occlusion, other faces that need to be face recognized have appeared, and the like), and the smart security device control module 230 may control the smart security device to perform the second response operation based on the second target face image. The second response operation may be an operation executed by the intelligent security and protection device based on a result of the face recognition performed by the second target face image. For example, unlocking, further locking (e.g., controlling the second bolt to extend when the first bolt is in an extended state), fingerprint recognition is turned on, a prompt message is sent, and the like.

In some embodiments, the intelligent security device control module 230 may determine whether the second target face image is legal by determining whether a legal face image similar to the second target face image exists in the legal face set, so as to determine whether to control the device to execute the second response operation. For example, when a legal face image similar to the second target face image exists in the legal face set, the intelligent security device control module 230 may determine that the second target face image is legal and control the intelligent lock 170 to unlock and/or unlock a fingerprint identification component; when a legal face image similar to the second target face image does not exist in the legal face set, the intelligent security device control module 230 may determine that the second target face image does not have validity and control the intelligent lock 170 to perform further locking and/or control the prompting component to send out prompting information (e.g., voice information, light information, etc.), and the like.

In some embodiments, the smart security device control module 230 may further determine whether the second target face image is legal based on the legal face set through the second machine learning model.

In some embodiments, the intelligent security device control module 230 may control the camera shooting and collecting device to repeatedly obtain the second image for multiple times until the similarity between the first target face image and the second target face image satisfies the preset condition, so that after the user completes one face recognition, the face recognition is avoided from being performed again under the condition that the position or face of the user does not change greatly, the number of times of the face recognition can be effectively reduced, and the unlocking efficiency is improved.

In some embodiments, after the intelligent security device control module 230 based on face recognition determines that the first target face image is not a legal face image, the image may be obtained again and again for face recognition until the face recognition passes (i.e., the face is legal). In some embodiments, the number of times of face recognition performed by the smart security device control module 230 may be avoided by presetting a recognition time threshold, and if the number of times of face recognition that has been completed is close to the recognition time threshold (for example, the difference is less than 2), the smart security device control module 230 may stop obtaining the image and stopping face recognition. In some embodiments, after the smart security device control module 230 stops face recognition, it may further send a prompt message (e.g., a voice message, a light message, etc.) for prompting the user that the face recognition is stopped.

In some embodiments, the user may determine whether the face recognition is passed based on the response operation performed by the device. In some embodiments, the user may also determine the number of face recognitions based on the number of response operations performed by the device. For example, the intelligent security device sequentially executes twice sending of the prompt information and once unlocking, so that the user can judge that three times of face recognition are executed, the first face recognition and the second face recognition fail, and the third face recognition passes.

In some embodiments, in order to improve the integrity and clarity of the first target face image and/or the second target face image obtained by the camera capture device (e.g., the camera capture device 160), and facilitate subsequent face recognition based on the first target face image and/or the second target face image, the image obtaining module 220 may further determine a target shooting area before the first target face image obtained by the camera capture device (e.g., the camera capture device 160), and the camera capture device may obtain the first target face image and/or the second target face image when a human body stands at or near the target shooting area.

FIG. 4 is an exemplary flow diagram illustrating acquisition of a first image based on a target capture area according to some embodiments of the present description. As shown in fig. 4, the process 400 includes the following steps. In some embodiments, flow 400 may be performed by server 110.

At step 410, a pre-image is acquired based on the environmental information. In some embodiments, this step 410 may be performed by the image acquisition module 220.

In some embodiments, before acquiring the first image, the image acquisition module 220 may acquire a pre-image based on the environmental information, wherein the pre-image may be an image acquired by a camera acquisition device (e.g., the camera acquisition device 160) before acquiring the first image and containing a human body located within the detection area.

In some embodiments, when the information acquisition module 210 determines that an obstacle exists in the detection area based on the environmental information acquired by the information acquisition device 150, the image acquisition module 220 may control the camera acquisition device 160 to acquire a pre-image. In some embodiments, when the image acquired by the primary camera device of the camera acquisition device 160 does not include a human body, the camera acquisition device 160 may acquire a pre-image through the secondary camera device, so that the subsequent image acquisition module 220 may determine the current distance between the human body and the camera acquisition device based on the pre-image, and meanwhile, may avoid that the acquired pre-image does not include a human body.

In some embodiments, in order to avoid an obstacle other than a living body from triggering the acquisition of the pre-image, when the information acquisition module 210 determines that a living body exists in the detection area based on the environmental information acquired by the information acquisition device 150, the image acquisition module 220 may control the image pickup device 160 to acquire the pre-image.

In some embodiments, in order to avoid the acquisition of the pre-image triggered by an obstacle of a living body (e.g., a pet) other than a human body, when the information acquisition module 210 determines that a human body exists in the detection area based on the environmental information acquired by the information acquisition device 150, the image acquisition module 220 may control the camera acquisition device 160 to acquire the pre-image; when the information acquisition module 210 determines that there is no human body in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 may not perform acquisition of the pre-image. For example, when the sensor of the information acquiring device 150 acquires at least one of blood oxygen, heart rate, and finger vein, the image acquiring module 220 may control the camera capturing device 160 to acquire the pre-image.

In step 420, a target photographing region is determined based on the pre-image. In some embodiments, this step 420 may be performed by the image acquisition module 220.

In some embodiments, the pre-image may include an image of at least one human body. In some embodiments, the image acquisition module 220 may determine an image of the target human body from the image of the at least one human body. In some embodiments, the image acquisition module 220 may determine the target human body from the at least one human body based on a feature determination of a face image of the at least one human body. For example, the image obtaining module 220 may determine a ratio of a face area of each face in the face image to a total area of the pre-image, and use a human body corresponding to the face image with the largest ratio as the target human body. For another example, the image obtaining module 220 may determine whether at least one image of a human body includes a complete facial feature image, and use the human body corresponding to the image including the complete facial feature as the target human body.

In some embodiments, the image acquisition module 220 may determine the recognized intention of each human body from the image of at least one human body, and determine the target human body from the at least one human body based on the recognized intention. For more details regarding determining a recognized intent, reference may be made to FIG. 3 and its associated description.

In some embodiments, the target capture area is an area within the detection area. In some embodiments, when the target human body is located in the target shooting area, the first image and/or the second image acquired by the camera acquisition device 160 are more complete and clear, so that the subsequent intelligent security device control module 230 performs face recognition based on the first target face image and/or the second target face image.

In some embodiments, the image acquisition module 220 may determine the target photographing region based on the pre-image.

In some embodiments, the image acquisition module 220 may determine the human body feature information based on the pre-image, wherein the human body feature information may be information related to the body type of the target human body, such as height and the like.

In some embodiments, the pre-image may be a depth map containing depth information. In some embodiments, the image acquisition module 220 may determine the human characteristic information based on depth information of the depth map. For example, the image obtaining module 220 may use the camera capturing device as a coordinate origin to convert the depth map into point cloud data based on coordinate conversion, and obtain point cloud data of the target human body from the point cloud data, and the image obtaining module 220 may obtain a three-dimensional coordinate of a highest point of the human body and a three-dimensional coordinate of a lowest point of the human body in the point cloud data of the target human body, and calculate a height difference between the highest point and the lowest point of the human body (i.e., a height of the human body).

In some embodiments, the image acquisition module 220 may determine the target capture area based on a target angle of the target human body relative to the camera capture device 160 and a target distance between the target human body and the camera capture device 160. In some embodiments, the image acquisition module 220 may pre-store a target angle (e.g., -5 °) of the target human body with respect to the camera acquisition device 160. In some embodiments, the image acquisition module 220 may determine a target distance between the target human body and the camera acquisition device 160 based on the human body characteristic information. For example, the image acquisition module 220 may determine a target distance between the target person and the camera acquisition device 160 based on height. Illustratively, for a height of 1 meter, the target distance is 1.2 meters. As another example, a height of 1.8 meters, a target distance of 1.5 meters.

In some embodiments, the image acquisition module 220 may determine the target distance between the target person and the camera acquisition device 160 based on the following formula:

L＝aH+N；

where L is the target distance, a is a preset coefficient, a is a positive number, H is the height, and N is a preset fixed distance (e.g., 1 meter).

In some embodiments, the image obtaining module 220 determines the target shooting area based on the human body feature information, so that the target shooting area can be adjusted according to different people, and thus, the obtained images of people with different human body features are more complete and clear, and face recognition based on the first target face image and/or the second target face image is facilitated subsequently.

Step 430, acquiring a first image based on the target shooting area. In some embodiments, this step 430 may be performed by the image acquisition module 220.

In some embodiments, the image acquisition module 220 may acquire the first image when the target human body is close to the target photographing region.

In some embodiments, the image acquisition module 220 may acquire the current location of the target human body. In some embodiments, the image acquisition module 220 may determine the current position of the target human body based on the pre-image, wherein the current position of the target human body may be information related to the relative position of the target human body with respect to the camera acquisition device, such as the current distance between the target human body and the camera acquisition device, and the current angle between the target human body and the camera acquisition device.

In some embodiments, the pre-image may be a depth map containing depth information. In some embodiments, the image acquisition module 220 may determine the current distance between the target human body and the camera acquisition device based on pixel values of the target human body image in the pre-image.

In some embodiments, the image acquisition module 220 may determine an angle between the target human body relative to the camera acquisition device based on the depth information of the depth map. For example, the image obtaining module 220 may use the camera capturing device as a coordinate origin, convert the depth map into point cloud data based on coordinate conversion, obtain point cloud data of the target human body from the point cloud data, and determine an angle between the target human body and the camera capturing device based on three-dimensional coordinates of the point cloud data of the target human body. For example, the image obtaining module 220 may randomly select several points from the point cloud data of the target human body as sampling points, use an average value of three-dimensional coordinates of the several sampling points as a three-dimensional coordinate for representing the target human body, and calculate an angle between the three-dimensional coordinate for representing the target human body and an origin (i.e., a current angle between the target human body and the camera acquisition device).

In some embodiments, the image obtaining module 220 may further obtain the current distance between the target human body and the camera shooting and collecting device based on other manners, for example, obtain the current distance between the target human body and the camera shooting and collecting device by a laser, a structured light, a signal interference, and the like.

In some embodiments, the image obtaining module 220 may determine whether the target human body is close to the target shooting area based on the current distance and the current angle between the target human body and the camera shooting acquisition device. For example, when the absolute value of the difference between the current angle between the target human body and the camera shooting device and the target angle is greater than an angle difference threshold (e.g., 3 °), the image obtaining module 220 may determine that the target human body is not close to the target shooting area. For example, when the absolute value of the difference between the current distance between the target human body and the camera shooting device and the target distance is greater than a distance difference threshold (for example, 20 cm), the image obtaining module 220 may determine that the target human body is not close to the target shooting area.

In some embodiments, when the image obtaining module 220 determines that the target human body is not located in the target shooting area, the image obtaining module 220 may further generate a prompt message for prompting the user to approach the target shooting area based on the current position of the target human body and the target shooting area. In some embodiments, the image obtaining module 220 may determine the prompt information based on a difference between a current angle between the target human body and the camera capturing device and a difference between a current distance between the target human body and the camera capturing device and a target distance, so as to guide the user to quickly adjust from the current position to the target shooting area.

In some embodiments, the prompt message may be a voice message, a text message, an image message, and the like. For example, the prompt message may be a voice message "please walk 10cm straight ahead". For another example, the prompt message may be a text message "please walk 5cm to the left" displayed in the screen display area of the client. For another example, the prompt message may be a light projected to the target shooting area.

In some embodiments, the image obtaining module 220 may further display the real-time image of the user collected by the camera capture device 160 through a display screen installed outside the door, and mark the target shooting area on the real-time image, so that the user can watch the real-time image marked with the target shooting area displayed on the display screen to quickly adjust from the current position to the target shooting area.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, though not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.

Also, the description uses specific words to describe embodiments of the specification. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means a feature, structure, or characteristic described in connection with at least one embodiment of the specification. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.

Additionally, the order in which elements and sequences are described in this specification, the use of numerical letters, or other designations are not intended to limit the order of the processes and methods described in this specification, unless explicitly stated in the claims. While certain presently contemplated useful embodiments of the invention have been discussed in the foregoing disclosure by way of various examples, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein described. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range in some embodiments of the specification are approximations, in specific embodiments, such numerical values are set forth as precisely as possible within the practical range.

For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document is inconsistent or contrary to the present specification, and except where the application history document is inconsistent or contrary to the present specification, the application history document is not inconsistent or contrary to the present specification, but is to be read in the broadest scope of the present claims (either currently or hereafter added to the present specification). It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.

Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the present specification can be seen as consistent with the teachings of the present specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims

1. A control method of intelligent security equipment based on face recognition is characterized by comprising the following steps:

acquiring environmental information;

acquiring a first image based on the environment information, wherein the first image comprises a first target face image;

performing face recognition on the first target face image, and controlling equipment to execute a first response operation based on a recognition result of the first target face image;

if the first response operation is not the target operation, repeatedly executing to obtain a second image, wherein the second image comprises a second target face image, and determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition;

and carrying out face recognition on the second target face image with the similarity meeting the preset condition, and controlling the equipment to execute a second response operation based on the recognition result of the second target face image.

2. The method of claim 1, wherein said obtaining a first image based on said environmental information comprises:

judging whether living bodies exist in the detection area or not based on the environmental information;

and if the living body exists in the detection area, acquiring the first image.

3. The method according to claim 1, wherein the preset condition comprises that the similarity between the first target face image and the second target face image is less than a preset similarity threshold.

4. The method according to claim 1, wherein the environmental information includes at least information related to determining whether an obstacle exists in the detection area and/or information related to determining whether a living body exists in the detection area.

5. The method of claim 1, wherein the obtaining a first image based on the environmental information comprises:

judging whether a human body exists in the detection area or not based on the environment information;

and if the human body exists, acquiring the first image.

6. The method of claim 5, wherein the acquiring the first image comprises:

acquiring a pre-image based on the environment information;

determining a target shooting area based on the pre-image;

and acquiring the first image based on the target shooting area.

7. The method of claim 6, wherein the determining a target capture area based on the pre-image comprises:

determining human body feature information based on the pre-image;

and determining a target shooting area based on the human body characteristic information.

8. The method of claim 6, wherein the acquiring the image further comprises:

acquiring the current position of the human body based on the pre-image;

and acquiring the first image based on the current position and the target shooting area.

9. The method according to claim 1, wherein the controlling the device to perform a first response operation based on the first target facial image comprises:

determining the legality of the first target face image based on a legal face set, wherein the legal face set comprises at least one legal face image;

and controlling the equipment to execute the first response operation based on the validity judgment.

10. The method of claim 1, wherein the first image comprises at least one human face;

acquiring the first target face image from the first image, wherein the acquiring comprises the following steps:

acquiring a face image of at least one face in the first image;

determining face state information of the at least one face based on a face image of the at least one face;

and determining the first target face image from the face images of the at least one face based on the face state information of the at least one face.

11. The utility model provides an intelligent security equipment control system based on face identification which characterized in that includes:

the information acquisition module is used for acquiring environmental information;

the image acquisition module is used for acquiring a first image based on the environment information, wherein the first image comprises a first target face image;

the intelligent security equipment control module is used for controlling equipment to execute a first response operation based on the first target face image, and is also used for repeatedly executing control to the image acquisition module to acquire a second image if the first response operation is not the target operation, wherein the second image comprises a second target face image, and the similarity between the first target face image and the second target face image is determined until the similarity meets a preset condition, and is also used for controlling the equipment to execute a second response operation based on the second target face image when the similarity meets the preset condition.

12. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-10 when executing the computer program.

13. A computer-readable storage medium storing computer instructions, wherein when the computer instructions in the storage medium are read by a computer, the computer performs the method of any one of claims 1-10.