CN106874830B - A kind of visually impaired people's householder method based on RGB-D camera and recognition of face - Google Patents
A kind of visually impaired people's householder method based on RGB-D camera and recognition of face Download PDFInfo
- Publication number
- CN106874830B CN106874830B CN201611140457.8A CN201611140457A CN106874830B CN 106874830 B CN106874830 B CN 106874830B CN 201611140457 A CN201611140457 A CN 201611140457A CN 106874830 B CN106874830 B CN 106874830B
- Authority
- CN
- China
- Prior art keywords
- face
- image
- pixel
- depth
- facial image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Abstract
Visually impaired people's householder method based on RGB-D camera and recognition of face that the invention discloses a kind of.This method comprises: carrying out the tracking of face using the collected color image of RGB-D and depth image, and label is assigned for these faces automatically;The label passes through microphone input, the including but not limited to name of face, personal information, telephone number etc. by user;Facial image is corrected by frontization, is adapted to identification in different positions;Facial image after the correction is used for the training human face recognition model in neural network;Facial image to be identified is input into the trained human face recognition model, and the recognition result of model output passes to user by the stereosonic mode of 3D;The stereo information that can be prompted of the 3D includes: the distance of the orientation of face and face apart from user in depth image.
Description
Technical field
The present invention relates to pattern classifications, machine learning, and recognition of face, dysopia crowd's ancillary technique field especially relates to
And a kind of visually impaired people's householder method based on RGB-D camera and recognition of face.
Background technique
According to the data of the World Health Organization (WHO), there are 2.85 hundred million visually impaired peoples in the whole world, wherein having 39,000,000 is blind person.
In the daily life of visually impaired people, identify that the identity of people around is demand outstanding.It is visually impaired under the auxiliary of not other equipment
Personage can only judge that this is largely limited to the familiarity of visually impaired people Yu its people around by distinguishing sound,
Distance, the factors such as environment noisy degree.Traditional face identification method generally shoots facial image using color camera, and needs
Guarantee positive face and uniform illumination, this requires acquisition face sample during, face as close as possible to camera simultaneously
And front is towards camera.Therefore, one kind is designed specially towards the face identification system of visually impaired people, and uses natural friendship
Mutual mode will be provided largely conveniently for visually impaired people.
Summary of the invention
The purpose of the present invention is utilizing RGB-D camera and face recognition technology, solve visually impaired people recognizes people and knows people side
Inconvenience existing for face, it is intended to provide a kind of easy to use, householder method of interactive mode hommization for visually impaired people.
The present invention is achieved through the following technical solutions: a kind of visually impaired people based on RGB-D camera and recognition of face is auxiliary
Aid method, the specific steps are as follows: (1) foundation of the typing of face and face database;(2) correction of facial image;(3) neural
Network training;(4) face is identified;(5) interaction of the stereo result for identification of 3D.
The step (1) specifically: for each identification object, acquire the continuous color image of multiframe and depth respectively
Image further detects facial image by the color image channel in RGB-D, the facial image detected using in first frame as
The initialization starting point of face tracking.If occurring face missing inspection or detection mistake in n-th frame, face tracking mould can star
Formula detects the region of face.The face image data of all identification objects of typing and corresponding name, establish face database.
The face tracking mode the following steps are included:
First, in the face detection of the (n-1)th frame, calculate separately the straight of the human face region in cromogram and depth map
Fang Tu.The abscissa of color histogram is chromatic value, and ordinate is the number of the corresponding pixel of each chromatic value;Depth histogram
Abscissa be depth value, ordinate be the corresponding pixel of each depth value number.
Second, in n-th frame, calculate the back projection figure of cromogram and depth map.The corresponding back projection figure of cromogram
It is that the chromatic value of each pixel in cromogram is replaced with into corresponding ordinate in color histogram and is obtained;Depth map pair
The back projection figure answered is that the depth value of each pixel in depth map is replaced with corresponding ordinate in depth histogram
And it obtains.After two back projection figures are merged, the human face region more to be tallied with the actual situation is predicted.
Third, using mean shift algorithm MeanShift, calculates n-th frame face in fused back projection figure
Region.
The step (2) specifically:
First, the format of facial image is adjusted as unified size, i.e. 100 pixel *, 100 pixel.
Second, the characteristic point of human face region is detected, the characteristic point includes cheek profile, eyes, eyebrow, nose, mouth.
The detection of the characteristic point is based on color image.
Third, using the three-dimensional face model of point as characterized above as benchmark coordinate system, according to the feature in color image
Point position carries out coordinate calibration to RGB-D, obtains camera coordinates system.
4th, all the points in threedimensional model are projected in the camera coordinates system.
5th, the RGB information of each point in the threedimensional model under colour image projection to camera coordinates system, will be assigned;
6th, front projection is carried out to the threedimensional model after assignment, the facial image after being corrected.
7th, turning colorized face images is grayscale image, and does histogram equalization processing.
The step (3) specifically: the facial image corrected, size are unified for 100 pixel *, 100 pixel, can regard
Make the vector of one 10000 dimension.Then dimension-reduction treatment is carried out by principal component analysis PCA.
Each face corresponds to a data label being made of 0 and 1, and the data label of m-th of face is [a1,a2,…
am,…ak], wherein am=1, remaining is that 0, k is face sum;Using the data after dimensionality reduction as input, data label is as defeated
Out, with back-propagation algorithm BP training neural network model.
Further, it is identified by the following method:
Facial image to be identified is acquired, through overcorrection, dimension-reduction treatment, then trained neural network is inputted, is exporting
In each element of vector, if only one is greater than threshold value 0.5, the element vector thus of classification belonging to input data is determined
The corresponding class of element;If having, the value of more than one element is greater than threshold value or the value of all elements is both less than threshold value, determines to input number
It is stranger in recognition of face according to the data set being not belonging to when training.
Further, it interacts by the following method:
According to the face that step (4) identify, its name is obtained, its azran further can be known according to depth map
From;Name is played to user with 3D sound, and the angle of 3D sound is used to indicate the orientation of face, and the size of 3D sound is for referring to
It lets others have a look at the distance of face.
The beneficial effects of the present invention are:
1. the present invention provides a kind of method for identifying its people around's identity information for visually impaired people.
2. face tracking method proposed by the present invention can improve face recall rate, and the label of the automatic tag image of energy.
3. facial image antidote proposed by the present invention can remove head pose variation and non-uniform illumination to face
The influence of identification.
4. proposed by the present invention use neural metwork training and face identification system, it can achieve the effect that real-time face identifies.
5. the interaction of 3D stereo sound proposed by the present invention result for identification, effectively improves face identification system and used
The Experience Degree of journey.
Detailed description of the invention
Fig. 1 is system structure diagram;
Fig. 2 is face detection result figure;
Fig. 3 is gray processing treated color histogram or depth histogram;
Fig. 4 is fused back projection figure;
Fig. 5 is the preceding comparison diagram with facial image after correction of correction.
Specific embodiment
A kind of visually impaired people's householder method based on RGB-D camera and recognition of face, the specific steps are as follows:
(1) foundation of the typing of face and face database;
For each object to be identified, the continuous color image of multiframe and depth image are acquired respectively, is further passed through
The magazine color image channel RGB-D detects facial image, and the facial image detected using in first frame is as face tracking
Initialize starting point.If occurring face missing inspection or detection mistake in n-th frame, it can star face tracking mode, detect face
Region.The face image data of all objects to be identified of typing and corresponding name, establish face database.
The face tracking mode the following steps are included:
First, in the face detection of the (n-1)th frame, (human face region is outlined) as shown in Figure 2 calculates separately colour
The histogram of human face region in figure and depth map, as shown in Figure 3.The abscissa of color histogram is chromatic value, and ordinate is
The number of the corresponding pixel of each chromatic value;The abscissa of depth histogram is depth value, and ordinate is corresponding for each depth value
Pixel number.
Second, in n-th frame, the back projection figure of cromogram and depth map is calculated, as shown in Figure 4.Cromogram is corresponding
Back projection figure is that the chromatic value of each pixel in cromogram is replaced with corresponding ordinate in color histogram and is obtained
?;The corresponding back projection figure of depth map be the depth value of each pixel in depth map is replaced with it is right in depth histogram
The ordinate answered and obtain.Back projection figure is gray level image, in the cromogram and the corresponding back projection of depth map
In figure, it is human face region that, which there is a possibility that bigger in the bigger region of gray value,;After two back projection figures are merged, more accorded with
Close the human face region prediction of actual conditions.
Third, using mean shift algorithm MeanShift, calculates n-th frame face in fused back projection figure
Region.
(2) correction of facial image
The correction of face is for removing head pose variation and influence of the non-uniform illumination to recognition of face.Face is known
Be not equivalent to a classification problem, in the training process of classifier, the class inherited of sample should larger and every one kind class
Interior difference should be smaller, and the head pose variation and non-uniform illumination will increase difference in class, poor even up between class
Different comparable degree, for such sample, during classifier training, classifier is difficult to find that the difference between inhomogeneity
It is different, it is as a result exactly that classifier does not have the ability correctly classified.Similarly, the facial image without correction in identification process more
It is easy error.
The correction of facial image is divided into following steps:
First, the format of facial image is adjusted as unified size, i.e. 100 pixel *, 100 pixel.
Second, the characteristic point of human face region is detected, the characteristic point includes cheek profile, eyes, eyebrow, nose and mouth
Bar.The detection of the characteristic point is based on color image.
Third finds the three-dimensional coordinate of character pair point, the three-dimensional coordinate in a general three-dimensional face model
In world coordinate system.According to characteristic point in the two-dimensional coordinate and camera parameter and the threedimensional model in color image
Three-dimensional coordinate, the transformational relation of world coordinate system and camera coordinates system is calculated.
4th, all the points in threedimensional model are projected to the camera coordinates system according to the coordinate system transformational relation
In, result in the RGB information of each point.
5th, the human face three-dimensional model after assignment RGB information is projected on positive direction, the face figure after being corrected
Picture.
6th, turning colorized face images is grayscale image, and does histogram equalization processing.
It is illustrated in figure 5 the comparison before correction with facial image after correction, wherein a, b, c are the image before correction, d, e, f
Image after respectively corresponding correction.
(3) neural metwork training
Correct obtained facial image, size is unified for 100 pixel *, 100 pixel, can be regarded as one 10000 dimension to
Amount.Such dimension is too big for the neural network input for needing to calculate in real time and cannot receive.Principal component analysis PCA quilt
Apply to preprocessed data.The data prediction is dimensionality reduction.
Each face corresponds to a data label being made of 0 and 1, and the data label of m-th of face is [a1,
a2,…am,…ak], wherein am=1, remaining is that 0, k is face sum;Using the data after dimensionality reduction as input, data label is made
For output, neural network model is trained with back-propagation algorithm BP.
(4) face is identified
Facial image to be identified is acquired, through overcorrection, dimension-reduction treatment, then trained neural network is inputted, is exporting
In each element of vector, if only one is greater than threshold value 0.5, the element vector thus of classification belonging to input data is determined
The corresponding class of element;If having, the value of more than one element is greater than threshold value or the value of all elements is both less than threshold value, determines to input number
It is stranger in recognition of face according to the data set being not belonging to when training.
(5) interaction of the stereo result for identification of 3D
To the face of the step (4) identification, its name is obtained, its azran further can be known according to depth map
From;Name is played to user with 3D sound, and the angle of 3D sound is used to indicate the orientation of face, and the size of 3D sound is for referring to
It lets others have a look at the distance of face.
Claims (5)
1. a kind of visually impaired people's householder method based on RGB-D camera and recognition of face, which is characterized in that specific step is as follows:
(1) foundation of the typing of face and face database;
Object is identified for each, the continuous color image of multiframe and depth image is acquired respectively, further by RGB-D
Color image channel detect facial image, initialization starting point of the facial image detected using in first frame as face tracking;
If occurring face missing inspection or detection mistake in n-th frame, it can star face tracking mode, detect the region of face;Typing
The face image data of all identification objects and corresponding name, establish face database;The face tracking mode include with
Lower step:
First, in the face detection of the (n-1)th frame, calculate separately the histogram of the human face region in cromogram and depth map
Figure;The abscissa of color histogram is chromatic value, and ordinate is the number of the corresponding pixel of each chromatic value;Depth histogram
Abscissa is depth value, and ordinate is the number of the corresponding pixel of each depth value;
Second, in n-th frame, calculate the back projection figure of cromogram and depth map;The corresponding back projection figure of cromogram be by
The chromatic value of each pixel in cromogram replaces with corresponding ordinate in color histogram and obtains;Depth map is corresponding
Back projection figure is that the depth value of each pixel in depth map is replaced with corresponding ordinate in depth histogram and is obtained
?;After two back projection figures are merged, the human face region more to be tallied with the actual situation is predicted;
Third, using mean shift algorithm MeanShift, calculates the area of n-th frame face in fused back projection figure
Domain;
(2) correction of facial image;
(3) neural metwork training;
(4) face is identified;
(5) interaction of the stereo result for identification of 3D.
2. the method according to claim 1, wherein the step (2) specifically:
First, the format of facial image is adjusted as unified size, i.e. 100 pixel *, 100 pixel;
Second, the characteristic point of human face region is detected, the characteristic point includes cheek profile, eyes, eyebrow, nose and mouth;Institute
The detection for stating characteristic point is based on color image;
Third, using the three-dimensional face model of point as characterized above as benchmark coordinate system, according to the feature point in color image
It sets, coordinate calibration is carried out to RGB-D, obtains camera coordinates system;
4th, all the points in threedimensional model are projected in the camera coordinates system;
5th, the RGB information of each point in the threedimensional model under colour image projection to camera coordinates system, will be assigned;
6th, front projection is carried out to the threedimensional model after assignment, the facial image after being corrected;
7th, turning colorized face images is grayscale image, and does histogram equalization processing.
3. the method according to claim 1, wherein the step (3) specifically: the face figure corrected
Picture, size are unified for 100 pixel *, 100 pixel, can be regarded as the vector of one 10000 dimension;Then pass through principal component analysis PCA
Carry out dimension-reduction treatment;
Each face corresponds to a data label being made of 0 and 1, and the data label of m-th of face is [a1,a2,…am,…
ak], wherein am=1, remaining is that 0, k is face sum;Using the data after dimensionality reduction as input, data label is used as output
Back-propagation algorithm BP trains neural network model.
4. the method according to claim 1, wherein being identified by the following method:
Facial image to be identified is acquired, through overcorrection, dimension-reduction treatment, then trained neural network is inputted, in output vector
Each element in, if only one is greater than threshold value 0.5, determine the vector element pair thus of classification belonging to input data
The class answered;If having, the value of more than one element is greater than threshold value or the value of all elements is both less than threshold value, determines input data not
Data set when belonging to trained is stranger in recognition of face.
5. the method according to claim 1, wherein interacting by the following method:
According to the face that step (4) identify, its name is obtained, its azimuth-range further can be known according to depth map;With
3D sound plays name to user, and the angle of 3D sound is used to indicate the orientation of face, and the size of 3D sound is used to indicate people
The distance of face.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611140457.8A CN106874830B (en) | 2016-12-12 | 2016-12-12 | A kind of visually impaired people's householder method based on RGB-D camera and recognition of face |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611140457.8A CN106874830B (en) | 2016-12-12 | 2016-12-12 | A kind of visually impaired people's householder method based on RGB-D camera and recognition of face |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106874830A CN106874830A (en) | 2017-06-20 |
CN106874830B true CN106874830B (en) | 2019-09-24 |
Family
ID=59164100
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611140457.8A Active CN106874830B (en) | 2016-12-12 | 2016-12-12 | A kind of visually impaired people's householder method based on RGB-D camera and recognition of face |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106874830B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109299639B (en) * | 2017-07-25 | 2021-03-16 | 虹软科技股份有限公司 | Method and device for facial expression recognition |
CN107977650B (en) * | 2017-12-21 | 2019-08-23 | 北京华捷艾米科技有限公司 | Method for detecting human face and device |
CN108197587B (en) * | 2018-01-18 | 2021-08-03 | 中科视拓(北京)科技有限公司 | Method for performing multi-mode face recognition through face depth prediction |
CN108537191B (en) * | 2018-04-17 | 2020-11-20 | 云从科技集团股份有限公司 | Three-dimensional face recognition method based on structured light camera |
CN109993086B (en) * | 2019-03-21 | 2021-07-27 | 北京华捷艾米科技有限公司 | Face detection method, device and system and terminal equipment |
CN110059678A (en) * | 2019-04-17 | 2019-07-26 | 上海肇观电子科技有限公司 | A kind of detection method, device and computer readable storage medium |
CN110472610B (en) * | 2019-08-22 | 2023-08-01 | 王旭敏 | Face recognition device and method with self-depth optimization function |
CN114419697B (en) * | 2021-12-23 | 2022-12-02 | 北京深睿博联科技有限责任公司 | Vision-impaired person prompting method and device based on mechanical vibration |
CN114612959A (en) * | 2022-01-28 | 2022-06-10 | 北京深睿博联科技有限责任公司 | Face recognition system and method for assisting blind person in interpersonal communication |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101339607B (en) * | 2008-08-15 | 2012-08-01 | 北京中星微电子有限公司 | Human face recognition method and system, human face recognition model training method and system |
CN204542562U (en) * | 2015-04-02 | 2015-08-12 | 重庆大学 | A kind of intelligent blind glasses |
CN104899869A (en) * | 2015-05-14 | 2015-09-09 | 浙江大学 | Plane and barrier detection method based on RGB-D camera and attitude sensor |
CN105267013A (en) * | 2015-09-16 | 2016-01-27 | 电子科技大学 | Head-wearing intelligent vision obstruction assisting system |
-
2016
- 2016-12-12 CN CN201611140457.8A patent/CN106874830B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101339607B (en) * | 2008-08-15 | 2012-08-01 | 北京中星微电子有限公司 | Human face recognition method and system, human face recognition model training method and system |
CN204542562U (en) * | 2015-04-02 | 2015-08-12 | 重庆大学 | A kind of intelligent blind glasses |
CN104899869A (en) * | 2015-05-14 | 2015-09-09 | 浙江大学 | Plane and barrier detection method based on RGB-D camera and attitude sensor |
CN105267013A (en) * | 2015-09-16 | 2016-01-27 | 电子科技大学 | Head-wearing intelligent vision obstruction assisting system |
Non-Patent Citations (2)
Title |
---|
基于车载系统双目 CCD 相机测距;张颖江 等;《信息安全与技术》;20160131;全文 * |
基于颜色直方图和深度信息的 CamShift 目标跟踪算法;顾 超;《佳 木 斯 大 学 学 报 ( 自 然 科 学 版 )》;20150731;第33卷(第4期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN106874830A (en) | 2017-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106874830B (en) | A kind of visually impaired people's householder method based on RGB-D camera and recognition of face | |
CN107862299B (en) | Living body face detection method based on near-infrared and visible light binocular cameras | |
WO2019127262A1 (en) | Cloud end-based human face in vivo detection method, electronic device and program product | |
CN109376582A (en) | A kind of interactive human face cartoon method based on generation confrontation network | |
CN106600640B (en) | Face recognition auxiliary glasses based on RGB-D camera | |
CN112418095A (en) | Facial expression recognition method and system combined with attention mechanism | |
CN106570447B (en) | Based on the matched human face photo sunglasses automatic removal method of grey level histogram | |
CN104008364B (en) | Face identification method | |
CN102184016B (en) | Noncontact type mouse control method based on video sequence recognition | |
CN109101949A (en) | A kind of human face in-vivo detection method based on colour-video signal frequency-domain analysis | |
WenJuan et al. | A real-time lip localization and tacking for lip reading | |
CN107862298B (en) | Winking living body detection method based on infrared camera device | |
KR20200012355A (en) | Online lecture monitoring method using constrained local model and Gabor wavelets-based face verification process | |
CN111860394A (en) | Gesture estimation and gesture detection-based action living body recognition method | |
CN109725721A (en) | Human-eye positioning method and system for naked eye 3D display system | |
Özbudak et al. | Effects of the facial and racial features on gender classification | |
CN109101925A (en) | Biopsy method | |
CN104573628A (en) | Three-dimensional face recognition method | |
CN113627256A (en) | Method and system for detecting counterfeit video based on blink synchronization and binocular movement detection | |
Lai et al. | Skin colour-based face detection in colour images | |
CN109159129A (en) | A kind of intelligence company robot based on facial expression recognition | |
Montazeri et al. | Automatic extraction of eye field from a gray intensity image using intensity filtering and hybrid projection function | |
CN102968636A (en) | Human face contour extracting method | |
Riaz et al. | A model based approach for expressions invariant face recognition | |
Ouellet et al. | Multimodal biometric identification system for mobile robots combining human metrology to face recognition and speaker identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
CB02 | Change of applicant information |
Address after: 9, 181, 310000, Wuchang Road, Wuchang Street, Yuhang District, Zhejiang, Hangzhou, 202-7 Applicant after: Hangzhou vision krypton Technology Co., Ltd. Address before: Room 589, C building, No. 525 Xixi Road, Xihu District, Zhejiang, Hangzhou 310007, China Applicant before: Hangzhou vision krypton Technology Co., Ltd. |
|
CB02 | Change of applicant information | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |