CN112700576A - Multi-modal recognition algorithm based on images and characters - Google Patents
Multi-modal recognition algorithm based on images and characters Download PDFInfo
- Publication number
- CN112700576A CN112700576A CN202011587116.1A CN202011587116A CN112700576A CN 112700576 A CN112700576 A CN 112700576A CN 202011587116 A CN202011587116 A CN 202011587116A CN 112700576 A CN112700576 A CN 112700576A
- Authority
- CN
- China
- Prior art keywords
- image information
- mode
- information
- dimensional
- static
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides a multi-modal recognition algorithm based on images and characters. The algorithm is based on the matching and fusion technology of two-dimensional and three-dimensional information, the character image information obtained by the on-site signature is combined with the three-dimensional image information obtained by face recognition, the dependence on biological characteristics is reduced, and the interference of counterfeiters on the face recognition by using the three-dimensional printing technology is avoided.
Description
Technical Field
The invention relates to a multi-modal recognition algorithm based on images and characters.
Background
The access control system can control the access of personnel and can record the access information of the personnel. By utilizing the installed measuring and controlling device, the opening and closing of people can be effectively managed, and the free access of authorized personnel and the personal and property safety are ensured. With the development of economy and society in China, the access control safety management system has been deeply developed in the aspect of life, and provides important guarantee for personal safety, property safety and information safety of people. The entrance guard safety management system is a modern safety management system, relates to a plurality of new technologies such as electronics, machinery, optics, computer technology, communication technology, biotechnology and the like, is an effective measure for solving the safety precaution management of the entrances and exits of important departments, and is suitable for various occasions such as banks, hotels, parking lot management, machine rooms, military machine storehouses, key rooms, offices, intelligent districts, factories and the like. The traditional access security management system, despite combining multiple detection means, may still fail to match various biometrics due to modern high-tech counterfeiting technology, and may be entered by a counterfeiter or a counterfeit person, thereby causing a potential risk of access security.
Disclosure of Invention
In order to solve the problem of entrance guard safety caused by the fact that high-tech technology is used for imitating biological characteristics possibly existing in entrance guard safety in the prior art, for example, a 3D printing mask mode and the like are used, the invention provides a multi-mode recognition algorithm based on images and characters, which comprises the following steps:
(1) acquiring name character image information and face image information at the same moment, wherein the name character image information and the face image information are associated by the acquisition direction of a sensor for acquiring each information to establish pairing connection, the character image information is two-dimensional image information obtained by the on-site signature of a person to be detected, and the face image information is three-dimensional image information;
(2) inputting name character image information, and preprocessing the character image information, wherein the name character image information comprises a plurality of groups of static image information of a first mode and a second mode, the first mode and the second mode are provided with the same interesting region and represent shapes, the static image information of the second mode corresponds to the first mode and represents colors, and the first mode and the second mode are respectively positioned on different layers of a data structure of the static image information;
(3) inputting face image information matched with name character image information, wherein the face image information comprises a plurality of groups of dynamic image information of at least a shape mode with the same region of interest and a color mode, a speed mode and a distance mode corresponding to the shape mode, and the shape mode, the color mode, the speed mode and the distance mode are respectively positioned on different layers of a data structure of the dynamic image information;
(4) judging whether the image information of each layer in different modes in each group of static image information is matched with each other; if the image information of each layer in different modes in each group of static image information is matched with each other, dividing the name character image information of each corresponding layer into a plurality of two-dimensional image blocks respectively; if the static image information of each layer in different modes in each group of static image information is not completely matched, performing three-dimensional reconstruction and registration on the static image information of the first mode in each group of data, and then segmenting to obtain a first set containing m layers of static image information of the first mode, wherein m is a natural number greater than 5; cleaning the information of the first set by using a morphological hole filling method, performing information fusion on each layer of sliced static image information in the static image information of a first mode and the corresponding static image information of a second mode in the same group by using a frequency domain information fusion method of discrete cosine transform, performing three-dimensional reconstruction and registration to obtain three-dimensional fusion information, wherein the first dimension is the information obtained by fusing the static image information corresponding to the first mode and the second mode, the second dimension is the static image information of the second mode representing color, the third dimension represents distance and is set to be 0, the reconstructed three-dimensional fusion information is subjected to information fusion with the dynamic image information, and the information obtained by fusion is marked as a quasi-identification image sub-block with a direction according to the acquisition direction;
(5) training a neural network model by utilizing pre-collected name, character and image information;
(6) setting the third dimension to be 0 by using the prepared image sub-blocks to be recognized in all directions so as to carry out two-dimension to obtain two-dimensional image sub-blocks, inputting the two-dimensional image sub-blocks into a neural network model, and carrying out similarity comparison on the obtained recognition result and one of two-dimensional face image information: if the similarity of the comparison result is smaller than the preset threshold value, the similarity comparison with other two-dimensional facial image information is continued, otherwise, the model stops the iteration operation of the similarity comparison, and the model is saved.
Further, the preprocessing includes threshold processing to eliminate the influence of noise possibly existing in the text image information, and/or interpolation processing on the face image information to unify the resolution of different planes of the face image information.
Further, the direction includes three angles of 75 °, +90 °, 105 °.
Further, each set of the first-mode static image information and the second-mode static image information of the name character image information come from the same person to be detected.
Furthermore, each set of the first-mode static image information and the second-mode static image information of the name character image information come from different persons to be detected and serve as confusion data during the training of the model.
Further, image information of the same modality is acquired by the same equipment.
Further, the device is a three-dimensional camera.
The invention has the beneficial effects that: the character image information obtained by using the in-situ signature is combined with the three-dimensional image information obtained by face recognition, so that the dependence on biological characteristics is reduced, and the interference of counterfeiters on the face recognition by using a three-dimensional printing technology is avoided.
Drawings
Fig. 1 shows a flow diagram of the present algorithm.
Detailed Description
An image and text based multi-modal recognition algorithm, comprising the steps of:
(1) acquiring name character image information and face image information at the same moment, wherein the name character image information and the face image information are associated by the acquisition direction of a sensor for acquiring each information to establish pairing connection, the character image information is two-dimensional image information obtained by the on-site signature of a person to be detected, and the face image information is three-dimensional image information;
(2) inputting name character image information, and preprocessing the character image information, wherein the name character image information comprises a plurality of groups of static image information of a first mode and a second mode, the first mode and the second mode are provided with the same interesting region and represent shapes, the static image information of the second mode corresponds to the first mode and represents colors, and the first mode and the second mode are respectively positioned on different layers of a data structure of the static image information;
(3) inputting face image information matched with name character image information, wherein the face image information comprises a plurality of groups of dynamic image information of at least a shape mode with the same region of interest and a color mode, a speed mode and a distance mode corresponding to the shape mode, and the shape mode, the color mode, the speed mode and the distance mode are respectively positioned on different layers of a data structure of the dynamic image information;
(4) judging whether the image information of each layer in different modes in each group of static image information is matched with each other; if the image information of each layer in different modes in each group of static image information is matched with each other, dividing the name character image information of each corresponding layer into a plurality of two-dimensional image blocks respectively; if the static image information of each layer in different modes in each group of static image information is not completely matched, performing three-dimensional reconstruction and registration on the static image information of the first mode in each group of data, and then segmenting to obtain a first set containing m layers of static image information of the first mode, wherein m is a natural number greater than 5; cleaning the information of the first set by using a morphological hole filling method, performing information fusion on each layer of sliced static image information in the static image information of a first mode and the corresponding static image information of a second mode in the same group by using a frequency domain information fusion method of discrete cosine transform, performing three-dimensional reconstruction and registration to obtain three-dimensional fusion information, wherein the first dimension is the information obtained by fusing the static image information corresponding to the first mode and the second mode, the second dimension is the static image information of the second mode representing color, the third dimension represents distance and is set to be 0, the reconstructed three-dimensional fusion information is subjected to information fusion with the dynamic image information, and the information obtained by fusion is marked as a quasi-identification image sub-block with a direction according to the acquisition direction;
(5) training a neural network model by utilizing pre-collected name, character and image information;
(6) setting the third dimension to be 0 by using the prepared image sub-blocks to be recognized in all directions so as to carry out two-dimension to obtain two-dimensional image sub-blocks, inputting the two-dimensional image sub-blocks into a neural network model, and carrying out similarity comparison on the obtained recognition result and one of two-dimensional face image information: if the similarity of the comparison result is smaller than the preset threshold value, the similarity comparison with other two-dimensional facial image information is continued, otherwise, the model stops the iteration operation of the similarity comparison, and the model is saved.
Preferably, the preprocessing includes thresholding to remove the effects of noise that may be present in the text image information and/or interpolation of the face image information to unify the resolution of different planes of the face image information.
Preferably, the direction comprises three angles 75 °, +90 °, 105 °.
Preferably, each set of the first-modality static image information and the second-modality static image information of the name text image information is from the same person to be detected.
Preferably, each set of the first-modality static image information and the second-modality static image information of the name text image information comes from different persons to be detected and is used as confusion data when the model is trained.
Preferably, image information of the same modality is acquired using the same apparatus.
Preferably, the device is a three-dimensional camera.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (7)
1. A multi-modal recognition algorithm based on images and characters is characterized by comprising the following steps:
(1) acquiring name character image information and face image information at the same moment, wherein the name character image information and the face image information are associated and paired through the acquisition direction of a sensor for acquiring each piece of information, the character image information is two-dimensional image information, and the face image information is three-dimensional image information;
(2) inputting name character image information, and preprocessing the character image information, wherein the name character image information comprises a plurality of groups of static image information of a first mode and a second mode, the first mode and the second mode are provided with the same interesting region and represent shapes, the static image information of the second mode corresponds to the first mode and represents colors, and the first mode and the second mode are respectively positioned on different layers of a data structure of the static image information;
(3) inputting face image information matched with name character image information, wherein the face image information comprises a plurality of groups of dynamic image information of at least a shape mode with the same region of interest and a color mode, a speed mode and a distance mode corresponding to the shape mode, and the shape mode, the color mode, the speed mode and the distance mode are respectively positioned on different layers of a data structure of the dynamic image information;
(4) judging whether the image information of each layer in different modes in each group of static image information is matched with each other; if the image information of each layer in different modes in each group of static image information is matched with each other, dividing the name character image information of each corresponding layer into a plurality of two-dimensional image blocks respectively; if the static image information of each layer in different modes in each group of static image information is not completely matched, performing three-dimensional reconstruction and registration on the static image information of the first mode in each group of data, and then segmenting to obtain a first set containing m layers of static image information of the first mode, wherein m is a natural number greater than 5; cleaning the information of the first set by using a morphological hole filling method, performing information fusion on each layer of sliced static image information in the static image information of a first mode and the corresponding static image information of a second mode in the same group by using a frequency domain information fusion method of discrete cosine transform, performing three-dimensional reconstruction and registration to obtain three-dimensional fusion information, wherein the first dimension is the information obtained by fusing the static image information corresponding to the first mode and the second mode, the second dimension is the static image information of the second mode representing color, the third dimension represents distance and is set to be 0, the reconstructed three-dimensional fusion information is subjected to information fusion with the dynamic image information, and the information obtained by fusion is marked as a quasi-identification image sub-block with a direction according to the acquisition direction;
(5) training a neural network model by utilizing pre-collected name, character and image information;
(6) setting the third dimension to be 0 by using the prepared image sub-blocks to be recognized in all directions so as to carry out two-dimension to obtain two-dimensional image sub-blocks, inputting the two-dimensional image sub-blocks into a neural network model, and carrying out similarity comparison on the obtained recognition result and one of two-dimensional face image information: if the similarity of the comparison result is smaller than the preset threshold value, the similarity comparison with other two-dimensional facial image information is continued, otherwise, the model stops the iteration operation of the similarity comparison, and the model is saved.
2. The multi-modal image and text based recognition algorithm of claim 1, wherein: the preprocessing includes threshold processing to eliminate the effect of noise that may be present in the text image information and/or interpolation processing of the face image information to unify the resolution of different planes of the face image information.
3. The multi-modal image and text-based recognition algorithm of claim 1, wherein the direction comprises three angles of 75 °, +90 °, and 105 °.
4. The multi-modal image and text based recognition algorithm of claim 1, wherein: and each group of the first-mode static image information and the second-mode static image information of the name character image information come from the same person to be detected.
5. The multi-modal image and text based recognition algorithm of claim 1, wherein: and each group of first-mode static image information and second-mode static image information of the name character image information come from different persons to be detected and serve as confusion data during the training of the model.
6. The multi-modal image and text based recognition algorithm of claim 1, wherein: and image information of the same modality is acquired by the same equipment.
7. The multi-modal image and text based recognition algorithm of claim 6, wherein: the device is a three-dimensional camera.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011587116.1A CN112700576B (en) | 2020-12-29 | 2020-12-29 | Multi-modal recognition algorithm based on images and characters |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011587116.1A CN112700576B (en) | 2020-12-29 | 2020-12-29 | Multi-modal recognition algorithm based on images and characters |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112700576A true CN112700576A (en) | 2021-04-23 |
CN112700576B CN112700576B (en) | 2021-08-03 |
Family
ID=75511427
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011587116.1A Active CN112700576B (en) | 2020-12-29 | 2020-12-29 | Multi-modal recognition algorithm based on images and characters |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112700576B (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3271045D1 (en) * | 1981-11-20 | 1986-06-12 | Siemens Ag | Method of identifying a person by speech and face recognition, and device for carrying out the method |
CN103903319A (en) * | 2014-02-10 | 2014-07-02 | 袁磊 | Electronic lock system based on internet dynamic authorization |
CN104573634A (en) * | 2014-12-16 | 2015-04-29 | 苏州福丰科技有限公司 | Three-dimensional face recognition method |
CN204331744U (en) * | 2014-11-27 | 2015-05-13 | 天津和财世纪信息技术有限公司 | 3 D stereo intelligent face recognition system |
CN107724900A (en) * | 2017-09-28 | 2018-02-23 | 深圳市晟达机械设计有限公司 | A kind of family security door based on personal recognition |
CN108596171A (en) * | 2018-03-29 | 2018-09-28 | 青岛海尔智能技术研发有限公司 | Enabling control method and system |
CN109785483A (en) * | 2018-12-28 | 2019-05-21 | 杭州文创企业管理有限公司 | A kind of wisdom garden access control system |
CN111311786A (en) * | 2018-11-23 | 2020-06-19 | 杭州眼云智家科技有限公司 | Intelligent door lock system and intelligent door lock control method thereof |
CN111401160A (en) * | 2020-03-03 | 2020-07-10 | 北京三快在线科技有限公司 | Hotel authentication management method, system and platform and hotel PMS system |
CN111597928A (en) * | 2020-04-29 | 2020-08-28 | 深圳市商汤智能传感科技有限公司 | Three-dimensional model processing method and device, electronic device and storage medium |
CN111625100A (en) * | 2020-06-03 | 2020-09-04 | 浙江商汤科技开发有限公司 | Method and device for presenting picture content, computer equipment and storage medium |
CN111862413A (en) * | 2020-07-28 | 2020-10-30 | 公安部第三研究所 | Method and system for realizing epidemic situation resistant non-contact multidimensional identity rapid identification |
-
2020
- 2020-12-29 CN CN202011587116.1A patent/CN112700576B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3271045D1 (en) * | 1981-11-20 | 1986-06-12 | Siemens Ag | Method of identifying a person by speech and face recognition, and device for carrying out the method |
CN103903319A (en) * | 2014-02-10 | 2014-07-02 | 袁磊 | Electronic lock system based on internet dynamic authorization |
CN204331744U (en) * | 2014-11-27 | 2015-05-13 | 天津和财世纪信息技术有限公司 | 3 D stereo intelligent face recognition system |
CN104573634A (en) * | 2014-12-16 | 2015-04-29 | 苏州福丰科技有限公司 | Three-dimensional face recognition method |
CN107724900A (en) * | 2017-09-28 | 2018-02-23 | 深圳市晟达机械设计有限公司 | A kind of family security door based on personal recognition |
CN108596171A (en) * | 2018-03-29 | 2018-09-28 | 青岛海尔智能技术研发有限公司 | Enabling control method and system |
CN111311786A (en) * | 2018-11-23 | 2020-06-19 | 杭州眼云智家科技有限公司 | Intelligent door lock system and intelligent door lock control method thereof |
CN109785483A (en) * | 2018-12-28 | 2019-05-21 | 杭州文创企业管理有限公司 | A kind of wisdom garden access control system |
CN111401160A (en) * | 2020-03-03 | 2020-07-10 | 北京三快在线科技有限公司 | Hotel authentication management method, system and platform and hotel PMS system |
CN111597928A (en) * | 2020-04-29 | 2020-08-28 | 深圳市商汤智能传感科技有限公司 | Three-dimensional model processing method and device, electronic device and storage medium |
CN111625100A (en) * | 2020-06-03 | 2020-09-04 | 浙江商汤科技开发有限公司 | Method and device for presenting picture content, computer equipment and storage medium |
CN111862413A (en) * | 2020-07-28 | 2020-10-30 | 公安部第三研究所 | Method and system for realizing epidemic situation resistant non-contact multidimensional identity rapid identification |
Also Published As
Publication number | Publication date |
---|---|
CN112700576B (en) | 2021-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Siamese neural network based gait recognition for human identification | |
CN101763671B (en) | System for monitoring persons by using cameras | |
CN106919921B (en) | Gait recognition method and system combining subspace learning and tensor neural network | |
CN112069891B (en) | Deep fake face identification method based on illumination characteristics | |
CN104766063A (en) | Living body human face identifying method | |
CN107230267A (en) | Intelligence In Baogang Kindergarten based on face recognition algorithms is registered method | |
CN102629320A (en) | Ordinal measurement statistical description face recognition method based on feature level | |
CN104700094A (en) | Face recognition method and system for intelligent robot | |
CN111507320A (en) | Detection method, device, equipment and storage medium for kitchen violation behaviors | |
Sabourin et al. | Shape matrices as a mixed shape factor for off-line signature verification | |
CN114758440B (en) | Access control system based on image and text mixed recognition | |
CN114758439B (en) | Multi-mode access control system based on artificial intelligence | |
CN112700576B (en) | Multi-modal recognition algorithm based on images and characters | |
Tewari et al. | Fingerprint recognition and feature extraction using transform domain techniques | |
CN106845500A (en) | A kind of human face light invariant feature extraction method based on Sobel operators | |
Daramola et al. | Algorithm for fingerprint verification system | |
CN102436591B (en) | Discrimination method of forged iris image | |
Kalangi et al. | Deployment of Haar Cascade algorithm to detect real-time faces | |
Chua et al. | Fingerprint Singular Point Detection via Quantization and Fingerprint Classification. | |
JP2008129679A (en) | Fingerprint discrimination model construction method, fingerprint discrimination method, identification method, fingerprint discrimination device and identification device | |
Kundu et al. | An efficient chain code based face identification system for biometrics | |
Chen et al. | Broad learning with uniform local binary pattern for fingerprint liveness detection | |
Jyothsna et al. | Facemask detection using Deep Learning | |
Mhatre et al. | Offline signature verification based on statistical features | |
Vinoth et al. | Region based Minutiae Mass Measure for Efficient Finger Print Forgery Detection in Health Care System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |