CN112651445A

CN112651445A - Biological information identification method and device based on deep network multi-modal information fusion

Info

Publication number: CN112651445A
Application number: CN202011588398.7A
Authority: CN
Inventors: 周武; 李尚轩; 张林子; 张洪来
Original assignee: Guangzhou University Of Chinese Medicine Guangzhou Institute Of Chinese Medicine
Current assignee: Guangzhou University Of Chinese Medicine Guangzhou Institute Of Chinese Medicine
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2021-04-13
Anticipated expiration: 2040-12-29
Also published as: CN112651445B

Abstract

The biological information recognition method and device based on deep network multimodal information fusion provided by the embodiments of the present application relate to the technical field of image recognition. A method for identifying biological information based on deep network multimodal information fusion, comprising: first, acquiring images to be identified in multiple modalities, wherein the images to be processed in multiple modalities belong to the same biological information; secondly, by preset The recognition model performs fusion processing on the images to be recognized in multiple modalities to obtain fusion features; then, performs classification processing according to the fusion features to obtain the categories of biological information, wherein the preset recognition model is trained according to the biological information images of multiple modalities get. Through the above method, classification and recognition can be realized according to the fused features of multiple modal images, and the neural network model in the prior art generally recognizes images of multiple modalities separately, resulting in an image Identified low reliability issues.

Description

Biological information identification method and device based on deep network multi-modal information fusion

Technical Field

The application relates to the technical field of image recognition, in particular to a biological information recognition method and device based on deep network multi-modal information fusion.

Background

The image recognition method for obtaining the category to which the image belongs by manually recognizing the image has a problem of low efficiency of image recognition. In the field of biological information recognition technology, in order to improve the efficiency of biological information image recognition, images can be recognized through a neural network model, but challenges still exist in the multi-modal-based image recognition process. The images have larger intra-class shape difference in multiple modes, so that the classification is more difficult, and the neural network model in the prior art generally identifies the images of the multiple modes respectively, so that the reliability of the image identification is low.

Disclosure of Invention

In view of the above, an object of the present application is to provide a biological information recognition method and apparatus based on deep network multimodal information fusion, so as to solve the problems in the prior art.

In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:

in a first aspect, the invention provides a biological information identification method based on deep network multimodal information fusion, which comprises the following steps:

acquiring images to be identified of a plurality of modalities, wherein the images to be processed of the plurality of modalities belong to the same biological information;

fusing the images to be recognized in the plurality of modes through a preset recognition model to obtain fusion characteristics;

and carrying out classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of a plurality of modalities.

In an optional embodiment, the step of obtaining a fusion feature by fusing the images to be recognized in the multiple modalities through a preset recognition model includes:

performing feature processing on the image to be recognized in each mode to obtain a learning feature of each image to be recognized;

and splicing the learning features of the images to be processed to obtain fusion features.

In an optional embodiment, the preset identification model includes a projection matrix corresponding to the image to be identified in each modality, and the step of performing feature processing on the image to be identified in each modality to obtain a learning feature of each image to be identified includes:

aiming at the image to be recognized of each mode, carrying out depth recognition processing on the image to be recognized to obtain the depth characteristics of the image to be processed;

and calculating to obtain the learning characteristic of the image to be processed according to the depth characteristic of the image to be processed and the corresponding projection matrix.

In an optional embodiment, the preset recognition model includes a conca layer, and the step of performing stitching processing on the learning features of the images to be processed to obtain a fusion feature includes:

and splicing the learning features of the images to be processed through the conca layer to obtain fusion features.

In an optional embodiment, the biological information recognition method based on deep network multimodal information fusion further includes a step of training a preset recognition model, where the step includes:

calculating to obtain a total loss function according to the optimization loss function and the classification loss function of the initial model;

and training the initial model according to the total loss function to obtain a preset recognition model.

In an optional embodiment, the biological information identification method based on deep network multimodal information fusion further includes:

acquiring feedback data of a user;

and updating the preset identification model according to the feedback data.

In a second aspect, the present invention provides a biological information recognition apparatus based on deep network multimodal information fusion, including:

the image acquisition module is used for acquiring images to be identified in a plurality of modals, wherein the images to be processed in the plurality of modals belong to the same biological information;

the image processing module is used for carrying out fusion processing on the images to be identified in the plurality of modes through a preset identification model to obtain fusion characteristics;

and the classification module is used for performing classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of multiple modalities.

In an alternative embodiment, the image processing module is specifically configured to:

In a third aspect, the present invention provides an electronic device, which includes a memory and a processor, where the processor is configured to execute an executable computer program stored in the memory to implement the biological information recognition method based on deep network multimodal information fusion according to any one of the foregoing embodiments.

In a fourth aspect, the present invention provides a storage medium, on which a computer program is stored, which when executed, implements the steps of the biological information recognition method based on deep network multimodal information fusion of any one of the foregoing embodiments.

According to the biological information identification method and device based on the deep network multi-modal information fusion, fusion processing is carried out on images to be identified in multiple modalities through the preset identification model, fusion features are obtained, classification processing is carried out according to the fusion features, classification and identification according to the features of the fused images in the multiple modalities are achieved, and the problem that in the prior art, the neural network model generally identifies the images in the multiple modalities respectively, and therefore reliability of image identification is low is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure.

Fig. 2 is a schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.

Fig. 3 is another schematic flow chart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.

Fig. 4 is a block diagram of a preset recognition model according to an embodiment of the present disclosure.

Fig. 5 is another schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.

Fig. 6 is another schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.

Fig. 7 is another schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.

Fig. 8 is a result chart of experimental verification provided in the embodiment of the present application.

Fig. 9 is another result chart of experimental verification provided in the embodiments of the present application.

Fig. 10 is a block diagram illustrating a structure of a biological information recognition apparatus based on deep network multimodal information fusion according to an embodiment of the present application.

Icon: 100-an electronic device; 110-a network port; 120-a first processor; 130-a communication bus; 140-a first storage medium; 150-interface; 1000-biological information recognition means; 1010-an image acquisition module; 1020-an image processing module; 1030-classification module.

Detailed Description

The defects of the above solutions are the results of the inventor after practice and careful study, and therefore, the discovery process of the above problems and the solution proposed by the present application to the above problems should be the contribution of the inventor to the present application in the process of the present application.

For purposes of making the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be described in detail below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.

In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

In order to enable a person skilled in the art to make use of the present disclosure, the following embodiments are given. It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Applications of the system or method of the present application may include web pages, plug-ins for browsers, client terminals, customization systems, internal analysis systems, or artificial intelligence robots, among others, or any combination thereof.

It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Fig. 1 illustrates a schematic diagram of exemplary hardware and software components of an electronic device 100 that may implement the concepts of the present application, according to some embodiments of the present application. The electronic device 100 may include a network port 110 connected to a network, one or more first processors 120 for executing program instructions, a communication bus 130, and a first storage medium 140 of a different form, such as a disk, ROM, or RAM, or any combination thereof. Illustratively, the electronic device 100 may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof, according to which the methods of the present application may be implemented. The electronic device 100 may also include an Input/Output (I/O) interface 150 with other Input/Output devices (e.g., keyboard, display screen).

In some embodiments, the first processor 120 may process information and/or data related to embodiments of the present application to perform one or more of the functions described herein. In some embodiments, the first processor 120 may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (S)). Merely by way of example, the first Processor 120 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set Computing, RISC), a microprocessor, or the like, or any combination thereof.

The first processor 120 in the electronic device 100 may be a general purpose computer or a set purpose computer, both of which may be used to implement the methods of the present application. Although only a single computer is shown, for convenience, the functions described herein may be implemented in a distributed fashion across multiple similar platforms to balance processing loads.

For ease of illustration, only one processor is depicted in electronic device 100. However, it should be noted that the electronic device 100 in the present application may also comprise a plurality of processors, and thus the steps performed by one processor described in the present application may also be performed by a plurality of processors in combination or individually. For example, if the processor of the electronic device 100 executes steps a and B, it should be understood that steps a and B may also be executed by two different processors together or separately in one processor. For example, a first processor performs step A and a second processor performs step B, or both a first processor and a second processor perform steps A and B.

The network may be used for the exchange of information and/or data. In some embodiments, one or more components in electronic device 100 may send information and/or data to other components. For example, the electronic device 100 may acquire the signal via a network. Merely by way of example, the Network may include a Wireless Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a bluetooth Network, a ZigBee Network, or a Near Field Communication (NFC) Network, among others, or any combination thereof.

In some embodiments, the network may include one or more network access points. For example, the network may include wired or wireless network access points, such as base stations and/or network switching nodes, through which one or more components of electronic device 100 may connect to the network to exchange data and/or information.

Fig. 2 shows one of flowcharts of a biological information recognition method based on deep network multimodal information fusion according to an embodiment of the present application, where the method is applicable to the electronic device 100 shown in fig. 1 and is executed by the electronic device 100 in fig. 1. It should be understood that, in other embodiments, the order of partial steps in the biological information identification method based on deep network multimodal information fusion of the present embodiment may be interchanged according to actual needs, or partial steps thereof may be omitted or deleted. The following describes the flow of the biological information recognition method based on deep network multimodal information fusion shown in fig. 2 in detail.

Step S210, images to be identified of a plurality of modalities are acquired.

Wherein, the images to be processed of a plurality of modals belong to the same biological information.

And step S220, carrying out fusion processing on the images to be identified in the plurality of modes through a preset identification model to obtain fusion characteristics.

And step S230, carrying out classification processing according to the fusion characteristics to obtain the category of the biological information.

The preset recognition model is obtained by training according to biological information images of a plurality of modals.

According to the method, the preset identification model is used for carrying out fusion processing on the images to be identified in the plurality of modes to obtain the fusion characteristics, and classification processing is carried out according to the fusion characteristics, so that classification and identification are carried out according to the characteristics of the fused images in the plurality of modes, and the problem of low reliability of image identification caused by the fact that a neural network model in the prior art generally respectively identifies the images in the plurality of modes is solved.

For step S210, it should be noted that the specific manner of acquiring the image to be recognized is not limited, and may be set according to the actual application requirement. For example, in an alternative example, when the electronic device 100 has a module of an image capturing function, the images to be recognized of the plurality of modalities may be directly captured by the electronic device 100. For another example, in another alternative example, when the electronic device 100 does not have a module of an image capturing function, images to be identified of a plurality of modalities may be captured by an image capturing device connected to the electronic device 100.

The biological information may be related to cancer lesion, and specifically may be related to hepatocellular carcinoma. The obtained image to be identified can be a 3D image or a 2D image, the shape of the 3D image is 16X 16, and the shape of the 2D image is 28X 28. After the image to be recognized is obtained, the image to be recognized can be preprocessed, and the shape dimension and the like of the image can be adjusted to adapt to the unified input of the network.

It should be noted that, in order to obtain the preset recognition model, before step S210, the biological information recognition method based on deep-network multimodal information fusion provided in the embodiment of the present application may further include a step of training the preset recognition model. Therefore, on the basis of fig. 2, fig. 3 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 3, the biometric information identification method may further include:

and step S240, calculating to obtain a total loss function according to the optimization loss function and the classification loss function of the initial model.

And step S250, training the initial model according to the total loss function to obtain a preset recognition model.

With respect to step S240, it should be noted that, in conjunction with fig. 4, the preset recognition model (depth supervision network, DSN) provided in the embodiment of the present application includes three convolutional layers (conv1, conv2, and conv3), two full connection layers, projection matrices (Wpp, Wap, and Wpvp) corresponding to images to be recognized of respective modalities, a conca layer, and a softmax layer, where each modality has a CNN network for depth feature learning, that is, a single transverse model before the conca layer in fig. 2.

The embodiment of the application aims to provide a biological information identification method based on deep network multi-modal information fusion, which comprises the following steps: introducing a discriminant intra-class loss function to reduce the distance of features in the same class and expand the distance of features in tumors of different classes; designing a self-adaptive weighting strategy based on loss values of different modes in optimization iteration to increase the contribution of the mode with a relatively low loss value and reduce the influence of the mode with a large loss value on a final loss function; an overall loss function is proposed for the proposed deep learning framework, which combines the proposed two parts to better characterize the lesion.

In detail, the design flow steps of introducing the discriminant intra-class loss function are as follows:

image blocks of images to be identified (before radiography, in arterial and portal phases) of three modalities are respectively placed in a CNN network, and depth features (Fpp, Fap and Fpvp) of each phase are generated from a second full-connection layer. And introducing a modal internal judging module to reduce the distance between the features of the same category and increase the distance between the features of different categories. Then, three projection matrices (Wpp, Wap, and Wpvp) are learned, and depth features are projected to the new feature representation, maximizing the correlation for the same class of samples by reducing the distance of the same class sample features.

And moreover, an adaptive deep supervision network module suitable for multi-mode fusion can be designed, the learned features are respectively sent into the softmax layer for classification, and the softmax layer is connected with the learned features. In the steps of the integration of a deep monitoring network and the self-adaptive weighting adjustment, good adjustment strategies and specifications are provided for improving the classification performance through the direct monitoring of the multi-mode feature extraction.

In detail, the process steps of the deep monitoring network fusion adopting the self-adaptive weighting adjustment are as follows:

the classification loss function that classifies the original DSN may be as follows:

Lclassify＝Lcon+λ1·Lpp+λ2·Lap+λ3·Lpvp；

wherein λ 1, λ 2 and λ 3 are weights, Lcon represents a loss value after the three modalities are fused, and Lpp, Lap and Lpvp represent a loss value of a single modality. The weights of the different modalities can be adaptively adjusted:

wherein, beta represents a positive focusing parameter, is a super-parameter which can be customized, when the parameter is 2, the parameter is in proportion to the classification performance of the networks m and n, and the classification loss of m and n is focused on the transfer of the positive characteristic knowledge. Let "Lm" and "Ln" be classification losses for networks m and n, corresponding to the m-th and n-th modalities, respectively. Let Δ equal to Lm-Ln, with Δ being positive indicates that the effect of network n is better than the effect of network m.

Further, the total loss function Ltotal of the end-to-end multimodal fusion framework proposed in the embodiments of the present application combines the proposed two parts to better characterize a lesion, and the specific process steps are as follows:

ltotal is LD + Llassify, LD is an optimization function which can maximize the correlation in the same type of sample mode, but maximize the correlation in different types of sample modes, Llassify is a depth supervision loss function of self-adaptive weight based on cross entropy loss, and is used for classification.

The specific steps for generating the LD may be as follows:

the intra-modal discrimination module may design a computational matrix W that projects the depth features of each phase F into space, decreasing the xi and xj distances of the samples if they are from the same class, and increasing the distances otherwise. The constraints are defined as follows: if two tumors are from the same category (yij-1), their relative depth characteristic distance should be less than a predefined threshold μ - τ, otherwise yij-1, the deep layer characteristic distance of the two tumors should be greater than μ + τ. xi, xj represents two image samples of the same data batch in the same modality, yij represents the labels of the two samples, and if the two labels are the same, yij is equal to 1, otherwise yij is equal to-1. μ ═ 10, τ ═ 1, μ, τ are two empirical parameters.

y_ij(μ-dW(x_i，x_j))＞τ (1)；

The distance dW (xi, xj) between a pair of depth features xi and xj is calculated as:

where Wxi denotes the projection matrix of xi samples and Wxj denotes the projection matrix of xj samples.

The intra-modal discriminant function is defined as follows:

where h () has the formula h (x) ═ max (0, x), the three stages of discriminators, Dpp, Dap and Dpp, can be constructed using the corresponding three projection matrices Wpp, Wap and Wpvp, respectively. To learn features from these three phases, we construct an objective function for the intra-modal discriminant module as follows:

wherein, alpha represents a coordination parameter among the modes, and the introduction of the hyper-parameter k enables the alpha 1, the alpha 2 and the alpha 3 to become nonlinear, thereby avoiding suboptimal solution. In the present work, k is 2, and several values of k are tested in the method. When k is 2, the AUC value is higher. If k >3, the network training is suboptimal and may be difficult to converge.

For step S250, it should be noted that after obtaining the total loss function, the "tenserflow" algorithm may be used to implement the multi-modal feature fusion, and the SGD algorithm may be used to implement the minimization of the objective function. Tensorflow is only a framework for model implementation, and the SGD algorithm is a random gradient descent algorithm for minimizing an objective function, and is an important step for training a model.

For step S220, it should be noted that the specific manner of performing the fusion process is not limited, and may be set according to the actual application requirement. For example, in an alternative example, step S220 may include a step of obtaining a fusion feature according to the learned feature, and therefore, on the basis of fig. 2, fig. 5 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 5, step S220 may include:

step S221, performing feature processing on the image to be recognized in each modality to obtain a learning feature of each image to be recognized.

And step S222, splicing the learning features of the images to be processed to obtain fusion features.

For step S221, it should be noted that the specific manner of performing the feature processing is not limited, and the feature processing may be set according to the actual application requirement. For example, in an alternative example, the preset recognition model includes projection matrices corresponding to images to be recognized of respective modalities, and step S221 may include a step of calculating a learning feature from the projection matrices. Therefore, on the basis of fig. 5, fig. 6 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 6, step S221 may include:

step S2211, for the image to be recognized in each modality, performing depth recognition processing on the image to be recognized, so as to obtain a depth feature of the image to be recognized.

And step S2212, calculating the learning characteristic of the image to be processed according to the depth characteristic of the image to be processed and the corresponding projection matrix.

Wherein, the projection matrix W is learnable, and the depth feature F obtained by the CNN network is multiplied by the learnable W to obtain a learning feature.

For step S222, it should be noted that the specific manner of performing the splicing process is not limited, and the splicing process may be set according to the actual application requirement. For example, in an alternative example, the preset recognition model includes a conca layer, and step S222 may include the following sub-steps:

That is, after the conca layer splices the feature matrices of the three learning features, the softmax layer is sent to perform classification.

After step S230, the biometric information identification method provided in the embodiment of the present application may further include a step of updating the preset identification model. Therefore, on the basis of fig. 2, fig. 7 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 7, the biometric information identification method may further include:

step S260, obtaining feedback data of the user.

And step S270, updating the preset identification model according to the feedback data.

Where the images are labeled (High-low), histological grading of HCC, stage I, II for low-grade, and stage III and IV for High-grade. During testing, the labels predicted by the model are compared with the original labels of the images, so that the accuracy, sensitivity, specificity, AUC (AUC) value and the like of the high and low classification of the tumor can be obtained (a table below). By means of the method for fusing the discriminant learning and the adaptive weight under the deep learning framework, the problems existing in the two multi-mode fusion are solved, and the performance index of the model is improved.

That is, the superiority of the method provided by the embodiments of the present application can be verified by comparison with the current multimodal fusion method and a large number of experiments, and the flow steps of the experimental verification are as follows.

1) Two fusion experiments were performed: 3D CNN-based multi-phase fusion and 2D CNN-based multi-view fusion;

2) comparing typical fusion methods;

3) the two modules are combined with a typical fusion method, and the validity of the two modules is verified.

In detail, 117 histologically confirmed liver cancer patients can be included, the data set was randomly divided into a training set (77 HCCs) and a test set (40 HCCs), and the training and the test were repeated five times to calculate values of accuracy, sensitivity, color-to-contrast, and area under the curve (AUC).

Fig. 9 is a graph showing the results of heterogeneous fusion performance based on 3D CNN, comparing performance (%) of different methods of three-dimensional heterogeneous fusion of 117 HCCs. Fig. 8 is a graph of the results of the performance of 2D CNN multiview fusion, comparing the effect of 117 HCC arterial phase two-dimensional multiview fusion with different methods. After obtaining a result graph of experimental validation, each module can be studied in detail: the result graph shows that the performance index of the method provided by the embodiment of the application is better, further proves the effectiveness of the proposed discriminant intra-mode item and the proposed adaptive weighting adjustment strategy in HCC classification, and verifies the effectiveness of the proposed adaptive weighting strategy in 3D CNN-based multi-phase fusion.

It should be noted that, when the performance index of the method provided by the embodiment of the present application fails to meet the requirement, feedback data of the user may be obtained, and the preset identification model is updated according to the feedback data until the performance index meets the requirement.

With reference to fig. 10, an embodiment of the present application further provides a biological information recognition apparatus 1000 based on deep network multimodal information fusion, where functions implemented by the biological information recognition apparatus 1000 correspond to steps executed by the foregoing method. The biometric information recognition apparatus 1000 may be understood as a processor of the electronic device 100, or may be understood as a component that is independent of the electronic device 100 or a processor and that implements the functions of the present application under the control of the electronic device 100. The biological information recognition apparatus 1000 may include an image acquisition module 1010, an image processing module 1020, and a classification module 1030, among others.

The image obtaining module 1010 is configured to obtain images to be identified in multiple modalities, where the images to be processed in the multiple modalities belong to the same biological information. In the embodiment of the present application, the image obtaining module 1010 may be configured to perform step S210 shown in fig. 2, and reference may be made to the foregoing description of step S210 regarding the relevant content of the image obtaining module 1010.

The image processing module 1020 is configured to perform fusion processing on the images to be identified in the multiple modalities through a preset identification model to obtain fusion characteristics. In the embodiment of the present application, the image processing module 1020 may be configured to perform step S220 shown in fig. 2, and reference may be made to the foregoing description of step S220 regarding the relevant content of the image processing module 1020.

And a classification module 1030, configured to perform classification processing according to the fusion features to obtain a category of the biological information, where the preset identification model is obtained by training a plurality of modal biological information images. In the embodiment of the present application, the classifying module 1030 may be configured to perform step S230 shown in fig. 2, and reference may be made to the foregoing description of step S230 for relevant contents of the classifying module 1030.

Further, the image processing module 1020 may be further specifically configured to:

performing feature processing on the images to be recognized in each mode to obtain learning features of the images to be recognized;

In addition, an embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and the computer program is executed by a processor to perform the steps of the above-mentioned biological information identification method.

The computer program product of the biological information identification method provided in the embodiment of the present application includes a computer readable storage medium storing a program code, and instructions included in the program code may be used to execute the steps of the biological information identification method in the above method embodiment, which may be specifically referred to in the above method embodiment and are not described herein again.

In summary, according to the biological information identification method and device based on deep network multimodal information fusion provided by the embodiment of the application, fusion processing is performed on images to be identified in multiple modalities through the preset identification model to obtain fusion features, classification processing is performed according to the fusion features, classification and identification are performed according to the features after the images in the multiple modalities are fused, and the problem that in the prior art, the neural network model generally identifies the images in the multiple modalities respectively, so that the reliability of image identification is low is solved.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. a biological information identification method based on deep network multimodal information fusion, is characterized in that, comprises:

acquiring images to be identified in multiple modalities, wherein the images to be processed in the multiple modalities belong to the same biological information;

Perform fusion processing on the images to be recognized in the multiple modalities by using a preset recognition model to obtain fusion features;

The classification processing is performed according to the fusion feature to obtain the category of the biological information, wherein the preset recognition model is obtained by training according to the biological information images of multiple modalities.

2 . The biometric information recognition method based on deep network multimodal information fusion according to claim 1 , wherein the predetermined recognition model is used to perform fusion processing on the images to be recognized of the multiple modalities to obtain 2 . The steps of fusing features include:

Perform feature processing on the to-be-recognized images of each modality to obtain learning features of each of the to-be-recognized images;

Perform splicing processing on the learning features of the images to be processed to obtain fusion features.

3. The biological information recognition method based on deep network multimodal information fusion as claimed in claim 2, wherein the preset recognition model comprises a projection matrix corresponding to the to-be-recognized image of each modal, wherein the Describe the steps of performing feature processing on the to-be-recognized images of each modality to obtain the learning features of each of the to-be-recognized images, including:

For each modality of the to-be-recognized image, perform depth recognition processing on the to-be-recognized image to obtain the depth feature of the to-be-processed image;

The learning feature of the image to be processed is calculated according to the depth feature of the image to be processed and the corresponding projection matrix.

4. The biometric information recognition method based on deep network multimodal information fusion according to claim 2, wherein the preset recognition model comprises a conca layer, and the learning feature of each of the images to be processed is performed. The steps of splicing to obtain fusion features include:

The learning features of each image to be processed are spliced through the conca layer to obtain fusion features.

5. The biological information recognition method based on deep network multimodal information fusion as claimed in claim 1, wherein the biological information recognition method based on deep network multimodal information fusion further comprises a method of training a preset recognition model. step, the step includes:

Calculate the total loss function according to the optimized loss function and classification loss function of the initial model;

The initial model is trained according to the total loss function to obtain a preset recognition model.

6. The biological information recognition method based on deep network multimodal information fusion as claimed in claim 1, wherein the biological information recognition method based on deep network multimodal information fusion further comprises:

Obtain user feedback data;

The preset recognition model is updated according to the feedback data.

7. A biometric information identification device based on deep network multimodal information fusion, characterized in that, comprising:

an image acquisition module, configured to acquire images to be identified in multiple modalities, wherein the images to be processed in the multiple modalities belong to the same biological information;

an image processing module, configured to perform fusion processing on the to-be-recognized images of the multiple modalities through a preset recognition model to obtain fusion features;

The classification module is configured to perform classification processing according to the fusion feature to obtain the category of the biological information, wherein the preset recognition model is obtained by training according to the biological information images of multiple modalities.

8. The biometric information identification device based on deep network multimodal information fusion as claimed in claim 7, wherein the image processing module is specifically used for:

9. An electronic device, characterized in that it comprises a memory and a processor, wherein the processor is configured to execute an executable computer program stored in the memory, so as to realize the depth-based method according to any one of claims 1-6. A biometric information recognition method based on network multimodal information fusion.

10. A storage medium, characterized in that a computer program is stored thereon, and when the program is executed, the steps of the biometric information identification method based on deep network multimodal information fusion described in any one of claims 1-6 are implemented.