CN112651445A - Biological information identification method and device based on deep network multi-modal information fusion - Google Patents

Biological information identification method and device based on deep network multi-modal information fusion Download PDF

Info

Publication number
CN112651445A
CN112651445A CN202011588398.7A CN202011588398A CN112651445A CN 112651445 A CN112651445 A CN 112651445A CN 202011588398 A CN202011588398 A CN 202011588398A CN 112651445 A CN112651445 A CN 112651445A
Authority
CN
China
Prior art keywords
fusion
biological information
images
image
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011588398.7A
Other languages
Chinese (zh)
Inventor
周武
李尚轩
张林子
张洪来
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University Of Chinese Medicine Guangzhou Institute Of Chinese Medicine
Guangzhou University of Chinese Medicine
Original Assignee
Guangzhou University Of Chinese Medicine Guangzhou Institute Of Chinese Medicine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University Of Chinese Medicine Guangzhou Institute Of Chinese Medicine filed Critical Guangzhou University Of Chinese Medicine Guangzhou Institute Of Chinese Medicine
Priority to CN202011588398.7A priority Critical patent/CN112651445A/en
Publication of CN112651445A publication Critical patent/CN112651445A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a biological information identification method and device based on deep network multi-modal information fusion, and relates to the technical field of image identification. A biological information identification method based on deep network multi-modal information fusion comprises the following steps: firstly, images to be identified in a plurality of modals are obtained, wherein the images to be processed in the plurality of modals belong to the same biological information; secondly, fusing the images to be recognized in the multiple modes through a preset recognition model to obtain fusion characteristics; and then, carrying out classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of a plurality of modalities. By the method, classification and identification can be performed according to the fused features of the images in the plurality of modalities, and the problem of low reliability of image identification caused by the fact that a neural network model in the prior art generally identifies the images in the plurality of modalities respectively is solved.

Description

Biological information identification method and device based on deep network multi-modal information fusion
Technical Field
The application relates to the technical field of image recognition, in particular to a biological information recognition method and device based on deep network multi-modal information fusion.
Background
The image recognition method for obtaining the category to which the image belongs by manually recognizing the image has a problem of low efficiency of image recognition. In the field of biological information recognition technology, in order to improve the efficiency of biological information image recognition, images can be recognized through a neural network model, but challenges still exist in the multi-modal-based image recognition process. The images have larger intra-class shape difference in multiple modes, so that the classification is more difficult, and the neural network model in the prior art generally identifies the images of the multiple modes respectively, so that the reliability of the image identification is low.
Disclosure of Invention
In view of the above, an object of the present application is to provide a biological information recognition method and apparatus based on deep network multimodal information fusion, so as to solve the problems in the prior art.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
in a first aspect, the invention provides a biological information identification method based on deep network multimodal information fusion, which comprises the following steps:
acquiring images to be identified of a plurality of modalities, wherein the images to be processed of the plurality of modalities belong to the same biological information;
fusing the images to be recognized in the plurality of modes through a preset recognition model to obtain fusion characteristics;
and carrying out classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of a plurality of modalities.
In an optional embodiment, the step of obtaining a fusion feature by fusing the images to be recognized in the multiple modalities through a preset recognition model includes:
performing feature processing on the image to be recognized in each mode to obtain a learning feature of each image to be recognized;
and splicing the learning features of the images to be processed to obtain fusion features.
In an optional embodiment, the preset identification model includes a projection matrix corresponding to the image to be identified in each modality, and the step of performing feature processing on the image to be identified in each modality to obtain a learning feature of each image to be identified includes:
aiming at the image to be recognized of each mode, carrying out depth recognition processing on the image to be recognized to obtain the depth characteristics of the image to be processed;
and calculating to obtain the learning characteristic of the image to be processed according to the depth characteristic of the image to be processed and the corresponding projection matrix.
In an optional embodiment, the preset recognition model includes a conca layer, and the step of performing stitching processing on the learning features of the images to be processed to obtain a fusion feature includes:
and splicing the learning features of the images to be processed through the conca layer to obtain fusion features.
In an optional embodiment, the biological information recognition method based on deep network multimodal information fusion further includes a step of training a preset recognition model, where the step includes:
calculating to obtain a total loss function according to the optimization loss function and the classification loss function of the initial model;
and training the initial model according to the total loss function to obtain a preset recognition model.
In an optional embodiment, the biological information identification method based on deep network multimodal information fusion further includes:
acquiring feedback data of a user;
and updating the preset identification model according to the feedback data.
In a second aspect, the present invention provides a biological information recognition apparatus based on deep network multimodal information fusion, including:
the image acquisition module is used for acquiring images to be identified in a plurality of modals, wherein the images to be processed in the plurality of modals belong to the same biological information;
the image processing module is used for carrying out fusion processing on the images to be identified in the plurality of modes through a preset identification model to obtain fusion characteristics;
and the classification module is used for performing classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of multiple modalities.
In an alternative embodiment, the image processing module is specifically configured to:
performing feature processing on the image to be recognized in each mode to obtain a learning feature of each image to be recognized;
and splicing the learning features of the images to be processed to obtain fusion features.
In a third aspect, the present invention provides an electronic device, which includes a memory and a processor, where the processor is configured to execute an executable computer program stored in the memory to implement the biological information recognition method based on deep network multimodal information fusion according to any one of the foregoing embodiments.
In a fourth aspect, the present invention provides a storage medium, on which a computer program is stored, which when executed, implements the steps of the biological information recognition method based on deep network multimodal information fusion of any one of the foregoing embodiments.
According to the biological information identification method and device based on the deep network multi-modal information fusion, fusion processing is carried out on images to be identified in multiple modalities through the preset identification model, fusion features are obtained, classification processing is carried out according to the fusion features, classification and identification according to the features of the fused images in the multiple modalities are achieved, and the problem that in the prior art, the neural network model generally identifies the images in the multiple modalities respectively, and therefore reliability of image identification is low is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Fig. 2 is a schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.
Fig. 3 is another schematic flow chart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.
Fig. 4 is a block diagram of a preset recognition model according to an embodiment of the present disclosure.
Fig. 5 is another schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.
Fig. 6 is another schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.
Fig. 7 is another schematic flowchart of a biological information identification method based on deep network multimodal information fusion according to an embodiment of the present application.
Fig. 8 is a result chart of experimental verification provided in the embodiment of the present application.
Fig. 9 is another result chart of experimental verification provided in the embodiments of the present application.
Fig. 10 is a block diagram illustrating a structure of a biological information recognition apparatus based on deep network multimodal information fusion according to an embodiment of the present application.
Icon: 100-an electronic device; 110-a network port; 120-a first processor; 130-a communication bus; 140-a first storage medium; 150-interface; 1000-biological information recognition means; 1010-an image acquisition module; 1020-an image processing module; 1030-classification module.
Detailed Description
The defects of the above solutions are the results of the inventor after practice and careful study, and therefore, the discovery process of the above problems and the solution proposed by the present application to the above problems should be the contribution of the inventor to the present application in the process of the present application.
For purposes of making the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be described in detail below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
In order to enable a person skilled in the art to make use of the present disclosure, the following embodiments are given. It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Applications of the system or method of the present application may include web pages, plug-ins for browsers, client terminals, customization systems, internal analysis systems, or artificial intelligence robots, among others, or any combination thereof.
It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Fig. 1 illustrates a schematic diagram of exemplary hardware and software components of an electronic device 100 that may implement the concepts of the present application, according to some embodiments of the present application. The electronic device 100 may include a network port 110 connected to a network, one or more first processors 120 for executing program instructions, a communication bus 130, and a first storage medium 140 of a different form, such as a disk, ROM, or RAM, or any combination thereof. Illustratively, the electronic device 100 may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof, according to which the methods of the present application may be implemented. The electronic device 100 may also include an Input/Output (I/O) interface 150 with other Input/Output devices (e.g., keyboard, display screen).
In some embodiments, the first processor 120 may process information and/or data related to embodiments of the present application to perform one or more of the functions described herein. In some embodiments, the first processor 120 may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (S)). Merely by way of example, the first Processor 120 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set Computing, RISC), a microprocessor, or the like, or any combination thereof.
The first processor 120 in the electronic device 100 may be a general purpose computer or a set purpose computer, both of which may be used to implement the methods of the present application. Although only a single computer is shown, for convenience, the functions described herein may be implemented in a distributed fashion across multiple similar platforms to balance processing loads.
For ease of illustration, only one processor is depicted in electronic device 100. However, it should be noted that the electronic device 100 in the present application may also comprise a plurality of processors, and thus the steps performed by one processor described in the present application may also be performed by a plurality of processors in combination or individually. For example, if the processor of the electronic device 100 executes steps a and B, it should be understood that steps a and B may also be executed by two different processors together or separately in one processor. For example, a first processor performs step A and a second processor performs step B, or both a first processor and a second processor perform steps A and B.
The network may be used for the exchange of information and/or data. In some embodiments, one or more components in electronic device 100 may send information and/or data to other components. For example, the electronic device 100 may acquire the signal via a network. Merely by way of example, the Network may include a Wireless Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a bluetooth Network, a ZigBee Network, or a Near Field Communication (NFC) Network, among others, or any combination thereof.
In some embodiments, the network may include one or more network access points. For example, the network may include wired or wireless network access points, such as base stations and/or network switching nodes, through which one or more components of electronic device 100 may connect to the network to exchange data and/or information.
Fig. 2 shows one of flowcharts of a biological information recognition method based on deep network multimodal information fusion according to an embodiment of the present application, where the method is applicable to the electronic device 100 shown in fig. 1 and is executed by the electronic device 100 in fig. 1. It should be understood that, in other embodiments, the order of partial steps in the biological information identification method based on deep network multimodal information fusion of the present embodiment may be interchanged according to actual needs, or partial steps thereof may be omitted or deleted. The following describes the flow of the biological information recognition method based on deep network multimodal information fusion shown in fig. 2 in detail.
Step S210, images to be identified of a plurality of modalities are acquired.
Wherein, the images to be processed of a plurality of modals belong to the same biological information.
And step S220, carrying out fusion processing on the images to be identified in the plurality of modes through a preset identification model to obtain fusion characteristics.
And step S230, carrying out classification processing according to the fusion characteristics to obtain the category of the biological information.
The preset recognition model is obtained by training according to biological information images of a plurality of modals.
According to the method, the preset identification model is used for carrying out fusion processing on the images to be identified in the plurality of modes to obtain the fusion characteristics, and classification processing is carried out according to the fusion characteristics, so that classification and identification are carried out according to the characteristics of the fused images in the plurality of modes, and the problem of low reliability of image identification caused by the fact that a neural network model in the prior art generally respectively identifies the images in the plurality of modes is solved.
For step S210, it should be noted that the specific manner of acquiring the image to be recognized is not limited, and may be set according to the actual application requirement. For example, in an alternative example, when the electronic device 100 has a module of an image capturing function, the images to be recognized of the plurality of modalities may be directly captured by the electronic device 100. For another example, in another alternative example, when the electronic device 100 does not have a module of an image capturing function, images to be identified of a plurality of modalities may be captured by an image capturing device connected to the electronic device 100.
The biological information may be related to cancer lesion, and specifically may be related to hepatocellular carcinoma. The obtained image to be identified can be a 3D image or a 2D image, the shape of the 3D image is 16X 16, and the shape of the 2D image is 28X 28. After the image to be recognized is obtained, the image to be recognized can be preprocessed, and the shape dimension and the like of the image can be adjusted to adapt to the unified input of the network.
It should be noted that, in order to obtain the preset recognition model, before step S210, the biological information recognition method based on deep-network multimodal information fusion provided in the embodiment of the present application may further include a step of training the preset recognition model. Therefore, on the basis of fig. 2, fig. 3 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 3, the biometric information identification method may further include:
and step S240, calculating to obtain a total loss function according to the optimization loss function and the classification loss function of the initial model.
And step S250, training the initial model according to the total loss function to obtain a preset recognition model.
With respect to step S240, it should be noted that, in conjunction with fig. 4, the preset recognition model (depth supervision network, DSN) provided in the embodiment of the present application includes three convolutional layers (conv1, conv2, and conv3), two full connection layers, projection matrices (Wpp, Wap, and Wpvp) corresponding to images to be recognized of respective modalities, a conca layer, and a softmax layer, where each modality has a CNN network for depth feature learning, that is, a single transverse model before the conca layer in fig. 2.
The embodiment of the application aims to provide a biological information identification method based on deep network multi-modal information fusion, which comprises the following steps: introducing a discriminant intra-class loss function to reduce the distance of features in the same class and expand the distance of features in tumors of different classes; designing a self-adaptive weighting strategy based on loss values of different modes in optimization iteration to increase the contribution of the mode with a relatively low loss value and reduce the influence of the mode with a large loss value on a final loss function; an overall loss function is proposed for the proposed deep learning framework, which combines the proposed two parts to better characterize the lesion.
In detail, the design flow steps of introducing the discriminant intra-class loss function are as follows:
image blocks of images to be identified (before radiography, in arterial and portal phases) of three modalities are respectively placed in a CNN network, and depth features (Fpp, Fap and Fpvp) of each phase are generated from a second full-connection layer. And introducing a modal internal judging module to reduce the distance between the features of the same category and increase the distance between the features of different categories. Then, three projection matrices (Wpp, Wap, and Wpvp) are learned, and depth features are projected to the new feature representation, maximizing the correlation for the same class of samples by reducing the distance of the same class sample features.
And moreover, an adaptive deep supervision network module suitable for multi-mode fusion can be designed, the learned features are respectively sent into the softmax layer for classification, and the softmax layer is connected with the learned features. In the steps of the integration of a deep monitoring network and the self-adaptive weighting adjustment, good adjustment strategies and specifications are provided for improving the classification performance through the direct monitoring of the multi-mode feature extraction.
In detail, the process steps of the deep monitoring network fusion adopting the self-adaptive weighting adjustment are as follows:
the classification loss function that classifies the original DSN may be as follows:
Lclassify=Lcon+λ1·Lpp+λ2·Lap+λ3·Lpvp;
wherein λ 1, λ 2 and λ 3 are weights, Lcon represents a loss value after the three modalities are fused, and Lpp, Lap and Lpvp represent a loss value of a single modality. The weights of the different modalities can be adaptively adjusted:
Figure BDA0002867831870000101
wherein, beta represents a positive focusing parameter, is a super-parameter which can be customized, when the parameter is 2, the parameter is in proportion to the classification performance of the networks m and n, and the classification loss of m and n is focused on the transfer of the positive characteristic knowledge. Let "Lm" and "Ln" be classification losses for networks m and n, corresponding to the m-th and n-th modalities, respectively. Let Δ equal to Lm-Ln, with Δ being positive indicates that the effect of network n is better than the effect of network m.
Further, the total loss function Ltotal of the end-to-end multimodal fusion framework proposed in the embodiments of the present application combines the proposed two parts to better characterize a lesion, and the specific process steps are as follows:
ltotal is LD + Llassify, LD is an optimization function which can maximize the correlation in the same type of sample mode, but maximize the correlation in different types of sample modes, Llassify is a depth supervision loss function of self-adaptive weight based on cross entropy loss, and is used for classification.
The specific steps for generating the LD may be as follows:
the intra-modal discrimination module may design a computational matrix W that projects the depth features of each phase F into space, decreasing the xi and xj distances of the samples if they are from the same class, and increasing the distances otherwise. The constraints are defined as follows: if two tumors are from the same category (yij-1), their relative depth characteristic distance should be less than a predefined threshold μ - τ, otherwise yij-1, the deep layer characteristic distance of the two tumors should be greater than μ + τ. xi, xj represents two image samples of the same data batch in the same modality, yij represents the labels of the two samples, and if the two labels are the same, yij is equal to 1, otherwise yij is equal to-1. μ ═ 10, τ ═ 1, μ, τ are two empirical parameters.
yij(μ-dW(xi,xj))>τ (1);
The distance dW (xi, xj) between a pair of depth features xi and xj is calculated as:
Figure BDA0002867831870000111
where Wxi denotes the projection matrix of xi samples and Wxj denotes the projection matrix of xj samples.
The intra-modal discriminant function is defined as follows:
Figure BDA0002867831870000112
where h () has the formula h (x) ═ max (0, x), the three stages of discriminators, Dpp, Dap and Dpp, can be constructed using the corresponding three projection matrices Wpp, Wap and Wpvp, respectively. To learn features from these three phases, we construct an objective function for the intra-modal discriminant module as follows:
Figure BDA0002867831870000113
wherein, alpha represents a coordination parameter among the modes, and the introduction of the hyper-parameter k enables the alpha 1, the alpha 2 and the alpha 3 to become nonlinear, thereby avoiding suboptimal solution. In the present work, k is 2, and several values of k are tested in the method. When k is 2, the AUC value is higher. If k >3, the network training is suboptimal and may be difficult to converge.
For step S250, it should be noted that after obtaining the total loss function, the "tenserflow" algorithm may be used to implement the multi-modal feature fusion, and the SGD algorithm may be used to implement the minimization of the objective function. Tensorflow is only a framework for model implementation, and the SGD algorithm is a random gradient descent algorithm for minimizing an objective function, and is an important step for training a model.
For step S220, it should be noted that the specific manner of performing the fusion process is not limited, and may be set according to the actual application requirement. For example, in an alternative example, step S220 may include a step of obtaining a fusion feature according to the learned feature, and therefore, on the basis of fig. 2, fig. 5 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 5, step S220 may include:
step S221, performing feature processing on the image to be recognized in each modality to obtain a learning feature of each image to be recognized.
And step S222, splicing the learning features of the images to be processed to obtain fusion features.
For step S221, it should be noted that the specific manner of performing the feature processing is not limited, and the feature processing may be set according to the actual application requirement. For example, in an alternative example, the preset recognition model includes projection matrices corresponding to images to be recognized of respective modalities, and step S221 may include a step of calculating a learning feature from the projection matrices. Therefore, on the basis of fig. 5, fig. 6 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 6, step S221 may include:
step S2211, for the image to be recognized in each modality, performing depth recognition processing on the image to be recognized, so as to obtain a depth feature of the image to be recognized.
And step S2212, calculating the learning characteristic of the image to be processed according to the depth characteristic of the image to be processed and the corresponding projection matrix.
Wherein, the projection matrix W is learnable, and the depth feature F obtained by the CNN network is multiplied by the learnable W to obtain a learning feature.
For step S222, it should be noted that the specific manner of performing the splicing process is not limited, and the splicing process may be set according to the actual application requirement. For example, in an alternative example, the preset recognition model includes a conca layer, and step S222 may include the following sub-steps:
and splicing the learning features of the images to be processed through the conca layer to obtain fusion features.
That is, after the conca layer splices the feature matrices of the three learning features, the softmax layer is sent to perform classification.
After step S230, the biometric information identification method provided in the embodiment of the present application may further include a step of updating the preset identification model. Therefore, on the basis of fig. 2, fig. 7 is a schematic flowchart of another biometric information identification method provided in the embodiment of the present application, and referring to fig. 7, the biometric information identification method may further include:
step S260, obtaining feedback data of the user.
And step S270, updating the preset identification model according to the feedback data.
Where the images are labeled (High-low), histological grading of HCC, stage I, II for low-grade, and stage III and IV for High-grade. During testing, the labels predicted by the model are compared with the original labels of the images, so that the accuracy, sensitivity, specificity, AUC (AUC) value and the like of the high and low classification of the tumor can be obtained (a table below). By means of the method for fusing the discriminant learning and the adaptive weight under the deep learning framework, the problems existing in the two multi-mode fusion are solved, and the performance index of the model is improved.
That is, the superiority of the method provided by the embodiments of the present application can be verified by comparison with the current multimodal fusion method and a large number of experiments, and the flow steps of the experimental verification are as follows.
1) Two fusion experiments were performed: 3D CNN-based multi-phase fusion and 2D CNN-based multi-view fusion;
2) comparing typical fusion methods;
3) the two modules are combined with a typical fusion method, and the validity of the two modules is verified.
In detail, 117 histologically confirmed liver cancer patients can be included, the data set was randomly divided into a training set (77 HCCs) and a test set (40 HCCs), and the training and the test were repeated five times to calculate values of accuracy, sensitivity, color-to-contrast, and area under the curve (AUC).
Fig. 9 is a graph showing the results of heterogeneous fusion performance based on 3D CNN, comparing performance (%) of different methods of three-dimensional heterogeneous fusion of 117 HCCs. Fig. 8 is a graph of the results of the performance of 2D CNN multiview fusion, comparing the effect of 117 HCC arterial phase two-dimensional multiview fusion with different methods. After obtaining a result graph of experimental validation, each module can be studied in detail: the result graph shows that the performance index of the method provided by the embodiment of the application is better, further proves the effectiveness of the proposed discriminant intra-mode item and the proposed adaptive weighting adjustment strategy in HCC classification, and verifies the effectiveness of the proposed adaptive weighting strategy in 3D CNN-based multi-phase fusion.
It should be noted that, when the performance index of the method provided by the embodiment of the present application fails to meet the requirement, feedback data of the user may be obtained, and the preset identification model is updated according to the feedback data until the performance index meets the requirement.
With reference to fig. 10, an embodiment of the present application further provides a biological information recognition apparatus 1000 based on deep network multimodal information fusion, where functions implemented by the biological information recognition apparatus 1000 correspond to steps executed by the foregoing method. The biometric information recognition apparatus 1000 may be understood as a processor of the electronic device 100, or may be understood as a component that is independent of the electronic device 100 or a processor and that implements the functions of the present application under the control of the electronic device 100. The biological information recognition apparatus 1000 may include an image acquisition module 1010, an image processing module 1020, and a classification module 1030, among others.
The image obtaining module 1010 is configured to obtain images to be identified in multiple modalities, where the images to be processed in the multiple modalities belong to the same biological information. In the embodiment of the present application, the image obtaining module 1010 may be configured to perform step S210 shown in fig. 2, and reference may be made to the foregoing description of step S210 regarding the relevant content of the image obtaining module 1010.
The image processing module 1020 is configured to perform fusion processing on the images to be identified in the multiple modalities through a preset identification model to obtain fusion characteristics. In the embodiment of the present application, the image processing module 1020 may be configured to perform step S220 shown in fig. 2, and reference may be made to the foregoing description of step S220 regarding the relevant content of the image processing module 1020.
And a classification module 1030, configured to perform classification processing according to the fusion features to obtain a category of the biological information, where the preset identification model is obtained by training a plurality of modal biological information images. In the embodiment of the present application, the classifying module 1030 may be configured to perform step S230 shown in fig. 2, and reference may be made to the foregoing description of step S230 for relevant contents of the classifying module 1030.
Further, the image processing module 1020 may be further specifically configured to:
performing feature processing on the images to be recognized in each mode to obtain learning features of the images to be recognized;
and splicing the learning features of the images to be processed to obtain fusion features.
In addition, an embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and the computer program is executed by a processor to perform the steps of the above-mentioned biological information identification method.
The computer program product of the biological information identification method provided in the embodiment of the present application includes a computer readable storage medium storing a program code, and instructions included in the program code may be used to execute the steps of the biological information identification method in the above method embodiment, which may be specifically referred to in the above method embodiment and are not described herein again.
In summary, according to the biological information identification method and device based on deep network multimodal information fusion provided by the embodiment of the application, fusion processing is performed on images to be identified in multiple modalities through the preset identification model to obtain fusion features, classification processing is performed according to the fusion features, classification and identification are performed according to the features after the images in the multiple modalities are fused, and the problem that in the prior art, the neural network model generally identifies the images in the multiple modalities respectively, so that the reliability of image identification is low is solved.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A biological information identification method based on deep network multi-modal information fusion is characterized by comprising the following steps:
acquiring images to be identified of a plurality of modalities, wherein the images to be processed of the plurality of modalities belong to the same biological information;
fusing the images to be recognized in the plurality of modes through a preset recognition model to obtain fusion characteristics;
and carrying out classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of a plurality of modalities.
2. The biological information recognition method based on deep network multi-modal information fusion as claimed in claim 1, wherein the step of performing fusion processing on the images to be recognized of the plurality of modalities through a preset recognition model to obtain fusion features comprises:
performing feature processing on the image to be recognized in each mode to obtain a learning feature of each image to be recognized;
and splicing the learning features of the images to be processed to obtain fusion features.
3. The biological information recognition method based on deep network multi-modal information fusion as claimed in claim 2, wherein the preset recognition model includes a projection matrix corresponding to the image to be recognized of each modality, and the step of performing feature processing on the image to be recognized of each modality to obtain the learning feature of each image to be recognized includes:
aiming at the image to be recognized of each mode, carrying out depth recognition processing on the image to be recognized to obtain the depth characteristics of the image to be processed;
and calculating to obtain the learning characteristic of the image to be processed according to the depth characteristic of the image to be processed and the corresponding projection matrix.
4. The biological information recognition method based on deep network multimodal information fusion as claimed in claim 2, wherein the preset recognition model includes a conca layer, and the step of obtaining the fusion feature by performing the stitching processing on the learning feature of each image to be processed includes:
and splicing the learning features of the images to be processed through the conca layer to obtain fusion features.
5. The biological information recognition method based on deep network multimodal information fusion as claimed in claim 1, wherein the biological information recognition method based on deep network multimodal information fusion further comprises a step of training a preset recognition model, the step comprising:
calculating to obtain a total loss function according to the optimization loss function and the classification loss function of the initial model;
and training the initial model according to the total loss function to obtain a preset recognition model.
6. The biological information recognition method based on deep network multimodal information fusion as claimed in claim 1, wherein the biological information recognition method based on deep network multimodal information fusion further comprises:
acquiring feedback data of a user;
and updating the preset identification model according to the feedback data.
7. A biological information recognition device based on deep network multi-modal information fusion is characterized by comprising:
the image acquisition module is used for acquiring images to be identified in a plurality of modals, wherein the images to be processed in the plurality of modals belong to the same biological information;
the image processing module is used for carrying out fusion processing on the images to be identified in the plurality of modes through a preset identification model to obtain fusion characteristics;
and the classification module is used for performing classification processing according to the fusion characteristics to obtain the category of the biological information, wherein the preset identification model is obtained according to the training of biological information images of multiple modalities.
8. The biological information recognition apparatus based on deep network multimodal information fusion as claimed in claim 7, wherein the image processing module is specifically configured to:
performing feature processing on the image to be recognized in each mode to obtain a learning feature of each image to be recognized;
and splicing the learning features of the images to be processed to obtain fusion features.
9. An electronic device, comprising a memory and a processor, wherein the processor is configured to execute an executable computer program stored in the memory to implement the biological information recognition method based on deep network multi-modal information fusion according to any one of claims 1 to 6.
10. A storage medium, characterized in that a computer program is stored thereon, which when executed, implements the steps of the biological information recognition method based on deep network multimodal information fusion of any one of claims 1-6.
CN202011588398.7A 2020-12-29 2020-12-29 Biological information identification method and device based on deep network multi-modal information fusion Pending CN112651445A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011588398.7A CN112651445A (en) 2020-12-29 2020-12-29 Biological information identification method and device based on deep network multi-modal information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011588398.7A CN112651445A (en) 2020-12-29 2020-12-29 Biological information identification method and device based on deep network multi-modal information fusion

Publications (1)

Publication Number Publication Date
CN112651445A true CN112651445A (en) 2021-04-13

Family

ID=75363666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011588398.7A Pending CN112651445A (en) 2020-12-29 2020-12-29 Biological information identification method and device based on deep network multi-modal information fusion

Country Status (1)

Country Link
CN (1) CN112651445A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113317763A (en) * 2021-06-30 2021-08-31 平安科技(深圳)有限公司 Multi-modal Parkinson's disease detection device and computer-readable storage medium
CN113591629A (en) * 2021-07-16 2021-11-02 深圳职业技术学院 Finger three-mode fusion recognition method, system, device and storage medium
CN114693977A (en) * 2022-04-06 2022-07-01 北京百度网讯科技有限公司 Image processing method, model training method, device, equipment and medium
CN117746997A (en) * 2023-12-28 2024-03-22 徐州医科大学 Cis-regulation die body identification method based on multi-mode priori information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558882A (en) * 2018-11-30 2019-04-02 苏州大学 Image classification method and device based on robust part low-rank sparse CNN feature
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN111325221A (en) * 2020-02-25 2020-06-23 青岛海洋科学与技术国家实验室发展中心 Image feature extraction method based on image depth information
CN111368893A (en) * 2020-02-27 2020-07-03 Oppo广东移动通信有限公司 Image recognition method and device, electronic equipment and storage medium
CN111639607A (en) * 2020-06-01 2020-09-08 广州虎牙科技有限公司 Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558882A (en) * 2018-11-30 2019-04-02 苏州大学 Image classification method and device based on robust part low-rank sparse CNN feature
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN111325221A (en) * 2020-02-25 2020-06-23 青岛海洋科学与技术国家实验室发展中心 Image feature extraction method based on image depth information
CN111368893A (en) * 2020-02-27 2020-07-03 Oppo广东移动通信有限公司 Image recognition method and device, electronic equipment and storage medium
CN111639607A (en) * 2020-06-01 2020-09-08 广州虎牙科技有限公司 Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113317763A (en) * 2021-06-30 2021-08-31 平安科技(深圳)有限公司 Multi-modal Parkinson's disease detection device and computer-readable storage medium
CN113317763B (en) * 2021-06-30 2024-03-19 平安科技(深圳)有限公司 Multi-mode-based parkinsonism detection device and computer-readable storage medium
CN113591629A (en) * 2021-07-16 2021-11-02 深圳职业技术学院 Finger three-mode fusion recognition method, system, device and storage medium
CN113591629B (en) * 2021-07-16 2023-06-27 深圳职业技术学院 Finger tri-modal fusion recognition method, system, device and storage medium
CN114693977A (en) * 2022-04-06 2022-07-01 北京百度网讯科技有限公司 Image processing method, model training method, device, equipment and medium
CN117746997A (en) * 2023-12-28 2024-03-22 徐州医科大学 Cis-regulation die body identification method based on multi-mode priori information
CN117746997B (en) * 2023-12-28 2024-06-11 徐州医科大学 Cis-regulation die body identification method based on multi-mode priori information

Similar Documents

Publication Publication Date Title
CN112651445A (en) Biological information identification method and device based on deep network multi-modal information fusion
US11508146B2 (en) Convolutional neural network processing method and apparatus
CN113449857B (en) Data processing method and data processing equipment
Patravali et al. 2D-3D fully convolutional neural networks for cardiac MR segmentation
CN108182441B (en) Parallel multichannel convolutional neural network, construction method and image feature extraction method
CN110782015B (en) Training method, device and storage medium for network structure optimizer of neural network
US10534999B2 (en) Apparatus for classifying data using boost pooling neural network, and neural network training method therefor
US10275688B2 (en) Object detection with neural network
CN113570029A (en) Method for obtaining neural network model, image processing method and device
US20160350649A1 (en) Method and apparatus of learning neural network via hierarchical ensemble learning
US20180114110A1 (en) Method and apparatus to reduce neural network
Hernández et al. Using deep learning for segmentation and counting within microscopy data
WO2020260862A1 (en) Facial behaviour analysis
CN111160225B (en) Human body analysis method and device based on deep learning
WO2023280113A1 (en) Data processing method, training method for neural network model, and apparatus
CN109284782A (en) Method and apparatus for detecting feature
CN114612774A (en) Target detection and model construction method thereof, electronic device and storage medium
Liu et al. Style curriculum learning for robust medical image segmentation
Khan et al. Classification of colon cancer using deep learning techniques on histopathological images
Lanchantin et al. Graph convolutional networks for epigenetic state prediction using both sequence and 3D genome data
Wu et al. A cascaded CNN-based method for monocular vision robotic grasping
Arun et al. Facial micro emotion detection and classification using swarm intelligence based modified convolutional network
CN110210523B (en) Method and device for generating image of clothes worn by model based on shape graph constraint
Garcia-Garcia et al. CGP-NAS: Real-based solutions encoding for multi-objective evolutionary neural architecture search
CN115170568A (en) Automatic segmentation method and system for rectal cancer image and chemoradiotherapy response prediction system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination