TW202329153A

TW202329153A - Data fusion system and method thereof

Info

Publication number: TW202329153A
Application number: TW111100631A
Authority: TW
Inventors: 潘善斌; 高于晴
Original assignee: 沐恩生醫光電股份有限公司
Priority date: 2022-01-06
Filing date: 2022-01-06
Publication date: 2023-07-16
Also published as: TWI829065B

Abstract

The data fusion system is used to form a composite feature matrix. The composite feature matrix includes data in different formats. The data fusion system includes a data receiving device for collecting a plurality of data having different data formats. A feature extraction device, based on the data formats, uses the corresponding neural network architecture to perform feature matrix extraction on the data to generate a plurality of feature matrices. A feature integration device receives the feature matrices to combine the feature matrices to form the composite feature matrix.

Description

Data fusion system and its operation method

本發明涉及一種形成資料的系統及其操作方法，尤其是有關於一種資料融合系統及其操作方法，用以形成包括不同格式資料的複合式特徵矩陣。The present invention relates to a system for forming data and its operating method, in particular to a data fusion system and its operating method for forming a composite feature matrix including data in different formats.

隨著人口數量的不斷增長、人口老齡化的加重，越來越多的人需要尋求醫療幫助。近年來，將大數據分析應用於醫療輔助上與日俱增，不少醫師與工程師合作開發能幫助醫療體系效率與容錯率的模型開發，在一定程度上減少醫療體系資源浪費與強化精準醫療。As the population continues to grow and the population ages, more and more people need to seek medical help. In recent years, the application of big data analysis to medical assistance has been increasing day by day. Many doctors and engineers have cooperated to develop models that can help the efficiency and fault tolerance of the medical system, which can reduce the waste of medical system resources and strengthen precision medicine to a certain extent.

傳統上，都是將不同格式的生理資料分別用不同的演算法進行訓練建立對應判斷模型，例如，分別將文字格式的數據生理資料、光學格式的影像生理資料或聲學格式的聲音生理資料等，以不同的演算法分別進行訓練來建立模型。然而，如此的做法存在資料所能學習的特徵限制，例如，單純使用病人的影像生理資料，如X光片，所產生之判斷結果，醫師並無法完整下診斷，通常需要進一步結合病人數據生理資料或影像生理資料所產生之判斷結果，做綜合評斷，才能精確下診斷。Traditionally, physiological data in different formats are trained with different algorithms to establish a corresponding judgment model. Models are built by training them separately with different algorithms. However, such an approach has limitations in the characteristics that can be learned from the data. For example, the judgment results generated by simply using the patient's image physiological data, such as X-rays, cannot be fully diagnosed by the doctor, and it is usually necessary to further combine the patient's physiological data Or the judgment results generated by the imaging physiological data, only by making a comprehensive judgment can an accurate diagnosis be made.

因此過去以單一生理資料特徵進行模型的訓練方式，存在改進空間。Therefore, there is room for improvement in the past model training method based on a single physiological data feature.

本案的一實施態樣係提供一種資料融合系統，用以形成一包括不同格式資料的一複合式特徵矩陣，包括：一資料接收元件用以收集複數資料，其中該些資料包括複數種資料格式；一特徵擷取元件，根據該些資料格式應用對應神經網路架構對該些資料進行特徵矩陣擷取，產生複數個特徵矩陣；以及一特徵集成元件用以接收該些特徵矩陣，以結合該些特徵矩陣成該複合式特徵矩陣。An implementation aspect of this case is to provide a data fusion system for forming a composite feature matrix including data in different formats, including: a data receiving element for collecting multiple data, wherein the data includes multiple data formats; A feature extraction component, which applies the corresponding neural network framework to extract feature matrices from the data according to the data format, and generates a plurality of feature matrices; and a feature integration component is used to receive the feature matrices to combine the The feature matrix becomes the composite feature matrix.

在一些實施例中，該些資料包括文字格式的數據生理資料、光學格式的影像生理資料以及聲學格式的聲音生理資料。In some embodiments, the data include digital physiological data in text format, visual physiological data in optical format, and sound physiological data in acoustic format.

在一些實施例中，擷取元件使用一前饋神經網路架構從該文字格式的數據生理資料中擷取出具一第一維度的一第一特徵矩陣，以及使用一卷積神經網路從該光學格式的影像生理資料以及該聲學格式的聲音生理資料，分別擷取出具一第二維度的一第二特徵矩陣以及具一第三維度的一第三特徵矩陣。In some embodiments, the extraction component uses a feed-forward neural network architecture to extract a first feature matrix with a first dimension from the data physiological data in text format, and uses a convolutional neural network to extract a first feature matrix from the text format. A second feature matrix with a second dimension and a third feature matrix with a third dimension are respectively extracted from the image physiological data in the optical format and the sound physiological data in the acoustic format.

在一些實施例中，該特徵集成元件將該第一特徵矩陣、該第二特徵矩陣以及該第三特徵矩陣直接做連接，形成該複合式特徵矩陣。In some embodiments, the feature integration element directly connects the first feature matrix, the second feature matrix and the third feature matrix to form the composite feature matrix.

在一些實施例中，該複合式特徵矩陣具一第四維度，該第四維度等於該第一維度、該第二三維度和該第三維度加總。In some embodiments, the composite feature matrix has a fourth dimension, and the fourth dimension is equal to the sum of the first dimension, the second three-dimensional dimension and the third dimension.

在一些實施例中，該特徵集成元件將該第一特徵矩陣、該第二特徵矩陣以及該第三特徵矩陣進行一參數平均計算來形成該複合式特徵矩陣。In some embodiments, the feature integration element performs a parameter average calculation on the first feature matrix, the second feature matrix and the third feature matrix to form the composite feature matrix.

在一些實施例中，在該特徵集成元件進行該參數平均計算前，更包括透過一全連接層進行一矩陣向量乘積之降維度處理，讓該第一特徵矩陣、該第二特徵矩陣以及該第三特徵矩陣具有相同維度。In some embodiments, before the feature integration element performs the parameter average calculation, it further includes performing a dimensionality reduction process of a matrix-vector product through a fully connected layer, so that the first feature matrix, the second feature matrix, and the second feature matrix The three feature matrices have the same dimensions.

在一些實施例中，該特徵集成元件進行該參數平均計算來形成該複合式特徵矩陣更包括：該複合式特徵矩陣=a*(該第一特徵矩陣)+b*(該第二特徵矩陣112)+c*(該第三特徵矩陣113)，其中，a、b、c分別為該第一特徵矩陣、該第二特徵矩陣和該第三特徵矩陣對應權重，其中a+b+c=1 。In some embodiments, the feature integration element performs the parameter average calculation to form the composite feature matrix further includes: the composite feature matrix=a*(the first feature matrix)+b*(the second feature matrix 112 )+c*(the third characteristic matrix 113), wherein, a, b, c are respectively the corresponding weights of the first characteristic matrix, the second characteristic matrix and the third characteristic matrix, wherein a+b+c=1 .

在一些實施例中，資料融合系統更包括一資料清理元件耦接該資料接收元件，用以整理該些資料。In some embodiments, the data fusion system further includes a data cleaning component coupled to the data receiving component for sorting the data.

本案的另一實施態樣係提供一種資料融合方法，用以形成一包括不同格式資料的一複合式特徵矩陣，包括：收集複數資料，其中該些資料包括複數種資料格式；根據該些資料格式應用對應神經網路架構對該些資料進行特徵矩陣擷取，產生複數個特徵矩陣；以及將該些特徵矩陣直接接合或進行一參數平均計算來形成該複合式特徵矩陣。Another implementation aspect of this case is to provide a data fusion method for forming a composite feature matrix including data in different formats, including: collecting multiple data, wherein the data includes multiple data formats; according to the data formats Applying the corresponding neural network architecture to extract feature matrices from the data to generate a plurality of feature matrices; and directly combining these feature matrices or performing a parameter average calculation to form the composite feature matrix.

本發明通過整合不同的資料格式結合成一複合式資料提供模型進行學習。依此，模型能夠對此複合式資料中不同格式的資料特徵進行一個整合學習，因此，可結合病人的數據生理資料、影像生理資料以及聲音生理資料來提供一綜合判斷，大幅提升判斷得準確性。The present invention combines different data formats into a composite data providing model for learning. According to this, the model can carry out an integrated study of the data characteristics of different formats in this composite data. Therefore, it can combine the patient's data physiological data, image physiological data and sound physiological data to provide a comprehensive judgment, greatly improving the accuracy of judgment .

以下將以實施方式對上述的說明作詳細的描述，並對本發明的技術方案提供更進一步的解釋。The above-mentioned description will be described in detail in the following embodiments, and a further explanation will be provided for the technical solution of the present invention.

以下將以圖式及詳細敘述清楚說明本案之精神，任何所屬技術領域中具有通常知識者在瞭解本案之實施例後，當可由本案所教示之技術，加以改變及修飾，其並不脫離本案之精神與範圍。The following will clearly illustrate the spirit of this case with drawings and detailed descriptions. Anyone with common knowledge in the technical field can change and modify the technology taught in this case after understanding the embodiment of this case. It does not depart from the spirit of this case. Spirit and scope.

本文之用語只為描述特定實施例，而無意為本案之限制。單數形式如“一”、“這” 、“此” 、“本”以及“該”，如本文所用，同樣也包含複數形式。The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting of the present case. Singular forms such as "a", "the", "the", "this" and "the", as used herein, also include plural forms.

關於本文中所使用之『耦接』或『連接』，均可指二或多個元件或裝置相互直接作實體接觸，或是相互間接作實體接觸，亦可指二或多個元件或裝置相交互操作或動作。As used herein, "coupling" or "connection" can mean that two or more elements or devices are in direct physical contact with each other, or that they are in indirect physical contact with each other, or that two or more elements or devices are in physical contact with each other. Interaction or action.

關於本文中所使用之『包含』、『包括』、『具有』、『含有』等等，均為開放性的用語，即意指包含但不限於。"Includes", "including", "has", "containing" and so on used in this article are all open terms, meaning including but not limited to.

關於本文中所使用之『及/或』，係包括所述事物的任一或全部組合。As used herein, "and/or" refers to any or all combinations of the above things.

關於本文中所使用之用詞（terms），除有特別註明外，通常具有每個用詞使用在此領域中、在本案之內容中與特殊內容中的平常意義。某些用以描述本案之用詞將於下或在此說明書的別處討論，以提供本領域技術人員在有關本案之描述上額外的引導。Regarding the terms (terms) used in this article, unless otherwise specified, generally have the ordinary meaning of each term used in this field, in the content of this case and in the special content. Certain terms used to describe the subject matter are discussed below or elsewhere in this specification to provide those skilled in the art with additional guidance in describing the subject matter.

為解決傳統上僅使用單一生理資料特徵做學習，造成最終僅能提供單方向判斷結果，無法進行全方位評估，達到精準診斷之目的。因此，本案藉由整合不同的資料格式，將文字格式的數據生理資料、光學格式的影像生理資料以及聲學格式的聲音生理資料，結合成一複合式資料提供模型進行學習。依此，模型能夠對此複合式資料中不同格式的資料特徵進行一個整合學習，因此，可結合病人的數據生理資料、影像生理資料以及聲音生理資料來提供一綜合判斷，大幅提升判斷得準確性。In order to solve the traditional problem of only using a single physiological data feature for learning, it can only provide one-way judgment results in the end, and cannot conduct a comprehensive evaluation to achieve the purpose of accurate diagnosis. Therefore, by integrating different data formats, this case combines data physiological data in text format, image physiological data in optical format, and sound physiological data in acoustic format into a composite data providing model for learning. According to this, the model can carry out an integrated study of the data characteristics of different formats in this composite data. Therefore, it can combine the patient's data physiological data, image physiological data and sound physiological data to provide a comprehensive judgment, greatly improving the accuracy of judgment .

第1圖所示為根據本案一實施例整合不同資料格式特徵矩陣的資料融合系統概略圖。資料融合系統100可形成一包括不同格式資料的複合式特徵矩陣110。資料融合系統100包括一資料接收元件101、一資料清理元件102、一特徵擷取元件104以及一特徵集成元件108。在一實施例中，資料清理元件102、特徵擷取元件104以及特徵集成元件108可整合成一單一元件或為分離元件，可由中央處理器、微處理器、微控制器、數位信號處理器、特殊應用積體電路來實現。FIG. 1 is a schematic diagram of a data fusion system integrating feature matrices of different data formats according to an embodiment of the present case. The data fusion system 100 can form a composite feature matrix 110 including data in different formats. The data fusion system 100 includes a data receiving component 101 , a data cleaning component 102 , a feature extraction component 104 and a feature integration component 108 . In one embodiment, the data cleaning component 102, the feature extraction component 104, and the feature integration component 108 can be integrated into a single component or separate components, which can be composed of a central processing unit, a microprocessor, a microcontroller, a digital signal processor, a special Application of integrated circuits to achieve.

資料接收元件101用以收集資料，在一實施例中，本案收集的資料包括不同的資料格式，例如文字格式的數據生理資料、光學格式的影像生理資料以及聲學格式的聲音生理資料。在一實施例中，資料接收元件101用以收集文字格式的數據生理資料、光學格式的影像生理資料以及聲學格式的聲音生理資料至少兩者。The data receiving component 101 is used to collect data. In one embodiment, the data collected in this case includes different data formats, such as data physiological data in text format, image physiological data in optical format, and sound physiological data in acoustic format. In one embodiment, the data receiving component 101 is used to collect at least two of data physiological data in a text format, image physiological data in an optical format, and sound physiological data in an acoustic format.

資料整理元件102耦接資料接收元件101，用以對資料接收元件101所收集的資料進行數據整理，以改善資料品質，同時形成符合電腦處理的資料格式。在一實施例中，資料整理元件102對數據遺缺、重複資料、不一致性、數字誤植資料進行整理。在一實施例中，資料整理元件102根據不同格式資料進行對應整理。在一實施裡中，針對文字格式的數據生理資料，資料整理元件102進行數據正規化以及依照資料屬性分類數據，提供給後續特徵擷取元件104進行特徵矩陣擷取。針對光學格式的影像生理資料，資料整理元件102可進行影像轉換，強化影像擴大影像中灰階或色彩的對比，去除影像中因不良傳輸或干擾或不良取像或量化所造成的雜訊，以及擷取影像中點、線、邊、角、區域等特徵，提供給後續特徵擷取元件104進行特徵矩陣擷取。針對聲學格式的聲音生理資料，使用MFCC(Mel-Frequency Cepstral Coefficients ，梅爾頻率倒譜係數)進行整理，以形成對應的梅爾頻率倒譜係數特徵，提供給後續特徵擷取元件104進行特徵矩陣擷取。The data sorting component 102 is coupled to the data receiving component 101 and is used for sorting the data collected by the data receiving component 101 to improve the quality of the data and form a data format suitable for computer processing. In one embodiment, the data sorting component 102 sorts out missing data, repeated data, inconsistency, and digital misplanted data. In one embodiment, the data organizing component 102 performs corresponding sorting according to different formats of data. In one implementation, for the physiological data in text format, the data organizing component 102 performs data normalization and classifies the data according to data attributes, and provides the subsequent feature extraction component 104 for feature matrix extraction. For the image physiological data in optical format, the data processing unit 102 can perform image conversion, strengthen the image and expand the contrast of gray scale or color in the image, remove the noise caused by poor transmission or interference or poor imaging or quantization in the image, and Features such as points, lines, edges, corners, and regions in the image are extracted, and provided to the subsequent feature extraction component 104 for feature matrix extraction. For the sound physiological data in the acoustic format, use MFCC (Mel-Frequency Cepstral Coefficients, Mel-Frequency Cepstral Coefficients) to organize to form the corresponding Mel-Frequency Cepstral Coefficient features, which are provided to the subsequent feature extraction component 104 for feature matrix fetch.

特徵擷取元件104耦接資料整理元件102，對資料整理元件102 處理後之資料進行特徵矩陣擷取。在一實施例中，特徵擷取元件104會根據不同格式資料進行對應特徵矩陣擷取。在一實施裡中，特徵擷取元件104使用前饋神經網路(Feedforward Neural Networks,FNN)，從資料整理元件102處理後之文字格式的數據生理資料中擷取出特徵矩陣105。在一實施例中，所擷取出特徵矩陣105具有128維度。在一實施裡中，特徵擷取元件104使用卷積神經網路(Convolutional Neural Networks,CNN)，從資料整理元件102處理後之光學格式的影像生理資料中擷取出特徵矩陣106。在一實施例中，所擷取出特徵矩陣106具有256維度。在一實施裡中，特徵擷取元件104使用卷積神經網路(Convolutional Neural Networks,CNN)，從資料整理元件102處理後之聲學格式的聲音生理資料中擷取出特徵矩陣107。在一實施例中，所擷取出特徵矩陣107具有256維度。值得注意的是，上述特徵矩陣之維度，僅為一實施例，在其他實施例中，亦可根據所需產生不同維度特徵矩陣。此外，本案所使用的神經網路架構亦不以上述所述為限，其他形式的神經網路架構，例如，人工神經網路(Artificial Neural Networks,ANN)架構，遞歸(循環)神經網路(Recurrent Neural Networks,RNN)架構亦可使用於本發明中，來提取特徵矩陣。The feature extraction component 104 is coupled to the data organizing component 102 and performs feature matrix extraction on the data processed by the data organizing component 102 . In one embodiment, the feature extraction component 104 extracts corresponding feature matrices according to data in different formats. In one implementation, the feature extraction component 104 uses a feedforward neural network (Feedforward Neural Networks, FNN) to extract the feature matrix 105 from the physiological data in text format processed by the data organizing component 102 . In one embodiment, the extracted feature matrix 105 has 128 dimensions. In one implementation, the feature extraction component 104 uses a convolutional neural network (Convolutional Neural Networks, CNN) to extract the feature matrix 106 from the image physiological data in optical format processed by the data organizing component 102 . In one embodiment, the extracted feature matrix 106 has 256 dimensions. In one implementation, the feature extraction unit 104 uses a convolutional neural network (Convolutional Neural Networks, CNN) to extract the feature matrix 107 from the acoustic physiological data processed by the data organizing unit 102 in an acoustic format. In one embodiment, the extracted feature matrix 107 has 256 dimensions. It should be noted that the dimensions of the above-mentioned feature matrix are only one embodiment, and in other embodiments, feature matrices of different dimensions can also be generated according to requirements. In addition, the neural network architecture used in this case is not limited to the above-mentioned, other forms of neural network architecture, for example, artificial neural network (Artificial Neural Networks, ANN) architecture, recursive (circular) neural network ( Recurrent Neural Networks (RNN) architecture can also be used in the present invention to extract the feature matrix.

特徵集成元件108耦接特徵擷取元件104，用以接收特徵擷取元件104所產生的特徵矩陣105、特徵矩陣106以及特徵矩陣107，以結合成一複合式特徵矩陣110提供模型進行學習。在一實施例中，特徵集成元件108將特徵矩陣105、特徵矩陣106以及特徵矩陣107做連接(Concatenate)，形成複合式特徵矩陣110。其中是將特徵矩陣105、特徵矩陣106以及特徵矩陣107的維度直接做連接，如第2圖所示為根據本案一實施例複合式特徵矩陣110之概略圖。在一實施例中，特徵矩陣105具有128維度。特徵矩陣106具有256維度。特徵矩陣107具有256維度。因此，所形成的複合式特徵矩陣110將會有128+256+256=640維度的大小。The feature integration component 108 is coupled to the feature extraction component 104 for receiving the feature matrix 105 , the feature matrix 106 and the feature matrix 107 generated by the feature extraction component 104 to combine into a composite feature matrix 110 to provide a model for learning. In one embodiment, the feature integration component 108 concatenates the feature matrix 105 , the feature matrix 106 and the feature matrix 107 to form a composite feature matrix 110 . Among them, the dimensions of feature matrix 105, feature matrix 106, and feature matrix 107 are directly connected, as shown in FIG. 2, which is a schematic diagram of a composite feature matrix 110 according to an embodiment of this case. In one embodiment, feature matrix 105 has 128 dimensions. The feature matrix 106 has 256 dimensions. The feature matrix 107 has 256 dimensions. Therefore, the formed composite feature matrix 110 will have a size of 128+256+256=640 dimensions.

在另一實施例中，特徵集成元件108將特徵矩陣105、特徵矩陣106以及特徵矩陣107進行參數平均(weight average)計算來形成複合式特徵矩陣110。也就是說，本案更可根據參酌特徵之重要性，對特徵矩陣安排不同權重，例如，對一腫瘤進行判別時，光學格式的影像生理資料，如病患的x光片特徵，在腫瘤判別時的重要性，一般會高於文字格式的數據生理資料，如病患的血脂肪或血壓數據特徵，以及聲學格式的聲音生理資料，如病患的心雜音特徵。因此，可根據參酌特徵之重要性，在進行腫瘤判別訓練時，將從光學格式影像生理資料中擷取出的特徵矩陣106安排較高之權重。依此，為了根據參酌特徵之重要性，對特徵矩陣進行不同權重的安排，來形成複合式特徵矩陣110。因此會先將特徵矩陣105、特徵矩陣106以及特徵矩陣107形成一同一維度，如第3圖所示為根據本案一實施例依據不同權重所形成的複合式特徵矩陣概略圖。在一實施例中，因為特徵矩陣105為128維度，而特徵矩陣106和特徵矩陣107為256維度。因此，特徵矩陣106和特徵矩陣107會先降維度為128維度以和特徵矩陣105連接。在一實施例中，特徵矩陣106和特徵矩陣107會輸入一全連接層(Fully connected layer, FC layer)111進行降維，其中全連接層111執行一個矩陣向量乘積，透過一個轉換矩陣分別將特徵矩陣106和特徵矩陣107從256維度降維成128維度，形成特徵矩陣112和特徵矩陣113以和特徵矩陣105連接。在一實施例中，當特徵矩陣105、特徵矩陣112和特徵矩陣113均為128維度後，即可根據參酌特徵之重要性安排特徵矩陣105、特徵矩陣112和特徵矩陣113對應權重，特徵集成元件108進行參數平均(weight average)計算形成複合式特徵矩陣110。其中複合式特徵矩陣110=a*(特徵矩陣105)+b*(特徵矩陣112)+c*(特徵矩陣113)。其中，a、b、c分別為特徵矩陣105、特徵矩陣112和特徵矩陣113對應權重，a+b+c=1 。依此，所形成的複合式特徵矩陣110為128維度。In another embodiment, the feature integration component 108 performs weight average calculation on the feature matrix 105 , the feature matrix 106 and the feature matrix 107 to form a composite feature matrix 110 . That is to say, in this case, different weights can be assigned to the feature matrix according to the importance of the features considered. For example, when distinguishing a tumor, the image physiological data in optical format, such as the X-ray features of the patient, can be used in tumor discrimination. The importance of the data is generally higher than that of text-based physiological data, such as the patient's blood fat or blood pressure data characteristics, and acoustic physiological data, such as the patient's heart murmur characteristics. Therefore, according to the importance of the reference features, the feature matrix 106 extracted from the optical format image physiological data can be assigned a higher weight when performing tumor discrimination training. Accordingly, in order to arrange different weights for the feature matrix according to the importance of the feature to form a composite feature matrix 110 . Therefore, the feature matrix 105, the feature matrix 106, and the feature matrix 107 are first formed into a same dimension, as shown in FIG. 3, which is a schematic diagram of a composite feature matrix formed according to different weights according to an embodiment of the present case. In an embodiment, since the feature matrix 105 has 128 dimensions, the feature matrix 106 and the feature matrix 107 have 256 dimensions. Therefore, the feature matrix 106 and the feature matrix 107 are first reduced to 128 dimensions to be connected with the feature matrix 105 . In one embodiment, the feature matrix 106 and the feature matrix 107 are input into a fully connected layer (Fully connected layer, FC layer) 111 for dimensionality reduction, wherein the fully connected layer 111 performs a matrix-vector product, and the features are respectively transformed through a transformation matrix The matrix 106 and feature matrix 107 are dimensionally reduced from 256 dimensions to 128 dimensions, forming feature matrix 112 and feature matrix 113 to be connected with feature matrix 105 . In one embodiment, when the feature matrix 105, the feature matrix 112, and the feature matrix 113 are all 128 dimensions, the corresponding weights of the feature matrix 105, the feature matrix 112, and the feature matrix 113 can be arranged according to the importance of the reference feature, and the feature integration element 108 performs parameter average (weight average) calculation to form a composite feature matrix 110 . Wherein the composite feature matrix 110=a*(feature matrix 105)+b*(feature matrix 112)+c*(feature matrix 113). Wherein, a, b, and c are the corresponding weights of the feature matrix 105, the feature matrix 112, and the feature matrix 113, respectively, and a+b+c=1. Accordingly, the formed composite feature matrix 110 has 128 dimensions.

第4圖所示，為根據本案一實施例形成一包括不同格式資料的複合式特徵矩陣之流程圖。請同時參閱第1圖至第4圖。首先於步驟401，進行資料收集。在一實施例中，資料接收元件101用以收集資料，其中本案所收集的資料包括不同的資料格式，例如文字格式的數據生理資料、光學格式的影像生理資料以及聲學格式的聲音生理資料。As shown in FIG. 4, it is a flow chart of forming a compound feature matrix including data in different formats according to an embodiment of the present case. Please also refer to Figures 1 to 4. First, in step 401, data collection is performed. In one embodiment, the data receiving component 101 is used to collect data, wherein the data collected in this case includes different data formats, such as data physiological data in text format, image physiological data in optical format, and sound physiological data in acoustic format.

於步驟402，對所收集的資料進行資料整理。在一實施例中，資料整理元件102用以對資料接收元件101所收集的資料進行數據整理，以改善資料品質，同時形成符合電腦處理的資料格式。In step 402, organize the collected data. In one embodiment, the data sorting component 102 is used for sorting the data collected by the data receiving component 101 to improve the quality of the data and to form a data format suitable for computer processing.

於步驟403，對整理後之資料進行特徵矩陣擷取。在一實施例中，一特徵擷取元件104會根據不同格式資料進行對應特徵矩陣擷取。在一實施裡中，特徵擷取元件104使用前饋神經網路(Feedforward Neural Networks,FNN)，從處理後之文字格式的數據生理資料中擷取出特徵矩陣105。使用卷積神經網路(Convolutional Neural Networks,CNN)，從光學格式的影像生理資料以及聲學格式的聲音生理資料中分別擷取出特徵矩陣106以及特徵矩陣107。In step 403, feature matrix extraction is performed on the sorted data. In one embodiment, a feature extraction component 104 extracts corresponding feature matrices according to data in different formats. In one implementation, the feature extraction component 104 uses a feedforward neural network (Feedforward Neural Networks, FNN) to extract the feature matrix 105 from the processed physiological data in text format. A feature matrix 106 and a feature matrix 107 are respectively extracted from the image physiological data in the optical format and the physiological sound data in the acoustic format by using a convolutional neural network (CNN).

於步驟404，結合特徵矩陣105、特徵矩陣106以及特徵矩陣107成一複合式特徵矩陣110。在一實施例中，一特徵集成元件108將具第一維度的特徵矩陣105、具第二維度的特徵矩陣106以及具第三維度的特徵矩陣107直接連接(Concatenate)，形成具第四維度的複合式特徵矩陣110，其中第四維度的大小等於第一維度、第二維度以及第三維度加總後的大小。In step 404 , combine the feature matrix 105 , the feature matrix 106 and the feature matrix 107 into a composite feature matrix 110 . In one embodiment, a feature integration element 108 directly connects the feature matrix 105 with the first dimension, the feature matrix 106 with the second dimension, and the feature matrix 107 with the third dimension to form a feature matrix with the fourth dimension. The composite feature matrix 110, wherein the size of the fourth dimension is equal to the sum of the first dimension, the second dimension and the third dimension.

在另一實施例中，一特徵集成元件108根據參酌特徵之重要性，將特徵矩陣105、特徵矩陣106以及特徵矩陣107進行參數平均(weight average)計算來形成複合式特徵矩陣110。在一實施例中，透過一全連接層(Fully connected layer, FC layer)111矩陣向量乘積之降維度處理，讓特徵矩陣106、特徵矩陣107和特徵矩陣105具有相同維度，再根據參酌特徵之重要性安排特徵矩陣105、特徵矩陣112和特徵矩陣113對應權重，特徵集成元件108進行參數平均(weight average)計算形成複合式特徵矩陣110。其中，複合式特徵矩陣110= a*(特徵矩陣105)+b*(特徵矩陣112)+c*(特徵矩陣113)。其中，a、b、c分別為特徵矩陣105、特徵矩陣112和特徵矩陣113對應權重。In another embodiment, a feature integration component 108 calculates the weight average of the feature matrix 105 , the feature matrix 106 , and the feature matrix 107 according to the importance of features to form a composite feature matrix 110 . In one embodiment, the feature matrix 106, the feature matrix 107 and the feature matrix 105 have the same dimension through a fully connected layer (Fully connected layer, FC layer) 111 matrix-vector product dimension reduction processing, and then according to the importance of features The characteristic matrix 105 , the characteristic matrix 112 and the characteristic matrix 113 are arranged with corresponding weights, and the characteristic integration component 108 performs parameter average (weight average) calculation to form a composite characteristic matrix 110 . Wherein, the composite feature matrix 110=a*(feature matrix 105)+b*(feature matrix 112)+c*(feature matrix 113). Wherein, a, b, and c are weights corresponding to feature matrix 105, feature matrix 112, and feature matrix 113, respectively.

綜上所述，本案藉由整合不同的資料格式，將文字格式的數據生理資料、光學格式的影像生理資料以及聲學格式的聲音生理資料，結合成一複合式資料提供模型進行學習。依此，模型能夠對此複合式資料中不同格式的資料特徵進行一個整合學習，因此，可結合病人的數據生理資料、影像生理資料以及聲音生理資料來提供一綜合判斷，大幅提升判斷得準確性。To sum up, in this case, by integrating different data formats, data physiological data in text format, image physiological data in optical format, and sound physiological data in acoustic format are combined into a composite data providing model for learning. According to this, the model can carry out an integrated study of the data characteristics of different formats in this composite data. Therefore, it can combine the patient's data physiological data, image physiological data and sound physiological data to provide a comprehensive judgment, greatly improving the accuracy of judgment .

雖然本案以實施例揭露如上，然其並非用以限定本案，任何熟習此技藝者，在不脫離本案之精神和範圍內，當可作各種之更動與潤飾，因此本案之保護範圍當視後附之申請專利範圍所界定者為準。Although this case is disclosed as above with the embodiment, it is not used to limit this case. Anyone who is familiar with this technology can make various changes and modifications without departing from the spirit and scope of this case. Therefore, the scope of protection of this case should be regarded as attached The scope of the patent application shall prevail.

100:資料融合系統 101:資料接收元件 102:資料清理元件 104:特徵擷取元件 105:特徵矩陣 106:特徵矩陣 107:特徵矩陣 108:特徵集成元件 110:複合式特徵矩陣 111:全連接層 112:特徵矩陣 113:特徵矩陣 401-404:步驟 100:Data Fusion System 101: Data receiving component 102:Data cleaning component 104: Feature extraction component 105:Eigen matrix 106:Eigen matrix 107:Eigen matrix 108: Feature integration components 110:Composite feature matrix 111: Fully connected layer 112:Eigen matrix 113: Feature matrix 401-404: Steps

此處的附圖被併入說明書中並構成本說明書的一部分，這些附圖示出了符合本發明的實施例，並與說明書一起用於說明本發明實施例的技術方案。第1圖所示為根據本案一實施例的資料融合系統概略圖。第2圖所示為根據本案一實施例形成一複合式特徵矩陣之概略圖。第3圖所示為根據本案另一實施例形成一複合式特徵矩陣之概略圖。第4圖所示為根據本案一實施例的資料融合方法流程圖。 The drawings here are incorporated into the specification and constitute a part of the specification. These drawings show embodiments consistent with the present invention, and are used together with the description to illustrate the technical solutions of the embodiments of the present invention. FIG. 1 is a schematic diagram of a data fusion system according to an embodiment of the present case. FIG. 2 is a schematic diagram of forming a composite feature matrix according to an embodiment of the present case. FIG. 3 is a schematic diagram of forming a composite feature matrix according to another embodiment of the present application. FIG. 4 is a flowchart of a data fusion method according to an embodiment of the present case.

100:資料融合系統 100:Data Fusion System

101:資料接收元件 101: Data receiving component

102:資料清理元件 102:Data cleaning component

104:特徵擷取元件 104: Feature extraction component

105:特徵矩陣 105:Eigen matrix

106:特徵矩陣 106:Eigen matrix

107:特徵矩陣 107:Eigen matrix

108:特徵集成元件 108: Feature integration components

110:複合式特徵矩陣 110:Composite feature matrix

Claims

A data fusion system for forming a composite feature matrix including data in different formats, including: A data receiving component is used to collect multiple data, wherein the data includes multiple data formats; A feature extraction component, which applies a corresponding neural network framework to extract feature matrices from the data according to the data format to generate a plurality of feature matrices; and A feature integration element is used for receiving the feature matrices to combine the feature matrices to form the composite feature matrix.

The data fusion system as described in Claim 1, wherein the data include data physiological data in text format, image physiological data in optical format, and sound physiological data in acoustic format.

The data fusion system as described in claim 2, wherein the extraction component uses a feed-forward neural network architecture to extract a first feature matrix with a first dimension from the data physiological data in text format, and uses a The convolutional neural network extracts a second feature matrix with a second dimension and a third feature matrix with a third dimension from the image physiological data in the optical format and the sound physiological data in the acoustic format respectively.

The data fusion system according to claim 3, wherein the feature integration component directly connects the first feature matrix, the second feature matrix and the third feature matrix to form the composite feature matrix.

The data fusion system according to claim 4, wherein the composite feature matrix has a fourth dimension, and the fourth dimension is equal to the sum of the first dimension, the second three-dimensional dimension and the third dimension.

The data fusion system according to claim 3, wherein the feature integration component performs a parameter average calculation on the first feature matrix, the second feature matrix, and the third feature matrix to form the composite feature matrix.

The data fusion system as described in claim 6, wherein before the feature integration element performs the average calculation of the parameter, it further includes performing a dimensionality reduction process of a matrix-vector product through a fully connected layer, so that the first feature matrix, the second feature matrix The second feature matrix and the third feature matrix have the same dimension.

The data fusion system as described in claim item 7, wherein the feature integration element performs the parameter average calculation to form the composite feature matrix further includes: the composite feature matrix=a*(the first feature matrix)+b*( The second characteristic matrix 112)+c*(the third characteristic matrix 113), wherein, a, b, and c are respectively the corresponding weights of the first characteristic matrix, the second characteristic matrix and the third characteristic matrix, wherein a +b+c=1.

The data fusion system as described in Claim 1 further includes a data cleaning component coupled to the data receiving component for sorting the data.

A data fusion method for forming a composite feature matrix including data in different formats, including: collecting plural data, which data includes plural data formats; Applying the corresponding neural network architecture to extract feature matrices from the data according to the data format to generate a plurality of feature matrices; and The composite feature matrix is formed by directly concatenating these feature matrices or performing a parameter average calculation.