CN116451095A - Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data - Google Patents

Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data Download PDF

Info

Publication number
CN116451095A
CN116451095A CN202310313005.9A CN202310313005A CN116451095A CN 116451095 A CN116451095 A CN 116451095A CN 202310313005 A CN202310313005 A CN 202310313005A CN 116451095 A CN116451095 A CN 116451095A
Authority
CN
China
Prior art keywords
view
hidden
clustering
learning
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310313005.9A
Other languages
Chinese (zh)
Inventor
蔡宏民
杨思蕤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202310313005.9A priority Critical patent/CN116451095A/en
Publication of CN116451095A publication Critical patent/CN116451095A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-view clustering method, a multi-source heterogeneous medical data multi-view clustering device, a medium and multi-view clustering equipment, wherein the method comprises the following steps: acquiring original multi-source heterogeneous medical data; extracting features of the data of each view to obtain hidden representations corresponding to each view one by one; performing a rotation countermeasure study on the hidden representations, distributing Ji Yin representations to redetermine a plurality of new hidden representations; carrying out consistency hidden representation learning according to the new hidden representation to obtain a consistency hidden representation; inputting the consistency hidden representation to a clustering layer to obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function. The embodiment of the invention solves the problem of difficult fusion caused by misalignment of the feature space of each view, can realize more effective feature fusion, obtains a cluster structure with more compact inside and sparse between the classes, and finally improves the clustering effect of multi-source heterogeneous medical data.

Description

Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data
Technical Field
The invention relates to the technical field of multi-view clustering, in particular to a multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data.
Background
There are a variety of data in the existing medical field, and these data often originate from different acquisition devices and come in different forms and structures, such as CT images may show lesions, blood analysis data may represent blood conditions, and this data is multi-view data. The data describe the same sample from different angles, and the data contain different information and always share the essential characteristics of the sample. The medical data are analyzed through multi-view clustering to obtain cluster-like distribution of samples, so that auxiliary disease diagnosis, same-type patient management, prognosis analysis and the like can be realized.
In the prior art, the conventional multi-view clustering method generally adopts technologies such as subspace clustering, matrix decomposition, typical correlation analysis, direct fusion operation and the like to cluster multi-view data.
In the prior art, because each view is derived from different domains, the feature spaces are often not aligned, feature scale confusion is caused by direct fusion, and because each view is often not of equal importance, important information is ignored and unimportant information interferes with learning by direct average fusion of all views, so that clustering effect is influenced.
Disclosure of Invention
In order to solve the technical problems, the embodiment of the invention provides a multi-view clustering method, a multi-view clustering device, a multi-view clustering medium and multi-view clustering equipment for multi-source heterogeneous medical data, which can be used for carrying out distribution alignment on each view of the multi-source heterogeneous data, and guiding feature learning by adopting self-supervision-based clustering loss, so that each view can be effectively fused, and a better clustering effect can be obtained.
In order to achieve the above object, an embodiment of the present invention provides a multi-view clustering method for multi-source heterogeneous medical data, including:
acquiring original multi-source heterogeneous medical data; wherein the raw multi-source heterogeneous medical data comprises a number of views;
extracting features of the data of each view to obtain hidden representations corresponding to each view one by one;
performing a rotation countermeasure study on the hidden representations, distributing Ji Yin representations to redetermine a plurality of new hidden representations;
carrying out consistency hidden representation learning according to the new hidden representation to obtain a consistency hidden representation;
inputting the consistency hidden representation to a clustering layer to obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function.
Further, the feature extraction is performed on the data of each view to obtain hidden representations corresponding to each view one to one, which specifically includes: and respectively inputting each view into a self-encoder network corresponding to each view one by one to obtain hidden representations output by the self-encoder networks one by one.
Further, the raw multi-source heterogeneous medical data includes M views; wherein M is more than or equal to 2;
performing a rotation countermeasure learning on the hidden representations, and distributing Ji Yin representations to redefine a plurality of new hidden representations, wherein the method specifically comprises the following steps: selecting a v-th view as a reference view; wherein v=1, 2, once again, M; inputting each hidden representation to a discriminator corresponding to the reference view to obtain an output result which is output by the discriminator and corresponds to each hidden representation one by one; performing primary calculation according to the output result and a preset rotation countermeasure loss function to obtain a rotation countermeasure loss result, and correcting the hidden representation based on the rotation countermeasure loss result obtained by calculation; judging whether the rotation countermeasure loss result meets a preset rotation countermeasure learning condition or not; the rotation countermeasure learning condition comprises rotation countermeasure loss results obtained by N previous times of calculation are all in a preset first stable interval, and N is a preset positive integer; if the alternate countermeasure learning condition is met, the corrected hidden representation is redetermined to be a new hidden representation;
if the alternate countermeasure learning condition is not satisfied, then: if (v+1) is less than or equal to M, taking (v+1) as new v to re-select a new reference view; if (v+1) > M, then taking (V+1-M) as new V to re-select the new reference view; and recalculating a new rotation countermeasure loss result according to the new reference view, and correcting the hidden representation until the rotation countermeasure learning condition is met, so that consistency representation learning is carried out after all view distribution reaches an alignment state.
Further, the rotation countermeasure loss function is formula (1):
wherein G is a self-encoder network, D is a discriminator, M is the number of views, r is a reference view, i is a view other than the reference view in the views, z (v) E is a mathematical expectation, being a hidden representation corresponding to the v-th view.
Further, according to the new hidden representation, consistent hidden representation learning is performed to obtain a consistent hidden representation, which specifically includes: the views are spliced along the feature dimensions corresponding to the views one by one and then input into a preset weight learning network to obtain a weight vector W output by the weight learning network; wherein w= (W 1 ,W 2 ,......,W M ) And W is t Weights for the t-th view; will W t And multiplying the hidden representations of the t-th view to obtain zt; calculating to obtain the consistency hidden representation
Further, the method further comprises: iterative training is carried out on the consistent hidden representation learning by adopting a consistent hidden representation learning loss function until a preset consistent hidden representation learning condition is met, and the trained consistent hidden representation learning is obtained; the consistency implicit representation learning condition comprises that the number of iterative training reaches a preset second number threshold, or a value obtained by iterative calculation of the consistency implicit representation learning loss function is in a preset second stable interval;
wherein the consistency implicit representation learning loss function is of formula (2):
where K () is a matrix of dimension n, n being the number of samples per iteration training, Z u For coherent implicit representation, W v Weight of the t-th view, Z (v) For the hidden representation corresponding to the v-th view,representing the F-norm of the matrix;
wherein the ith row and jth column elements K of the n-dimensional matrix K () i,j The following formula (3) is calculated:
k i,j =exp(-||K() i -K() j || 2 )/2σ 2 ; (3)
wherein K () i is the ith row, K () j For the j-th row of the matrix K (), σ is a preset constant.
Further, the clustered self-supervised loss function is of formula (4):
min D CS +L ortho ; (4)
wherein D is CS L is a cluster constraint based on CS divergence ortho Is a cluster orthogonal constraint;
L ortho =trisum(AA T );
wherein C is a preset clustering number, A is a clustering result, alpha i For the ith row of A, trisum () is the sum of the elements of the upper triangular matrix, K () is the matrix of dimension n, Z u Is a consistency hidden representation.
The embodiment of the invention also provides a multi-view clustering device for multi-source heterogeneous medical data, which comprises the following steps:
the original data acquisition module is used for acquiring original multi-source heterogeneous medical data; wherein the raw multi-source heterogeneous medical data comprises a number of views;
the feature extraction module is used for extracting features of the data of each view to obtain hidden representations corresponding to each view one by one;
the alternate antagonism learning module is used for performing alternate antagonism learning on the hidden representations and redefining a plurality of new hidden representations by distributing Ji Yin representations;
the consistency hidden representation learning module is used for carrying out consistency hidden representation learning according to the new hidden representation to obtain a consistency hidden representation;
the clustering module is used for inputting the consistency hidden representation into a clustering layer to obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the multi-view clustering method of multi-source heterogeneous medical data as described in any one of the above.
The embodiment of the invention also provides computer equipment, which comprises a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor realizes the steps of the multi-source heterogeneous medical data multi-view clustering method according to any one of the above steps when executing the computer program.
In summary, the invention has the following beneficial effects:
by adopting the embodiment of the invention, each view of the multi-source heterogeneous medical data can be distributed and aligned by adopting the rotation antagonism learning, and the weight of the distributed and aligned features is distributed by utilizing the weight learning network, so that the problem of difficult fusion caused by the misalignment of the feature space of each view is solved, more effective feature fusion is realized, the consistency implicit expression is obtained, the self-supervision loss of cluster constraint and cluster orthogonal constraint based on CS divergence is introduced to guide the feature learning, a more compact cluster structure with sparse clusters between the clusters is obtained, and finally the clustering effect of the multi-source heterogeneous medical data is improved.
Drawings
FIG. 1 is a flow diagram of one embodiment of a multi-view clustering method for multi-source heterogeneous medical data provided by the present invention;
fig. 2 is a schematic structural diagram of an embodiment of a multi-view clustering device for multi-source heterogeneous medical data provided by the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flow chart of an embodiment of a multi-view clustering method for multi-source heterogeneous medical data provided by the present invention is shown, and the method includes steps S1 to S5, specifically as follows:
s1, acquiring original multi-source heterogeneous medical data; wherein the raw multi-source heterogeneous medical data comprises a number of views;
s2, extracting features of the data of each view to obtain hidden representations corresponding to each view one by one;
preferably, the feature extraction is performed on the data of each view to obtain hidden representations corresponding to each view one to one, which specifically includes: and respectively inputting each view into a self-encoder network corresponding to each view one by one to obtain hidden representations output by the self-encoder networks one by one.
Illustratively, the original data X is subjected to feature extraction through a self-Encoder network (Encoder) independent of each View (View) to obtain each ViewHidden representation z= { Z of individual views (1) ,...,Z (v) },v=1,2,...,M。
S3, performing alternate antagonism learning on the hidden representations, and determining the distribution of Ji Yin representations to redetermine a plurality of new hidden representations;
preferably, the raw multi-source heterogeneous medical data comprises M views; wherein M is more than or equal to 2;
performing a rotation countermeasure learning on the hidden representations, and distributing Ji Yin representations to redefine a plurality of new hidden representations, wherein the method specifically comprises the following steps: selecting a v-th view as a reference view; wherein v=1, 2, once again, M; inputting each hidden representation to a discriminator corresponding to the reference view to obtain an output result which is output by the discriminator and corresponds to each hidden representation one by one; performing primary calculation according to the output result and a preset rotation countermeasure loss function to obtain a rotation countermeasure loss result, and correcting the hidden representation based on the rotation countermeasure loss result obtained by calculation; judging whether the rotation countermeasure loss result meets a preset rotation countermeasure learning condition or not; the rotation countermeasure learning condition comprises rotation countermeasure loss results obtained by N previous times of calculation are all in a preset first stable interval, and N is a preset positive integer; if the alternate countermeasure learning condition is met, the corrected hidden representation is redetermined to be a new hidden representation;
if the alternate countermeasure learning condition is not satisfied, then: if (v+1) is less than or equal to M, taking (v+1) as new v to re-select a new reference view; if (v+1) > M, then taking (V+1-M) as new V to re-select the new reference view; and recalculating a new rotation countermeasure loss result according to the new reference view, and correcting the hidden representation until the rotation countermeasure learning condition is met, so that consistency representation learning is carried out after all view distribution reaches an alignment state.
As an improvement of the above scheme, the rotation countermeasure loss function is represented by formula (1):
wherein G is a self-encoder network, D is a discriminator, M is the number of views, r is a reference view, i is a view other than the reference view in the views, z (v) E is a mathematical expectation, being a hidden representation corresponding to the v-th view.
Illustratively, the hidden representations Z of the respective views are subjected to a iterative countermeasure learning (Alternate adversarial learning) to feed the learning results back to the generation process of the previous hidden representation. The specific learning mode flow is as follows: (1) the pick view v is the reference view and the other views are referenced views (fake views) (initially v is the first). (2) The hidden representations of all views are taken as input to the v-th arbiter (dispersor) and a digital output within the 0,1 interval is obtained. Wherein a Discriminator (Discriminator) output number near 0 indicates that the Discriminator considers the input as a referenced view, whereas a 1 near the reference view. (3) The reference view is switched to the next view v+1, and if v+1 is greater than the total view number M, M is left over with v+1. Repeating the steps (1) - (3) until the alternate countermeasure learning process is completed.
It should be noted that the present application may operate on a deep learning framework of the current mainstream, and the framework may automatically calculate a gradient based on a corresponding module (such as a generator and a discriminator) according to a loss function, so as to update parameters of the module by using a gradient descent method. When the discriminator outputs the number, the corresponding rotation antagonism loss function calculates and outputs a corresponding value, so that the parameters of the corresponding module can be updated, and the generation of the hidden representation is finally influenced.
S4, consistent hidden representation learning is carried out according to the new hidden representation, and consistent hidden representation is obtained;
preferably, the consistency hidden representation learning is performed according to the new hidden representation to obtain a consistency hidden representation, which specifically comprises: the views are spliced along the feature dimensions corresponding to the views one by one and then input into a preset weight learning network to obtain a weight vector W output by the weight learning network; wherein w= (W 1 ,W 2 ,......,W M ) And W is t Weights for the t-th view; will W t And multiplying the hidden representations of the t-th view to obtain zt; calculating to obtain the consistency hidden representation
As an improvement of the above solution, the method further includes: iterative training is carried out on the consistent hidden representation learning by adopting a consistent hidden representation learning loss function until a preset consistent hidden representation learning condition is met, and the trained consistent hidden representation learning is obtained; the consistency implicit representation learning condition comprises that the number of iterative training reaches a preset second number threshold, or a value obtained by iterative calculation of the consistency implicit representation learning loss function is in a preset second stable interval; wherein the consistency implicit representation learning loss function is of formula (2):
where K () is a matrix of dimension n, n being the number of samples per iteration training, Z u For coherent implicit representation, W v Weight of the t-th view, Z (v) For the hidden representation corresponding to the v-th view,representing the F-norm of the matrix;
wherein the ith row and jth column elements K of the n-dimensional matrix K () i,j The following formula (3) is calculated:
k i,j =exp(-||K() i -K() j || 2 )/2σ 2 ; (3)
wherein, the liquid crystal display device comprises a liquid crystal display device, K () i I-th row of matrix K (), K () j For the j-th row of the matrix K (), σ is a preset constant.
Illustratively, consistent hidden representation learning is performed on the hidden representations Z of the various views. The specific learning mode flow is as follows: (1) after each view is spliced according to the feature dimension, the view is used as a weight learning network (Attention layer), i.e. Z con . (2) The weight learning network outputs a vector W with M dimensions, and one dimension W in the vector W v The value is 0,1]Within the interval and representing the importance of the v-th view as measured by the network. (3) Each view W v And Z is (v) Corresponding multiplication and accumulation to obtain final consistency hidden representation Z u
S5, inputting the consistency hidden representation to a clustering layer to obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function.
Preferably, the clustered self-supervising loss function is of formula (4):
min D cs +L ortho ; (4)
wherein D is CS L is a cluster constraint based on CS divergence ortho Is a cluster orthogonal constraint;
L ortho =trisum(AA T );
wherein C is a preset clustering number, A is a clustering result, alpha i For the ith row of A, trisum () is the sum of the elements of the upper triangular matrix, K () is the matrix of dimension n, Z u Is a consistency hidden representation.
Correspondingly, the embodiment of the invention also provides a multi-view clustering device for the multi-source heterogeneous medical data, which can realize all the flows of the multi-view clustering method for the multi-source heterogeneous medical data provided by the embodiment.
Referring to fig. 2, a schematic structural diagram of an embodiment of a multi-view clustering device for multi-source heterogeneous medical data is provided.
The original data acquisition module 101 is used for acquiring original multi-source heterogeneous medical data; wherein the raw multi-source heterogeneous medical data comprises a number of views;
the feature extraction module 102 is configured to perform feature extraction on the data of each view to obtain a hidden representation corresponding to each view one by one;
a rotation countermeasure learning module 103, configured to perform rotation countermeasure learning on the hidden representations, and determine a distribution of Ji Yin representations to redefine a number of new hidden representations;
the consistency hidden representation learning module 104 is configured to perform consistency hidden representation learning according to the new hidden representation to obtain a consistency hidden representation;
the clustering module 105 is configured to input the consistency hidden representation to a clustering layer, and obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the multi-view clustering method of multi-source heterogeneous medical data as described in any one of the above.
The embodiment of the invention also provides computer equipment, which comprises a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor realizes the steps of the multi-source heterogeneous medical data multi-view clustering method according to any one of the above steps when executing the computer program.
The computer device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor, such as a multi-view clustering program of multi-source heterogeneous medical data. The processor, when executing the computer program, implements the steps in the above-described embodiments of the multi-view clustering method for multi-source heterogeneous medical data, such as steps S1 to S5 shown in fig. 1. Alternatively, the processor, when executing the computer program, performs the functions of the modules/units of the apparatus embodiments described above, e.g. 101 to 105.
The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules/units may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program in the computer device.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer device may include, but is not limited to, a processor, a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the computer device, connecting various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the computer device integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.
In summary, the invention has the following beneficial effects:
by adopting the embodiment of the invention, each view of the multi-source heterogeneous medical data can be distributed and aligned by adopting the rotation antagonism learning, and the weight of the distributed and aligned features is distributed by utilizing the weight learning network, so that the problem of difficult fusion caused by the misalignment of the feature space of each view is solved, more effective feature fusion is realized, the consistency implicit expression is obtained, the self-supervision loss of cluster constraint and cluster orthogonal constraint based on CS divergence is introduced to guide the feature learning, a more compact cluster structure with sparse clusters between the clusters is obtained, and finally the clustering effect of the multi-source heterogeneous medical data is improved.
From the above description of the embodiments, it will be clear to those skilled in the art that the present invention may be implemented by means of software plus necessary hardware platforms, but may of course also be implemented entirely in hardware. With such understanding, all or part of the technical solution of the present invention contributing to the background art may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the embodiments or some parts of the embodiments of the present invention.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (10)

1. A multi-view clustering method for multi-source heterogeneous medical data, comprising:
acquiring original multi-source heterogeneous medical data; wherein the raw multi-source heterogeneous medical data comprises a number of views;
extracting features of the data of each view to obtain hidden representations corresponding to each view one by one;
performing a rotation countermeasure study on the hidden representations, distributing Ji Yin representations to redetermine a plurality of new hidden representations;
carrying out consistency hidden representation learning according to the new hidden representation to obtain a consistency hidden representation;
inputting the consistency hidden representation to a clustering layer to obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function.
2. The multi-view clustering method of multi-source heterogeneous medical data according to claim 1, wherein the feature extraction is performed on the data of each view to obtain hidden representations corresponding to each view one by one, and the method specifically comprises:
and respectively inputting each view into a self-encoder network corresponding to each view one by one to obtain hidden representations output by the self-encoder networks one by one.
3. The multi-view clustering method of multi-source heterogeneous medical data of claim 1, wherein the original multi-source heterogeneous medical data comprises M views; wherein M is more than or equal to 2;
performing a rotation countermeasure learning on the hidden representations, and distributing Ji Yin representations to redefine a plurality of new hidden representations, wherein the method specifically comprises the following steps:
selecting a v-th view as a reference view; wherein v=1, 2, … …, M;
inputting each hidden representation to a discriminator corresponding to the reference view to obtain an output result which is output by the discriminator and corresponds to each hidden representation one by one;
performing primary calculation according to the output result and a preset rotation countermeasure loss function to obtain a rotation countermeasure loss result, and correcting the hidden representation based on the rotation countermeasure loss result obtained by calculation;
judging whether the rotation countermeasure loss result meets a preset rotation countermeasure learning condition or not; the rotation countermeasure learning condition comprises rotation countermeasure loss results obtained by N previous times of calculation are all in a preset first stable interval, and N is a preset positive integer;
if the alternate countermeasure learning condition is met, the corrected hidden representation is redetermined to be a new hidden representation;
if the alternate countermeasure learning condition is not satisfied, then:
if (v+1) is less than or equal to M, taking (v+1) as new v to re-select a new reference view;
if (v+1) >, taking (v+1-) as new v to reselect a new reference view;
and recalculating a new rotation countermeasure loss result according to the new reference view, and correcting the hidden representation until the rotation countermeasure learning condition is met, so that consistency representation learning is carried out after all view distribution reaches an alignment state.
4. The multi-source heterogeneous medical data multi-view clustering method of claim 3, wherein the rotation antagonism loss function is formula (1):
wherein G is a self-encoder network, D is a discriminator, M is the number of views, r is a reference view, i is a view other than the reference view in the views, z (v) E is a mathematical expectation, being a hidden representation corresponding to the v-th view.
5. The multi-view clustering method of multi-source heterogeneous medical data according to claim 1, wherein the learning of the coherent hidden representation according to the new hidden representation to obtain the coherent hidden representation specifically comprises:
the views are spliced along the feature dimensions corresponding to the views one by one and then input into a preset weight learning network to obtain a weight vector W output by the weight learning network; wherein w= (W 1 ,W 2 ,……,W M ) And W is t Weights for the t-th view;
will W t And multiplying the hidden representation of the t-th view to obtain z t
Calculating to obtain the consistency hidden representation
6. The multi-source heterogeneous medical data multi-view clustering method of claim 5, wherein the method further comprises:
iterative training is carried out on the consistent hidden representation learning by adopting a consistent hidden representation learning loss function until a preset consistent hidden representation learning condition is met, and the trained consistent hidden representation learning is obtained; the consistency implicit representation learning condition comprises that the number of iterative training reaches a preset second number threshold, or a value obtained by iterative calculation of the consistency implicit representation learning loss function is in a preset second stable interval;
wherein the consistency implicit representation learning loss function is of formula (2):
where K () is a matrix of dimension n, n being the number of samples per iteration training, Z u For coherent implicit representation, W v Weight of the t-th view, Z (v) For the hidden representation corresponding to the v-th view,representing the F-norm of the matrix;
wherein the ith row and jth column elements K of the n-dimensional matrix K () i,j The following formula (3) is calculated:
k i,i =exp(-||K() i -K() j || 2 )/2σ 2 ; (3)
wherein K () i I-th row of matrix K (), K () j For the j-th row of the matrix K (), σ is a preset constant.
7. The multi-source heterogeneous medical data multi-view clustering method of claim 1, wherein the clustered self-supervising loss function is of formula (4):
minD CS + ortho ;(4)
wherein D is CS L is a cluster constraint based on CS divergence ortho Is a cluster orthogonal constraint;
L ortho =risum(AA T );
wherein C is a preset clustering number, A is a clustering result, alpha i For the ith row of A, trisum () is the sum of the elements of the upper triangular matrix, K () is the matrix of dimension n, Z u Is a consistency hidden representation.
8. A multi-view clustering device for multi-source heterogeneous medical data, comprising:
the original data acquisition module is used for acquiring original multi-source heterogeneous medical data; wherein the raw multi-source heterogeneous medical data comprises a number of views;
the feature extraction module is used for extracting features of the data of each view to obtain hidden representations corresponding to each view one by one;
the alternate antagonism learning module is used for performing alternate antagonism learning on the hidden representations and redefining a plurality of new hidden representations by distributing Ji Yin representations;
the consistency hidden representation learning module is used for carrying out consistency hidden representation learning according to the new hidden representation to obtain a consistency hidden representation;
the clustering module is used for inputting the consistency hidden representation into a clustering layer to obtain a clustering result output by the clustering layer; the clustering layer is obtained by training a clustering self-supervision loss function.
9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the multi-view clustering method of multi-source heterogeneous medical data according to any one of claims 1 to 7.
10. A computer device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the multi-source heterogeneous medical data multi-view clustering method of any one of claims 1 to 7 when the computer program is executed.
CN202310313005.9A 2023-03-27 2023-03-27 Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data Pending CN116451095A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310313005.9A CN116451095A (en) 2023-03-27 2023-03-27 Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310313005.9A CN116451095A (en) 2023-03-27 2023-03-27 Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data

Publications (1)

Publication Number Publication Date
CN116451095A true CN116451095A (en) 2023-07-18

Family

ID=87121262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310313005.9A Pending CN116451095A (en) 2023-03-27 2023-03-27 Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data

Country Status (1)

Country Link
CN (1) CN116451095A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523244A (en) * 2023-10-31 2024-02-06 哈尔滨工业大学(威海) Multi-view clustering method, system, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523244A (en) * 2023-10-31 2024-02-06 哈尔滨工业大学(威海) Multi-view clustering method, system, electronic equipment and storage medium
CN117523244B (en) * 2023-10-31 2024-05-24 哈尔滨工业大学(威海) Multi-view clustering method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110111313B (en) Medical image detection method based on deep learning and related equipment
Ma et al. Loss odyssey in medical image segmentation
EP3961484A1 (en) Medical image segmentation method and device, electronic device and storage medium
US20220067943A1 (en) Automated semantic segmentation of non-euclidean 3d data sets using deep learning
US11922654B2 (en) Mammographic image processing method and apparatus, system and medium
US20220375242A1 (en) Deep learning automated dermatopathology
Charkhi et al. Asymptotic post-selection inference for the Akaike information criterion
CN109754361A (en) The anisotropic hybrid network of 3D: the convolution feature from 2D image is transmitted to 3D anisotropy volume
TW202040585A (en) Method and apparatus for automated target and tissue segmentation using multi-modal imaging and ensemble machine learning models
WO2021169126A1 (en) Lesion classification model training method and apparatus, computer device, and storage medium
US11430123B2 (en) Sampling latent variables to generate multiple segmentations of an image
JP2021533505A (en) Deep model training methods and their equipment, electronics and storage media
CN116451095A (en) Multi-view clustering method, device, medium and equipment for multi-source heterogeneous medical data
Liu et al. SRAS‐net: Low‐resolution chromosome image classification based on deep learning
CN112614573A (en) Deep learning model training method and device based on pathological image labeling tool
CN111681247A (en) Lung lobe and lung segment segmentation model training method and device
Agrawal et al. A new hybrid adaptive cuckoo search-squirrel search algorithm for brain MR image analysis
CN110874855B (en) Collaborative imaging method and device, storage medium and collaborative imaging equipment
Saeed et al. A granular level feature extraction approach to construct hr image for forensic biometrics using small training dataset
Wesemeyer et al. Annotation quality vs. quantity for deep-learned medical image segmentation
EP3937084A1 (en) Training a model to perform a task on medical data
CN117437423A (en) Weak supervision medical image segmentation method and device based on SAM collaborative learning and cross-layer feature aggregation enhancement
Feng et al. Trusted multi-scale classification framework for whole slide image
CN114708264B (en) Light spot quality judging method, device, equipment and storage medium
Li et al. DeepAMO: a multi-slice, multi-view anthropomorphic model observer for visual detection tasks performed on volume images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination