CN111563549A - Medical image clustering method based on multitask evolutionary algorithm - Google Patents

Medical image clustering method based on multitask evolutionary algorithm Download PDF

Info

Publication number
CN111563549A
CN111563549A CN202010364563.4A CN202010364563A CN111563549A CN 111563549 A CN111563549 A CN 111563549A CN 202010364563 A CN202010364563 A CN 202010364563A CN 111563549 A CN111563549 A CN 111563549A
Authority
CN
China
Prior art keywords
roi
medical image
clustering
cluster
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010364563.4A
Other languages
Chinese (zh)
Other versions
CN111563549B (en
Inventor
胡晓敏
颜志鹏
陈伟能
李敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202010364563.4A priority Critical patent/CN111563549B/en
Publication of CN111563549A publication Critical patent/CN111563549A/en
Application granted granted Critical
Publication of CN111563549B publication Critical patent/CN111563549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a medical image clustering method based on a multitask evolutionary algorithm, which comprises the following steps of: s1, extracting ROI feature description data of the medical image; s2, reading ROI feature description data of the extracted medical image, and obtaining a plurality of clustering results under a multi-task framework by optimizing a plurality of clustering internal indexes by applying an NMP clustering rule; and S3, selecting an optimal result from the results by using expert knowledge of the doctor. The method can fully express the connotation of the medical image, can simultaneously optimize one population to obtain a plurality of clustering results, is easier to converge to global optimum through cross-domain communication, and has more obvious clustering effect.

Description

Medical image clustering method based on multitask evolutionary algorithm
Technical Field
The invention relates to two fields of medical technology and intelligent calculation, in particular to a medical image clustering method based on a multitask evolutionary algorithm.
Background
In the last 20 years, medical imaging technology has become one of the rapidly growing areas in medical technology, and medical images have become increasingly easy to acquire and store. Among the image data, medical image data occupies a large proportion. In medical systems, medical images play an important role. Through the observation of the medical image, the lesion part is judged more directly and more clearly, so that a more accurate diagnosis result is obtained. The medical image clustering is accurately carried out, scientific reference can be better provided for medical staff to judge and diagnose the disease cause, so that misdiagnosis rate caused by insufficient vision resolution of human or insufficient clinical experience of the medical staff in the subjective aspect can be greatly reduced, and the utilization rate of the medical image is further improved.
At present, for medical image clustering, although a traditional clustering algorithm is transplanted to medical images at home and abroad, such as a method based on a K mean value and a plurality of variants thereof, the method has the defects that the method is sensitive to an initial center and a K value, the method based on density is sensitive to a domain radius and a MinPtr value, the method is greatly influenced by noise, and the method based on a grid is deficient in precision.
In addition, an evolutionary algorithm is used for clustering, a genetic algorithm or a difference algorithm is commonly used, but the evolutionary algorithm and the difference algorithm are based on a single-task framework, and only a single target can be optimized through one-time operation to obtain a single optimization result.
Also, not every pixel in a medical image is worth observing for the medical staff, the doctor will pay more attention to the distinctive pixel regions, called roi (region of interest). In past image studies, researchers extracted ROIs based on features of traditional images, such as color, texture, and shape. But this is not applicable to medical images. Medical images have many features compared to general image data.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a medical image clustering method based on a multi-task evolutionary algorithm, which can fully express the connotation of medical images, simultaneously optimize a population to obtain a plurality of clustering results, is easier to converge to the global optimum through cross-domain communication and has more obvious clustering effect,
in order to achieve the purpose, the technical scheme provided by the invention is as follows:
the medical image clustering method based on the multitask evolutionary algorithm comprises the following steps:
s1, extracting ROI feature description data of the medical image;
s2, reading ROI feature description data of the extracted medical image, and obtaining a plurality of clustering results under a multi-task framework by optimizing a plurality of clustering internal indexes by applying an NMP clustering rule;
and S3, selecting an optimal result from the results by using expert knowledge of the doctor.
Further, the ROI feature description data of the medical image extracted at step S1 includes a relative gray scale S1, a relative area S2, a relative centroid coordinate S3, a circularity S4, an angle S5, and a symmetry S6.
Further, the specific process of extracting ROI feature description data of the medical image in step S1 is as follows:
s1-1, reading in a medical image;
s1-2, scanning the medical image to obtain the maximum value of the pixel number, the length, the width and the gray level of the image;
s1-3, detecting the symmetry of the medical image S6;
s1-4, extracting an ROI (region of interest) of the medical image according to the gray scale range;
s1-5, extracting a gray average value, the number of pixels, a longest axis, a shortest axis, a centroid coordinate and an angle from the ROI obtained in the step S1-4;
and S1-6, calculating the feature description data of the ROI area.
Further, the step S1-3 of detecting the symmetry S6 of the medical image is as follows:
folding the medical image for difference, performing binarization processing by using a gray threshold value, if the remaining pixel points are less than a set value, judging that the medical image is symmetrical, otherwise, judging that the medical image is asymmetrical;
the specific process of calculating the feature description data of the ROI region in step S1-6 is as follows:
relative gray s 1:
Figure BDA0002476283740000031
wherein, ROI.gray is the average gray of the ROI area, and IMAGE.gray is the average gray of the whole image;
relative area s 2: s2 is roi.area, which is the number of pixels in the ROI region, and image.area is the number of pixels in the entire image;
relative centroid coordinate s 3: s3 ═ roi.x/image.length, roi.y/image.height, roi.x is the abscissa of the ROI centroid, image.length is the length of the original image, roi.y is the ordinate of the ROI centroid, and image.height is the height of the original image;
circularity s 4: s4 ═ 4 × pi × roi2ROI is the number of pixels in the ROI region, and ROI is the number of pixels around the ROI region;
angle s 5: s5 (Orientation +90)/180, Orientation being the angle from the long axis of the ROI to the X-axis.
Further, the specific process of step S2 is as follows:
s2-1, reading ROI feature description data of the extracted medical image;
s2-2, initializing a population and setting the maximum iteration number n, wherein k is 0;
s2-3, combining the NMP clustering rule to execute clustering and evaluating the clustering indexes;
s2-4, calculating the skill factor tau of each individual;
s2-5, generating offspring; randomly selecting two individuals a and b from the population as parents, generating a random number rand which is greater than 0 and less than 1, if rand is less than an algorithm parameter rmp or skill factors tau of the two individuals are equal, executing analog binary crossing, otherwise, respectively executing variation on the two individuals; repeating the step of crossing or mutation until the number of filial generations is equal to the number of population individuals, and then entering the step S2-6;
s2-6, calculating the adaptive value of the generated filial generation under each clustering index optimization task;
s2-7, merging the parents and the offspring to form a new population, and recalculating and updating the skill factors tau and the scalar fitness values phi of all the individuals according to the fitness values after clustering;
s2-8, sorting individuals in the population according to the scalar fitness value phi, then sequentially selecting the individuals from good to bad to enter the next generation of population, and preferentially selecting the individuals with larger phi to enter the next generation of population; k is k + 1;
s2-9, if k is less than n, returning to the step S2-3, otherwise, stopping iteration;
and S2-10, finding out the individuals with the skill factor tau equal to 1, and recording the individuals as the individuals with the optimal current tasks.
Further, in step S2-2, the population is initialized by the following codes:
setting the maximum value K of the cluster category numbermaxThe individual code of the population is Kmax+KmaxVector of dimension x d
Figure BDA0002476283740000041
d is the number of dimensions of the data, mijCoordinate vector of cluster center, Tij(j=1,...,Kmax) To obtain an activation threshold for a class centroid point;
the definition of the activation-derived centroid point is specifically as follows:
if TijIf the mass center point m is larger than 0.5, the corresponding clusterijActivated, otherwise not activated; if TijIf the obtained centroid number is less than the set minimum category number, randomly selecting a plurality of activation threshold values to activate so as to meet the requirement of the minimum category number.
Further, in step S2-3, the NMP clustering rule is specifically as follows:
given a number N of data sets, the data is represented as X ═ X (X)1,...,xN) The centroid of the K clusters is C ═ C1,...,CKD denotes distance, then sample point x in the NMP ruleiA certain cluster class C and (i ═ 0.·, N)hThe distance of (h ═ 1.., K) is defined as follows:
D(xi,Ch)=min{D(xi,xj),D(xi,mh)|xj∈Ch}
that is, the distance from the sample to the cluster category is the distance from the sample to the cluster point with the smallest distance;
each sample is assigned to the nearest cluster, and all samples assigned to the same cluster constitute a candidate sample set, and the cluster is called an undetermined cluster of the samples; then, for each cluster, selecting a nearest sample from the candidate sample set, and merging the nearest sample points of all clusters to be called a nearest sample set; finally, finding a sample in the nearest sample set, wherein the distance between the sample and the undetermined cluster in which the sample is located is the smallest in the nearest sample set, and distributing the sample to the undetermined cluster; the above steps are repeated until all samples are allocated.
Further, in step S2-3, the clustering index includes a CH index, a Dunn index, and an SIL index, each corresponding to an optimization task; the indexes are as follows:
CH index:
Figure BDA0002476283740000051
in the above formula, the first and second carbon atoms are,
Figure BDA0002476283740000052
the trace of the inter-class dispersion matrix is represented, and m represents the mean vector of the whole data set;
dunn index:
Figure BDA0002476283740000053
in the above formula, D (C)i,Cj) Representing the distance between the different classes as the distance between the two closest data points, the formula is expressed as follows:
Figure BDA0002476283740000054
(Ci) The two farthest distance points for this category are separated:
Figure BDA0002476283740000055
SIL index:
Figure BDA0002476283740000056
in the above formula, sj=(bj-aj)/max(aj,bj) Representing data point xjThe width of the profile of (a); data point xjAverage distance a to other data points of the class to which it belongsjAnd minimum distance b to other category data pointsjThe calculation formula of (a) is as follows:
Figure BDA0002476283740000057
further, in step S2-5, the operation of simulating binary crossing is as follows:
is provided with two parents xa=[xa(1),...,xa(d)]And xb=[xb(1),...,xb(d)]D is the dimension of the data, the distribution factor c (j) is first calculated:
Figure BDA0002476283740000058
wherein, beta is a system parameter larger than 0, and r is a random number larger than 0 and smaller than 1 in each dimension;
the obtained offspring is:
xe(j)=[(1+c(j))xa(j)+(1-c(j))xb(j)]/2
xf(j)=[(1+c(j))xb(j)+(1-c(j))xa(j)]/2。
compared with the prior art, the principle and the advantages of the scheme are as follows:
1. by a multi-factor evolutionary algorithm, one population is simultaneously optimized aiming at clustering internal indexes such as CH indexes, Dunn indexes, SIL indexes and the like to obtain a plurality of clustering results, and the global optimum is easier to converge through cross-domain communication.
2. The traditional image feature mining method based on color, texture and shape usually ignores information carried by unique attributes of a medical image ROI, the ROI attributes used by the method are relative gray scale, relative area, relative centroid coordinates, circularity, angle and symmetry, and the characteristic features of the medical image are considered when the traditional image features are considered.
3. Nmp (nearest multiple prototypes) -based clustering rules: when the traditional particle swarm optimization mode is applied to cluster optimization, the effect is not good for the problem of non-circular clustering of the clustering form. The NMP rule is a dynamic process, is more flexible and generally has better clustering effect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the services required for the embodiments or the technical solutions in the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is an overall flow chart of the medical image clustering method based on the multi-task evolutionary algorithm of the present invention;
FIG. 2 is a flow chart of medical image ROI data extraction;
FIG. 3 is a diagram of encoding of a multitask cluster evolution algorithm.
Detailed Description
The invention will be further illustrated with reference to specific examples:
as shown in fig. 1, the medical image clustering method based on the multi-task evolutionary algorithm according to this embodiment includes the following steps:
s1, extracting ROI feature description data of the medical image, wherein the ROI feature description data comprise relative gray scale S1, relative area S2, relative centroid coordinate S3, circularity S4, angle S5 and symmetry S6.
As shown in fig. 2, the specific process of extraction is as follows:
s1-1, reading in a medical image;
s1-2, scanning the medical image to obtain the maximum value of the pixel number, the length, the width and the gray level of the image;
s1-3, detecting the symmetry of the medical image S6:
folding the medical image for difference, performing binarization processing by using a gray threshold value, if the remaining pixel points are less than a set value, judging that the medical image is symmetrical, otherwise, judging that the medical image is asymmetrical;
s1-4, extracting an ROI (region of interest) of the medical image according to the gray scale range;
s1-5, extracting a gray average value, the number of pixels, a longest axis, a shortest axis, a centroid coordinate and an angle from the ROI obtained in the step S1-4;
s1-6, calculating the feature description data of the ROI:
relative gray s 1:
Figure BDA0002476283740000071
wherein, ROI.gray is the average gray of the ROI area, and IMAGE.gray is the average gray of the whole image;
relative area s 2: s2 is roi.area, which is the number of pixels in the ROI region, and image.area is the number of pixels in the entire image;
relative centroid coordinate s 3: s3 ═ roi.x/image.length, roi.y/image.height, roi.x is the abscissa of the ROI centroid, image.length is the length of the original image, roi.y is the ordinate of the ROI centroid, and image.height is the height of the original image;
circularity s 4: s4 ═ 4 × pi × roi2ROI is the number of pixels in the ROI region, and ROI is the number of pixels around the ROI region;
angle s 5: s5 (Orientation +90)/180, Orientation being the angle from the long axis of the ROI to the X-axis.
S2, after ROI feature description data of the medical image are extracted, data are read in, NMP clustering rules are applied, multiple clustering internal indexes are optimized, and multiple clustering results are obtained under a multi-task framework; the specific process is as follows:
s2-1, reading ROI feature description data of the extracted medical image;
s2-2, initializing a population and setting the maximum iteration number n, wherein k is 0; the initialization is completed through the following codes:
setting the maximum value K of the cluster category numbermaxThe individual code of the population is Kmax+KmaxVector of dimension x d
Figure BDA0002476283740000081
d is the number of dimensions of the data, mijCoordinate vector of cluster center, Tij(j=1,...,Kmax) To obtain an activation threshold for a class centroid point;
the definition of the activation-derived centroid point is specifically as follows:
if TijIf the mass center point m is larger than 0.5, the corresponding clusterijActivated, otherwise not activated; if TijIf the obtained centroid number is less than the set minimum category number, randomly selecting a plurality of activation threshold values to activate so as to meet the requirement of the minimum category number. As shown in fig. 3 of the drawings, an individual with a dimension d of 2 and a maximum cluster category number Kmax of 4 is preceded by an activation threshold, and a second position 0.4 is less than 0.5, so that a corresponding second centroid is in an inactive state, and so on, and the threshold of other positions is greater than 0.5 and is activated. Therefore, the number of clusters for this individual is 3. In practice, d is set to 6.
S2-3, combining the NMP clustering rule to execute clustering and evaluating the clustering indexes;
the NMP clustering rule in this step is specifically as follows:
given a number N of data sets, the data is represented as X ═ X (X)1,...,xN) The centroid of the K clusters is C ═ C1,...,CKD denotes distance, then sample point x in the NMP ruleiA certain cluster class C and (i ═ 0.·, N)hThe distance of (h ═ 1.., K) is defined as follows:
D(xi,Ch)=min{D(xi,xj),D(xi,mh)|xj∈Ch}
that is, the distance from the sample to the cluster category is the distance from the sample to the cluster point with the smallest distance;
each sample is assigned to the nearest cluster, and all samples assigned to the same cluster constitute a candidate sample set, and the cluster is called an undetermined cluster of the samples; then, for each cluster, selecting a nearest sample from the candidate sample set, and merging the nearest sample points of all clusters to be called a nearest sample set; finally, finding a sample in the nearest sample set, wherein the distance between the sample and the undetermined cluster in which the sample is located is the smallest in the nearest sample set, and distributing the sample to the undetermined cluster; the above steps are repeated until all samples are allocated.
The clustering indexes comprise a CH index, a Dunn index and an SIL index, and respectively correspond to an optimization task; the indexes are as follows:
CH index:
Figure BDA0002476283740000091
in the above formula, the first and second carbon atoms are,
Figure BDA0002476283740000092
the trace of the inter-class dispersion matrix is represented, and m represents the mean vector of the whole data set;
dunn index:
Figure BDA0002476283740000093
in the above formula, D (C)i,Cj) Representing the distance between the different classes as the distance between the two closest data points, the formula is expressed as follows:
Figure BDA0002476283740000094
(Ci) The two farthest distance points for this category are separated:
Figure BDA0002476283740000095
SIL index:
Figure BDA0002476283740000096
in the above formula, sj=(bj-aj)/max(aj,bj) Representing data point xjThe width of the profile of (a); data point xjAverage distance a to other data points of the class to which it belongsjAnd to other category data pointsMinimum distance b ofjThe calculation formula of (a) is as follows:
Figure BDA0002476283740000097
s2-4, calculating the skill factor tau of each individual, which records which one of the optimization tasks the single individual i performs best among all tasks. If the individual i is the first in the individual ranking of the jth task, then τ isiJ. When calculating the skill factor, sequencing all individuals in the population according to the size of the adaptive value under a certain task, wherein the corresponding serial number in the sequence is a certain individual i; at factorial level r of corresponding task ji j
S2-5, generating offspring; randomly selecting two individuals a and b from a population as parents, generating a random number rand which is more than 0 and less than 1, if the rand is less than an algorithm parameter rmp (random matching probability) or the skill factors tau of the two individuals are equal, executing simulated binary crossing, otherwise, respectively executing variation on the two individuals; repeating the step of crossing or mutation until the number of filial generations is equal to the number of population individuals, and then entering the step S2-6;
the operation of simulating binary crossing is as follows:
is provided with two parents xa=[xa(1),...,xa(d)]And xb=[xb(1),...,xb(d)]And d is the dimension of the data, the distribution factor (spread factor) c (j) is firstly calculated:
Figure BDA0002476283740000101
wherein, beta is a system parameter larger than 0, and r is a random number larger than 0 and smaller than 1 in each dimension;
the obtained offspring is:
xe(j)=[(1+c(j))xa(j)+(1-c(j))xb(j)]/2
xf(j)=[(1+c(j))xb(j)+(1-c(j))xa(j)]/2。
s2-6, calculating the adaptive value of the generated filial generation under each clustering index optimization task;
s2-7, merging the parents and the offspring to form a new population, and recalculating and updating the skill factors tau and the scalar fitness values phi of all the individuals according to the fitness values after clustering; if the factorial grades of the ith individual in all K tasks are respectively
Figure BDA0002476283740000102
Then the scalar fitness value for that individual is
Figure BDA0002476283740000103
I.e., the individual scalar fitness value is determined by its factorial rank in the best performing task;
s2-8, sorting individuals in the population according to the scalar fitness value phi, then sequentially selecting the individuals from good to bad to enter the next generation of population, and preferentially selecting the individuals with larger phi to enter the next generation of population; k is k + 1;
s2-9, if k is less than n, returning to the step S2-3, otherwise, stopping iteration;
and S2-10, finding out the individuals with the skill factor tau equal to 1, and recording the individuals as the individuals with the optimal current tasks.
And S3, obtaining 3 clustering results after the step S2 is finished, and selecting an optimal result from the clustering results by utilizing expert knowledge of doctors.
The embodiment can regard the optimization of different indexes as different tasks through multi-task clustering, respectively find the optimal individual under each task, and find the clustering index which is most suitable for the medical images by using expert knowledge. The clustering index can select an index suitable for the actual situation so as to achieve a better effect. The method utilizes the migration process of multi-task learning, the communication among different learning tasks is beneficial to breaking through local optimization to be closer to global optimization, and the optimization speed is more efficient than that of the method for operating different optimization targets for multiple times in succession.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that variations based on the shape and principle of the present invention should be covered within the scope of the present invention.

Claims (9)

1. The medical image clustering method based on the multitask evolutionary algorithm is characterized by comprising the following steps of:
s1, extracting ROI feature description data of the medical image;
s2, reading ROI feature description data of the extracted medical image, and obtaining a plurality of clustering results under a multi-task framework by optimizing a plurality of clustering internal indexes by applying an NMP clustering rule;
and S3, selecting an optimal result from the results by using expert knowledge of the doctor.
2. The medical image clustering method based on the multi-task evolutionary algorithm as claimed in claim 1, wherein the ROI feature description data of the medical image extracted in step S1 comprises relative gray S1, relative area S2, relative centroid coordinates S3, circularity S4, angle S5 and symmetry S6.
3. The medical image clustering method based on the multitask evolutionary algorithm according to claim 2, wherein the specific process of extracting the ROI feature description data of the medical image in the step S1 is as follows:
s1-1, reading in a medical image;
s1-2, scanning the medical image to obtain the maximum value of the pixel number, the length, the width and the gray level of the image;
s1-3, detecting the symmetry of the medical image S6;
s1-4, extracting an ROI (region of interest) of the medical image according to the gray scale range;
s1-5, extracting a gray average value, the number of pixels, a longest axis, a shortest axis, a centroid coordinate and an angle from the ROI obtained in the step S1-4;
and S1-6, calculating the feature description data of the ROI area.
4. The medical image clustering method based on the multitask evolutionary algorithm according to the claim 3, wherein the step S1-3 is to detect the symmetry S6 of the medical image by:
folding the medical image for difference, performing binarization processing by using a gray threshold value, if the remaining pixel points are less than a set value, judging that the medical image is symmetrical, otherwise, judging that the medical image is asymmetrical;
the specific process of calculating the feature description data of the ROI region in step S1-6 is as follows:
relative gray s 1:
Figure FDA0002476283730000011
wherein, ROI.gray is the average gray of the ROI area, and IMAGE.gray is the average gray of the whole image;
relative area s 2: s2 is roi.area, which is the number of pixels in the ROI region, and image.area is the number of pixels in the entire image;
relative centroid coordinate s 3: s3 ═ roi.x/image.length, roi.y/image.height, roi.x is the abscissa of the ROI centroid, image.length is the length of the original image, roi.y is the ordinate of the ROI centroid, and image.height is the height of the original image;
circularity s 4: s4 ═ 4 × pi × roi2ROI is the number of pixels in the ROI region, and ROI is the number of pixels around the ROI region;
angle s 5: s5 (Orientation +90)/180, Orientation being the angle from the long axis of the ROI to the X-axis.
5. The medical image clustering method based on the multitask evolutionary algorithm according to the claim 1, wherein the specific process of the step S2 is as follows:
s2-1, reading ROI feature description data of the extracted medical image;
s2-2, initializing a population and setting the maximum iteration number n, wherein k is 0;
s2-3, combining the NMP clustering rule to execute clustering and evaluating the clustering indexes;
s2-4, calculating the skill factor tau of each individual;
s2-5, generating offspring; randomly selecting two individuals a and b from the population as parents, generating a random number rand which is greater than 0 and less than 1, if rand is less than an algorithm parameter rmp or skill factors tau of the two individuals are equal, executing analog binary crossing, otherwise, respectively executing variation on the two individuals; repeating the step of crossing or mutation until the number of filial generations is equal to the number of population individuals, and then entering the step S2-6;
s2-6, calculating the adaptive value of the generated filial generation under each clustering index optimization task;
s2-7, merging the parents and the offspring to form a new population, and recalculating and updating the skill factors tau and the scalar fitness values phi of all the individuals according to the fitness values after clustering;
s2-8, sorting individuals in the population according to the scalar fitness value phi, then sequentially selecting the individuals from good to bad to enter the next generation of population, and preferentially selecting the individuals with larger phi to enter the next generation of population; k is k + 1;
s2-9, if k is less than n, returning to the step S2-3, otherwise, stopping iteration;
and S2-10, finding out the individuals with the skill factor tau equal to 1, and recording the individuals as the individuals with the optimal current tasks.
6. The medical image clustering method based on the multitask evolution algorithm according to the claim 5, wherein in the step S2-2, the population is initialized by the following codes:
setting the maximum value K of the cluster category numbermaxThe individual code of the population is Kmax+KmaxVector of dimension x d
Figure FDA0002476283730000031
d is the number of dimensions of the data, mijCoordinate vector of cluster center, Tij(j=1,...,Kmax) To obtain an activation threshold for a class centroid point;
the definition of the activation-derived centroid point is specifically as follows:
if TijIf the mass center point m is larger than 0.5, the corresponding clusterijActivated, otherwise not activated; if TijGreater than 1 or negative, reset to 1 or 0, if soAnd if the number of the centers of mass is less than the set minimum number of categories, randomly selecting a plurality of activation threshold values to activate so as to meet the requirement of the minimum number of categories.
7. The medical image clustering method based on the multitask evolutionary algorithm according to the claim 5, wherein in the step S2-3, the NMP clustering rule is as follows:
given a number N of data sets, the data is represented as X ═ X (X)1,...,xN) The centroid of the K clusters is C ═ C1,...,CKD denotes distance, then sample point x in the NMP ruleiA certain cluster class C and (i ═ 0.·, N)hThe distance of (h ═ 1.., K) is defined as follows:
D(xi,Ch)=min{D(xi,xj),D(xi,mh)|xj∈Ch}
that is, the distance from the sample to the cluster category is the distance from the sample to the cluster point with the smallest distance;
each sample is assigned to the nearest cluster, and all samples assigned to the same cluster constitute a candidate sample set, and the cluster is called an undetermined cluster of the samples; then, for each cluster, selecting a nearest sample from the candidate sample set, and merging the nearest sample points of all clusters to be called a nearest sample set; finally, finding a sample in the nearest sample set, wherein the distance between the sample and the undetermined cluster in which the sample is located is the smallest in the nearest sample set, and distributing the sample to the undetermined cluster; the above steps are repeated until all samples are allocated.
8. The medical image clustering method based on the multitask evolution algorithm according to the claim 5, wherein in the step S2-3, the clustering index comprises a CH index, a Dunn index and a SIL index, which respectively correspond to an optimization task; the indexes are as follows:
CH index:
Figure FDA0002476283730000041
in the above formula, the first and second carbon atoms are,
Figure FDA0002476283730000042
the trace of the inter-class dispersion matrix is represented, and m represents the mean vector of the whole data set;
dunn index:
Figure FDA0002476283730000043
in the above formula, D (C)i,Cj) Representing the distance between the different classes as the distance between the two closest data points, the formula is expressed as follows:
Figure FDA0002476283730000044
(Ci) The two farthest distance points for this category are separated:
Figure FDA0002476283730000045
SIL index:
Figure FDA0002476283730000046
in the above formula, sj=(bj-aj)/max(aj,bj) Representing data point xjThe width of the profile of (a); data point xjAverage distance a to other data points of the class to which it belongsjAnd minimum distance b to other category data pointsjThe calculation formula of (a) is as follows:
Figure FDA0002476283730000047
9. the medical image clustering method based on the multitask evolutionary algorithm according to the claim 5, wherein in the step S2-5, the operation of simulating the binary crossing is as follows:
is provided with two parents xa=[xa(1),...,xa(d)]And xb=[xb(1),...,xb(d)]D is the dimension of the data, the distribution factor c (j) is first calculated:
Figure FDA0002476283730000048
wherein, beta is a system parameter larger than 0, and r is a random number larger than 0 and smaller than 1 in each dimension;
the obtained offspring is:
xe(j)=[(1+c(j))xa(j)+(1-c(j))xb(j)]/2
xf(j)=[(1+c(j))xb(j)+(1-c(j))xa(j)]/2。
CN202010364563.4A 2020-04-30 2020-04-30 Medical image clustering method based on multitasking evolutionary algorithm Active CN111563549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010364563.4A CN111563549B (en) 2020-04-30 2020-04-30 Medical image clustering method based on multitasking evolutionary algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010364563.4A CN111563549B (en) 2020-04-30 2020-04-30 Medical image clustering method based on multitasking evolutionary algorithm

Publications (2)

Publication Number Publication Date
CN111563549A true CN111563549A (en) 2020-08-21
CN111563549B CN111563549B (en) 2023-07-28

Family

ID=72070695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010364563.4A Active CN111563549B (en) 2020-04-30 2020-04-30 Medical image clustering method based on multitasking evolutionary algorithm

Country Status (1)

Country Link
CN (1) CN111563549B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115204323A (en) * 2022-09-16 2022-10-18 华智生物技术有限公司 Seed multi-feature based clustering and synthesis method, system, device and medium
CN115222007A (en) * 2022-05-31 2022-10-21 复旦大学 Improved particle swarm parameter optimization method for glioma multitask integrated network
CN115346665A (en) * 2022-10-19 2022-11-15 南昌大学第二附属医院 Method, system and equipment for constructing retinopathy incidence risk prediction model

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101499136A (en) * 2009-03-05 2009-08-05 西安电子科技大学 Image over-segmenting optimization method based on multi-target evolution clustering and spatial information
CN102567963A (en) * 2011-11-10 2012-07-11 西安电子科技大学 Quantum multi-target clustering-based remote sensing image segmentation method
CN104156945A (en) * 2014-07-16 2014-11-19 西安电子科技大学 Method for segmenting gray scale image based on multi-objective particle swarm optimization algorithm
CN106886467A (en) * 2017-02-24 2017-06-23 电子科技大学 Method for optimizing is tested in multitask based on the comprehensive multi-target evolution of packet
EP3273387A1 (en) * 2016-07-19 2018-01-24 Siemens Healthcare GmbH Medical image segmentation with a multi-task neural network system
WO2018086433A1 (en) * 2016-11-08 2018-05-17 江苏大学 Medical image segmenting method
US20190122071A1 (en) * 2017-10-24 2019-04-25 International Business Machines Corporation Emotion classification based on expression variations associated with same or similar emotions
CN110136828A (en) * 2019-05-16 2019-08-16 杭州健培科技有限公司 A method of medical image multitask auxiliary diagnosis is realized based on deep learning
US20190272333A1 (en) * 2018-03-01 2019-09-05 King Fahd University Of Petroleum And Minerals Heuristic for the data clustering problem
CN110458859A (en) * 2019-07-01 2019-11-15 南开大学 A kind of segmenting system of the myelomatosis multiplex stove based on multisequencing MRI
CN110991518A (en) * 2019-11-28 2020-04-10 山东大学 Two-stage feature selection method and system based on evolution multitask

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101499136A (en) * 2009-03-05 2009-08-05 西安电子科技大学 Image over-segmenting optimization method based on multi-target evolution clustering and spatial information
CN102567963A (en) * 2011-11-10 2012-07-11 西安电子科技大学 Quantum multi-target clustering-based remote sensing image segmentation method
CN104156945A (en) * 2014-07-16 2014-11-19 西安电子科技大学 Method for segmenting gray scale image based on multi-objective particle swarm optimization algorithm
EP3273387A1 (en) * 2016-07-19 2018-01-24 Siemens Healthcare GmbH Medical image segmentation with a multi-task neural network system
WO2018086433A1 (en) * 2016-11-08 2018-05-17 江苏大学 Medical image segmenting method
CN106886467A (en) * 2017-02-24 2017-06-23 电子科技大学 Method for optimizing is tested in multitask based on the comprehensive multi-target evolution of packet
US20190122071A1 (en) * 2017-10-24 2019-04-25 International Business Machines Corporation Emotion classification based on expression variations associated with same or similar emotions
US20190272333A1 (en) * 2018-03-01 2019-09-05 King Fahd University Of Petroleum And Minerals Heuristic for the data clustering problem
CN110136828A (en) * 2019-05-16 2019-08-16 杭州健培科技有限公司 A method of medical image multitask auxiliary diagnosis is realized based on deep learning
CN110458859A (en) * 2019-07-01 2019-11-15 南开大学 A kind of segmenting system of the myelomatosis multiplex stove based on multisequencing MRI
CN110991518A (en) * 2019-11-28 2020-04-10 山东大学 Two-stage feature selection method and system based on evolution multitask

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GENG-BIN CHEN ETAL: "Automatic clustering approach based on particle swarm optimization for data with arbitrary shaped clusters", 《HTTPS://IEEEXPLORE.IEEE.ORG/DOCUMENT/7885913》 *
程美英等: "多任务处理协同进化粒子群算法", 《模式识别与人工智能》 *
郭亮: "基于图论和差分进化的医学图像聚类分析方法的研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222007A (en) * 2022-05-31 2022-10-21 复旦大学 Improved particle swarm parameter optimization method for glioma multitask integrated network
CN115204323A (en) * 2022-09-16 2022-10-18 华智生物技术有限公司 Seed multi-feature based clustering and synthesis method, system, device and medium
CN115204323B (en) * 2022-09-16 2022-12-02 华智生物技术有限公司 Seed multi-feature based clustering and synthesis method, system, device and medium
CN115346665A (en) * 2022-10-19 2022-11-15 南昌大学第二附属医院 Method, system and equipment for constructing retinopathy incidence risk prediction model
CN115346665B (en) * 2022-10-19 2023-03-10 南昌大学第二附属医院 Method, system and equipment for constructing retinopathy incidence risk prediction model

Also Published As

Publication number Publication date
CN111563549B (en) 2023-07-28

Similar Documents

Publication Publication Date Title
JP6978519B2 (en) Predicting the quality of sequencing results using deep neural networks
JP6814981B2 (en) Learning device, identification device, learning identification system, and program
CN111563549B (en) Medical image clustering method based on multitasking evolutionary algorithm
Połap An adaptive genetic algorithm as a supporting mechanism for microscopy image analysis in a cascade of convolution neural networks
CN106228185B (en) A kind of general image classifying and identifying system neural network based and method
Geng et al. Automatic tracking, feature extraction and classification of C. elegans phenotypes
Meyer et al. MulteeSum: a tool for comparative spatial and temporal gene expression data
CN106650314A (en) Method and system for predicting amino acid mutation
CN110728666B (en) Typing method and system for chronic nasosinusitis based on digital pathological slide
WO2015173435A1 (en) Method for predicting a phenotype from a genotype
Ceccarelli et al. A deformable grid-matching approach for microarray images
CN111145145B (en) Image surface defect detection method based on MobileNet
CN110866922B (en) Image semantic segmentation model and modeling method based on reinforcement learning and migration learning
CN113449802A (en) Graph classification method and device based on multi-granularity mutual information maximization
CN113362277A (en) Workpiece surface defect detection and segmentation method based on deep learning
Vigdor et al. Accurate and fast off and online fuzzy ARTMAP-based image classification with application to genetic abnormality diagnosis
CN115359845A (en) Spatial transcriptome biological tissue substructure analysis method fusing unicellular transcriptome
CN117611974B (en) Image recognition method and system based on searching of multiple group alternative evolutionary neural structures
CN113192556B (en) Genotype and phenotype association analysis method in multigroup chemical data based on small sample
CN117708628A (en) Spatial domain identification method in spatial transcriptome based on map deep learning
CN115661498A (en) Self-optimization single cell clustering method
CN113593698B (en) Traditional Chinese medicine syndrome type identification method based on graph attention network
US20210375398A1 (en) Machine Learning-Based Analysis of Process Indicators to Predict Sample Reevaluation Success
CN115017988A (en) Competitive clustering method for state anomaly diagnosis
CN110459266B (en) Method for establishing SNP (Single nucleotide polymorphism) pathogenic factor and disease association relation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant