CN111403027B - Rare disease picture searching method based on rare class mining - Google Patents

Rare disease picture searching method based on rare class mining Download PDF

Info

Publication number
CN111403027B
CN111403027B CN202010185084.6A CN202010185084A CN111403027B CN 111403027 B CN111403027 B CN 111403027B CN 202010185084 A CN202010185084 A CN 202010185084A CN 111403027 B CN111403027 B CN 111403027B
Authority
CN
China
Prior art keywords
picture
rare
pictures
rare class
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010185084.6A
Other languages
Chinese (zh)
Other versions
CN111403027A (en
Inventor
刘振广
杨家旭
钱鹏
杨文武
纪首领
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN202010185084.6A priority Critical patent/CN111403027B/en
Publication of CN111403027A publication Critical patent/CN111403027A/en
Application granted granted Critical
Publication of CN111403027B publication Critical patent/CN111403027B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses a rare disease picture searching method based on rare class mining, which is completed in two steps through rare class detection and rare class development, and realizes accurate and efficient rare disease picture searching, and specifically comprises the following steps: offline pretreatment of a big data set; on-line query of rare class detection; updating the interest condition of the data interactively by a user; rare classes are developed using positive and negative samples. In order to solve the speed problem of rare class detection, the invention reduces the time complexity of single query to the degree sufficient for real-time feedback through global setting offline processing of variables, ensures that the development result of the rare class accords with the reality by utilizing man-machine interaction, avoids wasting human resources because a large number of irrelevant pictures are obtained, fills the defect of the current method based on the problem, not only can help people to better obtain economic benefit from medical pictures, but also has reference value in the related research field.

Description

Rare disease picture searching method based on rare class mining
Technical Field
The invention belongs to the technical field of data mining, and particularly relates to a rare disease picture searching method based on rare class mining.
Background
In recent years, along with the progress of medical means, more and more instrument detection results can be presented and stored in the form of pictures, and various medical pictures cover actual photos from microscopic bacterial virus structures to macroscopic affected parts of testees, and have the characteristics of huge data volume and high potential utilization value. Under the general condition, the medical pictures are analyzed by utilizing a data mining technology, and the classification corresponding to the inspection results of some common diseases and healthy people can be obtained from the picture angle, so that the medical pictures are used for further teaching and scientific research, and the progress of the medical technology is promoted.
The most valuable pictures for research among the pictures corresponding to different classifications are often the pictures corresponding to the focus of the rare diseases; taking polycystic kidney disease as an example, the prevalence of the disease is about 1-2 per mill, the disease is not obvious, the examination is often easy to be confused with simple kidney cyst, the treatment is delayed, one obvious characteristic of the disease is that the kidney volume is increased, medical pictures obtained by ultrasound or CT are different from common kidney cyst to a certain extent, and the pictures belong to different types of pictures, and compared with the detection pictures corresponding to healthy people and common diseases, the pictures belong to rare types. These rare disease pictures in rare categories have some similarity as manifestations of the same disease, but are extremely rare in number due to the low incidence of rare diseases, characteristic of the inspection results in the form of fresh pictures; therefore, the conventional data analysis and data mining techniques are difficult to distinguish pictures corresponding to rare diseases due to the influence of main classification.
The rare disease pictures belonging to the rare diseases are detected by using a rare class mining technology, and the rare class mining technology can detect the pictures belonging to the rare diseases from a large amount of medical picture data sets and can find out other pictures in a large data set corresponding to the same disease through analysis. The rare disease pictures belonging to the rare class are detected from a large number of medical picture data sets, however, the primary detection process of the existing rare class detection method is usually time complexity of square level, if no rare disease picture concerned by researchers exists in the inquired result, a long time is required for adjusting the primary input parameters to retrieve the result, and real-time feedback cannot be obtained. In practical situations, if the analysis and research of some serious rare diseases are urgent, the long inquiry time can delay the medical research and delay the treatment of patients.
On the basis of rare class detection, finding out all picture information corresponding to the rare diseases which are possibly concerned by researchers in a big dataset belongs to rare class development. In general, the development work of rare classes is to gradually search all other approximate pictures by taking a small number of rare disease pictures detected by the rare classes as cores through the characteristic of higher similarity between data in the same class in the rare classes. In the existing method, only the characteristic of high similarity of data under the same rare class is considered in the process, the actual situation is not considered, the obtained result possibly contains a large number of pictures which are similar in morphology but not corresponding to rare diseases and are influenced by the small number of core pictures, and manpower and material resources are wasted.
In view of the above, there is a urgent need for a method for searching rare disease pictures by using rare class mining, which can implement man-machine interaction, and the conventional rare class mining method has the following problems: (1) Only the statistical characteristics of rare class data characteristics are considered, and the practical significance of the data is ignored; (2) The user detects the picture corresponding to the rare class for a single time, so that the calculation is complex, and the real-time interaction requirement cannot be met. Therefore, designing and realizing an interactive rare-class mining rare disease detection method considering practical research significance will bring huge economic value and practical value.
Disclosure of Invention
In order to solve the problems that the query speed is low and the actual situation cannot be considered in the rare class mining process of rare disease pictures, the invention provides a rare disease picture searching method based on rare class mining, which optimizes the query speed in the traditional Rare Class Detection (RCD) process and reduces the query time complexity of rare class detection by combining offline preprocessing and online query; based on the detected few rare disease pictures, the invention combines the actual conditions through man-machine interaction, and provides a new rare class development (RCE) method, which interactively updates the range of the inquired rare disease pictures, ensures the maximum utilization of human resources and finds all the rare disease pictures under a large data set.
A rare disease picture searching method based on rare class mining comprises a rare class detection part and a rare class development part; the rare class detection process comprises the following steps:
A1. acquiring search feature parameters input by a user;
A2. calculating a rare class index of each picture in the picture library;
A3. searching pictures in the upper and lower limit intervals of the rare class index and feeding back to the user, stopping detection if at least one rare disease picture interested by the user exists in the feedback result, otherwise, performing fine adjustment on the searching characteristic parameters by the user, and returning to the step A1;
the development process of the rare class is as follows:
B1. the rare disease pictures interested by the user are formed into a positive sample set from the pictures which are fed back to the user through the rare class detection, and the rest pictures are formed into a negative sample set;
B2. determining adjacent sample sets of each picture in the positive sample set and taking the union of the adjacent sample sets as phi;
B3. for any picture in the set phi, calculate its positive sample distance r from the positive sample set + And its negative sample distance r from the negative sample set -
B4. Extracting the picture with the largest r value from the set phi to provide for a user, wherein r=r - -r + If the user is interested, adding the picture into a positive sample set, and if the user is not interested, adding the picture into a negative sample set;
B5. and circularly executing the steps B2 to B4 until phi-U + -∩ - For space-time termination, the positive sample set at this time is taken as the final output result, where ∈ + Is the intersection of the positive sample set and set Φ ∈ - Is the intersection of the negative sample set and set Φ.
Further, the search feature parameter in the step A1 is a triplet<k,s up ,s low >Wherein s is up Sum s low The upper and lower limit values of rare class indexes of rare disease pictures set for users respectively, wherein k is a given natural number and is used for representing the scale of a neighboring sample set of the pictures and k is E [ k ] min ,k max ],k min And k max The upper and lower limits of a given interval are respectively set.
Further, in the step A2, the rare class index of each picture is calculated by the following formula;
Figure BDA0002413898880000031
wherein:
Figure BDA0002413898880000032
for the rare class index of the ith picture in the picture library, i is a natural number greater than 0, the adjacent sample set of the ith picture is determined by a KNN (K-Nearest Neighbor) algorithm, and the number of pictures in the set is K, d 1 ~d k The Euclidean distance of k pictures in the set and the ith picture in the picture library is the sequence from small to large, and avg { } is an averaging function.
Further, the step A2 is performed on the interval [ k ] min ,k max ]Discretizing, calculating and storing rare class indexes of all pictures in a picture library under each K value in a traversal interval, so that the rare class indexes of the pictures can be directly called without calculation when the search characteristic parameters are adjusted later, and the time consumption and the waste of repeatedly calculating the K neighbor relation can be avoided.
Further, a grading statistical method is adopted for the picture rare class indexes under the condition that the k value is not used, and interval division is carried out according to the sizes of the rare class indexes through statistical processing, so that the single online rare class detection process of the user can be completed more quickly.
Further, in the step B3, the positive sample distance r is calculated by the following formula +
Figure BDA0002413898880000041
Wherein: x is the feature vector of any picture in the collection phi, P is the feature matrix composed of all picture feature vectors in the positive sample collection, the feature vector is composed of all pixel values of the picture, omega 1 For a weight vector consisting of n weight values, with the n weight value summations equal to 1, n is the number of pictures in the positive sample set,
Figure BDA0002413898880000042
as the weight coefficient omega 1 And->
Figure BDA0002413898880000043
By aligning the sample distance r + The determination of the minimum number of possible steps is minimized, I 2 Is a two-norm.
Further, in the step B3, the negative sample distance r is calculated by the following formula -
Figure BDA0002413898880000044
Wherein: x is the feature vector of any picture in the collection phi, G is the feature matrix composed of all picture feature vectors in the negative sample collection, the feature vector is composed of all pixel values of the picture, omega 2 For a weight vector consisting of m weight values, and the m weight value summations are equal to 1, m is the number of pictures in the negative sample set,
Figure BDA0002413898880000045
as the weight coefficient omega 2 And->
Figure BDA0002413898880000046
By applying a negative sample distance r - The determination of the minimum number of possible steps is minimized, I 2 Is a two-norm.
According to the invention, by combining rare class detection RCD and rare class development RCE, a rare disease picture searching method based on rare class mining is provided, the detection speed of rare disease pictures is greatly improved, compared with the existing rare class mining method based on the traditional method, the pertinence is stronger, the gap of the rare disease picture searching method in real-time human-computer interaction is filled, the rare disease picture searching method has high practical value, and the significance is consulted and is strong, and the specific beneficial technical effects and innovativeness are mainly represented in the following aspects:
1. the invention uses the global setting of parameters to carry out offline processing on the pictures in the big data set, thereby reducing the time complexity of online inquiry from the square level to the logarithmic level.
2. By setting up positive and negative samples, the invention interactively inquires the user whether to accurately update the positive and negative samples after developing each possible rare disease picture, thereby ensuring that the result is in accordance with reality.
3. The invention sets forth the realization of man-machine interaction in the rare class mining work, so that the rare class mining work under a large data set can be better applied to wider fields.
Drawings
FIG. 1 is a diagram showing the arrangement of rare class indexes in the k-interval range in the rare class detection according to the present invention.
FIG. 2 is a schematic diagram of positive and negative sample sets in rare class development for user interest.
Detailed Description
In order to more particularly describe the present invention, the following detailed description of the technical solution of the present invention is given with reference to the accompanying drawings and specific examples.
The rare disease picture searching method based on rare class mining is realized by sequentially detecting RCDs and developing RCEs by the rare classes, and the problems that the query speed is low and the interest of a user cannot be aimed at are solved, so that the man-machine interaction is realized.
The rare class detection RCD pre-processes the large data set picture to be processed through the off-line module, thereby shortening the time for obtaining the required rare disease picture by the on-line parameter adjustment of the user, and specifically comprising the following steps:
(1) And (3) performing off-line preprocessing on all medical picture data in the large data set, and calculating and storing rare class indexes of each picture.
The rare class index s is an index for judging whether a data belongs to a rare class, specifically, the rare class index corresponding to a rare disease picture is distributed in a specific interval, the rare disease picture in the rare class can be effectively detected through a rare class index threshold, and the rare class index has the following calculation formula:
Figure BDA0002413898880000051
wherein: i represents a picture x numbered i in the big dataset i D represents the picture x after pixel-wise expansion of the picture into a vector i Euclidean distance of picture corresponding to subscript of d in feature space of vector, k represents picture x i The subscript of d represents picture x, which is expanded as a K-nearest neighbor relation in the vector post-feature space i The K neighbor relation corresponds to the picture range according to the picture x i Pictures corresponding to the order of the euclidean distance from small to large, 1 representing the picture closest to the picture, 2 representing the next closest picture, and so on, avg representing the average of the distance d in brackets.
The off-line module preprocessing is to pre-calculate and store the rare class indexes of all pictures in the big data set, namely the global setting of parameters, the process is based on the characteristic that the rare class indexes corresponding to all picture data are limited after the parameters k in the given rare class indexes are combined with the characteristic that the parameters k are integers and the possible values are also a countable number, as shown in figure 1, the process is to calculate the values of all picture data in the big data set in all possible parameters k (the upper limit is k max Lower limit is k min ) The corresponding rare class index s is taken.
The K neighbor relation and the grading curve are keys for effectively performing global setting of parameters and statistical planning; the K neighbor relation refers to the Euclidean distance in the characteristic space after each picture in the big data set is calculated and expanded into a vector, the size of the distances is ordered, and for each selected picture x i The k pictures near k from the Euclidean distance are picture x i K neighbor relation pictures of (2) by K nearCalculating the neighbor relation to obtain the front k corresponding to all pictures max After the pictures are close to the K neighbor relation picture, when the rare class index of each picture under different conditions is calculated in the global setting of the parameters, the K neighbor relation picture under the value of all possible parameters K of each picture can be directly found, and the waste of repeatedly calculating the K neighbor relation is avoided.
The grading curve is a grading statistical method for the rare class indexes s of all picture data under different k values in the global setting of parameters, and the interval division is carried out according to the sizes of the rare class indexes through statistical processing, so that the single online rare class detection process of a user can be completed more quickly.
(2) And receiving the characteristic parameters input by the user on line.
Characteristic parameters of online input through user rare class detection are characteristic triples<k,s l Yao w ,s up >Where k is a parameter for calculating rare class index, and s corresponds to an estimated value of the number of rare disease pictures that may exist under a large data set by a user in practical application l Yao w Sum s up The lower limit and the upper limit of the rare class index s are used for screening pictures of which the rare class index is in a threshold range in a large dataset under the k value input by a user, and the pictures are regarded as pictures belonging to the rare class as detection results.
(3) And searching all pictures meeting the requirements in the corresponding range of the user input parameters on line in the preprocessing result.
The on-line searching of rare class detection means that the corresponding rare class index sequence in the off-line processing is positioned by k in the parameters input by the user, and the binary searching is utilized to position s l Yao w Sum s up The interval where the rare class index is located satisfies
Figure BDA0002413898880000061
And outputting the picture corresponding to the index of the (b) as a picture belonging to the rare class.
(4) And (3) feeding back the result to the user, judging whether a picture corresponding to the rare disease concerned exists or not by the user, if not, adjusting the input parameters, repeating the steps (2) (3) (4), and if so, ending the rare class detection.
The rare class development RCE takes pictures corresponding to rare diseases as centers in the result of rare class detection, interactively completes the expansion of the rare classes, and specifically comprises the following steps:
(1) And taking the rare disease picture obtained in the rare class detection RCD as a positive sample P, taking the non-rare disease picture as a negative sample G, and searching a picture set adjacent to the positive sample.
The adjacent picture set of the positive samples refers to that each positive sample picture in the positive samples is aligned according to preset k 0 And calculating K neighbor relation values, and obtaining a union set of all K neighbor relation picture sets.
(2) As shown in fig. 2, each picture x in the picture set that does not belong to the positive and negative samples is calculated i Positive sample distance to positive sample
Figure BDA0002413898880000071
And negative sample distance to negative sample +.>
Figure BDA0002413898880000072
Positive sample distance
Figure BDA0002413898880000073
There is a calculation formula, namely an L2 standard regularized distance formula:
Figure BDA0002413898880000074
wherein: p represents all pictures in the positive sample, i.e. each picture is developed into a vector and then a matrix is formed in parallel. X is x i An ith picture representing a non-positive and negative sample picture in a picture set adjacent to the positive sample, wherein the value of the ith picture is a vector of the picture after the picture is unfolded in a feature space; omega 1 And
Figure BDA0002413898880000075
as the weight, omega 1 For vector->
Figure BDA0002413898880000076
Is scalar and is subject to constraint->
Figure BDA0002413898880000077
Restriction, omega 1 And
Figure BDA0002413898880000078
is such that the calculated positive sample distance +.>
Figure BDA0002413898880000079
And taking the value corresponding to the minimum value. Similarly, negative sample distance +.>
Figure BDA00024138988800000710
The formula is as follows:
Figure BDA00024138988800000711
wherein: g represents all pictures in the negative sample, x i Omega, identical to the picture in the positive sample distance calculation 2 And
Figure BDA00024138988800000712
for corresponding->
Figure BDA00024138988800000713
Under the condition that the negative sample distance is +.>
Figure BDA00024138988800000714
And taking the weight corresponding to the minimum.
(3) Extracting distance from collection
Figure BDA00024138988800000715
The largest picture is given to the user for judgment, and is taken as a positive sample if the user is interested, and is taken as a negative sample if the user is not interested.
(4) Updating the pictures in the positive sample and negative sample sets, searching a new positive sample adjacent picture set, and repeating the steps (2) (3) (4) until no non-positive and negative sample pictures exist in the set.
The previous description of the embodiments is provided to facilitate a person of ordinary skill in the art in order to make and use the present invention. It will be apparent to those having ordinary skill in the art that various modifications to the above-described embodiments may be readily made and the generic principles described herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not limited to the above-described embodiments, and those skilled in the art, based on the present disclosure, should make improvements and modifications within the scope of the present invention.

Claims (5)

1. A rare disease picture searching method based on rare class mining comprises a rare class detection part and a rare class development part; the method is characterized in that the rare class detection process comprises the following steps:
A1. obtaining search feature parameters input by a user, wherein the search feature parameters are triples<k,s up ,s low >Wherein s is up Sum s low The upper and lower limit values of rare class indexes of rare disease pictures set for users respectively, wherein k is a given natural number and is used for representing the scale of a neighboring sample set of the pictures and k is E [ k ] min ,k max ],k min And k max Respectively setting the upper limit value and the lower limit value of a given interval;
A2. calculating a rare class index of each picture in the picture library through the following formula;
Figure FDA0004176575080000011
wherein:
Figure FDA0004176575080000012
for the rare class index of the ith picture in the picture library, i is a natural number greater than 0, the adjacent sample set of the ith picture is determined by a KNN algorithm, and the number of pictures in the set is k, d 1 ~d k Corresponds to k pictures in the collection and the ith picture in the picture libraryThe Euclidean distance of the pictures is in a sequence from small to large, and avg is an average function;
A3. searching for rare class index between upper and lower limits [ s ] low ,s up ]The picture in the picture is fed back to the user, if at least one rare disease picture interested by the user exists in the feedback result, the detection is stopped, otherwise, the user performs fine adjustment on the searching characteristic parameters and returns to the execution step A1;
the development process of the rare class is as follows:
B1. the rare disease pictures interested by the user are formed into a positive sample set from the pictures which are fed back to the user through the rare class detection, and the rest pictures are formed into a negative sample set;
B2. determining adjacent sample sets of each picture in the positive sample set and taking the union of the adjacent sample sets as phi;
B3. for any picture which does not belong to positive and negative samples in the set phi, calculating the positive sample distance r between the picture and the positive sample set + And its negative sample distance r from the negative sample set -
B4. Extracting the picture with the largest r value from the set phi to provide for a user, wherein r=r - -r + If the user is interested, adding the picture into a positive sample set, and if the user is not interested, adding the picture into a negative sample set;
B5. and circularly executing the steps B2 to B4 until phi-U + -∩ - For space-time termination, the positive sample set at this time is taken as the final output result, where ∈ + Is the intersection of the positive sample set and set Φ ∈ - Is the intersection of the negative sample set and set Φ.
2. The rare disease picture searching method according to claim 1, wherein: the section [ k ] is the pair in the step A2 min ,k max ]Discretizing, calculating and storing rare class indexes of all pictures in a picture library under each k value in a traversing interval, so that the rare class indexes of the pictures can be directly fetched without calculation when the search characteristic parameters are adjusted later.
3. The rare disease picture searching method according to claim 2, wherein: and (3) adopting a grading statistical method for the picture rare class indexes under different k values, and dividing intervals according to the sizes of the rare class indexes through statistical processing, so that a single online rare class detection process of a user can be completed more quickly.
4. The rare disease picture searching method according to claim 1, wherein: in the step B3, the positive sample distance r is calculated by the following formula +
Figure FDA0004176575080000021
Wherein: x is the feature vector of any picture in the collection phi, P is the feature matrix composed of all picture feature vectors in the positive sample collection, the feature vector is composed of all pixel values of the picture, omega 1 For a weight vector consisting of n weight values, with the n weight value summations equal to 1, n is the number of pictures in the positive sample set,
Figure FDA0004176575080000022
as the weight coefficient omega 1 And->
Figure FDA0004176575080000023
By aligning the sample distance r + Minimisation determination, II 2 Is a two-norm.
5. The rare disease picture searching method according to claim 1, wherein: in the step B3, the negative sample distance r is calculated by the following formula -
Figure FDA0004176575080000024
Wherein: x is the feature vector of any picture in the collection phi, G is the negative valueFeature matrix composed of feature vectors of all pictures in sample set, wherein the feature vectors are composed of all pixel values of the pictures, omega 2 For a weight vector consisting of m weight values, and the m weight value summations are equal to 1, m is the number of pictures in the negative sample set,
Figure FDA0004176575080000025
as the weight coefficient omega 2 And->
Figure FDA0004176575080000026
By applying a negative sample distance r - Minimisation determination, II 2 Is a two-norm.
CN202010185084.6A 2020-03-17 2020-03-17 Rare disease picture searching method based on rare class mining Active CN111403027B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010185084.6A CN111403027B (en) 2020-03-17 2020-03-17 Rare disease picture searching method based on rare class mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010185084.6A CN111403027B (en) 2020-03-17 2020-03-17 Rare disease picture searching method based on rare class mining

Publications (2)

Publication Number Publication Date
CN111403027A CN111403027A (en) 2020-07-10
CN111403027B true CN111403027B (en) 2023-06-27

Family

ID=71428932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010185084.6A Active CN111403027B (en) 2020-03-17 2020-03-17 Rare disease picture searching method based on rare class mining

Country Status (1)

Country Link
CN (1) CN111403027B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066515A (en) * 2017-01-23 2017-08-18 武汉万般上品信息技术有限公司 A kind of quick search technology towards rare class data in big data
CN107145901A (en) * 2017-04-24 2017-09-08 武汉大学 A kind of method for quickly querying towards rare class data in big data
CN109948705A (en) * 2019-03-20 2019-06-28 武汉大学 A kind of rare class detection method and device based on k neighbour's figure

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2951722B1 (en) * 2013-01-31 2018-05-16 Universite De Montpellier Process for identifying rare events
US9195946B2 (en) * 2013-05-23 2015-11-24 Globalfoundries Inc Auto-maintained document classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066515A (en) * 2017-01-23 2017-08-18 武汉万般上品信息技术有限公司 A kind of quick search technology towards rare class data in big data
CN107145901A (en) * 2017-04-24 2017-09-08 武汉大学 A kind of method for quickly querying towards rare class data in big data
CN109948705A (en) * 2019-03-20 2019-06-28 武汉大学 A kind of rare class detection method and device based on k neighbour's figure

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Hanfei Lin et al..RCLens: Interactive Rare Category Exploration and Identification.IEEE Transactions on Virtualization and Computer Graphics.2017,第24卷(第7期),第2223-2237. *
Jingrui He et al..Nearest-neighbor-based active learning for rare category detection.Advances in Neural Information Processing Systems 20.2007,第633-640页. *
Song Wang et al..Fast Rare Category Detection Using Nearest Centriod Neighborhood.Web Technologies and Applications.2016,第9931卷第383-394页. *
王淞 ; 黄浩 ; 余果 ; 梁楠 ; 王黎维 ; 孙月明 ; .一种基于k近邻图的稀有类检测算法.软件学报.2016,(09),第2320-2331页. *
黄浩 等.基于加权边界度的稀有类检测算法.软件学报.2012,第23卷(第5期),第1195-1206页. *

Also Published As

Publication number Publication date
CN111403027A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN112085059B (en) Breast cancer image feature selection method based on improved sine and cosine optimization algorithm
CN111259786A (en) Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video
CN114897914B (en) Semi-supervised CT image segmentation method based on countermeasure training
Liang et al. Comparison detector for cervical cell/clumps detection in the limited data scenario
CN104392231A (en) Block and sparse principal feature extraction-based rapid collaborative saliency detection method
CN111931953A (en) Multi-scale characteristic depth forest identification method for waste mobile phones
CN111986137B (en) Biological organ lesion detection method, apparatus, device, and readable storage medium
CN112507159B (en) Hash network training method, advertisement image material retrieval method and related devices
CN114782948A (en) Global interpretation method and system for cervical liquid-based cytology smear
CN111403027B (en) Rare disease picture searching method based on rare class mining
Liang et al. Comparison-based convolutional neural networks for cervical cell/clumps detection in the limited data scenario
CN113836330A (en) Image retrieval method and device based on generation antagonism automatic enhanced network
CN117371511A (en) Training method, device, equipment and storage medium for image classification model
CN116416468B (en) SAR target detection method based on neural architecture search
CN109934248B (en) Multi-model random generation and dynamic self-adaptive combination method for transfer learning
WO2023226217A1 (en) Microsatellite instability prediction system and construction method therefor, terminal device, and medium
CN109460768A (en) A kind of text detection and minimizing technology for histopathology micro-image
CN115601535A (en) Chest radiograph abnormal recognition domain self-adaption method and system combining Wasserstein distance and difference measurement
CN113313178A (en) Cross-domain image example-level active labeling method
Yao et al. Improving Nuclei Segmentation in Pathological Image via Reinforcement Learning
CN114168780A (en) Multimodal data processing method, electronic device, and storage medium
Xiao et al. Energy noise detection fcm for breast tumor image segmentation
CN113096828B (en) Diagnosis, prediction and major health management platform based on cancer genome big data core algorithm
CN117150231B (en) Measurement data filling method and system based on correlation and generation countermeasure network
CN112396648B (en) Target identification method and system capable of positioning mass center of target object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant