WO2005050479A1 - Appareil de recherche de structures similaires, procede de recherche de structures similaires, programme de recherche de structures similaires et appareil de fractionnement - Google Patents

Appareil de recherche de structures similaires, procede de recherche de structures similaires, programme de recherche de structures similaires et appareil de fractionnement Download PDF

Info

Publication number
WO2005050479A1
WO2005050479A1 PCT/JP2004/016841 JP2004016841W WO2005050479A1 WO 2005050479 A1 WO2005050479 A1 WO 2005050479A1 JP 2004016841 W JP2004016841 W JP 2004016841W WO 2005050479 A1 WO2005050479 A1 WO 2005050479A1
Authority
WO
WIPO (PCT)
Prior art keywords
pattern
class
particle size
map
similar
Prior art date
Application number
PCT/JP2004/016841
Other languages
English (en)
Japanese (ja)
Inventor
Hiromi Kataoka
Original Assignee
National University Corporation Kochi University
A & T Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University Corporation Kochi University, A & T Corporation filed Critical National University Corporation Kochi University
Priority to JP2005515594A priority Critical patent/JP4521490B2/ja
Priority to EP04818872A priority patent/EP1686494A4/fr
Priority to US10/580,252 priority patent/US7697764B2/en
Publication of WO2005050479A1 publication Critical patent/WO2005050479A1/fr

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2137Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/01Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials specially adapted for biological cells, e.g. blood cells
    • G01N2015/016White blood cells
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1477Multiparameters
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1486Counting the particles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1488Methods for deciding
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1493Particle size

Definitions

  • the present invention provides a similar pattern search apparatus, a similar pattern search method, a similar pattern search program, and a similar pattern search method for searching a pattern having a high similarity to a pattern of a test sample from a population including a plurality of patterns.
  • the present invention relates to an image separation device.
  • flow cytometry is a test that can quickly classify leukocytes into neutrophils, lymphocytes, monocytes, eosinophils, and the like.
  • Leukocyte particle size data obtained by flow cytometry can be classified into various particle size patterns depending on cell maturity and disease (see Non-Patent Document 1).
  • the present inventors have developed a method of performing clustering by using a self-organizing map (SOM) using leukocyte particle size data obtained as a two-dimensional histogram (see Non-Patent Documents 2-4).
  • SOM self-organizing map
  • the leukocyte particle size data is recorded in a database, and a characteristic pattern is extracted by applying data mining, thereby enabling a powerful classification that cannot be determined only by information based on a two-dimensional histogram.
  • processing is performed by a separation method in which a valley of each fraction is used as a boundary inside the analyzer, and a method is used in which each fraction is used as one numerical data for diagnosis.
  • Non-Patent Document 1 Noriyuki Tatsumi, Izumi Tsuda, Nobuyuki Takubo, et al .: Reflection of automatic leukocyte classification results in on-site medical care, HORIBA Technical Reports, No. 20, pp. 23-26, 2000.
  • Non-Patent Document 2 Hiromi Kataoka, Hiromi Ioki, Osamu Konishi, et al .: Construction of a data mining support system for leukocyte particle size, Journal of the Japan Society of Clinical Laboratory Automation, Vol. 27, 4, pp. 583, 2002.
  • Non-Patent Document 4 Hiromi Ioki, Hiromi Kataoka, Yuka Kawasaki, et al .: Pattern Classification of Allergic Disease Area by Leukocyte Particle Size Data, Medical Informatics 22 (Suppl.), Pp. 211-212, 2002.
  • the present invention has been made in view of the above, and a pattern having a high similarity to the pattern of a test sample is highly accurately searched for from a group including a plurality of patterns, and is used for diagnosis.
  • the invention according to claim 1 searches a pattern including a plurality of patterns for a pattern highly similar to the pattern of the test sample.
  • a similar pattern search means for selecting a class similar to the component fraction contained in the pattern of the test sample from the class map.
  • clustering is performed on a plurality of patterns using a model parameter characterizing a plurality of component fractions respectively included in the plurality of patterns, and a class map is obtained. Is created, and is similar to the component fraction contained in the pattern of the test sample. Class is selected from the class map and a similarity search is performed with high accuracy.
  • the invention according to claim 2 is characterized in that the pattern is a one-dimensional or multidimensional pattern.
  • a one-dimensional or multidimensional pattern similarity search is performed with high accuracy.
  • the invention according to claim 3 is characterized in that the pattern is a leukocyte particle size pattern, a protein electrophoresis waveform, or a blood cell histogram.
  • the pattern is a leukocyte particle size pattern, a protein electrophoresis waveform, or a blood cell histogram.
  • the invention according to claim 4 is a similar pattern search method for searching a pattern having high similarity to a pattern of a test sample from a population including a plurality of patterns, wherein Selecting a model parameter characterizing a plurality of component fractions contained in each of the plurality of patterns, performing a clustering on the plurality of patterns to create a class map, and a class map created in the class map creating step. And a similar pattern search step of selecting a class similar to the component fraction contained in the pattern of the test sample from the class map.
  • the cluster map is performed on the plurality of patterns by using the model parameters characterizing the plurality of component fractions respectively included in the plurality of patterns. Is created, and a class similar to the component fraction contained in the pattern of the test sample is selected from the class map, and a similarity search is performed with high accuracy.
  • An invention according to claim 5 is a program for causing a computer to execute a similar pattern search method for searching a pattern having a high similarity to a pattern of a test sample from a population including a plurality of patterns.
  • a computer executes a storage step of storing the class map created in the step and a similar pattern search step of selecting a class similar to the component fraction contained in the pattern of the test sample from the class map. It is characterized by making it.
  • the plurality of components respectively included in the plurality of patterns are provided.
  • clustering is performed on multiple patterns to create a class map, and a class similar to the component fraction contained in the pattern of the test sample is identified in the class map. Select from among them and perform similarity search with high accuracy.
  • the invention according to claim 6 is a similar pattern search device that searches a population containing a plurality of leukocyte particle size patterns for a leukocyte particle size pattern having a pattern highly similar to the leukocyte particle size pattern of the test sample.
  • the leukocyte particle size pattern includes a plurality of cell component fractions, and performs a clustering by applying a self-organizing map to the plurality of actually measured leukocyte particle size patterns to create a primary class map.
  • the primary clustering means and the EM algorithm for each pattern included in the primary class map are executed by using a predetermined initial value, whereby the number of components of the cell component included in each pattern and each cell
  • First parameter determining means for determining a first mixture distribution model parameter comprising a mean value, a variance, and a density of components;
  • Second parameter determination means for determining a second mixture distribution model parameter consisting of a value, a variance, and a density, and applying a self-organizing map to the second mixture distribution model parameter for each of the leukocyte particle size patterns.
  • Secondary clustering means to create a secondary class map, calculate the similarity distances of all combinations between the classes included in the secondary class map, and calculate the class combinations and similarities between the classes.
  • Means for creating an inter-class distance master for creating an inter-class distance master corresponding to the distance;
  • Storage means for storing the class and the inter-class distance master, class determination means for determining a class belonging to each cell component fraction included in the leukocyte particle size pattern of the test sample from the secondary class map, and A class in which the similar distance to the class determined by the class determining means is equal to or smaller than a predetermined threshold is detected as a similar class from the interclass distance master, and a leukocyte particle size pattern included in the similar class is detected.
  • a similar pattern search means for determining a pattern having a high similarity to the leukocyte particle size pattern of the test sample.
  • the initial value determined by applying the self-organizing map Then, the components of the leukocyte particle size are separated by the EM algorithm using, and the clustering is performed again using the self-organizing map, thereby constructing a secondary class map and an interclass distance master.
  • the invention according to claim 7 is a similar pattern search method for searching a leukocyte particle size pattern having a pattern highly similar to the leukocyte particle size pattern of a test sample from a population including a plurality of leukocyte particle size patterns.
  • the leukocyte particle size pattern includes a plurality of cell component fractions, and performs a clustering by applying a self-organizing map to the plurality of actually measured leukocyte particle size patterns to create a primary class map.
  • the primary clustering step and the EM algorithm for each pattern included in the primary class map are performed using a predetermined initial value, so that the number of components of the cell component included in each pattern and each cell
  • the initial value determined by applying the self-organizing map Then, the components of the leukocyte particle size are separated by the EM algorithm using, and the clustering is performed again using the self-organizing map, thereby constructing a secondary class map and an interclass distance master.
  • the invention according to claim 8 causes a computer to execute a similar pattern search method for searching for a leukocyte particle size pattern having a pattern highly similar to the leukocyte particle size pattern of a test sample from a population including a plurality of leukocyte particle size patterns.
  • a program wherein the leukocyte particle size pattern includes a plurality of cell component fractions, and performs clustering by applying a self-organizing map to the plurality of leukocyte particle size patterns obtained by actual measurement.
  • a class whose similarity distance between the class determined in the class determining step and the class determined in the class determining step is equal to or less than a predetermined threshold is detected as a similar class from the interclass distance master, and is included in the similar class.
  • each component of the leukocyte particle size is separated by the EM algorithm using the initial value determined by applying the self-organizing map, and the initial value is determined using the self-organizing map.
  • a secondary class map and an inter-class distance master are constructed.
  • the invention according to claim 9 is a cell component fractionation separation device for separating each cell component fraction in a leukocyte particle size pattern including a plurality of cell component fractions, wherein the plurality of cell component fractions obtained by actual measurement are obtained.
  • a primary clustering means for performing a clustering by applying a self-organizing map to the leukocyte particle size pattern to create a primary class map, and a predetermined initial state for each pattern included in the primary class map.
  • the parameter determination for determining the number of components of the cell components included in each pattern and the mixture distribution model parameters including the average value, variance, and density of each cell component fraction Means, and for each leukocyte particle size pattern by executing an EM algorithm with the mixture distribution model parameters as initial values, Characterized in that a fractionation means for separating the fractions of each cell component contained in the particle size pattern.
  • the self-organizing map (SOM) is applied to the determination of the initial value of the EM algorithm.
  • the similar pattern search device (claim 1) performs clustering on a plurality of patterns by using model parameters characterizing a plurality of component fractions respectively included in the plurality of patterns.
  • a class map is created based on the test pattern, and a class similar to the component fraction included in the pattern of the test sample is selected as a neutral class map.Therefore, the pattern of the test sample is selected from a population containing multiple patterns. This provides an effect that a pattern with high similarity to the above can be searched for with high accuracy, and useful information for diagnosis can be provided.
  • the similar pattern search device uses a one-dimensional or multi-dimensional pattern as the pattern. This has the effect that a pattern with a high degree of similarity can be searched for with high accuracy.
  • the similar pattern search device (claim 3) is characterized in that white blood cells are used as the pattern. Because it is determined to be a particle size pattern, protein electrophoresis waveform, or blood cell histogram, it is possible to perform a similarity search with high accuracy on a pattern having a high similarity to the white blood cell particle size pattern, protein electrophoresis waveform, or blood cell histogram pattern. It works.
  • a similar pattern search method is performed on a plurality of patterns by using model parameters characterizing a plurality of component fractions respectively included in the plurality of patterns.
  • a class map is created based on the test pattern, and a class similar to the component fraction included in the pattern of the test sample is selected as a neutral class map.Therefore, the pattern of the test sample is selected from a population containing multiple patterns. This provides an effect that a pattern with high similarity to the above can be searched for with high accuracy, and useful information for diagnosis can be provided.
  • the similar pattern search program according to the present invention (claim 5) performs clustering on a plurality of patterns by using model parameters characterizing a plurality of component fractions contained in a plurality of patterns. To create a class map, and select a class similar to the component fraction included in the pattern of the test sample as a neutral class map. This makes it possible to perform a similarity search with high accuracy on a pattern having a high similarity to the above pattern, and to provide useful information for diagnosis.
  • the similar pattern search device (claim 6) separates each component of the leukocyte particle size by an EM algorithm using an initial value determined by applying the self-organization mapping map, By performing clustering again by using the dani map, a secondary class map and an inter-class distance master are constructed, so that the similarity of the search target can be freely selected.
  • each component is separated by performing a mixture density approximation using an EM algorithm, and further, by clustering feature parameters of each fraction, similarity focusing on a distribution pattern of a target cell group is obtained. It enables search
  • the similar pattern search method according to the present invention uses an EM algorithm to separate each component of leukocyte particle size using an initial value determined by applying a self-organizing map. By separating and re-clustering using the self-organizing map, a secondary class map and an inter-class distance master are constructed, so that the similarity of the search target can be freely selected.
  • the similar pattern search program according to the present invention separates each component of leukocyte particle size by an EM algorithm using an initial value determined by applying a self-organizing map. By performing clustering again using the organization map, a secondary class map and an inter-class distance master are constructed, so that there is an effect that the similarity of the search target can be freely selected.
  • the fraction separating apparatus applies a self-organizing map (SOM) to the determination of the initial value of the EM algorithm, so that the local maximum Solving the problem of convergence to a value has an effect.
  • SOM self-organizing map
  • FIG. 1 is a block diagram showing a configuration of a similar pattern search device 1 according to the present embodiment.
  • FIG. 2 is a flowchart of a process performed by a similar pattern search device 1 according to the present embodiment.
  • FIG. 3 is a diagram showing an example of a primary class map obtained as a result of performing primary clustering by an SOM.
  • Fig. 4 shows a model obtained by synthesizing each fraction component using the two-dimensional histogram of the original grain size data (upper diagram) and the obtained mixed distribution parameters and redrawing. This is a two-dimensional histogram (figure below).
  • Fig. 5 shows the individual mixture model parameters obtained by the EM algorithm.
  • FIG. 9 is a diagram showing an example of a secondary class map obtained as a result of clustering with M.
  • FIG. 6 is a view showing the distribution of rod-shaped nuclei and lobulated nuclei distributed in the neutrophil region.
  • FIG. 7 is an enlarged view of distribution of lobulated nuclei based on Class 351.
  • FIG. 8 is a diagram plotting distances of each class based on Class801 of eosinophils.
  • Figure 9 shows the results of primary clustering performed on protein electrophoresis waveforms by SOM.
  • FIG. 9 is a diagram illustrating an example of an obtained primary class map.
  • FIG. 10 is a diagram showing an example of a primary class map obtained as a result of performing primary clustering on a blood cell histogram by SOM.
  • FIG. 11 is a diagram showing one embodiment of the present invention.
  • the present inventors performed a mixture density approximation using an EM algorithm on each cell component included in the leukocyte particle size pattern to separate each component, and further separated each fraction. It has been found that clustering the characteristic parameters of the above makes it possible to perform a similarity search focusing on the distribution pattern of the target cell group, and based on this finding, completed the present invention.
  • the EM algorithm has a problem that the convergence point strongly depends on the initial condition, and the local maximum of the marginal likelihood cannot be avoided. In other words, depending on the initial value, there is a phenomenon that a local solution with low quality converges.
  • the initial value of each class is obtained based on the result of clustering the leukocyte particle size data of the entirety by the SOM, and the convergence problem of the marginal likelihood to the local maximum value Is to solve.
  • an algorithm that enables a high-speed similarity search from a comprehensive viewpoint of each cell component of leukocytes or a combination of each component is developed, and information useful for diagnosis is provided.
  • FIG. 1 is a block diagram illustrating a configuration of a similar pattern search device 1 according to the present embodiment.
  • the similar pattern search device 1 that is effective in the present embodiment includes a primary clustering unit 11, a first parameter determining unit 12, a second parameter determining unit 13, a secondary clustering unit 14, an inter-class distance master creating unit 15, It has a memory 16, a class determination unit 17, and a similar pattern search unit 18.
  • the present invention separates each component by performing a mixture density approximation using an EM algorithm, and clusters the characteristic parameters of each fraction to obtain a distribution pattern of a target cell group.
  • the feature is that a similarity search focused on is enabled.
  • the EM algorithm is composed of two processing algorithms, the Expectation step (E-step) and the Maximization step (M-step). These operations are repeated until the convergence is reached, and the meter is updated. The maximum point of the maximum likelihood estimator can be obtained.
  • E-ste P calculates the conditional expected value of the log-likelihood, and M-step performs processing to maximize the conditional expected value.
  • the EM algorithm has a problem that the convergence point strongly depends on the initial condition, and the local maximum of the marginal likelihood cannot be avoided. In other words, depending on the initial value Has the phenomenon of converging to a low-quality local solution.
  • the initial value of each class is obtained based on the result of clustering the leukocyte particle size data of the entirety by the SOM, and the convergence problem of the marginal likelihood to the local maximum value Is to solve.
  • the two-dimensional histogram data of the leukocyte particle size measured by the analyzer 2 is transmitted to the similar pattern searcher 1 and stored in the memory 16.
  • the primary clustering unit 11 performs a clustering by applying a self-organizing map to a plurality of the leukocyte particle size patterns obtained by actual measurement, thereby creating a primary class map.
  • the first parameter determination unit 12 executes the EM algorithm for each pattern included in the primary class map using a predetermined initial value to thereby determine the number of components of the cell component included in each pattern. And a first mixture distribution model parameter consisting of the mean value, variance and density of each cell component.
  • the second parameter determination unit 13 executes an EM algorithm for each actually measured leukocyte particle size pattern with the first mixture distribution model parameter as an initial value, thereby obtaining a cell component included in each of the leukocyte particle size patterns. And the second mixture distribution model parameters including the average value, the variance, and the density of each cell component.
  • the secondary clustering unit 14 creates a secondary class map by performing clustering by applying the self-organizing map to the second mixture distribution model parameters.
  • a force K mean clustering or the like that uses the self-organizing map may be used.
  • the inter-class distance master creating unit 15 calculates the similar distances of all the combinations between the classes included in the secondary class map, and associates the class combinations with the similar distances between the classes. This is to create a distance master.
  • the memory 16 includes two-dimensional histogram data of the leukocyte particle size measured by the analyzer 2, the secondary class map data created by the secondary clustering unit 14, and the inter-class distance created by the inter-class distance master creating unit 15. It stores data such as master data.
  • the class determination unit 17 determines the cell component fraction contained in the leukocyte particle size pattern of the test sample.
  • the class to which it belongs is also used to determine the strength of the secondary class map.
  • the similar pattern search unit 18 detects a class whose similar distance to the class determined in the class determination step is equal to or less than a predetermined threshold from the inter-class distance master as a similar class, and The included leukocyte particle size pattern is determined as a pattern having a high similarity to the leukocyte particle size pattern of the test sample.
  • the distance between classes was used to determine the similarity.
  • the evaluation criterion for similarity is not limited to this. You may decide to use distance, etc.
  • the external input / output device 2 transmits to the similar pattern search device 1 various parameters and similar pattern search conditions input by the user.
  • the similar pattern hit by the similar pattern search device 1 is output on the screen.
  • FIG. 2 shows a flowchart of a process performed by the similar pattern search device 1 according to the present embodiment.
  • a two-dimensional histogram of LMNE channels of 8,800 general patient samples analyzed by the automatic blood cell counter PENTRA120 (Horiba, Ltd.) 128 * 128, 8bit / sample data The explanation is given along the case of processing.
  • the two-dimensional histogram data output from the analyzer 2 has been subjected to smoothing processing of eight points in the vicinity.
  • the human power layer 128 water 128 (16,384 neurons), the competitive layer 12 water 12 (uni (1) Clustering was performed using the SOM, and the 144 patterns obtained were used as the primary class map.
  • the learning parameters of the SOM were a neighborhood distance of 4 and a learning rate of 0.3.
  • 4 * 4 16-divided areas are set, the center of gravity of each two-dimensional histogram is obtained, and the center of gravity is used as the initial value to separate the mixed model using the EM algorithm.
  • the distribution model for each fraction was calculated assuming a normal distribution.
  • the obtained mixture distribution model parameters (number of components, average value of each component, variance, density) were artificially adjusted to determine temporary parameters.
  • the classes belonging to each fraction of the test sample were determined from the secondary map, the interclass distance master was read, the threshold was determined according to the purpose of the search, and the class group matching the conditions was searched.
  • the threshold variable By making the threshold variable, the strength of similarity of the search can be freely selected, and the similarity search is realized by searching the class group of the area included in the threshold with the disjunctive condition.
  • To search for the overall pattern of each fraction we decided to search using the conjunction of the classes belonging to each fraction.
  • FIG. 3 shows the result of performing primary clustering by SOM. It shows the inside of a 12 * 12 competitor layer, and the result obtained by clustering the entire pattern of leukocyte particle size into 144 clusters was obtained.
  • the upper diagram of Fig. 4 shows a two-dimensional histogram of the original granularity data, in which + represents an initial value, and a path and a convergence point where an optimal likelihood was searched by the X force 3 ⁇ 4M algorithm.
  • the lower part of Fig. 4 is a modeled two-dimensional histogram in which each fraction component is synthesized and redrawn using the obtained mixture distribution parameters.
  • Fig. 5 shows the results of clustering individual mixture model parameters obtained by the EM algorithm with SOM.
  • the elliptical component drawn in red indicates the fraction of one component cell, and a result in which a similar pattern was arranged around the component was obtained. It can be understood that various patterns exist for each cell group. Pink 1 indicates lymphocytes, yellow 2 indicates monocytes, light blue 3 indicates neutrophils, and purple 4 indicates eosinophils.
  • the clustering of four cell populations with literal LMNE channels was obtained.
  • platelets were mapped in the white area distributed below the lymphocytes, and distributions considered to be abnormal cells were mapped in the other white areas and in the boundary area between each cell group.
  • the cell groups shown in FIGS. 5 and 6 are referred to by sequential numbers in the raster direction, with the upper left corner being ClassO and the lower right corner being Class899.
  • FIG. 6 shows the distribution of rod-shaped nuclei and lobulated nuclei distributed in the neutrophil region.
  • Classl20 is a class with more rods than any other class
  • Class351 is a class with more lobulated nuclei.
  • the Dalladiation region of yellow 31 (left) represents the pattern of similar distances centered on Classl20, which was the group of cases containing the most rod-like nuclei with marked left nucleus movement, by color intensity. Distribution.
  • the blue radiation (Daradiation) region of 32 shows a pattern centering on Class 351 where lobulated nuclei were the most powerful.
  • FIG. 7 is an enlarged view of the distribution of lobulated nuclei based on Class 351. If you want to perform a similar search over a wide range, search for classes in the area enclosed by the red line, and if you want to search for cells with strong similarity, use the green or blue area. By searching for classes, the search target can be narrowed down.
  • FIG. 8 is a diagram plotting the distance of each class based on Class 801 of eosinophils.
  • the vertical axis represents distances from Class801, and the horizontal axis represents classes sorted in ascending order of distance.
  • the distance is 1 or less, the same eosinophils are distributed, indicating that the similarity of the search target can be changed by changing the threshold of the distance.
  • a step-like curve was obtained for each cell, and interesting results were obtained in which neutrophil lobulated and rod-shaped nuclei were separated by monocytes. This tended to be in various patterns depending on the reference cells.
  • the similarity search device 1 searches for the similarity of the leukocyte particle size pattern.
  • the present invention is not limited to this.
  • the similarity of the test sample patterns such as the protein electrophoresis waveform and the blood cell histogram can be searched, and the similarity of various test sample patterns can be searched.
  • the test sample pattern is not limited to two-dimensional information such as the leukocyte particle size pattern described above, but can be applied to one-dimensional information and multidimensional information (including a time axis).
  • FIG. 9 is a diagram showing an example of a primary class map obtained as a result of performing primary clustering on a protein electrophoresis waveform by SOM using the similar pattern search device 1.
  • FIG. 10 is a diagram showing an example of a primary class map obtained as a result of performing primary clustering on blood cell histograms by the SOM in the similar pattern search device 1.
  • a program for realizing the function of the similar parameter search device 1 is recorded on the computer-readable recording medium 60 shown in FIG.
  • Each function may be realized by causing the computer 50 shown in the same figure to read the executed program and executing it.
  • the computer 50 shown in the figure includes a CPU (Central Processing Unit) 51 for executing the above program, an input device 52 such as a keyboard and a mouse, and a ROM (Read Only Memory) 53 for storing various data. And a RAM (Random Access Memory) 54 for storing calculation parameters and the like, a reading device 55 for reading a program from a recording medium 60, and an output device 56 such as a display and a printer.
  • a CPU Central Processing Unit
  • an input device 52 such as a keyboard and a mouse
  • ROM Read Only Memory
  • RAM Random Access Memory
  • the CPU 51 reads the program recorded on the recording medium 60 via the reading device 55, and executes the program to realize the above-described functions.
  • the recording medium 60 includes an optical disk, a flexible disk, a hard disk, and the like.
  • the similar pattern search device can provide a useful information for diagnosis and treatment because the similarity scale can be freely changed with respect to the similarity obtained by integrating the components. Can be.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Image Analysis (AREA)

Abstract

On fournit des informations à des fins de diagnostic par la mise au point d'un algorithme tel que, dans une structure de la taille de particules de leucocytes contenant de multiples fractions de composants cellulaires, les composants sont séparés les uns des autres par la réalisation d'une approximation de densité de mélange au moyen d'un algorithme EM, et tel qu'une recherche de similarité dénotant une structure de distribution d'un groupe de cellules cibles est effectuée par agrégation des paramètres caractéristiques des fractions individuelles, réalisant ainsi une recherche de similarité de haute précision pour chacun des composants cellulaires des leucocytes ou à partir d'une vue d'ensemble d'une combinaison de composants cellulaires.
PCT/JP2004/016841 2003-11-21 2004-11-12 Appareil de recherche de structures similaires, procede de recherche de structures similaires, programme de recherche de structures similaires et appareil de fractionnement WO2005050479A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2005515594A JP4521490B2 (ja) 2003-11-21 2004-11-12 類似パターン検索装置、類似パターン検索方法、類似パターン検索プログラム、および分画分離装置
EP04818872A EP1686494A4 (fr) 2003-11-21 2004-11-12 Appareil de recherche de structures similaires, procede de recherche de structures similaires, programme de recherche de structures similaires et appareil de fractionnement
US10/580,252 US7697764B2 (en) 2003-11-21 2004-11-12 Similar pattern searching apparatus, method of similar pattern searching, program for similar pattern searching, and fractionation apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2003392845 2003-11-21
JP2003-392845 2003-11-21

Publications (1)

Publication Number Publication Date
WO2005050479A1 true WO2005050479A1 (fr) 2005-06-02

Family

ID=34616468

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2004/016841 WO2005050479A1 (fr) 2003-11-21 2004-11-12 Appareil de recherche de structures similaires, procede de recherche de structures similaires, programme de recherche de structures similaires et appareil de fractionnement

Country Status (4)

Country Link
US (1) US7697764B2 (fr)
EP (1) EP1686494A4 (fr)
JP (1) JP4521490B2 (fr)
WO (1) WO2005050479A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008191467A (ja) * 2007-02-06 2008-08-21 Nippon Hoso Kyokai <Nhk> 混合モデル初期値算出装置及び混合モデル初期値算出プログラム
JP2009210465A (ja) * 2008-03-05 2009-09-17 Yamaguchi Univ がん細胞を分類する方法、がん細胞を分類するための装置及びがん細胞を分類するためのプログラム
US7716169B2 (en) 2005-12-08 2010-05-11 Electronics And Telecommunications Research Institute System for and method of extracting and clustering information
JP2010122137A (ja) * 2008-11-21 2010-06-03 Kochi Univ 血球分析装置、血球分析方法及びコンピュータプログラム
WO2014112567A1 (fr) * 2013-01-17 2014-07-24 国立大学法人 東京大学 Appareil pour la classification de groupes cellulaires et procédé de classification de groupes cellulaires

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7299135B2 (en) 2005-11-10 2007-11-20 Idexx Laboratories, Inc. Methods for identifying discrete populations (e.g., clusters) of data within a flow cytometer multi-dimensional data set
EP2105863B1 (fr) * 2008-03-28 2017-09-13 Cytognos, S.L. Procédé et système pour la classification automatique d'événements acquis par un cytomètre de flux
US8349256B2 (en) * 2008-11-21 2013-01-08 Sysmex Corporation Blood cell analyzer, blood cell analyzing method, and computer program product
US8363922B2 (en) * 2009-02-12 2013-01-29 International Business Machines Corporation IC layout pattern matching and classification system and method
US9522396B2 (en) 2010-12-29 2016-12-20 S.D. Sight Diagnostics Ltd. Apparatus and method for automatic detection of pathogens
US8989514B2 (en) 2011-02-03 2015-03-24 Voxeleron Llc Method and system for image analysis and interpretation
KR101648651B1 (ko) 2011-11-29 2016-08-16 노키아 테크놀로지스 오와이 객체의 분류를 위한 방법, 장치 및 컴퓨터 프로그램 제품
CN104169719B (zh) 2011-12-29 2017-03-08 思迪赛特诊断有限公司 用于检测生物样品中病原体的方法和系统
ES2659270T3 (es) * 2012-10-23 2018-03-14 Synaffix B.V. Anticuerpo modificado, conjugado de anticuerpo y proceso de preparación de los mismos
EP2999988A4 (fr) 2013-05-23 2017-01-11 S.D. Sight Diagnostics Ltd. Procédé et système d'imagerie de prélèvement cellulaire
IL227276A0 (en) 2013-07-01 2014-03-06 Parasight Ltd A method and system for obtaining a monolayer of cells, for use specifically for diagnosis
US10831013B2 (en) 2013-08-26 2020-11-10 S.D. Sight Diagnostics Ltd. Digital microscopy systems, methods and computer program products
CN107077732B (zh) 2014-08-27 2020-11-24 思迪赛特诊断有限公司 用于对数字显微镜计算聚焦变化的系统及方法
CN114674825A (zh) 2015-09-17 2022-06-28 思迪赛特诊断有限公司 用于检测身体样本中实体的方法和设备
CA3018536A1 (fr) 2016-03-30 2017-10-05 S.D. Sight Diagnostics Ltd Distinction entre les composants d'un echantillon de sang
EP4177593A1 (fr) 2016-05-11 2023-05-10 S.D. Sight Diagnostics Ltd. Support d'échantillon pour mesures optiques
US11099175B2 (en) 2016-05-11 2021-08-24 S.D. Sight Diagnostics Ltd. Performing optical measurements on a sample
CN106644897A (zh) * 2016-10-14 2017-05-10 北京海岸鸿蒙标准物质技术有限责任公司 一种用于颗粒计数标准物质的计数装置
JP7214729B2 (ja) 2017-11-14 2023-01-30 エス.ディー.サイト ダイアグノスティクス リミテッド 光学測定用試料収容器
US20220076114A1 (en) * 2020-09-04 2022-03-10 NEC Laboratories Europe GmbH Modular-related methods for machine learning algorithms including continual learning algorithms

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6073356A (ja) * 1983-09-29 1985-04-25 Toa Medical Electronics Co Ltd 血液分析用試薬
JP4136017B2 (ja) * 1996-09-19 2008-08-20 シスメックス株式会社 粒子分析装置
CN1826529B (zh) * 2003-07-18 2010-12-01 A&T株式会社 临床检查分析装置及临床检查分析方法
GB2411369B (en) 2004-02-27 2007-02-14 Dynamic Proc Solutions Ltd Cyclone assembly and method for increasing or decreasing flow capacity of a cyclone separator in use

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
BODDY L ET AL., PATTERN RECOGNITION IN FLOW CYTOMETRY, vol. 44, no. 3, 1 July 2001 (2001-07-01), pages 195 - 209
HUJUN YIN ET AL.: "IEEE Transactions on Neural Networks", vol. 12, 1 March 2001, IEEE SERVICE CENTER, article "Self-Organizing Mixture Networks for Probability Density Estimation"
IOKI H. ET AL.: "Data mining shuho o mochiita kesshoban ryudo data karano shorei tansaku", JAPANESE JOURNAL OF CLINICAL LABORATORY AUTOMATION, vol. 27, no. 4, 1 August 2002 (2002-08-01), pages 584, XP002989336 *
IOKI H. ET AL.: "Hakkekkyu ryudo data ni yoru allergy shikkan ryoiki no pattern bunrui", IRYO JOHOGAKU RENGO TAIKAI RONBUNSHU, vol. 22, 14 November 2002 (2002-11-14), pages 211 - 212, XP002989337 *
KATAOKA H. ET AL.: "A Data Mining System for Protein Electrophoresis Waveforms", JAPANESE JOURNAL OF CLINICAL LABORATORY AUTOMATION, vol. 26, no. 3, 1 June 2001 (2001-06-01), pages 170 - 175, XP002985383 *
KATAOKA H. ET AL.: "Doteki keikakuho - SOM ni motozuku ruiji hakei kensaku system", TRANSACTIONS OF INFORMATION PROCESSING SOCIETY OF JAPAN, vol. 42, no. SIG010, 15 September 2001 (2001-09-15), pages 92 - 99, XP002989332 *
KATAOKA H. ET AL.: "Jiko soshikika map doteki keikakuho o mochiita tanpaku denki eido hakei no ruiji kensaku", JAPAN JOURNAL OF MEDICAL INFORMATION, no. 20, 23 November 2000 (2000-11-23), pages 394 - 395, XP002989334 *
KATAOKA H. ET AL.: "Jiko soshikika map o mochiita tanpaku eido hakei no ruiji kensaku", JAPANESE JOURNAL OF CLINICAL LABORATORY AUTOMATION, vol. 25, no. 4, 1 August 2000 (2000-08-01), pages 408, XP002989333 *
KATAOKA H. ET AL.: "Kongo mitsudo kinji oyobi clustering ni motozuku hakkekkyu ryudo no ruiji kensaku", IRYO JOHOGAKU RENGO TAIKAI RONBUNSHU, vol. 23, 22 November 2003 (2003-11-22), pages 447 - 450, XP002989338 *
See also references of EP1686494A4
TOM HESKES: "IEEE Transactions on Neural Networks", vol. 12, 1 November 2001, IEEE SERVICE CENTER, article "Self-Organizing Maps, Vector Quantization, and Mixture Modelling"
WILKINS, MF ET AL.: "Cabios Computer Applications in the Biosciences", vol. 12, 1 January 1996, IRL PRESS, article "A comparison of some neural and non- neural methods for identification of phytoplankton from flow cytometry data", pages: 9 - 18

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7716169B2 (en) 2005-12-08 2010-05-11 Electronics And Telecommunications Research Institute System for and method of extracting and clustering information
JP2008191467A (ja) * 2007-02-06 2008-08-21 Nippon Hoso Kyokai <Nhk> 混合モデル初期値算出装置及び混合モデル初期値算出プログラム
JP2009210465A (ja) * 2008-03-05 2009-09-17 Yamaguchi Univ がん細胞を分類する方法、がん細胞を分類するための装置及びがん細胞を分類するためのプログラム
JP2010122137A (ja) * 2008-11-21 2010-06-03 Kochi Univ 血球分析装置、血球分析方法及びコンピュータプログラム
WO2014112567A1 (fr) * 2013-01-17 2014-07-24 国立大学法人 東京大学 Appareil pour la classification de groupes cellulaires et procédé de classification de groupes cellulaires

Also Published As

Publication number Publication date
EP1686494A1 (fr) 2006-08-02
US20070133855A1 (en) 2007-06-14
JP4521490B2 (ja) 2010-08-11
JPWO2005050479A1 (ja) 2007-06-14
US7697764B2 (en) 2010-04-13
EP1686494A4 (fr) 2011-07-27

Similar Documents

Publication Publication Date Title
WO2005050479A1 (fr) Appareil de recherche de structures similaires, procede de recherche de structures similaires, programme de recherche de structures similaires et appareil de fractionnement
KR102469620B1 (ko) 생물학적 입자의 분류 시스템 및 방법
US7043500B2 (en) Subtractive clustering for use in analysis of data
JPS6171337A (ja) フローサイトメトリー法を用いる粒子の検出および分類のための装置および方法
KR100303608B1 (ko) 혈구세포자동인식방법및장치
CN106228554B (zh) 基于多属性约简的模糊粗糙集煤粉尘图像分割方法
CN111062296B (zh) 一种基于计算机的白细胞自动识别分类方法
CN109615014A (zh) 一种基于kl散度优化的数据分类系统与方法
CN112365471B (zh) 基于深度学习的宫颈癌细胞智能检测方法
CN113658174B (zh) 基于深度学习和图像处理算法的微核组学图像检测方法
CN116580394A (zh) 一种基于多尺度融合和可变形自注意力的白细胞检测方法
Naqvi et al. Feature quality-based dynamic feature selection for improving salient object detection
CN111863135B (zh) 一种假阳性结构变异过滤方法、存储介质及计算设备
US20220207895A1 (en) Cytometry data analysis
Rathore et al. CBISC: a novel approach for colon biopsy image segmentation and classification
JPH0584544B2 (fr)
Bazoon et al. A hierarchical artificial neural network system for the classification of cervical cells
KR101913952B1 (ko) V-CNN 접근을 통한 iPSC 집락 자동 인식 방법
CN109886332A (zh) 基于对称邻居关系的改进dpc聚类算法及系统
CN111723737B (zh) 一种基于多尺度匹配策略深度特征学习的目标检测方法
Othman et al. Segmentation and feature extraction of lymphocytes WBC using microscopic images
CN105260982B (zh) 基于稀疏和稠密重构的图像解析方法
Gelsema et al. Application of the method of multiple thresholding to white blood cell classification
Huque Shape Analysis and Measurement for the HeLa cell classification of cultured cells in high throughput screening
Nattkemper et al. Extracting patterns of lymphocyte fluorescence from digital microscope images

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2005515594

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2004818872

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007133855

Country of ref document: US

Ref document number: 10580252

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

WWP Wipo information: published in national office

Ref document number: 2004818872

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 10580252

Country of ref document: US