JP2023123838A

JP2023123838A - Continuous image processing

Info

Publication number: JP2023123838A
Application number: JP2023111569A
Authority: JP
Inventors: ペース・チャールズ・ピー; P Pace Charles; ウィルチ・エリック; Wirch Eric
Original assignee: Corista LLC
Current assignee: Corista LLC
Priority date: 2013-11-15
Filing date: 2023-07-06
Publication date: 2023-09-05
Also published as: JP2019049990A; JP2021114305A

Abstract

To provide a method for querying, analyzing and processing data.SOLUTION: A method for searching a database of images based on a query comprises: if a magnification level of the query is greater than a first threshold, returning a first list of result tiles satisfying the query at the magnification level of the query; if the magnification level is less than or equal to the first threshold, retrieving tiles at a next lower magnification level and returning a second list of result tiles satisfying the query at the next lower magnification level; and processing each list of result tiles, the processing including, for each result tile: adding the result tile to a subset of result tiles; if the number of result tiles in the subset is greater than or equal to a second threshold, recursively searching the subset; saving results of each recursive search of the subset in a remaining subset; recursively searching the remaining subset; and saving results of the search of the remaining subset.SELECTED DRAWING: Figure 1

Description

Related application

本願は、2013年11月15日付出願の“Continuous Image Analytics（連続画像処理）”と称される米国仮特許出願第61/905,027号の優先権、および2014年11月15日付出願の“Continuous Image Analytics（連続画像処理）”と称されるＰＣＴ国際出願第PCT/US14/65850号の優先権を主張する。各出願の全教示内容は、参照をもって本明細書に取り入れたものとする。 This application claims priority from U.S. Provisional Patent Application No. 61/905,027 entitled "Continuous Image Analytics," filed November 15, 2013, and "Continuous Image Analytics," filed November 15, 2014. and claims priority from PCT International Application No. PCT/US14/65850 entitled "Analytics". The entire teachings of each application are incorporated herein by reference.

Copyright Reservation

本願書類の開示内容の一部分は、著作権保護の対象となる著作物を含んでいる。著作権者は、特許庁の特許包袋又は記録に表示された本願書類又は本願の開示内容を何人かが複製することに対して異論を唱えないが，それ以外については全ての著作権を留保する。 A portion of the disclosure of this document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by any person of this application or the disclosure of this application as it appears in the patent file or records of the Patent Office, but otherwise reserves all copyright rights whatsoever. do.

データをクエリ、分析および処理する、方法、システムおよび記憶媒体（例えば、非過渡的な記憶媒体）におけるコンピュータ読取り可能な命令のセットを提供する。具体的には、デジタルシステムに用いられるサンプルを処理し、画像のデータベースをクエリし、画像データを反復的に処理し、画像データ結果を生成する、方法、システムおよび記憶媒体（例えば、非過渡的な記憶媒体）におけるコンピュータ読取り可能な命令のセットを提供する。 Methods, systems, and sets of computer-readable instructions in storage media (eg, non-transitory storage media) are provided for querying, analyzing, and processing data. Specifically, methods, systems and storage media (e.g., non-transient provides a set of computer readable instructions on a storage medium).

データのレポジトリからは、クエリで要求することにより、関連性のあるデータを取り出すことができる。しかし、既存のデータベース－クエリシステムでは、ユーザの概念的予想（例えば、クエリで伝えられる概念的予想）とデータの低次的表現との間にセマンティックギャップが存在する。具体的に述べると、組織イメージング（組織の画像化）の場合、顕微鏡組織のデジタル画像の低次的表現とユーザの意図（つまりクエリで伝えられる意図）との間に、意味の割当て上のセマンティックギャップが存在する。このセマンティックギャップの規模は、コンテンツベースの画像検索手法を汎用的に適用することを不可能にするほど大きい。具体的に述べると、クエリが広過ぎると関連性のない結果が返され、クエリが具体的過ぎると関連性のある結果が排除される可能性がある。 Relevant data can be retrieved from the data repository by requesting it with a query. However, in existing database-query systems, there is a semantic gap between the user's conceptual expectations (eg, the conceptual expectations conveyed in the query) and the low-level representation of the data. Specifically, in the case of tissue imaging, there is a semantic semantic assignment of meaning between the low-level representation of a digital image of microscopic tissue and the user's intent (i.e., the intent conveyed in the query). A gap exists. The magnitude of this semantic gap is so large that it makes general application of content-based image retrieval techniques impossible. Specifically, a query that is too broad can return irrelevant results, and a query that is too specific can eliminate relevant results.

セマンティックギャップの問題は、データの複雑性、ばらつきおよび大きさに原因がある。これらの要因は、アルゴリズムの区別化手段を複雑化させ（例えば、区別化手段に反するように働き）、極端な場合、制約適用範囲を超えてアルゴリズムが拡張されるとエラーモデルが画像データのパターンモデルを支配するといったことになりかねない。その他の問題として、アルゴリズムを大量の画像データに対して適用する際に用いられる実用上の近似仮定の問題が挙げられる。ここでの近似とは、アルゴリズムを演算上扱いやすくするための、画像データのサブセットや要約を指す。組織画像データの場合、データのスケールが極めて広いため、区別的フィーチャが難解になるだけでなく、異なるスケールで区別的フィーチャの意味が異なってくる。 The semantic gap problem is due to the complexity, variability and size of the data. These factors complicate the algorithm's differentiating means (e.g., work against the differentiating means), and in the extreme case, the error model may change the pattern of the image data as the algorithm is extended beyond the constraint coverage. It could end up dominating the model. Another problem is the practical approximate assumptions used when applying algorithms to large amounts of image data. An approximation here refers to a subset or summary of the image data to make the algorithm computationally tractable. In the case of tissue image data, the scale of the data is extremely wide, which not only makes distinguishing features difficult to understand, but also makes different meanings of distinguishing features at different scales.

本発明は、コンピュータによって実行される、画像のデータベースをクエリに基づいて検索する方法であって、前記クエリの倍率レベルが第１の閾値よりも大きいとの判定に応答して、前記クエリのその倍率レベルで当該クエリを満たす第１のリストの結果タイルを返す過程と、前記クエリの倍率レベルが前記第１の閾値以下であるとの判定に応答して、次に低い倍率レベルのタイルを取り出す過程かつ当該次に低い倍率レベルで前記クエリを満たす第２のリストの結果タイルを返す過程と、各リストの結果タイルを処理する過程であって、結果タイルのそれぞれについて、その結果タイルを、結果タイルのサブセットに追加すること、前記サブセット内の結果タイルの総数が第２の閾値以上であるとの判定に応答して、前記サブセットを再帰的に検索すること、前記サブセットについての各々の再帰的検索の結果を、残りのサブセットへと保存すること、前記残りのサブセットを再帰的に検索すること、および前記残りのサブセットについての検索の結果を保存すること、を有する、過程と、を含む、方法を提供する。 The present invention is a computer-implemented method of query-based searching of a database of images, comprising: responsive to determining that a magnification level of said query is greater than a first threshold; returning a first list of result tiles that satisfy the query at a magnification level; and retrieving a tile with the next lowest magnification level in response to determining that the query's magnification level is less than or equal to the first threshold. and returning a second list of result tiles that satisfy the query at the next lower magnification level; and processing the result tiles of each list, wherein, for each result tile, the result tile adding to a subset of tiles; recursively searching the subset in response to determining that the total number of result tiles in the subset is greater than or equal to a second threshold; saving the results of a search to a remaining subset, recursively searching the remaining subset, and saving the results of the search for the remaining subset; provide a way.

一実施形態では、各々の倍率レベルが、四分木における１つのレベルに対応し、かつ、前記四分木における各々のレベルが、画像結果を表すタイルを含む。一実施形態では、子供タイルが、親タイルと連関性を有する。一実施形態では、前記親タイルが、対応する少なくとも１つの前記子供タイルの、ダウンサンプリングおよびローパス空間的フィルタリングのうちの少なくとも１つである。一実施形態では、前記次に低い倍率レベルのタイルを取り出す過程が、前記四分木における次に低いレベルで子供を生成することを含む。一実施形態では、前記クエリが、結果数の最小閾値および結果数の最大閾値のうちの少なくとも１つを含む。一実施形態において、結果数の前記閾値は、システム資源に基づくものである。一実施形態では、前記クエリが、画像を含み、かつ、前記クエリの前記倍率レベルが、その画像の倍率レベルである。一実施形態では、結果タイルが、当該結果タイルの倍率、前記クエリの画像、当該結果タイルに関連付けられたインデックスのファイル名、結果のサイズ、およびインデックスの種類のうちの少なくとも１つに基づいて、返すリストに含められる。一実施形態では、前記クエリが、ある時間制限内で検索を実行するための、その時間制限を含む。一実施形態では、前記クエリが、品質レベルの閾値を含む。一実施形態において、所定の前記第１の閾値は、検索結果数がある数値を下回るとの判定に応答して当該方法が終了するように定義される。一実施形態において、所定の前記第１の閾値は、倍率レベルの数値に相当するように定義される。一実施形態において、所定の前記第１の閾値は、深さベースの検索が使用されないように定義される。一実施形態において、ある倍率レベルは、それよりも低い倍率レベルよりも高い解像度を有する。一実施形態において、ある倍率レベルは、各寸法の解像度が、次に低い倍率レベルの２倍である。一実施形態では、前記クエリが、前記第１のリストの結果タイルを返す過程後に更新される。一実施形態において、前記方法は、さらに、各リストの結果タイルを処理する過程前に、前記第１のリストの結果タイルおよび前記第２のリストの結果タイルのうちの少なくとも１つから結果を除外する過程、を含む。一実施形態において、各リストの結果タイルを処理する過程は、さらに、各々の再帰的検索の結果を保存することの後に前記サブセットをクリアすること、および前記残りのサブセットについての再帰的検索の結果を保存することの後に前記残りのサブセットをクリアすることを有する。一実施形態において、前記再帰的検索は、深さ優先検索である。一実施形態において、保存された前記結果は、検索が終了する前に利用可能である。 In one embodiment, each magnification level corresponds to a level in the quadtree, and each level in said quadtree contains tiles representing image results. In one embodiment, a child tile has an association with a parent tile. In one embodiment, said parent tile is at least one of down-sampling and low-pass spatial filtering of said at least one corresponding child tile. In one embodiment, the step of retrieving the next lower magnification level tile includes generating children at the next lower level in the quadtree. In one embodiment, the query includes at least one of a minimum number of results threshold and a maximum number of results threshold. In one embodiment, the threshold number of results is based on system resources. In one embodiment, the query includes an image and the magnification level of the query is the magnification level of that image. In one embodiment, a result tile is determined based on at least one of a scale factor of the result tile, an image of the query, a filename of an index associated with the result tile, a result size, and an index type. Included in the returned list. In one embodiment, the query includes a time limit within which to perform the search. In one embodiment, the query includes a quality level threshold. In one embodiment, the first predetermined threshold is defined such that the method terminates in response to determining that the number of search results falls below a certain number. In one embodiment, said first predetermined threshold is defined to correspond to a number of magnification levels. In one embodiment, said first predetermined threshold is defined such that no depth-based search is used. In one embodiment, one magnification level has a higher resolution than a lower magnification level. In one embodiment, one magnification level has twice the resolution of each dimension of the next lower magnification level. In one embodiment, the query is updated after returning the first list of result tiles. In one embodiment, the method further includes excluding results from at least one of the first list of result tiles and the second list of result tiles before the step of processing each list of result tiles. including the process of In one embodiment, the process of processing the result tiles of each list further comprises clearing the subset after saving the results of each recursive search, and clearing the results of the recursive search for the remaining subsets. clearing the remaining subset after saving the . In one embodiment, the recursive search is a depth-first search. In one embodiment, the saved results are available before the search is terminated.

一実施形態において、クエリタイルに基づいてタイルセットの再帰的検索を実行する方法及びシステムは、前記タイルセットの各タイルについて、結果セットが大きくなる（populated（充実する））まで、次のレベルからタイルのセットを取り出すこと、および前記次のレベルのタイルセットを前記結果セットに追加すること、を実行する過程と、倍率レベルが所定のターゲットレベルであるとの判定の場合に、前記結果セット内のマッチ（一致）の品質を評価する過程と、倍率レベルが前記ターゲットレベルよりも下であるとの判定の場合に、前記結果セット内の各タイルについて、サブセット内のタイルの数が第３の閾値以上であるとの判定に応答して、そのタイルを当該サブセットに追加すること、前記サブセット内のタイルの数が前記第３の閾値よりも小さいとの判定に応答して、前記サブセットを再帰的に検索することと、前記再帰的検索の結果を暫定的な結果セットに追加することと、前記サブセットをクリアすることと、を実行すること、前記サブセットを再帰的に検索すること、検索の結果を前記暫定的な結果セットに追加すること、前記サブセットをクリアすること、および前記暫定的な結果セットを返すこと、を実行する過程と、を含む。 In one embodiment, a method and system for performing a recursive search of a tileset based on a query tile includes, for each tile of said tileset, from the next level until the result set is populated. performing the steps of retrieving a set of tiles and adding the next level tileset to the result set; and determining that the magnification level is below the target level, for each tile in the result set, the number of tiles in the subset is a third adding the tile to the subset in response to determining that it is greater than or equal to a threshold; and recursing through the subset in response to determining that the number of tiles in the subset is less than the third threshold. adding results of the recursive search to a provisional result set; clearing the subset; recursively searching the subset; adding results to the provisional result set, clearing the subset, and returning the provisional result set.

一実施形態において、前記クエリタイルは、前記タイルセットにおける第１のタイルの、第１の子供タイルとして含まれる。一実施形態において、マッチの品質を評価する過程は、前記結果セット内の第１のタイルの、前記クエリタイルと比較してのマッチ値が、所定値未満であるか否かを判定することを有する。一実施形態において、前記所定値は、５０％である。一実施形態において、マッチの品質は、ベクトル間の差分に基づくものである。一実施形態において、ベクトル間の前記差分は、それぞれのタイルのベクトル間の距離に基づくものである。一実施形態において、ベクトル間の前記差分は、それぞれのタイルのベクトル間の平均二乗誤差に基づくものである。一実施形態では、あるタイルにおける各々のピクセルが、当該各々のピクセルの色および輝度のうちの少なくとも１つを表す少なくとも１つの数値を有し、かつ、それぞれのタイルが、そのタイルにおける全てのピクセルについての前記少なくとも１つの数値のベクトルを有する。一実施形態において、前記クエリは、マッチの品質を評価するための前記所定値を含む。一実施形態において、返された前記暫定的な結果セットは、前記クエリとのマッチ度合い（matching）に基づいて仕分けられる。一実施形態において、前記暫定的な結果セットは、対応する親タイルとのマッチ度合いに基づいて仕分けられる。 In one embodiment, the query tile is included as a first child tile of a first tile in the tileset. In one embodiment, evaluating match quality comprises determining whether a match value of a first tile in the result set compared to the query tile is less than a predetermined value. have. In one embodiment, the predetermined value is 50%. In one embodiment, match quality is based on differences between vectors. In one embodiment, the difference between vectors is based on the distance between the vectors of the respective tiles. In one embodiment, the difference between vectors is based on the mean squared error between the vectors of each tile. In one embodiment, each pixel in a tile has at least one numerical value representing at least one of the color and brightness of that each pixel, and each tile includes all pixels in that tile. a vector of said at least one numerical value for . In one embodiment, the query includes the predetermined value for evaluating match quality. In one embodiment, the returned interim result set is sorted based on matching with the query. In one embodiment, the interim result set is sorted based on how well it matches the corresponding parent tile.

一実施形態において、コンピュータによって実行される、処理実行プランは、少なくとも画像フィーチャの空間的位置及び広がりを含む、少なくとも１つの選択可能なプローブフィーチャ指定と、画像のセットを含む少なくとも１つのターゲット指定であって、前記画像のセットが、顕微鏡スライドの少なくとも１つの画像を含む、少なくとも１つのターゲット指定と、前記少なくとも１つの選択可能なプローブフィーチャと前記少なくとも１つのターゲット指定との相関サンプルを生成するための比較順序及び比較演算子を含む、トラバーサルプランと、を備え、前記相関サンプルが、類似性方法、類似性尺度、前記少なくとも１つの選択可能なプローブフィーチャ指定、前記少なくとも１つのターゲット指定、および前記画像フィーチャの前記広がりを含む。一実施形態において、前記トラバーサルプランは、サンプルを順序付けて類似性尺度を適用することによって前記少なくとも１つの選択可能なプローブフィーチャ指定との相関関係を確立させる方法を少なくとも含み、前記相関関係を含むデータは、トラバーサルプランにとって利用可能である永続的なコンピュータメモリに、当該トラバーサルプランによって前記処理実行プランが相関を評価することに適応されるように保持される。一実施形態において、前記トラバーサルプランは、前記サンプルを、一様な格子を有する統計的に一様なサンプリング、前記スライドの四分木分解、前記スライドの埋込みゼロツリー、網羅的なサンプリング、スパース（疎）なサンプリング、およびスケール・近さバイアス付きサンプリングのうちの少なくとも１つによって定義される順序で評価することに基づくものとされる。 In one embodiment, the computer-implemented processing execution plan includes at least one selectable probe feature specification including at least the spatial location and extent of image features and at least one target specification including a set of images. wherein said set of images includes at least one image of a microscope slide, for generating at least one target designation and correlation samples of said at least one selectable probe feature and said at least one target designation. a traversal plan comprising a comparison order and a comparison operator for the correlation samples, the similarity method, the similarity measure, the at least one selectable probe feature designation, the at least one target designation, and the including said spread of image features. In one embodiment, said traversal plan comprises at least a method of ordering samples and applying a similarity measure to establish a correlation with said at least one selectable probe feature specification, wherein data comprising said correlation is held in persistent computer memory available to the traversal plan such that the processing execution plan is adapted to evaluate correlations by the traversal plan. In one embodiment, the traversal plan divides the samples into statistically uniform sampling with a uniform grid, quadtree decomposition of the slide, embedded zerotree of the slide, exhaustive sampling, sparse ) sampling, and scale-proximity-biased sampling.

一実施形態では、推移するバイアスが、過去に相関したデータとの少なくとも１つの相関を用いることで各々のデータとの相関を予測することができるように適応的に適用される。一実施形態では、オンライン機械学習が、サンプリングおよび前記トラバーサルプランをバイアスするように使用される。一実施形態では、ユーザからの関連性フィードバックが、前記トラバーサルプランの実行中にサンプリングをバイアスするように使用される。一実施形態では、前記トラバーサルプランが、結果セットの一部として返されるサンプルを決定するのに利用可能な結果パラメータを定義し、当該パラメータは、倍率スケールおよび空間的広がりを含み、前記処理実行プランが、サンプルが評価される順序を定義し、当該順序が、結果セットのサンプルが返されるレートを決定する。 In one embodiment, a transitional bias is adaptively applied such that correlation with each data can be predicted using at least one correlation with previously correlated data. In one embodiment, online machine learning is used to bias the sampling and the traversal plan. In one embodiment, relevance feedback from the user is used to bias sampling during execution of the traversal plan. In one embodiment, the traversal plan defines result parameters that can be used to determine the samples returned as part of a result set, the parameters including magnification scale and spatial extent, and the processing execution plan defines the order in which the samples are evaluated, and that order determines the rate at which the samples of the result set are returned.

一実施形態では、スケールベース従属の木々が、各々の評価状態の分離に基づいて定義され、前記スケールベース従属の木々（ツリー）が、並列評価のために別個の処理要素に分配される。一実施形態では、データのうちの定められた区分が、他のデータ区分と独立して与えられ、かつ、部分的結果のための、データの中間的セットが生成される。一実施形態では、分割されたデータと処理指定とが、同一の記憶装置に記憶される。一実施形態において、分割は、サンプル評価毎に返される結果サンプルの数に基づくものとされ、その分割サイズは増加および減少のうちの少なくとも１つとなる。一実施形態では、変換プロセスが、前記結果セットに対する少なくとも１つの画像処理変換に適用され、かつ、前記変換プロセスの出力は、永続的な記憶装置に記憶される変換後のサンプルを含む。 In one embodiment, trees of scale-based dependencies are defined based on isolation of each evaluation state, and the trees of scale-based dependencies are distributed to separate processing elements for parallel evaluation. In one embodiment, defined partitions of data are provided independently of other data partitions, and intermediate sets of data are generated for partial results. In one embodiment, the divided data and processing specifications are stored on the same storage device. In one embodiment, the split is based on the number of result samples returned per sample evaluation, and the split size is at least one of increasing and decreasing. In one embodiment, a transformation process is applied to at least one image processing transformation on said result set, and the output of said transformation process comprises transformed samples stored in persistent storage.

一実施形態において、プローブサンプルおよびターゲットサンプルは、実行することによって返されるサンプルが二次的な結果サンプルであるトラバーサルプランを形成するように、利用可能な変換後のサンプルおよび二次的なサンプルから選択される。一実施形態では、前記二次的な結果セットサンプルが、一次的なトラバーザルプランを適応的にバイアスするように用いられ、前記二次的な結果セットに基づく強い相関は、前記一次的なトラバーサルプランにおける対応するサンプルが優先的に評価されるべきであることを示す。一実施形態において、前記二次的な結果サンプルは、当該二次的な結果サンプルをさらに変換して変換後の三次的なサンプルを生成するように適応的にバイアスされ、前記一次的な結果セットへの前記二次的な結果セットの適応的なバイアス適用（biasing）が、前記三次的な結果セットによる前記二次的な結果セットのバイアス適用にも及ぶ。一実施形態では、上流のプランと下流のプランとの適応的なバイアス適用が、チェーン状に用いられる。一実施形態では、前記適応的なバイアス適用に、グラフのトポグラフィが利用される。 In one embodiment, the probe and target samples are derived from the available transformed and secondary samples to form a traversal plan in which the samples returned by executing are secondary result samples. selected. In one embodiment, the secondary result set samples are used to adaptively bias a primary traversal plan, and a strong correlation based on the secondary result set determines the primary traversal plan. Indicates that the corresponding sample in the plan should be evaluated preferentially. In one embodiment, the secondary result samples are adaptively biased to further transform the secondary result samples to produce transformed tertiary samples, and the primary result set The adaptive biasing of the secondary result set to also extends to the biasing of the secondary result set by the tertiary result set. In one embodiment, adaptive biasing of upstream and downstream plans is used in a chain. In one embodiment, the adaptive bias application utilizes the topography of the graph.

一実施形態において、コンピュータによって実行される、画像データのレポジトリを連続的に処理する方法は、データの要求を含むクエリ指定を受け取る過程と、当該方法を実行する前記コンピュータのシステム指定を受け取る過程と、ドメイン指定を決定するために、前記クエリ指定と前記システム指定とを比較する過程と、前記レポジトリに対するクエリを、前記ドメイン指定に基づいて開始する過程と、前記クエリの、画像データを含む結果を受け取る過程と、前記結果画像データのインタラクティブかつ反復的な探索を、グラフィカルユーザインターフェースに表示する過程と、前記結果画像データの入力を、前記グラフィカルユーザインターフェースを介して受け取る過程と、前記クエリを、受け取った前記入力に基づいて更新する過程と、更新された結果画像データに基づいて、更新された前記グラフィカルユーザインターフェースを表示する過程と、を含む。一実施形態では、画像データの前記レポジトリが、顕微鏡のデジタルデータ（digital microscopy data）を含む。一実施形態では、画像データの前記レポジトリが、あるスケールの組織画像データを、当該データの、粗スケール側での近似が、細スケール側での当該データと相関を有さないように含む。一実施形態において、前記画像データの連続的な処理は、当該処理が完全に終了する前に結果が利用可能となるように、漸増的に結果を生成する。一実施形態では、前記クエリ指定が、データのインデックス及び変換を暗示的に定義する。 In one embodiment, a computer-implemented method for continuously processing a repository of image data comprises receiving a query specification containing a request for data; , comparing said query specification with said system specification to determine a domain specification; initiating a query to said repository based on said domain specification; displaying an interactive iterative search of the resulting image data on a graphical user interface; receiving input of the resulting image data via the graphical user interface; and receiving the query. and displaying the updated graphical user interface based on the updated resulting image data. In one embodiment, said repository of image data includes digital microscopy data. In one embodiment, the repository of image data contains tissue image data at a scale such that approximations of the data on the coarse scale side are uncorrelated with the data on the fine scale side. In one embodiment, the successive processing of the image data produces results incrementally such that the results are available before the processing is completely finished. In one embodiment, the query specification implicitly defines data indices and transformations.

一実施形態において、コンピュータによって実行される、データレポジトリ内の画像データをクエリに基づいて変換する方法は、前記データレポジトリ内のデータについてのクエリを受け取る過程であって、当該クエリが、少なくとも１つのプローブタイルおよび当該クエリが実行される少なくとも１つのグループのスライドを含む、過程と、前記データレポジトリを、前記クエリと前記プローブタイルとのオーバーラップが空間的な関連性を有するものである限り各々の倍率レベルを介して再帰的に検索する過程と、前記再帰的検索の結果に基づいて、前記クエリを洗練化する（絞り込む）過程と、前記再帰的検索に基づいて、ターゲットスライドについてのトラバーサルプランを生成する過程と、前記クエリによって返されるデータに基づいて、データを変換する過程と、を含み、前記変換が、前記データの個々のペル位置および前記データの色深度のうちの少なくとも１つを調節することを有する。 In one embodiment, a computer-implemented method of query-based transformation of image data in a data repository comprises receiving a query for data in the data repository, the query comprising at least one a process comprising probe tiles and at least one group of slides on which the query is performed; recursively searching through magnification levels; refining (refining) the query based on the results of the recursive search; and developing a traversal plan for a target slide based on the recursive search. and transforming data based on the data returned by the query, the transform adjusting at least one of individual pel positions of the data and color depth of the data. have to do.

一実施形態において、少なくとも２５６個のピクセルのオーバーラップがあるとの判定に応答して、オーバーラップは、空間的な関連性を有するものとされる。一実施形態において、前記クエリは、検索述語およびクエリターゲットを含む。一実施形態において、プローブフィーチャ指定は、関心領域として指定される記検索述語を含み、前記関心領域は、画像上の、指定された広がりを有する位置を含む。一実施形態において、前記プローブフィーチャ指定は、検索のマッチを生成するうえで当該プローブフィーチャ指定が標的とする画像のセットを有する、少なくとも１つのタイルを含む。一実施形態において、トラバーサルプランは、ターゲットが少なくとも１つのプローブフィーチャ指定と比較される順序を含む。一実施形態において、前記プローブフィーチャ指定および前記ターゲットのそれぞれは、タイル、単一の画像、および顕微鏡スライド画像の、ある倍率レベルでのサブ画像のうちの少なくとも１つである。 In one embodiment, the overlap is spatially related in response to determining that there is an overlap of at least 256 pixels. In one embodiment, the query includes a search predicate and a query target. In one embodiment, the probe feature specification includes a descriptive search predicate specified as a region of interest, said region of interest comprising locations on the image having a specified extent. In one embodiment, the probe feature specification includes at least one tile having a set of images that the probe feature specification targets in generating search matches. In one embodiment, the traversal plan includes an order in which targets are compared to at least one probe feature specification. In one embodiment, each of said probe feature designation and said target is at least one of a tile, a single image, and a sub-image of a microscope slide image at a magnification level.

例示的な一実施形態におけるインタラクティブなデータ探索の方法を示すフロー図である。FIG. 4 is a flow diagram illustrating a method of interactive data exploration in an exemplary embodiment; 例示的な一実施形態における探索ライフサイクルを示すフロー図である。FIG. 4 is a flow diagram illustrating a discovery lifecycle in an exemplary embodiment; 例示的な一実施形態におけるクエリユニットを示すブロック図である。FIG. 4 is a block diagram illustrating a query unit in an exemplary embodiment; 例示的な一実施形態における分析ユニットを示すブロック図である。FIG. 4 is a block diagram illustrating an analysis unit in an exemplary embodiment; 例示的な一実施形態における、クエリユニットの中間的段階でのアーキテクチャを示す図である。FIG. 4 illustrates an intermediate stage architecture of a query unit in an exemplary embodiment; 例示的な一実施形態におけるレポジトリの構成要素を示すブロック図である。FIG. 4 is a block diagram that illustrates components of a repository in an exemplary embodiment; 例示的な一実施形態を示す図である。FIG. 4 illustrates an exemplary embodiment; 例示的な一実施形態を示す図である。FIG. 4 illustrates an exemplary embodiment; 例示的な一実施形態を示す図である。FIG. 4 illustrates an exemplary embodiment; 例示的な一実施形態を示す図である。FIG. 4 illustrates an exemplary embodiment; 本発明の一実施形態におけるスライドのレイアウトの一例を示す図である。FIG. 4 is a diagram showing an example of a slide layout in one embodiment of the present invention; 例示的な一実施形態におけるスライド又はタイルのフィーチャ比較の一例を示す図である。FIG. 12 illustrates an example of feature comparison for slides or tiles in an exemplary embodiment; 例示的な一実施形態における、タイルの空間的分解の一例を示す図である。FIG. 4 illustrates an example spatial decomposition of tiles in an exemplary embodiment; 例示的な一実施形態における、タイルのスケール分解の一例を示す図である。FIG. 4 illustrates an example scale decomposition of a tile in an exemplary embodiment;

セマンティックギャップに由来する問題に対処するための、データ（例えば、画像データ）を連続的に処理する方法、システムおよびコンピュータ読取り可能な（記憶媒体に記憶されたものである）命令を提供する。一実施形態において、前記方法は、（ユーザの意図を表すものである）クエリ、システムの機能、およびシステムによるユーザのガイドによって起動される。前記方法は、データのクエリ、分析および処理のうちの少なくとも１つを介して、データのインタラクティブ（対話型）かつ反復的な探索を提供する。例えば、本発明は、組織画像データのユニークな要件やその他の生体画像データのユニークな要件に的を絞った、画像データの探索を提供する。本発明の用途として、様々な用途が考えられる。例えば、本発明は、写真業、衛星画像等を含むあらゆる産業でのあらゆる画像に関して適用することができる。 A method, system and computer readable (stored on a storage medium) instructions for continuously processing data (e.g., image data) are provided to address problems arising from semantic gaps. In one embodiment, the method is driven by a query (which represents the user's intent), system functionality, and user guidance by the system. The method provides for interactive and iterative exploration of data through at least one of querying, analyzing and processing data. For example, the present invention provides an image data search focused on the unique requirements of tissue image data and other biological image data. Various uses are conceivable as uses of the present invention. For example, the invention can be applied to any image in any industry, including photography, satellite imagery, and the like.

一実施形態において、探索の方法及びシステムは、クエリ範囲およびクエリ結果についての即座のフィードバックをユーザに提供する。これにより、クエリを直ちに洗練化することが容易となる。前記方法は、さらに、クエリから返された画像データ結果に対して実行する分析や処理を指定することを可能する。 In one embodiment, the search method and system provide users with immediate feedback on query scope and query results. This makes it easier to refine queries on the fly. The method further allows specifying analysis or processing to be performed on the image data results returned from the query.

一実施形態では、クエリ指定が、データの導出されたインデックス及びデータ変換を暗示的に定義する。一実施形態において、前記定義はクエリに対する結果を生成する。これら結果は、当該定義からの結果に反応したものである。一実施形態において、前記結果は、予め算出されたもの、および／または、オンデマンドで算出されたもの、および／または、過去の探索時に算出されたものである。一実施形態において、前記結果は、ユーザエクスペリエンス要件、ユーザのクエリ指定／洗練化、およびシステム機能のうちの少なくとも１つに基づいて、漸増的に返される。一実施形態では、複数のクエリ過程、分析過程および処理過程が、これらの各過程の反復的処理によってチェーン状に繋げられている。これらの過程の組合せが、処理のパイプラインとなる。 In one embodiment, the query specification implicitly defines derived indices and data transformations for the data. In one embodiment, the definition produces results for a query. These results are in response to the results from that definition. In one embodiment, the results are pre-computed and/or computed on demand and/or computed during previous searches. In one embodiment, the results are returned incrementally based on at least one of user experience requirements, user query specification/refinement, and system capabilities. In one embodiment, multiple query, analysis and processing steps are chained together by iterative processing of each of these steps. The combination of these processes forms the processing pipeline.

一実施形態において、システム及び方法の構成要素は、当該システムが前記処理のパイプラインを連続的に解決するための手段を提供する。一実施形態では、前記処理の結果が、漸増的に生成されて、かつ、ユーザおよび／または前記パイプラインの後段に与えられる。この構成は、１つのパイプライン工程で全てのデータを完全に処理する必要がないので有利である。例えば、前記結果は、少なくとも１つのプロセッサによって見つけられるたびに提供される。所与の結果が、関連性がより高いものとして選択されると、クエリがこの情報を用いて更新されて、前記少なくとも１つのプロセッサによる後の検索及び結果は、その更新されたクエリを含むものとされる。そして、前記少なくとも１つのプロセッサが、開始されていた検索を続行する。代替的な一実施形態では、クエリが変更されると、検索が新たに開始される。 In one embodiment, system and method components provide means for the system to continuously solve the processing pipeline. In one embodiment, the results of said processing are generated incrementally and provided to the user and/or later stages of said pipeline. This configuration is advantageous because it is not necessary to completely process all data in one pipeline step. For example, the results are provided each time they are found by at least one processor. when a given result is selected as more relevant, the query is updated with this information, and subsequent searches and results by the at least one processor include the updated query. It is said that The at least one processor then continues the search that was initiated. In an alternative embodiment, the search is started anew when the query is changed.

一実施形態において、システム及び方法は、アーカイブ動作、記憶動作、転送動作および分析動作を含み、かつ、この順番で優先順序化される。アーカイブ動作は、データ（例えば、組織画像データ）の保持および複製を含む。これにより、データを、頻繁に移動されることなく長期記憶可能なデータとすることが確実にできる。アーカイブされたままのデータに対して処理を実行することは、当該データを、当該データの演算処理にとってローカルな側に移動させるのではなく、当該データの演算処理を、当該データにとってローカルな側に移動させることを伴う。記憶動作は、データへのアクセスを、マルチスケールの間引き表現で提供する。記憶動作は、データを、アクセスレイテンシを減らし且つ導出データの記憶及びロードを管理するように編成する。転送動作は、データつまり導出データを送信する必要性を制限する。例えば、転送動作は、導出データが小さい場合に、下流の処理へのデータの転送を遅らせて且つ当該データの分析及び変換を当該データにとってローカルな側で実行し且つ結果を返す。 In one embodiment, the system and method includes archiving, storing, forwarding, and analyzing operations and prioritized in that order. Archival operations include retention and replication of data (eg, tissue image data). This ensures that the data can be stored for a long time without being moved frequently. Performing operations on data as it is archived moves computations on the data local to the data, rather than moving the data to the side local to the computations on the data. involves moving. A store operation provides access to data in a multi-scale, decimated representation. Store operations organize data to reduce access latency and manage storage and loading of derived data. Forwarding operations limit the need to transmit data or derived data. For example, a transfer operation delays the transfer of data to downstream processing and performs analysis and transformation of the data local to the data and returns results when the derived data is small.

図１は、例示的な一実施形態におけるインタラクティブなデータ探索の方法及びシステム１００を示すフロー図である。方法及びシステム１００は、レポジトリの情報を探索する処理の順番を含む。ステップ１０２において、ユーザ指定が、ユーザのデータ抽出要件に基づいて定義される。ステップ１０４において、ステップ１０２で定義されたユーザ指定が、前記レポジトリ内のドメイン固有の情報と交わらされる。これにより、ステップ１０６において、関連するドメイン指定が生成される。ユーザ指定（ステップ１０２）とドメイン指定（ステップ１０６）との組合せを用いて、前記レポジトリに対するクエリが開始される。これにより、ステップ１０８において、クエリ結果が生成される。生成された結果は、分析プロセスに供給される。ステップ１１０において、この分析プロセスが、前記結果の分析を生成する。 FIG. 1 is a flow diagram illustrating a method and system 100 for interactive data exploration in an exemplary embodiment. The method and system 100 includes an order of operations for searching the repository for information. At step 102, user specifications are defined based on the user's data extraction requirements. At step 104, the user specifications defined at step 102 are interspersed with domain-specific information in the repository. This generates an associated domain designation in step 106 . A query is initiated against the repository using a combination of user specification (step 102) and domain specification (step 106). This produces query results in step 108 . The results generated are fed into the analysis process. At step 110, the analysis process produces an analysis of the results.

図６は、例示的な一実施形態におけるレポジトリの構成要素６００を示すブロック図である。この例において、レポジトリの構成要素は、当該レポジトリの基礎データ６０２を含む。基礎データ６０２は、スライド画像データの四分木分解、具体的には、各々の親レイヤが一度サブサンプリングされて、各々のレイヤが木の構成要素のタイルに対応するタイルにそれぞれ分割（例えば、間引き）されている四分木分解である。タイルのインデックス生成６０４は、少なくとも１つのフィーチャ抽出６０８に基づいて定義される。フィーチャ抽出６０８は、各タイルの状態に分離される。相関インデックス６１０は、インデックス類似性に基づいて定義されて、少なくとも１つのインデックス６０４に基づいて２つ以上のタイル同士の相関を示す。基礎データ６０２は、インポートされたデータの変換６０６である。 FIG. 6 is a block diagram illustrating components 600 of a repository in an exemplary embodiment. In this example, the repository components include underlying data 602 for the repository. The underlying data 602 is a quadtree decomposition of the slide image data, specifically, each parent layer is sub-sampled once and divided into tiles, each layer corresponding to a component tile of the tree (e.g., decimation). A tile index generation 604 is defined based on at least one feature extraction 608 . Feature extraction 608 is separated into states for each tile. A correlation index 610 is defined based on index similarity to indicate the correlation between two or more tiles based on at least one index 604 . The underlying data 602 is a transformation 606 of the imported data.

図２は、例示的な一実施形態における、レポジトリ探索ライフサイクル２００を示すフロー図である。一実施形態において、前記探索は、少なくとも１つのクエリユニット２０２および少なくとも１つの分析ユニット２１６によって順々に実行されるクエリと分析との組合せである。例えば、少なくとも１つのクエリユニット２０２は、クエリターゲットを含む。このクエリターゲットについて、インデックス２０８および／または相関２１０が、例えばフィーチャ抽出等に基づいて特定される。インデックス及び相関の情報は、レポジトリ２１２内に記憶される。少なくとも１つのクエリユニット２０２を介して得られた結果２０６は、分析ユニット２１６および／または洗練化ユニット２０４に送られる。分析ユニット２１６の後、前記結果は、変換ユニット２１４に送られる。変換ユニット２１４の後、前記結果は、レポジトリ２１２に送られる。レポジトリ２１２内に記憶されている様々な情報が、少なくとも１つのクエリユニット２０２に送られる。結果２０６を洗練化したもの２０４も、検索を更新するように前記クエリユニットに送られる。 FIG. 2 is a flow diagram illustrating a repository discovery lifecycle 200 in an exemplary embodiment. In one embodiment, the search is a combination of query and analysis performed by at least one query unit 202 and at least one analysis unit 216 in sequence. For example, at least one query unit 202 includes a query target. For this query target, an index 208 and/or correlations 210 are identified, eg, based on feature extraction or the like. Index and correlation information is stored in repository 212 . Results 206 obtained via at least one query unit 202 are sent to analysis unit 216 and/or refinement unit 204 . After analysis unit 216 , the results are sent to transformation unit 214 . After conversion unit 214 , the results are sent to repository 212 . Various information stored in repository 212 is sent to at least one query unit 202 . Refinements 204 of results 206 are also sent to the query unit to update the search.

一実施形態において、少なくとも１つのクエリユニット２０２は、検索述語およびクエリターゲットを含む。本明細書において「プローブ（探索）」又は「プローブフィーチャ指定（探索フィーチャ指定）」とは、関心領域として指定された前記検索述語のことを指す。関心領域とは、例えば、画像上の、指定された広がりを有する点のことである。本明細書において「ターゲット指定」とは、検索のマッチを生成するうえでプローブが標的とするターゲット画像のセット、および／または、少なくとも１つの顕微鏡スライドにおけるタイルを構成する画像（例えば、デジタル画像）のセットのことを指す。一実施形態では、少なくとも１つのターゲット画像又はターゲットタイルが選択されると、その選択がプローブとなる。本明細書において「トラバーサルプラン」とは、ターゲット指定からのターゲットが少なくとも１つのプローブと比較される順序の組立てのことを指す。一実施形態において、個々の比較のそれぞれは、１つのプローブと１つのターゲットとの比較である。一実施形態において、前記比較は、１つ以上のプローブと１つ以上のターゲットとの比較である。一実施形態において、前記プローブおよび前記ターゲットは、タイル、単一の画像、および顕微鏡スライド画像の、特定のスケール又は倍率でのサブ画像のうちの少なくとも１つである。一実施形態において、前記トラバーサルプランは、顕微鏡スライド画像のマルチスケール四分木分解についての幅優先トラバーサルである。一実施形態において、前記トラバーサルプランは、顕微鏡スライド画像のマルチスケール四分木分解についての深さ優先トラバーサルである。一実施形態において、前記トラバーサルプランは、深さ方向のトラバーサル（depth traversal）の後、幅方向のトラバーサル（breadth traversal）となる。一実施形態において、前記トラバーサルプランは、幅方向のトラバーサルの後、深さ方向のトラバーサルとなる。 In one embodiment, at least one query unit 202 includes a search predicate and a query target. As used herein, "probe (search)" or "probe feature specification (search feature specification)" refers to the search predicate specified as the region of interest. A region of interest is, for example, a point on an image with a specified extent. As used herein, "targeting" refers to a set of target images (e.g., digital images) that make up the tiles on at least one microscope slide that the probe targets in generating search matches. refers to the set of In one embodiment, once at least one target image or target tile is selected, the selection becomes the probe. As used herein, a "traversal plan" refers to the construction of an order in which targets from a target specification are compared with at least one probe. In one embodiment, each individual comparison is of one probe to one target. In one embodiment, said comparison is of one or more probes to one or more targets. In one embodiment, said probe and said target are at least one of a tile, a single image, and a sub-image of a microscope slide image at a particular scale or magnification. In one embodiment, the traversal plan is a breadth-first traversal for multi-scale quadtree decomposition of microscope slide images. In one embodiment, the traversal plan is a depth-first traversal for multi-scale quadtree decomposition of microscope slide images. In one embodiment, the traversal plan is a depth traversal followed by a breadth traversal. In one embodiment, the traversal plan is a width traversal followed by a depth traversal.

一実施形態では、ユーザ（例えば、病理学者、アドミニストレータ、プロセッサ等）が、組織画像のレポジトリに対するクエリに用いられる少なくとも１つのプローブタイルを選択する。前記ユーザは、さらに、前記クエリが実行されるスライドのグループを選択する。前記クエリを実行すると、クエリ指定に基づいて、少なくとも１つのプローブタイル画像からフィーチャが抽出される。例えば、色ヒストグラムフィーチャ抽出が利用される。一実施形態では、フィーチャが抽出されると、当該フィーチャが、当該フィーチャの再算出を防ぐために長期記憶装置に永続的に維持される。ディスクに永続的に維持された、少なくとも１つのフィーチャベクトルの集まりが、インデックスとして定義される。前記スライドのうちの、前記プローブタイルを含む低い側の倍率スケールから、フィーチャベクトルが抽出される。一実施形態において、このプロセスは、前記プローブタイルとのオーバーラップが空間的な関連性を有するものである最後のレベル（例えば、最も高いレベル）に達するまで、低い側の倍率スケールを再帰的に遡る。一実施形態において、この空間的な関連性は、２５６個のピクセルのオーバーラップとされる。個別のプローブタイルについての最も高いレベルのプローブオーバーラップタイルからのフィーチャの抽出、および個別のプローブタイルについての中間的な倍率レベルのプローブオーバーラップタイルからのフィーチャの抽出が、その個別のプローブタイルについてのスケールプローブタイルセットとして定義される。一実施形態では、全ての前記プローブタイルについての全ての前記タイルセットの集まり（ひとまとめで言えば、総プローブタイルセット）を用いて、ターゲットスライドについてのトラバーサルプランが生成される。一実施形態では、この総プローブタイルセットの構成要素同士が、それらの倍率レベルに基づいて組み合わされる。このような構成要素（例えば、タイルセットから抽出されたフィーチャ）同士の集まりを用いて、候補ターゲットタイルの抽出されたフィーチャのセットを、トラバーサルプランを生成するうえで順序付ける。一実施形態では、ターゲットタイルからのフィーチャが、対応する倍率レベルで抽出されてプローブセットと比較される。そして、複数のターゲット候補が、フィーチャベクトルの類似性尺度に基づいて順序付けされる。一実施形態において、比較演算子は、２つのベクトルのＬ２ノルムである。一実施形態において、前記トラバーサルプランは、この順序付けされたリストであり、類似性の順にトラバースされ、次に高い倍率でのタイルが、対応するプローブセットタイルと定められたフィーチャ空間（例えば、この例で言えば色ヒストグラム）内で比較される。その結果も順序付けされた後、再帰的な処理が、より強力な倍率レベルまで続行する。一実施形態において、前記トラバーサルプランは、幅優先のものとして指定されるか又は深さ優先のものとして指定されて、現在の倍率レベルの全ての評価が完了する前により高い倍率レベルが再帰的に処理されるのか、それとも、現在の倍率レベルの全ての評価が完了した後により高い倍率レベルが再帰的に処理されるのかを決める。例えば、深さ優先のものには、実行される評価が少なくなるので、短いレイテンシでユーザに結果を提供できるという利点がある。 In one embodiment, a user (eg, pathologist, administrator, processor, etc.) selects at least one probe tile that is used to query a repository of tissue images. The user also selects a group of slides for which the query is run. Executing the query extracts features from at least one probe tile image based on the query specification. For example, color histogram feature extraction is utilized. In one embodiment, once a feature is extracted, it is persistently maintained in long-term storage to prevent recomputation of the feature. A collection of at least one feature vector persistently maintained on disk is defined as an index. A feature vector is extracted from the lower magnification scale of the slide that contains the probe tile. In one embodiment, this process recursively lowers the magnification scale until reaching the last level (e.g., the highest level) where overlap with the probe tile is spatially relevant. go back. In one embodiment, this spatial relationship is 256 pixels of overlap. Extracting features from the highest level of probe overlap tiles for an individual probe tile, and extracting features from intermediate magnification level probe overlap tiles for that individual probe tile is defined as the scale probe tileset of . In one embodiment, the collection of all the tilesets for all the probe tiles (collectively, the total probe tileset) is used to generate a traversal plan for the target slide. In one embodiment, the components of this total probe tileset are combined based on their magnification levels. A collection of such constituents (eg, features extracted from a tileset) is used to order the set of extracted features of candidate target tiles in generating a traversal plan. In one embodiment, features from the target tile are extracted at corresponding magnification levels and compared to the probeset. The multiple target candidates are then ordered based on the feature vector similarity measure. In one embodiment, the comparison operator is the L2 norm of the two vectors. In one embodiment, the traversal plan is this ordered list, traversed in order of similarity, with the tile at the next highest magnification defined as the corresponding probeset tile in feature space (e.g., in this example color histogram). After the results are also ordered, recursive processing continues to stronger magnification levels. In one embodiment, the traversal plan is specified as either breadth-first or depth-first so that higher magnification levels are recursively processed before all evaluations of the current magnification level are completed. or whether higher magnification levels are processed recursively after all evaluations of the current magnification level are completed. For example, the depth-first one has the advantage of providing results to the user with low latency, since fewer evaluations are performed.

一実施形態では、前記クエリの実行から前記トラバーサルプランを用いて返されてくる結果が、ユーザインターフェースに表示される。一実施形態では、ユーザが、ユーザインターフェースに結果が表示されている任意の時点において、プローブタイルを追加又は削除したり、ターゲットスライドを追加又は削除したり、返された結果について関連性フィードバックを入力したりと、クエリパラメータを変更することを選択できる。一実施形態において、プローブタイルやスライドが追加／削除されると、プローブタイルフィーチャの既存のセットに設定された単純な処理が適用されて、これによって前記トラバーサルプランに変更が施される。一実施形態において、前記関連性フィードバックは、トラバーサルの順序を変化させるだけでなく、類似性尺度のバイアス要因としても機能する。前記関連性フィードバックは、ユーザインターフェースにおいて、正のフィードバックに相当するプラス（＋）または負のフィードバックに相当するマイナス（－）として指定される。 In one embodiment, results returned from executing the query using the traversal plan are displayed in a user interface. In one embodiment, the user can add or remove probe tiles, add or remove target slides, and enter relevance feedback on the returned results at any time while results are being displayed in the user interface. or you can choose to change the query parameters. In one embodiment, as probe tiles and slides are added/deleted, simple processing is applied to the existing set of probe tile features, thereby making changes to the traversal plan. In one embodiment, the relevance feedback not only changes the traversal order, but also acts as a biasing factor for the similarity measure. The relevance feedback is designated in the user interface as plus (+), which corresponds to positive feedback, or minus (-), which corresponds to negative feedback.

一実施形態では、ユーザが、前記トラバーサルプランを用いての前記クエリの実行中に、第１のクエリの結果を標的とする少なくとも１つの追加のクエリを指定することができる。具体的に述べると、後のクエリが、前のクエリの結果に作用し、その結果セットを検索する。一実施形態では、一次的なクエリが色ヒストグラムフィーチャの抽出に基づくものとされ、二次的なクエリがより複雑なフィーチャベースクエリ（例えば、テクスチャのキャラクタリゼーションに基づくもの）とされる。一実施形態において、前記二次的なクエリは、テクスチャの向き（例えば、スパースガボール（sparse Gabor）によるヒストグラムフィーチャ抽出）に基づくものとされる。 In one embodiment, a user may specify at least one additional query that targets results of a first query during execution of said query using said traversal plan. Specifically, the later query operates on the results of the earlier query and retrieves its result set. In one embodiment, the primary query is based on color histogram feature extraction and the secondary query is a more complex feature-based query (eg, based on texture characterization). In one embodiment, the secondary query is based on texture orientation (eg, histogram feature extraction with sparse Gabor).

一実施形態では、ユーザが、クエリの結果に対して実行される分析変換を指定する。具体的に述べると、変換プロセスが、クエリ結果が生成されると、その結果タイルを変換後のバージョンへと変換する。一実施形態において、前記変換は、空間的なサポートを示すために、前記結果タイルにおける画像フィーチャを膨張させたり収縮させたりする画像処理モルフォロジー処理を実行する。一実施形態において、前記変換プロセスは、画像化された組織を準備する際に使用された染色に対応する色を分離するデコンボリューション処理である。一実施形態では、細胞核、間質、腺等の組織を同定し且つ当該組織の位置を決定する組織定量化処理が実行される。一実施形態では、そのような構造の位置決定及び同定が、結果タイルを変換して且つ標的となる細胞構造を増幅するのに用いられる。 In one embodiment, a user specifies analytical transformations to be performed on the results of a query. Specifically, the transformation process transforms the result tiles into transformed versions as query results are generated. In one embodiment, the transform performs an image processing morphological operation that dilates or erodes image features in the result tiles to show spatial support. In one embodiment, the conversion process is a deconvolution process that separates colors corresponding to the stains used in preparing the tissue to be imaged. In one embodiment, a tissue quantification process is performed that identifies and localizes tissues such as cell nuclei, stroma, glands, and the like. In one embodiment, the localization and identification of such structures is used to transform the result tiles and amplify the targeted cellular structures.

一実施形態において、クエリ結果は、漸増的に生成されるたびに、返される各々の結果ごとに少なくとも１つの変換後の結果を生成する定められた分析プロセスに渡される。一実施形態において、このプロセスは、前記変換後の結果を、処理及びクエリ動作のために保持する。このように保持することにより、それらのタイルを、元々のタイルの使い方と同じ使い方をすることができる。例えば、少なくとも１つの変換後のタイルをクエリタイルとして指定することができ、変換後の結果が生成されるたびに、クエリ動作が当該変換後の結果にわたって実行される。それら変換後の結果は、元々のタイルに対する当初のクエリ動作から生成されたものである。一実施形態において、前記関連性フィードバックの動作および／またはクエリタイル追加の動作は、クエリの実行中に実行可能である。 In one embodiment, the query results are passed to a defined analysis process that generates at least one transformed result for each returned result as it is generated incrementally. In one embodiment, this process retains the transformed results for processing and query operations. By holding in this way, the tiles can be used in the same way that the original tiles were used. For example, at least one transformed tile can be designated as a query tile, and each time a transformed result is generated, a query operation is performed over that transformed result. These transformed results were generated from the original query operation on the original tiles. In one embodiment, the operations of relevance feedback and/or adding query tiles can be performed during query execution.

一実施形態では、クエリと分析変換との組合せがチェーン状に繋げられており、クエリプロセスの出力が変換プロセスの入力となり、この変換プロセスの出力が別のクエリプロセスの入力となる。一実施形態において、このクエリ・分析処理のチェーンは実質的に無制限である。このような処理チェーンでは、例えば、ユーザが画像のベースレイヤからクエリタイルを幾つか選択することでプロセスが開始する。次にユーザがクエリの標的とするスライドを選択し、次にそのクエリに用いられるインデックス（例えば、色ヒストグラムに基づくインデックス）が選ばれて、ユーザが実行を選択すると、そのクエリが実行されることで当該クエリが結果を生成し始めて、ユーザがその結果に対する分析の実行を指定し且つユーザがその結果タイルについての色デコンボリューションを当該分析が実行することを指定することで、それらの各々のタイルが別のタイル（例えば、Ｈ＆Ｅ（ヘマトキシリン＆エオジン）染色タイル）に変換されて、変換結果が生成されてユーザに与えられる。このクエリプロセスがさらなる結果を生成するたびに、当該結果は自動的に変換されてユーザに与えられる。すると、ユーザは、変換後のタイルのうちの１つ（例えば、ヘマトキシリン染色に対応するもの）をクエリタイルとして選択する。このクエリのためのインデックスが、テクスチャフィーチャ抽出に基づいて選ばれる。このクエリが実行されると、複数の変換済みのヘマトキシリンタイルから、結果が決定される。上記のチェーンにおける各ステップでさらなる結果が生成されるたびに、このプロセスはさらなる結果を返す。 In one embodiment, combinations of query and analytical transforms are chained together, where the output of a query process becomes the input of a transform process, and the output of that transform process becomes the input of another query process. In one embodiment, this chain of query and analysis processing is virtually unlimited. In such a processing chain, for example, the process begins with the user selecting some query tiles from the base layer of the image. The user then selects the slides targeted by the query, then the index to be used for the query (e.g., a color histogram-based index) is chosen, and the query is executed when the user chooses to execute. , and the user specifies that the analysis should be performed on the results and that the analysis should perform color deconvolution on the result tiles. is converted to another tile (eg, H&E (Hematoxylin & Eosin) stained tile) and the conversion result is generated and given to the user. Each time this query process produces additional results, those results are automatically converted and presented to the user. The user then selects one of the transformed tiles (eg, the one corresponding to the hematoxylin stain) as the query tile. An index for this query is chosen based on texture feature extraction. When this query is run, results are determined from the multiple transformed hematoxylin tiles. The process returns more results as each step in the chain above produces more results.

一実施形態において、本明細書で説明するプロセスの実施形態は、画像のレポジトリに新たなスライドが追加されると実行される。この場合の一実施形態では、最終結果の類似性尺度が閾値化されており、アラート閾値を超える結果が、アラート生成システムに送られる。一実施形態では、レポジトリにスライドが追加されることによって処理パイプラインがトリガされ、アラートがその新たなスライドとそのパイプラインに関連する結果セットタイルとを、評価用にユーザに通知する。これは、スキャンされたスライドに対する自動スクリーニングプロセスであり、既存の処理パイプラインを利用して、システムに追加されたスライドを自動的に処理し且つ通知又はさらなる処理のためのアラートを生成する。このような自動処理は、異常組織検出や品質保証を含む、様々な目的のスライドのスクリーニングに適用することができる。 In one embodiment, embodiments of the processes described herein are performed when a new slide is added to the image repository. In one embodiment of this case, the final result similarity measure is thresholded, and results exceeding the alert threshold are sent to the alert generation system. In one embodiment, the addition of a slide to the repository triggers a processing pipeline, and an alert notifies the user of the new slide and the result set tiles associated with that pipeline for evaluation. This is an automated screening process for scanned slides that utilizes existing processing pipelines to automatically process slides added to the system and generate notifications or alerts for further processing. Such automated processing can be applied to screen slides for a variety of purposes, including abnormal tissue detection and quality assurance.

一実施形態において、クエリタイルは、ソーススライド（ソースとなるスライド）から指定される。スキャンされた標本スライドは、複数の異なるスケールで一連のタイルを並べる配置構成で記憶される。最も高いスケールは、そのスライドが撮像された元来の倍率とされる。典型的に、この倍率は、４０ｘ（以下、「ｘ」は「×（乗法の演算子）」として記載される場合がある。）の光学倍率である。高スケールのタイルは、低スケールにサブサンプリングされて、各々の低スケールは、その前のスケールの倍率の１／２とされる。例えば、最も高いスケールが４０ｘであるとすると、その次に高いスケールは２０ｘとなり、さらに１０ｘ、５ｘ、２．５ｘ、１．２５ｘ、０．６２ｘ、０．３１ｘとなっていく。例えば、最も低いスケールにおいて、典型的な組織サンプルは、４～８個のタイルを占める。各タイルは、２５６ｘ２５６個のペルを有する。一実施形態では、クエリタイルとのマッチについての検索の実行対象となるターゲットスライドが指定される。例えば、ターゲットタイルがクエリタイルと比較される順序は、そのクエリタイルと同じ倍率でのターゲットタイルについての、比較するタイル同士のペルのＬ２ノルムに基づく比較演算子を用いての網羅的なトラバーサルの順序とされる。一実施形態では、少なくとも１つの利用可能な低倍率スケールを用いてマッチ仮説を生成し、ターゲット倍率に達するまで倍率を上げていって当該マッチ仮説を証明する、マルチスケール検索・比較が実行される。 In one embodiment, the query tile is specified from the source slide. A scanned specimen slide is stored in an arrangement that tiles a series of different scales. The highest scale is taken as the original magnification at which the slide was imaged. Typically, this magnification is an optical magnification of 40x (hereinafter "x" may be described as "x (multiplicative operator)"). The high scale tiles are subsampled to lower scales, each lower scale being 1/2 the scale factor of the previous scale. For example, if the highest scale is 40x, then the next highest scale is 20x, then 10x, 5x, 2.5x, 1.25x, 0.62x, 0.31x. For example, at the lowest scale, a typical tissue sample occupies 4-8 tiles. Each tile has 256x256 pels. In one embodiment, a target slide is specified on which to perform a search for matches with the query tile. For example, the order in which a target tile is compared to a query tile determines the exhaustive traversal of target tiles at the same scale as the query tile using a comparison operator based on the L2 norm of the pels between the comparing tiles. Ordered. In one embodiment, a multi-scale search and comparison is performed in which at least one available low-scaling scale is used to generate match hypotheses, and then increasing the scale until a target magnification is reached to prove the match hypotheses. .

倍率は、連続する各スケールごとに１／２となり、各寸法のペルの個数も１／２となる。このようにして、各々のスケールでのタイルの集まりは、１つ前の高い倍率スケールのサブサンプリングとなる。あるスケールでの１個のタイルは、次に高い倍率での４個のタイルに相当する。この対応関係は、最も高い倍率での元々のフルスケールスライド画像の、四分木分解に合致する。本発明は、各々のタイルが木のノード（節）となり且つ各々の空間的対応関係が木のエッジ（枝）となるこの四分木をトラバースする。 The magnification is 1/2 for each successive scale, and the number of pels in each dimension is also 1/2. Thus, the collection of tiles at each scale is a sub-sampling of the previous higher magnification scale. One tile at one scale corresponds to four tiles at the next higher magnification. This correspondence matches the quadtree decomposition of the original full-scale slide image at the highest magnification. The present invention traverses this quadtree where each tile is a node of the tree and each spatial correspondence is an edge of the tree.

一実施形態において、そのような四分木の、マルチスケール検索のためのトラバーサルは、低い倍率での空間的に対応するタイルのマッチを、クエリタイルの倍率までの高い倍率でのマッチ候補の存在を示唆するものとして利用する。一実施形態において、基礎トラバーサルは、最も低い倍率での対応するタイルをクエリタイルと比較することにより、それら最も低い倍率でのタイルの部分木同士に関して、さらなる比較の優先順序を決める。一実施形態では、最も低い倍率レベルのタイル同士が、類似性尺度に基づいて順序付けされる。一実施形態では、類似性が最も低い側の少なくとも１つのタイルが、その類似性が所与の閾値を下回ることに基づいて除外される。一実施形態において、この閾値は、タイルから導出されたフィーチャベクトルの０．７０相関度とされる。一実施形態において、各々のタイルについてのフィーチャベクトルは、その各々のタイルのスペクトル成分の合計に基づく、３２個のビンのヒストグラムである。保持されたタイルのそれぞれについて、次に高い倍率での相当する４個のタイルが、タイルの新たな集まりに追加される。このタイルの集まりは、前のレベルと同じプロセスに基づいて評価され、この再帰的なプロセスは、基礎倍率に達するまで続行する。一実施形態において、相関度の前記閾値は、より小さい０．５０から始まり、各倍率レベルごとに漸増的に最大０．８０まで増加する。ここで説明している実施形態は、四分木についての幅優先トラバーサルであり、低倍率の対応するクエリタイルとこれに対応する検索タイルとの比較をその現在のレベルにおいて行ってから、次に高い倍率へと移る。これは、現在のレベルでの全てのマッチに基づいて次のレベルを再帰的に処理することを、単一のバッチで行うことに相当する。一実施形態では、この単一のバッチを均等に分割した複数のバッチで、そのような再帰的処理が行われる。このような複数の小規模のバッチのそれぞれも、高い倍率レベルへと再帰的に処理される。バッチサイズが現在のレベルでの１個のタイルとなる実施形態では、上記の四分木についてのトラバーサルが、四分木についての深さ優先トラバーサルとなる。各レベルごとのバッチサイズが、単一のバッチ（幅優先）と各バッチごとに１個のタイル（深さ優先）との間で可変であるとき、それは適応型トラバーサルと称される。 In one embodiment, such a quadtree traversal for multi-scale search matches spatially corresponding tiles at a low scale factor to the presence of match candidates at a high scale factor up to the query tile scale factor. used as an indication of In one embodiment, foundation traversal prioritizes further comparisons between subtrees of the lowest scaled tiles by comparing the corresponding tiles at the lowest scaled to the query tile. In one embodiment, the lowest magnification level tiles are ordered based on a similarity measure. In one embodiment, at least one tile on the side with the lowest similarity is excluded based on its similarity falling below a given threshold. In one embodiment, this threshold is 0.70 correlation of feature vectors derived from tiles. In one embodiment, the feature vector for each tile is a 32-bin histogram based on the sum of the spectral components of that respective tile. For each retained tile, the corresponding four tiles at the next higher magnification are added to the new collection of tiles. This collection of tiles is evaluated based on the same process as the previous level, and this recursive process continues until the base multiplier is reached. In one embodiment, the correlation threshold starts at a lower value of 0.50 and increases incrementally up to a maximum of 0.80 for each magnification level. The embodiment described here is a breadth-first traversal for a quadtree, comparing the corresponding query tile with a low scale factor to its corresponding search tile at its current level, then the next higher level. Move to Magnification. This is equivalent to recursively processing the next level based on all matches at the current level in a single batch. In one embodiment, such recursive processing is performed in batches that are evenly divided from this single batch. Each of these multiple smaller batches is also recursively processed to higher magnification levels. In embodiments where the batch size is one tile at the current level, the above quadtree traversal becomes a depth-first quadtree traversal. When the batch size per level is variable between a single batch (breadth-first) and one tile per batch (depth-first), it is called adaptive traversal.

一実施形態において、適応型トラバーサルは、検索の開始時に幅優先のものに設定されている。検索が進むにつれて、演算コストが、実行された類似性演算の数として算出される。演算コストに関して言えば、例えば、１つの比較ごとに返される検索結果の数が、漸増的な検索結果レイテンシを決める。一例として、類似性比較の数が２００であり且つ相関用の類似性尺度を０．９０とした場合に返される結果の数が４０であるとすると、比較：結果の比率は２００：４０となり、すなわち、１つの結果ごとに５個の、結果レイテンシが生じる。このような状況の場合、現在のレベルでの全てのタイルを単一のバッチ処理に区分する幅優先の区分が効率的であると判定される。もし前記結果レイテンシが所与の閾値（例えば、５０）を上回る量に増加すれば、これは、適応型トラバーサルメカニズムに各レベルごとのバッチの数を増加させる（例えば２つにさせる）合図となる。もしそれでも前記結果レイテンシが前記閾値を上回る場合には、各レベルごとのバッチの数が３つに増やされる。前記閾値は、アドミニストレータやシステムが所望又は要求する任意の数値に設定される。一実施形態において、前記適応型メカニズムは、低い類似性を有する結果（深さ優先処理）に移る前に高い類似性を有する結果に狙いを定めるために、各レベルごとの処理効率を犠牲にする。あるところまで達すると、バッチの数が現在のレベルでのタイルの数と等しくなる場合があり、これは、四分木についての厳密な深さ優先トラバーサルに相当する。 In one embodiment, adaptive traversal is set to breadth-first at the start of the search. As the search progresses, the computational cost is calculated as the number of similarity computations performed. In terms of computational cost, for example, the number of search results returned per comparison determines the incremental search result latency. As an example, if the number of similarity comparisons is 200 and the number of results returned with a similarity measure of 0.90 for correlation is 40, then the comparison:result ratio is 200:40, That is, there are five result latencies per result. For such situations, it is determined that breadth-first partitioning, which partitions all tiles at the current level into a single batch, is efficient. If the resulting latency increases by an amount above a given threshold (e.g. 50), this signals the adaptive traversal mechanism to increase the number of batches per level (e.g. two). . If the resulting latency is still above the threshold, the number of batches for each level is increased to three. The threshold can be set to any number desired or required by an administrator or system. In one embodiment, the adaptive mechanism sacrifices processing efficiency for each level in order to target results with high similarity before moving on to results with low similarity (depth-first processing). . At some point, the number of batches may equal the number of tiles at the current level, which corresponds to strict depth-first traversal for quadtrees.

一実施形態において、前記適応型トラバーサルは、結果レイテンシを減らす目的で、バッチ処理を漸増的に分割して各レベルごとのバッチの数を増やす。例えば、プロセスが深さ優先のバッチ分割限界に達したにもかかわらず結果レイテンシが増えた場合には、結果レイテンシを減らすために、当該プロセスが検索のための分割の数を漸減させる。一実施形態では、評価される部分木の数を決めるために演算予算が指定される。 In one embodiment, the adaptive traversal incrementally divides the batch process to increase the number of batches per level in order to reduce the result latency. For example, if a process reaches its depth-first batch split limit but results latency increases, the process tapers down the number of splits for searching to reduce result latency. In one embodiment, a computational budget is specified to determine the number of subtrees to be evaluated.

一実施形態では、データ（例えば、組織画像データ）にアクセスするためのサンプリング機能が設けられる。このサンプリング機能は、データがアクセスされる順序を定義する。この機能は、データ自体から予測される制約とユーザのインタラクティブ動作とに基づいて、アクセスプランを組み立てる。サンプリング機能に対するこれらの制約は、本発明にかかるクエリコンテキストを構成する。 In one embodiment, a sampling function is provided for accessing data (eg, tissue image data). This sampling function defines the order in which data is accessed. This facility builds an access plan based on constraints predicted from the data itself and user interaction. These constraints on sampling functions constitute the query context according to the invention.

一実施形態において、サンプリングは、ユーザ指定およびシステム指定に基づくアクセスプランで構成される。前記ユーザ指定は、任意の述語指定に加えて、検索されるデータの範囲を含む。前記システム指定は、既存のデータおよび過去の処理からの残っている結果を含む。前記サンプリング機能は、部分的結果のセットを漸増的に返す。このようにして返された結果は、現在のクエリコンテキスト内で前記ユーザ指定および前記システム指定を変更するのに用いられる。関連性がより高い結果セット（例えば、サンプル）へとクエリが導かれるように、洗練化動作が処理に割り込む。 In one embodiment, sampling consists of access plans based on user-specified and system-specified. The user specifications include the scope of data to be retrieved in addition to any predicate specifications. The system specification includes existing data and remaining results from past processing. The sampling function incrementally returns a set of partial results. The results returned in this manner are used to modify the user-specified and the system-specified within the current query context. A refinement operation interrupts the processing so that the query is directed to a more relevant result set (eg, sample).

図３は、例示的な一実施形態におけるクエリユニット３００を示すブロック図である。このクエリユニット３００は、クエリのためのドメイン（システム）指定３０２とユーザ指定３１０とを含む。ドメイン指定３０２は、ユーザ指定３１０と組み合わされることによって述語および／またはターゲット３０４を形成して、当該述語および／またはターゲット３０４が、結果３０６を生成する。当該結果に基づいてクエリが洗練化および／または拡張される（３０８）ことで、そのクエリのユーザ指定３１０が変更される。 FIG. 3 is a block diagram illustrating query unit 300 in an exemplary embodiment. This query unit 300 includes a domain (system) specification 302 and a user specification 310 for the query. Domain specifications 302 are combined with user specifications 310 to form predicates and/or targets 304 that produce results 306 . The query is refined and/or expanded 308 based on the results to change the user specification 310 of the query.

図４は、例示的な一実施形態における分析ユニット４００を示すブロック図である。分析ユニット４００は、レポジトリの、クエリおよび／またはユーザ指定４１０から返されたデータに基づいて、変換結果４０６，４０８を生成する。変換後のデータは、その導出元であるデータと同じ寸法及び同じ解像度であり且つ当該変換に基づいて個々のペル位置及び色深度に変化が施された画像の形態である。ドメイン指定４０２は、結果を調べて当該結果を確定および／または変換する。 FIG. 4 is a block diagram illustrating analysis unit 400 in an exemplary embodiment. Analysis unit 400 generates transformation results 406, 408 based on data returned from queries and/or user specifications 410 of the repository. The transformed data is in the form of an image of the same dimensions and resolution as the data from which it was derived, with variations in individual pel positions and color depths based on the transformation. A domain specification 402 examines the results and determines and/or transforms the results.

図５は、例示的な一実施形態における、中間的段階でのクエリユニット５００のアーキテクチャを示す図である。中間的結果（「中間物」）は、結果を生成するように実行される定められたプロセスの、その結果である。前記中間物は、レポジトリの基礎状態と定められた処理との組合せに依存する。 FIG. 5 is a diagram illustrating the architecture of query unit 500 at an intermediate stage in an exemplary embodiment. An intermediate result (“intermediate”) is the result of a defined process performed to produce a result. Said intermediate depends on a combination of the underlying state of the repository and a defined process.

一実施形態において、関心領域（「ＲＯＩ」）５０２は、レポジトリ内の少なくとも１つの隣合うペルであって、そのレポジトリ内の１つのタイル５１２を構成するペルを定義する。ＲＯＩ５０２は、複雑な多角形５２２であってもよい。このような多角形５２２は、少なくとも１つのタイルで定義されて、当該多角形５２２の内側は、１つ以上のタイル上の離散位置に補間されている。タイルのデータは、画像データをスケール及び空間において間引きしたものであり、当該タイルの処理の基本単位を生成している。ベクトル５１４は、タイルから抽出されたフィーチャベクトル５０４，５１６であり、クエリが作用するインデックスを生成するのに用いられる。相関中間物５０６は、２つのタイル間の相関を、当該２つのタイルの抽出されたフィーチャベクトル同士の類似性（例えば、前記２つのタイルのうちの一方がクエリ述語であり且つ前記２つのタイルのうちの他方がクエリ結果であるときに生成される類似性）によって保持する。前記相関自体は、検索されるフィーチャインデックスである。テンソル５２６は、相関を、他のフィーチャベクトルに伝えるように組み立てられる。レイヤ５０８は、データから導出された、変換（５１８）後および／またはフィルタリング後および／または視覚化後の任意の情報である。定量化５２４は、例えば集約タイル処理（例えばスケールベース制約によって達成される）等を伴う、量的漸増プロセスである。尺度５２８は、スケールベース制約、閾値、相関度、インデックスおよびその他の情報のうちの、任意のものである。 In one embodiment, a region of interest (“ROI”) 502 defines at least one adjacent pel in a repository that constitutes one tile 512 in that repository. ROI 502 may be a complex polygon 522 . Such a polygon 522 is defined in at least one tile, and the interior of the polygon 522 is interpolated to discrete locations on one or more tiles. The data for a tile is the scale and spatial decimation of the image data and forms the basic unit of processing for that tile. Vector 514 is the feature vector 504, 516 extracted from the tile and used to generate the index against which the query operates. Correlation intermediate 506 calculates the correlation between two tiles based on the similarity between the extracted feature vectors of the two tiles (e.g., one of the two tiles is the query predicate and the (similarity generated when the other of them is a query result). The correlation itself is the feature index that is searched. A tensor 526 is constructed to convey the correlation to other feature vectors. Layers 508 are any information derived from the data after transformation (518) and/or filtering and/or visualization. Quantification 524 is a quantitative incremental process involving, for example, aggregate tile processing (eg achieved by scale-based constraints). Measures 528 are any of scale-based constraints, thresholds, correlations, indices, and other information.

一実施形態では、データの構造及び内容が、処理の優先順序を定義し、かつ、この優先順序は、データがユーザに与えられる順序を決定する。これにより、ユーザは、データの構造及び内容の両方を理解することができる。ユーザは、この優先順序により、データの探索を案内される。これにより、データの事前知識の必要性が軽減される。 In one embodiment, the structure and content of the data define the order of processing priority, and this order of priority determines the order in which the data is presented to the user. This allows the user to understand both the structure and content of the data. The user is guided in the search for data by this order of preference. This alleviates the need for prior knowledge of the data.

一実施形態において、処理は、返されるデータに合わせて調節され、スパース（疎）なセットの結果が返されることに基づいて、検索されるデータのサンプリング範囲を拡張する。同様に一実施形態では、大量のデータが返される場合に、サンプリング範囲が制限される。このような制限により、全データのうちの局在した小規模なサブセットについて大量のデータが返されるのを防ぎつつ、データを広範囲にわたってサンプリングすることが可能となる。 In one embodiment, the processing is tailored to the returned data and expands the sampling range of the retrieved data based on returning a sparse set of results. Similarly, in one embodiment, the sampling range is limited when a large amount of data is returned. Such restrictions allow the data to be extensively sampled while preventing a large amount of data from being returned for a small, localized subset of the total data.

一実施形態において、空間的に連続することは、空間的に隣接するデータは離れたデータよりも一般的に関連性がより高いとの推定を生む。同様に、近くのデータと離れたデータとの相関が大きい場合には、その近くのデータと隣合うデータについても、同じ理由で前記離れたデータとの相関が大きくなる。一実施形態では、これらの関係が、検索の制約を予測する基礎として用いられる。 In one embodiment, spatial continuity yields the assumption that spatially adjacent data are generally more related than distant data. Similarly, when the correlation between nearby data and distant data is large, the correlation between nearby data and adjacent data also becomes large for the same reason. In one embodiment, these relationships are used as a basis for predicting search constraints.

一実施形態では、使用統計を用いることにより、生成・アクセスされるデータの処理範囲が、他のデータよりも広範囲に拡張される。データが生成・アクセスされる頻度が少ない場合には、中間データの制限およびフラッシュ動作が実行される。アクセスの頻度が、データのランク付けに影響する。ＩＡＰＥ（ユーザ探索のためのランク付けデータ）が、品質保証（「ＱＡ」）のために、システムの処理に基づくランク付けと比較してのバイアス（例えば、スライドのスクリーニング等に基づくバイアス）によって調整される。 In one embodiment, usage statistics are used to extend the processing scope of the data generated and accessed to a wider range than other data. Intermediate data restriction and flush operations are performed when data is generated and accessed infrequently. The frequency of access influences the ranking of data. IAPE (ranking data for user exploration) adjusted for quality assurance (“QA”) biases compared to system processing-based rankings (e.g., biases based on slide screening, etc.) be done.

一実施形態において、システム及び方法は、結果がユーザに返されたときに当該結果を拡張および／または制限するための手段を提供する。このオンライン調整機能は、ユーザに、クエリ結果と相互作用するための手段を提供する。この機能は、クエリ実行プロセスにおける任意の時点で（クエリが終了した後でも）、ユーザが利用可能なものとされる。そのような場合、クエリは、新たな制約を用いて再実行される。一実施形態では、クエリおよびクエリの漸増的結果が、当該クエリの処理続行が非効率となる所与の限度に達したか否かを判定するために分析される。その限度に達したと判定された場合、クエリは停止されてもよく、また、クエリに変更を施す機会がユーザに与えられてもよい。 In one embodiment, the system and method provide a means to expand and/or restrict the results when they are returned to the user. This online reconciliation facility provides users with a means to interact with the query results. This functionality is made available to the user at any point in the query execution process (even after the query has finished). In such cases, the query is re-executed with the new constraints. In one embodiment, the query and the incremental results of the query are analyzed to determine if a given limit has been reached that makes further processing of the query inefficient. If it is determined that the limit has been reached, the query may be stopped and the user may be given an opportunity to make changes to the query.

一実施形態では、データの分割及びトラバーサルが、記憶および演算連関性を最適に維持するために実行される。一実施形態では、データが、互いに重複しない空間領域（例えば、ブロック、タイル等）へと規則的に分割される。一実施形態では、データが、サブサンプリングされて分割される。一実施形態において、分割されたデータタイルは、スケールおよび空間的近さに基づいて並べられる。個々のそれぞれのタイルの局在性が、演算の基本単位として機能する。この基本単位が処理されて、その結果がユーザに与えられる。 In one embodiment, data partitioning and traversal is performed to optimally maintain storage and computational associativity. In one embodiment, the data is regularly divided into non-overlapping spatial regions (eg, blocks, tiles, etc.). In one embodiment, the data is subsampled and split. In one embodiment, the divided data tiles are ordered based on scale and spatial proximity. The locality of each individual tile serves as the basic unit of computation. This elementary unit is processed and the result is given to the user.

一実施形態において、集約タイル処理は、主に、低スケールについての分析を用いて高スケールでのデータ処理順序の適性を調べるスケールベース制約によって達成される。例えば、組織画像データのスケールベースのピラミッド型構造が、その画像データの四分木分解として表現される。親－ノード類似性を用いて、この表現の適性が調べられる。側路情報へのアクセスが増えると、大量の段階的結果が得られる。つまり、パイプラインの一部のみを実行するだけで済むようになる。 In one embodiment, aggregate tile processing is achieved primarily by scale-based constraints that use analysis on low scales to test the suitability of data processing order at high scales. For example, the scale-based pyramidal structure of tissue image data is represented as a quadtree decomposition of that image data. The relevance of this representation is checked using parent-node affinities. Increased access to sidetrack information yields a large number of incremental results. This means that only part of the pipeline needs to be executed.

一実施形態において、アルゴリズムの処理は、返される結果の量に依存する。トラバーサルの方針が、そのトラバーサルの戦略がどのように変更されるのかを決める。 In one embodiment, the processing of the algorithm depends on the amount of results returned. The traversal policy determines how the traversal strategy is changed.

一実施形態において、画像データ記憶の編成およびその導出データの編成を四分木分解により表現可能であるところ、各処理漸増ごとの粒度は、その四分木の部分木の処理に基づく。結果のセットを返すのに必要とされる演算が、部分木の処理コスト見積りを用いて制限される。部分木への処理の分離により、分散並列処理を適用してトラバーサルの演算を拡縮することが容易になる。 In one embodiment, where the organization of image data storage and the organization of its derived data can be expressed by a quadtree decomposition, the granularity for each processing increment is based on the processing of subtrees of that quadtree. The operations required to return the result set are bounded using the subtree processing cost estimates. Separation of processing into subtrees facilitates the application of distributed parallel processing to scale the operation of traversal.

一実施形態では、四分木についてのトラバーサルが、各レベルにおいてサブセット（部分集合）を処理するように定義されて、各レベルでのサブセットの数が、そのトラバーサルが幅優先又は深さ優先である度合いを決定する。幅優先のバイアスでは、大量のデータをサンプリングすることができると共に、結果のセットを返すまでの演算資源の利用が大量になる。このような漸増の仕方は、マッチする結果がスパース（疎）であり且つスケール連関性が弱いときに有利となる。一実施形態において、幅優先のトラバーサルは、一般的に網羅的であり且つマッチするサンプルの分布について多くの仮定をしない。このことは、述語検索仮説が弱いことを示唆する。一実施形態において、深さ優先のバイアスは、より少ない量のデータをサンプリングして、より短い演算時間にして、かつ、より少ない漸増で結果を返す。このことは、結果が密であり且つスケール連関性が強い場合に有利となる。 In one embodiment, a traversal for a quadtree is defined to process subsets at each level, and the number of subsets at each level determines the degree to which the traversal is breadth-first or depth-first. to decide. A breadth-first bias can sample a large amount of data and uses a large amount of computational resources to return a set of results. Such an incremental approach is advantageous when the matching results are sparse and scale-related weakly. In one embodiment, breadth-first traversal is generally exhaustive and does not make many assumptions about the distribution of matching samples. This suggests that the predicate retrieval hypothesis is weak. In one embodiment, depth-first bias samples a smaller amount of data resulting in shorter computation times and returns results in smaller increments. This is advantageous when the results are dense and highly scale-related.

一実施形態において、データの探索は、後のクエリに対する数多くの部分解（partial solution）を生成する。これらの部分解は、少ない中間データから算出される結果よりも素早く結果を提供する機会を意味する。これらの部分結果の多くが、他のユーザの活動によって生成されたものであるところ、品質バイアスが存在する。また、このバイアスを有する結果のサブセットが返される。 In one embodiment, searching the data produces a number of partial solutions for subsequent queries. These partial solutions represent an opportunity to provide results more quickly than those calculated from less intermediate data. A quality bias exists where many of these partial results were generated by the activities of other users. Also, the subset of results with this bias is returned.

一実施形態において、この品質バイアスは、結果を返す効率を向上させるのに利用可能であるだけでなく、データ単位で且つデータ有用性の総合ランキングとして再帰的に参照される（例えば、計数される）。 In one embodiment, this quality bias is not only available to improve the efficiency of returning results, but is also recursively referenced on a data-by-data basis and as an overall ranking of data usefulness (e.g. counted ).

一実施形態において、組織画像データの課題は、パイプラインにおける下流の処理に加えてユーザ指定の洗練化を可能にする、システム対応性によって対処される。この指定の洗練化は、現在のクエリ処理を変更して当該クエリによって生成される結果を変化させるのに用いられる。下流の処理は、クエリ結果が生成されると当該クエリ結果に作用して追加の変換をデータに実行する。その後、追加のクエリ処理が実行される。クエリ処理の対応性は、ユーザに、クエリの変更やさらなる処理を介してデータを探索する融通性を提供することができる。 In one embodiment, tissue image data challenges are addressed by system flexibility that allows for user-specified refinement in addition to downstream processing in the pipeline. This specification refinement is used to modify the current query processing to change the results produced by that query. Downstream processing operates on the query results as they are generated to perform additional transformations on the data. Additional query processing is then performed. Query processing flexibility can provide users with the flexibility to explore data through query modification and further processing.

基礎画像データは、空間的なスケール及びスペクトル的なスケールにわたる漸増的な処理のために構造化及び編成される。データは、インポート時に、較正プロセスによって空間的に且つスペクトル的に正規化される。データの相関が、フィーチャインデックスの類似性によって決定されて、相関インデックスに維持される。データのアクセス・処理が推定されて、かつ、そのようなアクセス処理が、予測されたコスト及び実際のコストに基づき実行される。演算全体の結果を概算する予備演算が実行されて、演算コスト見積りを提供して、かつ、最終結果を漸増的に算出する。結果は、今後の演算のためにロールアップ及び集約されて、中間生成物は、アルゴリズムのオンライン演算が可能な更新演算のために保持される。 The underlying image data is structured and organized for incremental processing across spatial and spectral scales. Data are spatially and spectrally normalized by a calibration process upon import. Correlation of data is determined by similarity of feature indices and maintained in a correlation index. Access and processing of data are estimated, and such access processing is performed based on predicted and actual costs. Preliminary computations are performed to approximate the results of the overall computation to provide computational cost estimates and to incrementally compute the final results. The results are rolled up and aggregated for future operations, and the intermediate products are retained for update operations that allow the algorithm to operate online.

一実施形態では、カーネルが、ピラミッド型／階層的／四分木のデータ構造（例えば、マルチスケールの画像ピラミッド型構造）を利用することにより、漸進的な且つ各々分離した演算を容易なものとする。本発明を限定しない一実施形態において、タイルは、各寸法が２５６の空間的な広がりを持つ正方形の画像とされる。これらのタイルは、当初の画像から、当該当初の画像についての連続する２ｘのサブサンプリングを、再帰的にサブサンプリングされた画像の寸法が２５６未満となるまで行って生成されたものである。これらのタイルを含むファイルシステムディレクトリも、２５６のグループつまり１６×１６タイルの配置構成に分割される。このファイルシステム編成は、記憶システムの局在性がキャッシュ格納メカニズムを活用するうえで最適な配置構成を提供する。さらに、このような分離構成により、分散型ファイルシステムのなかでも、別々の演算環境間でコンテキスト情報を共有する必要なくサブ領域についての処理を実行可能な分散型ファイルシステムを見込むことが可能となる。 In one embodiment, the kernel utilizes a pyramidal/hierarchical/quadtree data structure (e.g., multi-scale image pyramidal structure) to facilitate incremental and discrete operations. do. In one non-limiting embodiment, a tile is a square image with a spatial extent of 256 in each dimension. These tiles were generated from the original image by successive 2x subsamplings of the original image until the dimensions of the recursively subsampled image were less than 256. The file system directory containing these tiles is also divided into 256 groups or arrangements of 16x16 tiles. This file system organization provides an optimal arrangement for storage system locality to take advantage of caching mechanisms. Furthermore, such an isolated configuration makes it possible to envisage a distributed file system in which sub-region processing can be executed without the need to share context information between separate computing environments. .

一実施形態では、ヒストグラムのビンに基づくフィーチャベクトルが、タイル同士の類似性比較において利用される。そのようなフィーチャベクトルは、タイルの内容の近似であって、インデックス生成に用いられる近似を表す。このようなスペクトルフィーチャ空間において互いに類似するタイル同士については、これらのタイルに対する処理を用いて、演算の負担がより大きいフィーチャ抽出処理を伴う処理の結果が推定される。 In one embodiment, feature vectors based on histogram bins are utilized in tile-to-tile similarity comparisons. Such a feature vector represents an approximation of the contents of the tile that is used for index generation. For tiles that are similar to each other in such spectral feature space, the processing on these tiles is used to estimate the results of processing involving the more computationally intensive feature extraction process.

一実施形態では、本発明にかかる手法のスケール的側面により、タイルのスペクトル成分についての近似であるスペクトルフィーチャベクトルが、ランク付けされた近似となるように強調される。例えば、フィーチャベクトルがタイルの大きさになると、各々のフィーチャベクトルの位置は、それぞれのピクセルにそのまま対応する。一実施形態において、ピクセルよりもビンが少なくなる場合、それは当初のデータよりもスケールが下げられていることを必然的に意味する。ヒストグラムのビンとピクセルの位置との対応関係がなくなった場合、それは空間的な不変性を意味する。 In one embodiment, the scale aspect of our approach emphasizes the spectral feature vector, which is an approximation to the spectral content of the tile, to be a ranked approximation. For example, if the feature vectors are the size of a tile, the location of each feature vector corresponds directly to each pixel. In one embodiment, when there are fewer bins than pixels, it necessarily means that the original data has been scaled down. Spatial invariance is implied when histogram bins and pixel locations no longer correspond.

一実施形態において、カーネルは、オンラインで動作し、粗粒度の演算を実行して結果を集約する。それらの演算を実行する演算コストは、処理のスケジューリング動作において要因として組み込まれる。アクセスプランの部分木についてコスト見積りが実行されて、かつ、それらが集約される。これらの見積りは、各々の部分木に関して許可される演算について、その調整を可能にする。 In one embodiment, the kernel operates online, performing coarse-grained operations and aggregating the results. The computational cost of performing those operations is factored in the processing scheduling operations. A cost estimate is performed on the subtrees of the access plan and they are aggregated. These estimates allow adjustments to the allowed operations on each subtree.

一実施形態において、カーネルは、処理要素のパイプラインに対して漸増的に作用するように構成されている。これらの要素は、クエリ要素の後に分析要素が続くといったものであってもよい。前記クエリ要素は、少なくとも１つの述語パターンを候補パターンのセットに対して適用して、パターンマッチ条件に基づいて結果を返す。分析が、その結果セットパターンに対して実行されて、当該結果セットパターンを視覚化、さらなる分析およびクエリ動作のうちの少なくとも１つのために何らかのかたちで変換する。 In one embodiment, the kernel is configured to act incrementally on the pipeline of processing elements. These elements may be such as a query element followed by an analysis element. The query element applies at least one predicate pattern against a set of candidate patterns and returns results based on pattern matching conditions. Analysis is performed on the result set pattern to somehow transform the result set pattern for visualization, further analysis and/or for querying operations.

一実施形態では、クエリ自体がインデックスを生成することに基づき中間生成物を生成して、これらインデックスはパターンマッチプロセスで用いられる。一実施形態では、分析プロセスにより、フィルタリング後又は変換後の入力画像データ又は量的尺度の形態である中間生成物が生成される。 In one embodiment, intermediate products are generated based on the query itself generating indexes, and these indexes are used in the pattern matching process. In one embodiment, the analysis process produces intermediate products in the form of filtered or transformed input image data or quantitative measures.

一実施形態では、他の中間生成物として、近似後の結果を生成する近似機能からの中間生成物も挙げられる。また、他の中間生成物として、オンラインの演算からの中間生成物が、繰返し処理速度を上げる目的で保持される。このようなオンラインの演算からの中間生成物も、パイプラインの中間生成物と見なされる。 In one embodiment, other intermediate products also include intermediate products from the approximation function that produce post-approximation results. Also, other intermediate products from on-line operations are retained for the purpose of increasing iteration speed. Intermediates from such online operations are also considered pipeline intermediates.

一実施形態では、これら中間生成物の保持及び利用により、カーネルが、これらの処理を繰り返すのに必要となる演算を伴うことなく結果を生成するという代替的な手法を取ることができる。 In one embodiment, the retention and utilization of these intermediates allows the kernel to take an alternative approach to producing results without the operations required to repeat these processes.

一実施形態において、組織画像データ等のデータは、異なるスケールで、異なる構造を有する。これらの構造は、異なるスケールにわたって必ずしも構造的に連関していない。すなわち、構造パターンは、必ずしも繰り返されていない。これらのパターン間の関係は、マクロスケールモデルが少なくとも１つのマイクロスケールモデルを生成することが可能な生成機能としてモデル化されてもよい。これらのモデルは、追加の制約をクエリに付与するのに利用可能である。これらのモデルにおけるカーネル特有の側面として、当該カーネルが特定のデータに関係なく上記のようなマクロ／マイクロモデルを見つけ出し、かつ、当該カーネルがそのマクロ／マイクロモデルを利用することによって異なるスケールにわたって共同の類似性を提供することが挙げられる。 In one embodiment, data such as tissue image data have different structures at different scales. These structures are not necessarily structurally related across different scales. That is, the structural pattern is not necessarily repeated. The relationships between these patterns may be modeled as generative functions that allow macroscale models to generate at least one microscale model. These models can be used to impose additional constraints on queries. A kernel-specific aspect of these models is that the kernel finds such macro/micro-models regardless of the specific data, and that the kernel utilizes the macro-/micro-models to collaborate across different scales. Providing similarity.

一実施形態において、データ管理の面について述べると、システムの有用性は、データの性質及び編成に依存する。組織画像データ用途では、元々の画像データの保持が、導出データの保持よりも優先される。データ管理システム構成は、アーカイブの優先順序を反映して、かつ、この優先順序を参照して記憶動作、転送動作および分析動作を定義する。 In one embodiment, regarding the data management aspect, the usefulness of the system depends on the nature and organization of the data. For tissue image data applications, retention of original image data is prioritized over retention of derived data. The data management system configuration reflects and references archiving priorities to define storage, transfer and analysis operations.

一実施形態では、データの大きさが、データ複製動作に実質上の限度を課す。この大きさを、そのデータに関連付けられた長期保持方針と共に考慮すると、アーカイブされたデータを中心としてデータベースを構築することで要件を満足することが可能である。アーカイブフォーマット及びデータレイアウトの選択は、下流の全ての処理の機能に影響を与える。 In one embodiment, the size of the data imposes a practical limit on data replication operations. Given this size, along with the long-term retention policy associated with that data, it is possible to meet the requirement by building a database around archived data. The choice of archive format and data layout affects the functionality of all downstream processing.

一実施形態において、データの記憶は、データからの導出データにおける複数の異なる層を管理する能力に基づくものとされる。このデータの保持及びフラッシュ動作が、そのようなデータをオンデマンドで再生成する機能に基づいて、記憶要件及び演算要件を満足するために実行される。 In one embodiment, data storage is based on the ability to manage multiple different layers of data derived from the data. This data retention and flushing operations are performed to meet storage and computational requirements based on the ability to regenerate such data on demand.

一実施形態において、データの転送及び分配に関連付けられる処理は、データの物理的グルーピングと、所与のアドレッシング手法によって解決される範囲外の参照を行う能力とによって実現される。システムは、依存関係に制限を課すことにより、分散環境での処理を利用する場合に動作上有利となる。 In one embodiment, the processing associated with transferring and distributing data is accomplished through physical groupings of data and the ability to make references outside the scope resolved by a given addressing scheme. By imposing restrictions on dependencies, the system provides operational advantages when utilizing processing in a distributed environment.

一実施形態において、システムは、ルーチン処理およびユーザのインタラクション（相互作用）の両方に基づいて自動的に実行される動作を有する。 In one embodiment, the system has actions that are automatically performed based on both routine processing and user interaction.

図７Ａ及び図７Ｂは、例示的な一実施形態における、データを検索するクエリを開始する方法を示す簡略化されたフロー図である。図７Ａに示すように、検索が開始（ブロック７０５）された後、画像クエリ用画像の現在のズームレベル（あるいは「倍率」であり、同義的に用いる）が所定の閾値Ｔｈ１よりも大きい（ブロック７１０）場合、現在のレベルの照会についての検索結果が返される（ブロック７１５）。 Figures 7A and 7B are a simplified flow diagram illustrating a method of initiating a query to retrieve data in an exemplary embodiment. As shown in FIG. 7A, after the search is initiated (block 705), the current zoom level (or "magnification", used interchangeably) of the image query image is greater than a predetermined threshold Th1 (block 710) If so, search results for the current level of the query are returned (block 715).

一実施形態において、前記閾値は、非生産的な検索動作を制限するように設定され、例えば、生成される検索マッチの数が多くない場合にはその検索を終了するように設定される。他の例として、前記閾値は、画像のピクセルが有意義な画像情報を含まなくなる前に非生産的な検索動作を制限するように設定される。他の例として、前記閾値は、その検索における倍率レベルの深さ又はその他の深さを制限するように設定される。例えば、サブセットが小さ過ぎるために当該サブセットから生成されるマッチの数が少ない場合には、より広範囲な検索のためにそのサブセットのサイズが増やされる。 In one embodiment, the threshold is set to limit unproductive search operations, eg, to terminate the search if the number of search matches generated is not high. As another example, the threshold is set to limit unproductive search operations before image pixels no longer contain meaningful image information. Alternatively, the threshold is set to limit the depth of magnification levels or other depth in the search. For example, if a subset is too small to generate a small number of matches from that subset, then the size of the subset is increased for a broader search.

一実施形態において、検索結果が予備算出されたものであるか又は過去の探索時に算出されたものである場合、その検索結果は、深さベースの検索を実行せずに返される。 In one embodiment, if the search result was pre-computed or was computed during a previous search, the search result is returned without performing a depth-based search.

一実施形態において、検索機能は、異なる閾値を用いて再帰的に開始される。 In one embodiment, the search function is recursively initiated with different thresholds.

一実施形態において、各々のズームレベル又は倍率は、四分木における１つのレベルに対応する。例えば、画像は、倍率レベル１では単一のタイルで構成されている。そして、その画像と当該画像の４つの子供との対応関係が、倍率レベル２と見なされ、これら４つの子供に対して子供となる１６個のタイルがズームレベル３と見なされ、これら１６個の…等となる。各々の倍率レベルは、前の倍率レベルよりも高い解像度を有し、典型的には各寸法の解像度が２倍となる。タイルが解像度の上限に達した場合、それらのタイルに対する子供はピクセルが補間された場合のみ相当する。そのような補間ピクセルも利用する場合には、これらが最大のズームレベル又は倍率であると見なされる。四分木への分解により、親タイルは、対応する４つの子供タイルの、ダウンサンプリング、ローパス空間的フィルタリングとなるので、それら子供タイルは、その親タイルと連関性を有する。例えば、所与の色が親タイルに見られる場合、同じ色が子供タイルにも見られる可能性が高いか、あるいは、少なくとも親側の色は子供タイル側の色からダウンサンプリング処理によって導出可能である。 In one embodiment, each zoom level or magnification corresponds to one level in the quadtree. For example, the image consists of a single tile at magnification level one. Then, the correspondence between that image and its four children is considered to be at magnification level 2, the 16 tiles that are children of these four children are considered to be at zoom level 3, and these 16 children are considered to be at zoom level 3. … and so on. Each magnification level has a higher resolution than the previous magnification level, typically doubling the resolution in each dimension. If the tiles are at the resolution limit, the children for those tiles are equivalent only if the pixels are interpolated. If such interpolated pixels are also utilized, these are considered to be the maximum zoom level or magnification. Decomposition into quadtrees results in a parent tile being a down-sampling, low-pass spatial filtering of its four corresponding child tiles, so that the child tiles are related to their parent tiles. For example, if a given color is found in a parent tile, the same color is likely to be found in a child tile, or at least the parent side color can be derived from the child tile side color by a downsampling process. be.

一実施形態では、検索結果を取り出すために、現在の検索条件を満たす結果タイルのリストが、メモリからクエリされる。一実施形態において、タイルのそのようなリストは、関連付けられたデータ（例えば、タイルのレベルおよび／またはクエリおよび／または各々のタイルに関連付けられたインデックスのファイル名および／または結果のサイズおよび／またはインデックスの種類）に基づいて取り出される。 In one embodiment, to retrieve search results, a list of result tiles that meet the current search criteria is queried from memory. In one embodiment, such a list of tiles includes associated data (e.g., tile level and/or query and/or index filename and/or result size and/or index associated with each tile). index type).

一実施形態において、現在の検索条件またはクエリは、検索結果を、優先度およびシステム資源に基づいて制限する。例えば、クエリが、検索に利用可能な演算についての制限、検索を完了するのに利用可能な時間についての制限、利用可能なメモリ量についての制限、検索結果の品質又は量についての閾値等を含む。 In one embodiment, the current search criteria or query limits search results based on priority and system resources. For example, the query may include limits on the operations available for the search, limits on the time available to complete the search, limits on the amount of memory available, thresholds on the quality or quantity of search results, etc. .

一実施形態では、現在の倍率レベルが所定の閾値Ｔｈ１よりも大きくない（ブロック７１０）場合、次のレベルのタイルが取り出される（ブロック７２０）。これにより、現在のレベルと対応関係にある、次のレベルのタイルである四分木の子供が取り出されるか又は生成される。そして、それら次のレベルのタイルから、現在のクエリ結果とマッチするタイルのリストが生成される（ブロック７２５）。一実施形態において、クエリは、結果がさらに見つかるたびに洗練化される。例えば、クエリは、最小の結果サイズを含む。そして、例えば、より優れたマッチが見つかると、より広い結果又は先に見つかった所与の結果が、メモリから結果を取り出すクエリを狭めることによって結果リストから除外される。 In one embodiment, if the current magnification level is not greater than a predetermined threshold Th1 (block 710), the next level tile is retrieved (block 720). This retrieves or creates the next level tile quadtree child that corresponds to the current level. Then, from those next level tiles, a list of tiles that match the current query result is generated (block 725). In one embodiment, the query is refined each time more results are found. For example, the query contains a minimum result size. Then, for example, when a better match is found, the broader result or a given result found earlier is removed from the result list by narrowing the query that retrieves the result from memory.

一実施形態では、取り出されたリスト内のそれぞれのタイルについて、当該タイルがサブセットに追加される（ブロック７３０）。そのサブセットの大きさ、そのサブセット内のタイルの数が所定の閾値以上である（ブロック７３５）場合には、そのサブセット内のタイルについて再帰的な深さ優先検索が実行される（ブロック７４０）。これにより、再帰的検索が実行されたセットの大きさが小さくなる。この再帰的検索の結果が保存されて（ブロック７４５）、そのセットがクリアされる（ブロック７５０）。次に、残りのセットについて再帰的検索が実行されて（ブロック７５５）、この結果が保存されて（ブロック７６０）、そのセットがクリアされる（ブロック７６５）。保存された結果は、当該結果が記憶された時点から、メモリから取出し可能となる。これにより、一実施形態では、検索の実行中でも、検索が終了した後でも、連続的に更新される検索結果のセットを利用することが可能となる。 In one embodiment, for each tile in the retrieved list, that tile is added to the subset (block 730). If the size of the subset, the number of tiles in the subset is greater than or equal to a predetermined threshold (block 735), then a recursive depth-first search is performed on the tiles in the subset (block 740). This reduces the size of the set on which the recursive search is performed. The results of this recursive search are saved (block 745) and the set is cleared (block 750). A recursive search is then performed on the remaining set (block 755), the results are saved (block 760), and the set is cleared (block 765). The saved results are retrievable from memory from the time the results were stored. This allows, in one embodiment, to utilize a continuously updated set of search results, both during execution of the search and after the search has finished.

図８Ａ及び図８Ｂは、例示的な一実施形態における、再帰的検索を実行する方法を示す簡略化されたフロー図である。図８Ａに示すように、再帰的検索が、タイルのセットについて開始される（ブロック８０５）。検索を最適化させるために、新たなクエリタイルが、例えば入力タイルセット内の第１のタイルに対する第１の子供タイルとして、策定される（図示せず）。そして、タイルセットの各タイルについて、次のレベルのタイルが取り出されて（ブロック８１０）、当該次のレベルのタイルが結果セットに追加される（ブロック８１５）。 Figures 8A and 8B are a simplified flow diagram illustrating a method of performing a recursive search in one exemplary embodiment. As shown in Figure 8A, a recursive search is initiated for a set of tiles (block 805). To optimize the search, a new query tile is formulated (not shown), eg, as the first child tile for the first tile in the input tileset. Then, for each tile in the tileset, the next level tile is retrieved (block 810) and the next level tile is added to the result set (block 815).

一実施形態では、結果セット登録されると、現在のズームレベルがターゲットのズームレベルである（ブロック８２０）場合、結果セット内のマッチの品質が評価される（ブロック８２５）。例えば、その結果セット内の第１のタイルの、前記クエリタイルと比較してのマッチ値が５０％未満である場合、結果が前記クエリタイルと過度に異なることから、深さ方向の検索が、四分木のこのエッジ（枝）に沿って続行されなくなる（ブロック８３０）。他方で、結果が十分に正確である場合には、現在の結果セットが再帰的検索の結果として返される（ブロック８３５）。品質の最小閾値が、クエリの一部として設定される。 In one embodiment, once the result set is registered, if the current zoom level is the target zoom level (block 820), the quality of matches within the result set is evaluated (block 825). For example, if the first tile in the result set has a match value of less than 50% compared to the query tile, the results are too different from the query tile, and depth searching may Do not continue along this edge (branch) of the quadtree (block 830). On the other hand, if the results are sufficiently precise, the current result set is returned as the result of the recursive search (block 835). A minimum quality threshold is set as part of the query.

一実施形態では、マッチの品質が、ベクトル間の差分として評価される。例えば、あるタイルにおける各々のピクセルが、当該ピクセルの色および／または輝度を表す複数の数値を有する。そして、それぞれのタイルが、そのタイルにおける全てのピクセルについてのそのような数値のアレイ又はベクトルを有する。そして、２つのタイルが、それぞれのタイルのベクトル間の距離又は平均二乗誤差を算出することによって比較される。 In one embodiment, match quality is evaluated as the difference between the vectors. For example, each pixel in a tile has multiple numerical values representing the color and/or brightness of that pixel. Each tile then has an array or vector of such numbers for every pixel in that tile. The two tiles are then compared by calculating the distance or mean squared error between the vectors of each tile.

一実施形態では、現在のズームレベルがターゲットレベルでない（ブロック８２０）場合、再帰的検索が図８Ｂに示すように続行する。図８Ｂに示すように、結果セット内の各タイルについて、サブセットのサイズが所定の閾値Ｔｈ３未満よりも小さい（ブロック８４０）場合、そのタイルがサブセットに追加される（ブロック８４５）。他方で、サブセットのサイズが所定の閾値Ｔｈ３以上である（ブロック８４０）場合、そのサブセットについて新たな再帰的検索が開始される（ブロック８５０）。この再帰的検索の結果が暫定的な結果セットに追加されて（ブロック８５５）、そのサブセットがクリアされる（ブロック８６０）。これにより、前述したように、再帰的検索が実行されたセットの大きさが小さくなる。当初の結果セット内における各タイルの処理が済んで且つこれらのタイルがサブセットに追加されると、残りのサブセットについて再帰的検索が実行されて（ブロック８６５）、この結果が前記暫定的な結果セットに追加されて（ブロック８７０）、そのサブセットがクリアされる（ブロック８７５）。そして、前記暫定的な結果セットが、この再帰的検索の結果として返される（ブロック８８０）。 In one embodiment, if the current zoom level is not the target level (block 820), the recursive search continues as shown in Figure 8B. As shown in FIG. 8B, for each tile in the result set, if the size of the subset is less than a predetermined threshold Th3 (block 840), that tile is added to the subset (block 845). On the other hand, if the size of the subset is greater than or equal to the predetermined threshold Th3 (block 840), a new recursive search is initiated for that subset (block 850). The results of this recursive search are added to the interim result set (block 855) and that subset is cleared (block 860). This reduces the size of the set on which the recursive search is performed, as described above. After each tile in the original result set has been processed and these tiles have been added to the subset, a recursive search is performed on the remaining subset (block 865) and the results are added to the interim result set. (Block 870) and the subset is cleared (Block 875). The provisional result set is then returned as the result of this recursive search (block 880).

一実施形態では、各々のレベルでの１つ以上のタイルのリストが、それらとマッチするクエリタイルに基づいて並び替えられるか、あるいは、ターゲットレベルでない場合には少なくとも１つのクエリタイルの少なくとも１つの親タイルとのマッチ度合いに基づいて並び替えられる。 In one embodiment, the list of one or more tiles at each level is sorted based on the query tiles that match them, or at least one of at least one of the query tiles if not at the target level. Sorted based on degree of matching with parent tiles.

図９は、本発明の一実施形態におけるスライドのレイアウトを示す図である。具体的に述べると、生体組織または他の標本が、クエリスライド上に置かれている。このクエリスライドから、クエリスライド画像が準備される。例えば、このクエリスライド画像は、正方形または他の形状のタイルへと空間的にセグメント化されたデジタルファイルである。これらのクエリスライドタイルから、少なくとも１つのタイルが、少なくとも１つのクエリタイル（又は検索タイル）として特定される。そして、クエリタイルから得られた情報が、類似性尺度エンジンに入力される。一実施形態において、前記類似性尺度エンジンは、前記クエリタイルを所望のサイズ及びメモリ値に正規化する。 FIG. 9 is a diagram showing the layout of slides in one embodiment of the present invention. Specifically, a biological tissue or other specimen is placed on a query slide. From this query slide a query slide image is prepared. For example, this query slide image is a digital file that has been spatially segmented into square or other shaped tiles. From these query slide tiles, at least one tile is identified as at least one query tile (or search tile). The information obtained from the query tiles is then input into the similarity measure engine. In one embodiment, the similarity measure engine normalizes the query tiles to a desired size and memory value.

一実施形態において、正規化は、利用可能などのような方法を含むものであってもよい。一実施形態において、少なくとも１つのタイル又はスライド画像の正規化は、ミクロン記憶ピクセル値に関するメタデータ情報を得ることを含む。例えば、画像Ａが１ピクセルスケール当たり２０ミクロンであり且つ画像Ｂが１ピクセルスケール当たり４０ミクロンであるとすると、中間レベルは、これら２つのスライドの１ピクセルスケール当たりのミクロンを等しくするように（例えば、いずれも１ピクセルスケール当たり２０ミクロンまたは他のレベル等として）算出される。一実施形態では、所望の画像検索に応じて異なる結果が得られるように、利用可能な最も高い解像度キャプチャ、利用可能な中程度の解像度キャプチャ、または利用可能な最も低い解像度キャプチャを探してもよい。一実施形態では、色正規化または色補正が行われる。例えば、少なくとも２つのタイル又は画像の明るさ、輝度および色が決定されて、検索のために同様のレベルへと変更される。例えば、同じスライドについて２つの異なる機械で処理した場合に（又は２つの異なるスキャン動作を行った場合に）、その結果として生じる２つの画像を色正規化または色補正をすることにより、これと同じ決定結果に基づいて、これら機械からのその他のあらゆるスライドを補正することが可能となる。例えば、２つのスキャンされた画像の輝度値、ＲＧＢ値、および他の光ベースパラメータ又は色ベースパラメータが比較されて、一方又は両方をパラメータレベルの特定のセットに変更する決定がなされて、同じソースからのそれ以降の画像について色（色、明るさ。輝度等）に関して同じ変更または補正がなされる。 In one embodiment, normalization may include any available method. In one embodiment, normalizing at least one tile or slide image includes obtaining metadata information regarding micron storage pixel values. For example, if image A is 20 microns per pixel scale and image B is 40 microns per pixel scale, the intermediate level is to equalize the microns per pixel scale for these two slides (e.g. , both are calculated as 20 microns per pixel scale, or other levels, etc.). In one embodiment, one may look for the highest resolution capture available, medium resolution capture available, or lowest resolution capture available to yield different results depending on the desired image search. . In one embodiment, color normalization or color correction is performed. For example, the brightness, intensity and color of at least two tiles or images are determined and changed to similar levels for retrieval. For example, if the same slide is processed on two different machines (or if two different scanning operations are performed), the two resulting images can be color-normalized or color-corrected to achieve the same result. Any other slides from these machines can be corrected based on the results of the determination. For example, the luminance values, RGB values, and other light-based or color-based parameters of two scanned images are compared and a decision is made to change one or both to a particular set of parameter levels, and the same source The same changes or corrections are made with respect to color (color, brightness, brightness, etc.) for subsequent images from .

一実施形態において、前記類似性尺度エンジンは、少なくとも１つのクエリタイルを、１つ以上の場所に位置したターゲットタイルと比較する。例えば、ターゲットタイルは、１つ以上の物理的位置又はサーバの、１つ以上のデータベースに位置している。例えば、前記ターゲットタイルのうちの少なくとも１つのターゲットタイルは、１つ以上のターゲットスライドタイルからのものである。この１つ以上のターゲットスライドタイルは、ターゲットスライド画像またはターゲットスライドデジタル画像から準備されたものである。前記ターゲットスライド画像は、ソースからアップロードされておよび／または少なくとも１つのターゲットスライドから生成される。そのターゲットスライドは、組織標本または他のサンプルを用いて準備されたものである。 In one embodiment, the similarity measure engine compares at least one query tile with target tiles located at one or more locations. For example, target tiles are located in one or more databases at one or more physical locations or servers. For example, at least one of the target tiles is from one or more target slide tiles. The one or more target slide tiles are prepared from the target slide image or the target slide digital image. The target slide images are uploaded from a source and/or generated from at least one target slide. The target slide was prepared with a tissue specimen or other sample.

図１０は、タイルのフィーチャ比較の一例を示す図である。クエリタイルまたはデジタル画像クエリが、フィーチャ抽出エンジンに入力される。このフィーチャ抽出エンジンは、そのクエリタイルの特定のパラメータおよび／またはフィーチャを決定する。例えば、このフィーチャ抽出エンジンは、フィーチャの所定のセットを用いて、カラーピクセル、サイズ等についてのデータおよび／または測定結果を生成する。前記フィーチャ抽出エンジンを用いて求められたデータおよび／または測定結果の一部又は全部が、比較エンジンに入力されて、この比較エンジンが、前記クエリタイルの少なくとも１つのフィーチャの、少なくとも１つのターゲットタイルのデータおよび／または測定結果との類似性を比較する。その少なくとも１つのターゲットタイルのデータおよび／または測定結果は、フィーチャ抽出エンジン等を用いて得られたものである。 FIG. 10 is a diagram illustrating an example of tile feature comparison. A query tile or digital image query is input to the feature extraction engine. The feature extraction engine determines certain parameters and/or features of the query tile. For example, the feature extraction engine uses a predetermined set of features to generate data and/or measurements on color pixels, size, and/or the like. Some or all of the data and/or measurements determined using the feature extraction engine are input to a comparison engine, which compares at least one target tile of at least one feature of the query tile. data and/or measurements for similarity. The data and/or measurements for the at least one target tile were obtained using a feature extraction engine or the like.

図１１は、本発明の一実施形態における、タイルの空間的分解の一例を示す図である。スライド画像は、デジタルファイルまたは他の電子データもしくは電子情報であり、当該スライド画像のスライドタイル又は断片として特定することが可能なものである。そして、そのスライドタイルが、より低いレベルのタイルのセットへと分解される。 FIG. 11 is a diagram illustrating an example of spatial decomposition of tiles in one embodiment of the present invention. A slide image is a digital file or other electronic data or information that can be identified as a slide tile or piece of the slide image. The slide tile is then decomposed into lower level sets of tiles.

図１２は、本発明の一実施形態における、タイルのスケール分解の一例を示す図である。スライド画像が複数のタイルへとセグメント化されて、それらのタイルのうちの１つがより低いレベルの複数のタイルへと分解される。そして、これらのより低いレベルのタイルのうちの１つが、例えば本明細書で説明した各種実施形態を用いて、他の類似するタイル画像の検索に使用される。 FIG. 12 is a diagram illustrating an example of scale decomposition of tiles in one embodiment of the present invention. A slide image is segmented into tiles and one of the tiles is decomposed into lower level tiles. One of these lower level tiles is then used to search for other similar tile images using, for example, various embodiments described herein.

本発明は、本明細書で説明した実施形態を含め、デジタル電子回路、コンピュータハードウェア、ファームウェア、ソフトウェア、コンピュータプログラムプロダクト、機械読取り可能な記憶装置、プロセッサ、コンピュータ等のデータ処理装置を制御又は実行する伝播信号で実現可能である。本発明は、本明細書で説明した実施形態を含め、任意の形式のプログラミング言語で記述可能であり、かつ、スタンドアローンプログラムまたは別のプログラムのコンポーネントとして実現可能である。コンピュータプログラムは、単一の箇所又は２つ以上の箇所の、１つ以上のコンピュータにおいて配備および／または記憶および／または実行および／または当該１つ以上のコンピュータへと送信および／または当該１つ以上のコンピュータから送信することが可能である。 The present invention, including any embodiment described herein, controls or executes data processing apparatus such as digital electronic circuits, computer hardware, firmware, software, computer program products, machine-readable storage devices, processors, computers or the like. can be realized with a propagating signal that The invention, including the embodiments described herein, can be written in any form of programming language and can be implemented as a stand-alone program or as a component of another program. A computer program may be deployed and/or stored and/or executed and/or transmitted to and/or transmitted to one or more computers at a single location or at two or more locations. can be sent from any computer.

本発明において、方法の任意の過程（工程）は、少なくとも１つのプログラマブルプロセッサ、コンピュータ、タブレット、スマートフォン、携帯スマートデバイス等が、コンピュータプログラムを、入力データに作用して出力を生成することにより機能を実施するように実行することによって行われる。記憶媒体には、ＥＰＲＯＭ、ＥＥＰＲＯＭ、フラッシュメモリデバイス、チップカード、磁気カード、バーコード、ＱＲコード（登録商標）、ＤＶＤ－ＲＯＭ、ＣＤ－ＲＯＭ、内部ハードディスク、リムーバブルディスク、磁気ディスク、光磁気ディスク、光ディスク等が含まれる。本発明は、表示装置、陰極線管（ＣＲＴ）モニタ、液晶ディスプレイ（ＬＣＤ）モニタ、ＬＥＤモニタ、タッチスクリーン等にコンピュータから表示を行うことにより、ユーザとのインタラクション（相互作用）を可能にする。 In the present invention, any step (step) of the method is performed by at least one programmable processor, computer, tablet, smart phone, portable smart device, etc., executing a computer program to operate on input data and generate output. It is done by doing as you do. Storage media include EPROM, EEPROM, flash memory devices, chip cards, magnetic cards, barcodes, QR codes (registered trademark), DVD-ROMs, CD-ROMs, internal hard disks, removable disks, magnetic disks, magneto-optical disks, Optical discs and the like are included. The present invention enables user interaction by displaying from a computer to a display device, cathode ray tube (CRT) monitor, liquid crystal display (LCD) monitor, LED monitor, touch screen, or the like.

本発明は、実装によっては大量のデータを処理する。データは、データサーバ、アプリケーションサーバ、クラウドサーバ等のバックエンドコンポーネント、および、キーボード、音声入力キーボード、グラフィカルユーザインターフェース、ウェブブラウザ等のユーザインターフェース機能部などに記憶される。本発明において、コンポーネント同士は、ローカルエリアネットワーク（ＬＡＮ）、ワイドエリアネットワーク（ＷＡＮ）（例えば、インターネット）等の通信ネットワークなどといった、任意の形態のデジタルデータ通信によって相互接続される。 The present invention processes large amounts of data in some implementations. Data is stored in back-end components such as data servers, application servers, cloud servers, and user interface functions such as keyboards, voice input keyboards, graphical user interfaces, web browsers, and the like. In the present invention, the components are interconnected by any form of digital data communication, such as a communication network such as a local area network (LAN), wide area network (WAN) (eg, the Internet).

前述した実施形態の説明及び図示は、あくまでも例示に過ぎず本発明を限定するものでないと解釈されるべきである。例えば、前述した実施形態のうち、異なる実施形態に含まれる異なる構成／構成要素同士を、互いに組み合わせたり互いに組み合わせたりせずといった具合いで、様々な組合せで実施することも可能である。これまでの教示内容および添付の特許請求の範囲からみて、様々な変更、変形および改良が可能であり、かつ、これらの変更、変形および改良は、本発明の精神及び範囲内に包含される。 The foregoing description and illustration of the embodiments should be construed as illustrative only and not limiting of the present invention. For example, different features/components included in different embodiments of the embodiments described above may be implemented in various combinations, such as with or without each other. Various alterations, modifications and improvements are possible in light of the above teachings and the scope of the appended claims and are intended to be within the spirit and scope of the invention.

本発明を特定の例や特定の実施形態を参照しながら説明してきたが、本発明がこれらの例や実施形態に限定されないことを理解されたい。本発明は、本明細書で説明した特定の例や実施形態を変形したものも包含する。
なお、本発明は、実施の態様として以下の内容を含む。
〔態様１〕
コンピュータによって実行される、画像のデータベースをクエリに基づいて検索する方法であって、
前記クエリの倍率レベルが第１の閾値よりも大きいとの判定に応答して、前記クエリのその倍率レベルで当該クエリを満たす第１のリストの結果タイルを返す過程と、
前記クエリの倍率レベルが前記第１の閾値以下であるとの判定に応答して、次に低い倍率レベルのタイルを取り出す過程かつ当該次に低い倍率レベルで前記クエリを満たす第２のリストの結果タイルを返す過程と、
各リストの結果タイルを処理する過程であって、結果タイルのそれぞれについて、
その結果タイルを、結果タイルのサブセットに追加すること、
前記サブセット内の結果タイルの総数が第２の閾値以上であるとの判定に応答して、前記サブセットを再帰的に検索すること、
前記サブセットについての各々の再帰的検索の結果を、残りのサブセットへと保存すること、
前記残りのサブセットを再帰的に検索すること、および
前記残りのサブセットについての検索の結果を保存すること、
を有する、過程と、
を含む、方法。
〔態様２〕
態様１に記載の方法において、各々の倍率レベルが、四分木における１つのレベルに対応し、かつ、前記四分木における各々のレベルが、画像結果を表すタイルを含む、方法。
〔態様３〕
態様２に記載の方法において、子供タイルが、親タイルと連関性を有する、方法。
〔態様４〕
態様３に記載の方法において、前記親タイルが、対応する少なくとも１つの前記子供タイルの、ダウンサンプリングおよびローパス空間的フィルタリングのうちの少なくとも１つである、方法。
〔態様５〕
態様２に記載の方法において、前記次に低い倍率レベルのタイルを取り出す過程が、前記四分木における次に低いレベルで子供を生成することである、方法。
〔態様６〕
態様１に記載の方法において、前記クエリが、結果数の最小閾値および結果数の最大閾値のうちの少なくとも１つを含む、方法。
〔態様７〕
態様１に記載の方法において、前記クエリが、画像を含み、かつ、前記クエリの前記倍率レベルが、その画像の倍率レベルである、方法。
〔態様８〕
態様７に記載の方法において、結果タイルが、当該結果タイルの倍率、前記クエリの画像、当該結果タイルに関連付けられたインデックスのファイル名、結果のサイズ、およびインデックスの種類のうちの少なくとも１つに基づいて、返すリストに含められる、方法。
〔態様９〕
態様１に記載の方法において、所定の前記第１の閾値は、検索結果数がある数値を下回るとの判定に応答して当該方法が終了するように定義される、方法。
〔態様１０〕
態様１に記載の方法において、前記クエリが、前記第１のリストの結果タイルを返す過程後に更新される、方法。
〔態様１１〕
態様１に記載の方法において、さらに、
各リストの結果タイルを処理する過程前に、前記第１のリストの結果タイルおよび前記第２のリストの結果タイルのうちの少なくとも１つから結果を除外する過程、
を含む、方法。
〔態様１２〕
態様１に記載の方法において、前記クエリが、品質の閾値レベルを含む、方法。
〔態様１３〕
態様１に記載の方法において、前記クエリが時間制限を含み、前記探索はこの時間制限で実行される、方法。
〔態様１４〕
クエリタイルに基づいてタイルセットの再帰的検索を実行する方法であって、
前記タイルセットの各タイルについて、結果セットが充実するまで、
次のレベルからタイルのセットを取り出すこと、および
前記次のレベルのタイルセットを前記結果セットに追加すること、
を実行する過程と、
倍率レベルが所定のターゲットレベルであるとの判定に応答して、前記結果セット内のマッチの品質を評価する過程と、
倍率レベルが前記ターゲットレベルよりも下であるとの判定に応答して、前記結果セット内の各タイルについて、
サブセット内のタイルの数が第３の閾値以上であるとの判定に応答して、そのタイルを当該サブセットに追加すること、
前記サブセット内のタイルの数が前記第３の閾値よりも小さいとの判定に応答して、
前記サブセットを再帰的に検索することと、
この再帰的な検索の結果を暫定的な結果セットに追加することと、
前記サブセットをクリアすることと、
を実行すること、
前記サブセットを再帰的に検索すること、
検索の結果を前記暫定的な結果セットに追加すること、
前記サブセットをクリアすること、および
前記暫定的な結果セットを返すこと、
を実行する過程と、
を含む、方法。
〔態様１５〕
態様１４に記載の方法において、マッチの品質を評価する過程は、前記結果セット内の第１のタイルの、前記クエリタイルと比較してのマッチ値が、所定値未満であるか否かを判定することを有する、方法。
〔態様１６〕
態様１４に記載の方法において、あるタイルにおける各々のピクセルが、当該各々のピクセルの色および輝度のうちの少なくとも１つを表す少なくとも１つの数値を有し、かつ、それぞれのタイルが、そのタイルにおける全てのピクセルについての前記少なくとも１つの数値のベクトルを有する、方法。
〔態様１７〕
コンピュータによって実行される、画像データのレポジトリを連続的に処理する方法であって、
データの要求を含むクエリ指定を受け取る過程と、
当該方法が実行される前記コンピュータのシステム指定を受け取る過程と、
ドメイン指定を決定するために、前記クエリ指定と前記システム指定とを比較する過程と、
前記レポジトリに対するクエリを、前記ドメイン指定に基づいて開始する過程と、
前記クエリの、画像データを含む結果を受け取る過程と、
前記結果画像データのインタラクティブかつ反復的な探索を、グラフィカルユーザインターフェースに表示する過程と、
前記結果画像データの入力を、前記グラフィカルユーザインターフェースを介して受け取る過程と、
前記クエリを、受け取った前記入力に基づいて更新する過程と、
更新された結果画像データに基づいて、更新された前記グラフィカルユーザインターフェースを表示する過程と、
を含む、方法。
〔態様１８〕
態様１７に記載の方法において、画像データの前記レポジトリが、顕微鏡のデジタルデータを含む、方法。
〔態様１９〕
態様１７に記載の方法において、前記画像データの連続的な処理は、当該処理が完全に終了する前に結果が利用可能となるように、漸増的に結果を生成する、方法。
〔態様２０〕
態様１７に記載の方法において、前記クエリ指定が、データのインデックス及び変換を暗示的に定義する、方法。 Although the invention has been described with reference to specific examples and specific embodiments, it is to be understood that the invention is not limited to these examples or embodiments. The invention also encompasses variations on the specific examples and embodiments described herein.
In addition, this invention includes the following contents as a mode of implementation.
[Aspect 1]
A computer-implemented method for query-based searching of a database of images comprising:
returning a first list of result tiles satisfying the query at that query magnification level in response to determining that the query magnification level is greater than a first threshold;
retrieving a tile with the next lower magnification level in response to determining that the query's magnification level is less than or equal to the first threshold and results in a second list that satisfies the query at the next lower magnification level. the process of returning the tiles;
processing the result tiles of each list, wherein for each result tile:
adding the resulting tile to the subset of resulting tiles;
recursively searching the subset in response to determining that the total number of result tiles in the subset is greater than or equal to a second threshold;
saving the results of each recursive search for said subset into a remaining subset;
recursively searching the remaining subset; and storing results of searching for the remaining subset;
a process comprising
A method, including
[Aspect 2]
A method according to aspect 1, wherein each magnification level corresponds to a level in a quadtree, and each level in the quadtree includes tiles representing image results.
[Aspect 3]
3. The method of aspect 2, wherein a child tile has an association with a parent tile.
[Aspect 4]
4. The method of aspect 3, wherein the parent tile is at least one of downsampling and low-pass spatial filtering of the corresponding at least one child tile.
[Aspect 5]
3. The method of claim 2, wherein the step of retrieving the next lower magnification level tile is generating children at the next lower level in the quadtree.
[Aspect 6]
2. The method of claim 1, wherein the query includes at least one of a minimum threshold number of results and a maximum threshold number of results.
[Aspect 7]
2. The method of aspect 1, wherein the query includes an image, and the magnification level of the query is the magnification level of that image.
[Aspect 8]
8. The method of claim 7, wherein a result tile is at least one of a scale factor of the result tile, an image of the query, a filename of an index associated with the result tile, a result size, and an index type. Based on the method to be included in the returned list.
[Aspect 9]
2. The method of aspect 1, wherein the first predetermined threshold is defined such that the method terminates in response to determining that the number of search results falls below a certain number.
[Aspect 10]
2. The method of aspect 1, wherein the query is updated after returning the first list of result tiles.
[Aspect 11]
The method of aspect 1, further comprising:
excluding results from at least one of the first list of result tiles and the second list of result tiles before processing each list of result tiles;
A method, including
[Aspect 12]
2. The method of aspect 1, wherein the query includes a threshold level of quality.
[Aspect 13]
2. The method of aspect 1, wherein the query includes a time limit, and the search is performed with the time limit.
[Aspect 14]
A method for performing a recursive search of tilesets based on query tiles, comprising:
For each tile in said tileset, until the result set is full:
retrieving a set of tiles from a next level; and adding said next level tile set to said result set;
and
evaluating the quality of matches in the result set in response to determining that a magnification level is at a predetermined target level;
In response to determining that a magnification level is below the target level, for each tile in the result set:
adding the tile to the subset in response to determining that the number of tiles in the subset is greater than or equal to a third threshold;
In response to determining that the number of tiles in the subset is less than the third threshold,
recursively searching the subset;
adding the results of this recursive search to a provisional result set;
clearing the subset;
to run
recursively searching the subset;
adding results of a search to the preliminary result set;
clearing the subset and returning the interim result set;
and
A method, including
[Aspect 15]
15. The method of aspect 14, wherein assessing match quality comprises determining whether a match value of a first tile in the result set compared to the query tile is less than a predetermined value. A method having to.
[Aspect 16]
15. The method of aspect 14, wherein each pixel in a tile has at least one numeric value representing at least one of color and brightness of each pixel, and each tile has A method comprising a vector of said at least one numerical value for every pixel.
[Aspect 17]
A computer implemented method for continuously processing a repository of image data comprising:
receiving a query specification containing a request for data;
receiving a system designation of the computer on which the method is to be performed;
comparing the query specification and the system specification to determine a domain specification;
initiating queries to the repository based on the domain specification;
receiving results of the query, including image data;
displaying an interactive and iterative exploration of the resulting image data in a graphical user interface;
receiving input of the resulting image data via the graphical user interface;
updating the query based on the input received;
displaying the updated graphical user interface based on updated result image data;
A method, including
[Aspect 18]
18. The method of aspect 17, wherein the repository of image data comprises microscopy digital data.
[Aspect 19]
18. The method of aspect 17, wherein the successive processing of the image data produces results incrementally such that the results are available before the processing is completely finished.
[Aspect 20]
18. The method of aspect 17, wherein the query specification implicitly defines data indices and transformations.

Claims

1. A computer-implemented method for searching a database of images based on one or more query tiles included in a query, comprising:
retrieving the original capture resolution of an image in the database of images using a query tile for comparison and providing this query tile;
retrieving an image stored in the database having a magnification level equal to or greater than the query tile magnification level in response to determining that the query tile has a magnification level greater than a first threshold; returning a first list of result tiles that satisfy the query tile at that magnification level;
In response to determining that the query tile's magnification level is less than or equal to the first threshold, an image stored in the database is searched for a tile with the next lowest magnification level to the query tile's magnification level. retrieving and returning a second list of result tiles that satisfy the query tile at the next lowest magnification level;
processing the result tiles of each list, wherein for each result tile:
adding the resulting tile to a subset of the plurality of subsets of resulting tiles;
recursively searching the subset for the query tiles in response to determining that the total number of result tiles in the subset of result tiles is greater than or equal to a second threshold;
saving results of each recursive search for a subset of the result tiles into a remaining subset of the plurality of subsets;
recursively searching the remaining subset for the query tiles; and saving results of searching for the remaining subset;
a process comprising
A method, including

2. The method of claim 1, wherein each magnification level corresponds to a level in a quadtree, and each level in the quadtree contains result tiles representing an image.

3. The method of claim 2, wherein child tiles have relationships with parent tiles.

4. The method of claim 3, wherein the parent tile is at least one of a downsampled tile and a low-pass spatially filtered tile of the corresponding at least one child tile.

2. The method of claim 1, in response to determining that the total number of result tiles in the subset of result tiles is less than the second threshold,
recursively searching a subset of the result tiles for the query tile;
Add the results of the recursive search to the provisional result set,
Clear said subset, then
recursively searching the remaining subset for the query tile;
adding the results of the search to the interim result set;
clearing said remaining subset;
A method that returns the interim result set.

2. The method of claim 1, wherein the query includes an image and the magnification level of the query is the magnification level of that image.

7. The method of claim 6, wherein a result tile is at least one of the result tile scale, the query image, the index filename associated with the result tile, the size of the result, and the type of index. The method to be included in the returned list based on.

2. The method of claim 1, wherein the second predetermined threshold is defined such that the method terminates in response to determining that the number of search results falls below a certain number.

7. The method of claim 6, wherein the query is updated after returning a preliminary result set.

2. The method of claim 1, wherein the query includes a threshold level of quality.

2. The method of claim 1, wherein the query includes a time limit, and the search is performed with the time limit.

A computer-implemented method for performing a recursive search of a tileset based on a query tile, comprising:
For each tile of the current tileset in the recursive search, until the result set for said query tile is full:
retrieving a set of tiles from the next level of the current tileset; and adding the next level tileset to the result set;
and
evaluating a degree of match of the result set to the query tile in response to determining that the query tile's magnification level is at a predetermined target level;
In response to determining that the query tile's magnification level is below the target level, for each tile in the result set:
adding the tile to the subset in response to determining that the number of tiles in the subset is greater than or equal to a third threshold;
In response to determining that the number of tiles in the subset is less than the third threshold,
recursively searching the subset for the query tile;
adding the results of this recursive search to a provisional result set;
clearing the subset;
to run
recursively searching the subset for the query tile;
adding results of a search to the preliminary result set;
clearing the subset and returning the interim result set;
and
A method, including

13. The method of claim 12, wherein the step of evaluating a degree of match to the query tile comprises: if a first tile in the result set has a match value compared to the query tile less than a predetermined value; A method comprising determining whether there is a

13. The method of claim 12, wherein each pixel in a tile has at least one numerical value representing at least one of color and brightness of each pixel, and each tile has a vector of said at least one numerical value for every pixel in .