JP4171919B2

JP4171919B2 - Structure estimation system, structure estimation method and program

Info

Publication number: JP4171919B2
Application number: JP2005223566A
Authority: JP
Inventors: 正晃川田; 主税佐藤
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2005-08-02
Filing date: 2005-08-02
Publication date: 2008-10-29
Anticipated expiration: 2025-08-02
Also published as: JP2007041738A

Description

本発明は、分子の立体構造を推定するシステム、方法およびプログラムに関する。 The present invention relates to a system, method, and program for estimating a three-dimensional structure of a molecule.

分子の立体構造を解析する技術として、従来、解析対象となるタンパク質等の生体高分子を結晶化させた試料を作成し、この試料をＸ線結晶構造解析法とＮＭＲ法とにより解析および評価する方法がある。ところが、この方法では、試料の結晶化が必要となるため、膜タンパク質等の結晶化が困難な試料の解析に適用することができなかった。 As a technique for analyzing the three-dimensional structure of a molecule, a sample obtained by crystallizing a biopolymer such as a protein to be analyzed has been prepared, and this sample is analyzed and evaluated by an X-ray crystal structure analysis method and an NMR method. There is a way. However, since this method requires crystallization of the sample, it could not be applied to analysis of a sample that is difficult to crystallize membrane protein or the like.

そこで、試料の結晶化を必要としない方法として、極低温透過型電子顕微鏡（Ｃｒｙｏ−ＴＥＭ）を用いてタンパク質の構造解析を行う単粒子構造解析法が提案されている（非特許文献1）。単粒子構造解析法は、透過型電子顕微鏡（ＴＥＭ）中で極低温に冷却した粒子状のタンパク質試料から得られたＴＥＭ画像から、個々の球状タンパク質試料の画像を切り出し、これらの試料画像から粒子状タンパク質試料の３次元構造を推定する方法である。この方法によれば、試料の画像を加算平均することにより、試料画像のノイズを低減させることができるとされている。
佐藤主税他３名、「単粒子解析法による結晶を用いない蛋白質構造解析：電圧依存性Ｎａ+チャンネルの構造を例として」、電子顕微鏡、２００２年、第３７巻、第１号、ｐ．４０−４４ Therefore, as a method that does not require crystallization of a sample, a single particle structure analysis method has been proposed in which a protein structure analysis is performed using a cryogenic transmission electron microscope (Cryo-TEM) (Non-patent Document 1). The single particle structure analysis method cuts out images of individual globular protein samples from TEM images obtained from particulate protein samples cooled to cryogenic temperatures in a transmission electron microscope (TEM), and particles are extracted from these sample images. Is a method for estimating the three-dimensional structure of a protein sample. According to this method, the noise of the sample image can be reduced by averaging the sample images.
Sato, and others, “Protein structure analysis without using crystals by single particle analysis: taking the structure of voltage-dependent Na + channel as an example”, Electron Microscope, 2002, Vol. 37, No. 1, p. 40-44

ところが、単粒子構造解析法のように試料の画像データを用いて分子の構造解析を行う場合、構造解析に必要なデータの量およびデータ処理の量が膨大であった。このため、解析を要求するユーザの端末や、データ処理を行う端末への負荷が大きく、解析に長時間を要していた。 However, when the molecular structure analysis is performed using the sample image data as in the single particle structure analysis method, the amount of data necessary for the structure analysis and the amount of data processing are enormous. For this reason, the load on the user's terminal requesting the analysis and the terminal performing the data processing is large, and the analysis takes a long time.

本発明は上記事情に鑑みてなされたものであり、画像データを用いた分子の構造解析において、データ処理を効率よく行う技術を提供する。 The present invention has been made in view of the above circumstances, and provides a technique for efficiently performing data processing in molecular structure analysis using image data.

本発明者は、画像データを用いた分子の構造解析における画像処理の効率化、迅速化を図るべく鋭意検討を行った。その結果、複数の測定画像および複数のリファレンス画像を用いたデータ処理に所定の分散処理を適用することにより、データ処理の際に発生する負荷が軽減され、迅速な解析処理が可能となることを見出し、本発明に至った。 The present inventor has intensively studied to improve the efficiency and speed of image processing in molecular structure analysis using image data. As a result, by applying predetermined distributed processing to data processing using a plurality of measurement images and a plurality of reference images, the load generated during data processing can be reduced and rapid analysis processing can be performed. The headline, the present invention has been reached.

本発明によれば、
複数の画像のデータから分子の立体構造を推定するシステムであって、
ユーザのリクエストを受け付けて、前記分子の複数の測定画像に基づく構造解析を要求するユーザ端末と、
複数のデータ処理端末から構成されるデータ処理システムと、
前記ユーザ端末からの要求を受け付けて、前記データ処理システムにデータ処理を要求するスケジューラと、
がネットワークを介して接続されており、
前記データ処理システムが、
複数のリファレンス画像のデータが分配される複数のリファレンスデータ記憶部と、複数のデータ処理部と、を含み、
前記データ処理部の各々が、前記スケジューラからの前記データ処理の要求に応じて、複数の前記測定画像のそれぞれについて、前記測定画像のデータまたは前記測定画像を平行移動させてなる平行移動画像のデータと、自己に割り当てられた前記リファレンス画像のデータと当該リファレンス画像を回転移動させてなる回転移動画像の群のデータとから構成されるリファレンスデータセットと、を比較して、前記測定画像と割り当てられた前記リファレンス画像との類似度を評価する処理を行い、
前記スケジューラが、
類似度を評価する前記処理の結果に基づいて、複数の前記リファレンス画像のいずれかに複数の前記測定画像を対応づける測定データ分類部と、
前記測定データ分類部にて同じ前記リファレンス画像に対応づけられた複数の前記測定画像を平均化して得られる平均化画像のデータを複数の前記リファレンス画像について取得し、取得した複数の前記平均化画像に基づき前記分子の前記立体構造を推定する構造推定部と、
を含む構造推定システムが提供される。 According to the present invention,
A system for estimating the three-dimensional structure of a molecule from data of a plurality of images,
A user terminal that accepts a user request and requests a structural analysis based on a plurality of measurement images of the molecule;
A data processing system comprising a plurality of data processing terminals;
A scheduler that accepts a request from the user terminal and requests data processing from the data processing system;
Are connected via a network,
The data processing system is
A plurality of reference data storage units to which data of a plurality of reference images are distributed, and a plurality of data processing units,
Each of the data processing units translates the measurement image data or the translation image data obtained by translating the measurement image for each of the plurality of measurement images in response to the data processing request from the scheduler. And a reference data set composed of data of the reference image assigned to itself and data of a group of rotationally moving images obtained by rotating and moving the reference image, and assigned to the measurement image. There line processing for evaluating the similarity between the reference image,
The scheduler
A measurement data classifying unit that associates a plurality of the measurement images with any one of the plurality of reference images based on the result of the processing for evaluating the similarity;
The averaged image data obtained by averaging the plurality of measurement images associated with the same reference image in the measurement data classifying unit is acquired for the plurality of reference images, and the plurality of averaged images acquired A structure estimation unit for estimating the three-dimensional structure of the molecule based on
A structure estimation system is provided.

また、本発明によれば、
複数の画像のデータから分子の立体構造を推定する方法であって、
ユーザ端末が、ユーザのリクエストを受け付けて、複数の測定画像に基づく前記分子の構造解析を要求するステップと、
スケジューラが前記ユーザ端末からの要求を受け付けて、複数のデータ処理端末から構成されるデータ処理システムにデータ処理を要求するステップと、
前記スケジューラが、前記データ処理システムに含まれる複数のリファレンスデータ記憶部に、複数のリファレンス画像のデータを分配させるステップと、
前記スケジューラが、前記データ処理システムに含まれる複数のデータ処理部に、複数の測定画像と複数のリファレンス画像との類似度を評価する処理を割り当てるステップと、
複数の前記データ処理部の各々が、前記スケジューラからの前記データ処理の要求に応じて、前記測定画像のデータまたは前記測定画像を平行移動させてなる平行移動画像のデータと、自己に割り当てられた前記リファレンス画像のデータと前記リファレンス画像を回転移動させてなる回転移動画像の群のデータとから構成されるリファレンスデータセットと、を比較して、前記測定画像と割り当てられた前記リファレンス画像との類似度を評価する前記処理を行うステップと、
前記スケジューラが、類似度を評価する前記処理の結果に基づいて、複数の前記測定画像を、それぞれ、複数の前記リファレンス画像のいずれかに対応づけるステップと、
前記スケジューラが、複数の前記リファレンス画像のいずれかに対応づけられた複数の前記測定画像を平均化して得られる複数の平均化画像のデータを取得して、前記平均化画像のデータに基づいて、前記分子の前記立体構造を推定するステップと、
を含む構造推定方法が提供される。 Moreover, according to the present invention,
A method for estimating the three-dimensional structure of a molecule from data of a plurality of images,
A user terminal accepting a user request and requesting a structural analysis of the molecule based on a plurality of measurement images;
A scheduler accepting a request from the user terminal and requesting data processing from a data processing system including a plurality of data processing terminals;
A step wherein the scheduler, the multiple's Reference data storage unit included in the data processing system, for distributing the data of a plurality of reference images,
A step wherein the scheduler is to allocate the data processing plurality of data processing units included in the system, the process of evaluating the similarity between a plurality of measurement images and the plurality of reference images,
Each of the plurality of data processing units is assigned to the data of the measurement image or the data of the translation image obtained by translating the measurement image in response to the data processing request from the scheduler . by comparing the reference data set composed of a data group of rotational movement image composed by rotating moving data and the reference image of the reference image, similar to the reference image allocated to the measurement image Performing the process of evaluating the degree;
The scheduler associates each of the plurality of measurement images with one of the plurality of reference images based on the result of the process of evaluating the similarity,
The scheduler acquires a plurality of averaged image data obtained by averaging the plurality of measurement images associated with any of the plurality of reference images, and based on the averaged image data, Estimating the conformation of the molecule;
A structure estimation method is provided.

本発明においては、複数のリファレンス画像のデータが複数のリファレンスデータ記憶部に分配される。また、複数のデータ処理部が、測定画像と複数のリファレンス画像との類似度の評価に関するデータ処理を分担して行う。複数の測定画像と複数のリファレンス画像とを用いる際に必要な膨大なデータ処理を複数のデータ処理部に分散させるため、データ処理システムにおける局所的な負荷の増加を低減することができる。このため、データ処理を効率よく行い、処理速度を増加させることができる。また、ユーザ端末においてデータ処理を行う必要がないため、ユーザ端末への負荷を低減させることができる。 In the present invention, data of a plurality of reference images is distributed to a plurality of reference data storage units. A plurality of data processing units share and perform data processing related to the evaluation of the similarity between the measurement image and the plurality of reference images. Since enormous data processing required when using a plurality of measurement images and a plurality of reference images is distributed to a plurality of data processing units, an increase in local load in the data processing system can be reduced. For this reason, data processing can be performed efficiently and the processing speed can be increased. Moreover, since it is not necessary to perform data processing in the user terminal, the load on the user terminal can be reduced.

本発明の構造推定システムにおいて、複数の前記リファレンスデータ記憶部に、複数の前記平均化画像のデータが更新リファレンス画像として再分配されるとともに、複数の前記データ処理部に前記更新リファレンス画像が割り当てられ、前記データ処理部が、前記測定画像と割り当てられた前記更新リファレンス画像との類似度を評価する処理を行い、前記測定データ分類部が、類似度を評価する前記処理の結果に基づいて、複数の前記更新リファレンス画像のいずれかに複数の前記測定画像を対応づけし、前記構造推定部が、前記測定データ分類部にて同じ前記更新リファレンス画像に対応づけられた複数の前記測定画像を平均化して得られる平均化画像のデータを複数の前記更新リファレンス画像について取得し、取得した複数の前記平均化画像に基づき前記分子の前記立体構造を推定する構成とすることができる。この構成においては、分子の立体構造を推定する際に、リファレンス画像として複数の更新リファレンス画像が用いられる。このため、構造推定に用いるリファレンス画像の精度をさらに向上させることができる。 In the structure estimation system of the present invention, a plurality of the averaged image data are redistributed as updated reference images to the plurality of reference data storage units, and the updated reference images are allocated to the plurality of data processing units. The data processing unit performs a process of evaluating the similarity between the measurement image and the assigned update reference image, and the measurement data classifying unit performs a plurality of processes based on the result of the process of evaluating the similarity. A plurality of the measurement images are associated with any of the updated reference images, and the structure estimation unit averages the plurality of measurement images associated with the same updated reference image in the measurement data classification unit. The averaged image data obtained by acquiring the plurality of the updated reference images, the plurality of acquired It can be configured to estimate the three-dimensional structure of the molecule based upon the disproportionation image. In this configuration, a plurality of updated reference images are used as reference images when estimating the three-dimensional structure of molecules. For this reason, the precision of the reference image used for structure estimation can be further improved.

本発明の構造推定システムにおいて、前記スケジューラが、複数の前記データ処理部の稼働状態に基づいて、類似度を評価する前記処理について複数の前記リファレンス画像が割り当てられる複数のデータ処理部を選択するデータ処理部選択部と、複数の前記リファレンスデータセットの分配状況に関する情報を記憶するリファレンス分配情報記憶部と、を含んでもよい。また、本発明の構造推定システムにおいて、前記データ処理部選択部が、前記リファレンスデータセットを取得する複数の前記データ処理部をさらに選択してもよい。こうすることによって、データ処理をさらに効率よく進行させることができる。 In the structure estimation system of the present invention, the scheduler selects, based on operating states of the plurality of data processing units, data for selecting a plurality of data processing units to which the plurality of reference images are assigned for the processing for evaluating the degree of similarity. You may include a process part selection part and the reference distribution information storage part which memorize | stores the information regarding the distribution condition of the said some reference data set. In the structure estimation system of the present invention, the data processing unit selection unit may further select a plurality of the data processing units that acquire the reference data set. By doing so, the data processing can proceed more efficiently.

本発明の構造推定システムにおいて、複数の前記データ処理部が、前記リファレンスデータセットを取得するリファレンスデータセット取得部と、前記リファレンスデータセット取得部における前記リファレンスデータセットの取得方法を選択するリファレンスデータセット取得方法選択部と、を含んでもよい。また、本発明の構造推定システムにおいて、複数の前記データ処理部が、前記リファレンスデータセットを取得するリファレンスデータセット取得部を含み、前記スケジューラが、前記リファレンスデータセット取得部における前記リファレンスデータセットの取得方法を選択するリファレンスデータセット取得方法選択部を含んでもよい。この構成においては、リファレンスデータセットの取得が複数のデータ処理部に分散されるため、リファレンスデータの取得をさらに効率よく行うことができる。 In the structure estimation system of the present invention, a plurality of the data processing units select a reference data set acquisition unit for acquiring the reference data set, and a reference data set for selecting the reference data set acquisition method in the reference data set acquisition unit And an acquisition method selection unit. In the structure estimation system of the present invention, the plurality of data processing units include a reference data set acquisition unit that acquires the reference data set, and the scheduler acquires the reference data set in the reference data set acquisition unit. A reference data set acquisition method selection unit that selects a method may be included. In this configuration, since the acquisition of the reference data set is distributed to a plurality of data processing units, the reference data can be acquired more efficiently.

本発明において、前記リファレンスデータセットが、前記リファレンス画像のデータと、前記リファレンス画像を０度から３６０度まで回転させた回転移動画像群のデータとを含んでもよい。これにより、リファレンス画像と測定画像との類似度の評価をさらに確実に行うことができる。さらに、本発明において、前記リファレンスデータセットが、前記リファレンス画像のデータと、前記リファレンス画像を０度から９０度まで回転させた回転移動画像群のデータとを含んでもよい。こうすれば、回転移動画像群のうち、４回対称の画像群のみを作成しておけばよく、予め作成する回転移動画像の数を減らしつつ、類似度を評価する際に必要に応じて符号反転により３６０度までの回転画像のデータを容易に取得することができる。このため、システムへの負荷をさらに軽減するとともに、さらに処理を高速に行うことができる。 In the present invention, the reference data set may include data of the reference image and data of a rotationally moving image group obtained by rotating the reference image from 0 degrees to 360 degrees. Thereby, it is possible to more reliably evaluate the similarity between the reference image and the measurement image. In the present invention, the reference data set may include data of the reference image and data of a rotationally moving image group obtained by rotating the reference image from 0 degrees to 90 degrees. In this way, it is only necessary to create a four-fold symmetric image group from among the rotationally moving image group, and the number of rotationally moving images to be created in advance is reduced as necessary when evaluating the degree of similarity. Rotation image data up to 360 degrees can be easily acquired by inversion. For this reason, the load on the system can be further reduced and the processing can be performed at higher speed.

本発明において、前記分子が生体高分子であってもよい。さらに、この構成において、前記生体高分子が膜タンパク質であってもよい。こうすれば、膜タンパク質を結晶化せずに立体構造を推定できるため、一般に結晶化が困難な膜タンパク質の立体構造推定を確実に行うことができる。 In the present invention, the molecule may be a biopolymer. Furthermore, in this configuration, the biopolymer may be a membrane protein. In this way, since the three-dimensional structure can be estimated without crystallizing the membrane protein, the three-dimensional structure of a membrane protein that is generally difficult to crystallize can be reliably estimated.

なお、以上の構成要素の任意の組合せ、本発明の表現を方法、装置、システム、記録媒体、コンピュータプログラムなどの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

たとえば、本発明によれば、
コンピュータを、複数の画像のデータから分子の立体構造を推定するシステムであって、
ユーザのリクエストを受け付けて、前記分子の複数の測定画像に基づく構造解析を要求するユーザ端末と、
複数のデータ処理端末から構成されるデータ処理システムと、
前記ユーザ端末からの要求を受け付けて、前記データ処理システムにデータ処理を要求するスケジューラと、
がネットワークを介して接続されており、
前記データ処理システムが、
複数のリファレンス画像のデータが分配される複数のリファレンスデータ記憶部と、複数のデータ処理部と、を含み、
前記データ処理部の各々が、前記スケジューラからの前記データ処理の要求に応じて、複数の前記測定画像のそれぞれについて、前記測定画像のデータまたは前記測定画像を平行移動させてなる平行移動画像のデータと、自己に割り当てられた前記リファレンス画像のデータと当該リファレンス画像を回転移動させてなる回転移動画像の群のデータとから構成されるリファレンスデータセットと、を比較して、前記測定画像と割り当てられた前記リファレンス画像との類似度を評価する処理を行い、
前記スケジューラが、
類似度を評価する前記処理の結果に基づいて、複数の前記リファレンス画像のいずれかに複数の前記測定画像を対応づける測定データ分類部と、
前記測定データ分類部にて同じ前記リファレンス画像に対応づけられた複数の前記測定画像を平均化して得られる前記平均化画像のデータを複数の前記リファレンス画像について取得し、取得した複数の前記平均化画像に基づき前記分子の前記立体構造を推定する構造推定部と、
を含む構造推定システムとして機能させるプログラムが提供される。 For example, according to the present invention,
A system for estimating a three-dimensional structure of a molecule from data of a plurality of images,
A user terminal that accepts a user request and requests a structural analysis based on a plurality of measurement images of the molecule;
A data processing system comprising a plurality of data processing terminals;
A scheduler that accepts a request from the user terminal and requests data processing from the data processing system;
Are connected via a network,
The data processing system is
A plurality of reference data storage units to which data of a plurality of reference images are distributed, and a plurality of data processing units,
Each of the data processing units translates the measurement image data or the translation image data obtained by translating the measurement image for each of the plurality of measurement images in response to the data processing request from the scheduler. And a reference data set composed of data of the reference image assigned to itself and data of a group of rotationally moving images obtained by rotating and moving the reference image, and assigned to the measurement image. There line processing for evaluating the similarity between the reference image,
The scheduler
A measurement data classifying unit that associates a plurality of the measurement images with any one of the plurality of reference images based on the result of the processing for evaluating the similarity;
The averaged image data obtained by averaging the plurality of measurement images associated with the same reference image in the measurement data classification unit is acquired for the plurality of reference images, and the plurality of the averaged acquired A structure estimation unit that estimates the three-dimensional structure of the molecule based on an image;
There is provided a program that functions as a structure estimation system including.

本発明によれば、画像データを用いた分子の構造解析において、データ処理を効率よく行う技術が実現される。 ADVANTAGE OF THE INVENTION According to this invention, the technique which performs a data processing efficiently in the structure analysis of the molecule | numerator using image data is implement | achieved.

以下、本発明の実施形態について、図面を用いて説明する。なお、すべての図面において、共通する構成要素には同じ符号を付し、適宜説明を省略する。なお、以下の実施形態においては、単粒子解析法の手法を用いて生体高分子の立体構造を解析する場合を例に説明するが、本発明の解析対象は生体高分子には限られず、合成高分子や各種低分子化合物等を解析対象とすることもできる。また、本発明は、画像処理による分子の立体構造解析であれば、単粒子解析法以外の手法にも適用できる。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In all the drawings, common constituent elements are denoted by the same reference numerals, and description thereof is omitted as appropriate. In the following embodiments, a case where a three-dimensional structure of a biopolymer is analyzed using a single particle analysis method will be described as an example. However, the analysis target of the present invention is not limited to a biopolymer, and synthesis is performed. Polymers and various low-molecular compounds can also be analyzed. In addition, the present invention can be applied to techniques other than the single particle analysis method as long as the three-dimensional structure analysis of molecules is performed by image processing.

はじめに、本発明の理解を深めるために、分子の立体構造の解析方法の一例として、従来の単粒子解析法について図７を参照して概要を説明する。図７は、背景技術の項で前述した非特許文献１に示された単粒子解析法の概要を説明する図である。同文献に記載のように、単粒子解析法は、結晶を用いないタンパク質の構造解析法である。単粒子解析法では、タンパク質の電子顕微鏡写真を用いて構造解析が行われる。 First, in order to deepen the understanding of the present invention, an outline of a conventional single particle analysis method will be described with reference to FIG. 7 as an example of a method for analyzing a three-dimensional structure of a molecule. FIG. 7 is a diagram for explaining the outline of the single particle analysis method disclosed in Non-Patent Document 1 described above in the background art section. As described in this document, the single particle analysis method is a protein structure analysis method that does not use crystals. In the single particle analysis method, structural analysis is performed using an electron micrograph of a protein.

タンパク質の電子顕微鏡写真は、電子線照射による試料の損傷のために分解能が制限され、ノイズが多い画像となる。タンパク質のように電子線損傷を受けやすい場合には、シグナルとノイズの比（Ｓ／Ｎ比）が悪いことによって、画像の分解能が制限されるために、電子顕微鏡の持つ本来の分解能が発揮できない。このために、１枚の画像から高分解能の情報が得られないのみならず、立体構造の再構成も容易ではない。 An electron micrograph of a protein is a noisy image with limited resolution due to sample damage due to electron beam irradiation. When proteins are susceptible to electron beam damage, the resolution of the image is limited due to the poor signal-to-noise ratio (S / N ratio), so the original resolution of the electron microscope cannot be achieved. . For this reason, not only high-resolution information cannot be obtained from one image, but also the reconstruction of the three-dimensional structure is not easy.

この問題を克服するため、単粒子解析では、同じ向きの粒子画像を選び出し、それらを平均化することでＳ／Ｎ比を向上させて分解能を上げるというコンピュータを用いた結晶化ともいうべき方法を用いる（図７中）。凍結法による無染色電子顕微鏡画像は分子内密度を反映した投影画像を与えると考えられるため、そのＳ／Ｎ比が向上した平均画像から、様々な角度で１次元に投影したシノグラムを作成し、相互に比較することで３次元上のお互いの角度を推定する（図７左下）。 In order to overcome this problem, in single particle analysis, a method that should be referred to as crystallization using a computer, which selects particle images in the same direction and averages them to improve the S / N ratio and increase the resolution. Used (in FIG. 7). Since the unstained electron microscope image by the freezing method is thought to give a projected image reflecting the intramolecular density, create a sinogram projected in one dimension at various angles from the average image with improved S / N ratio, By comparing with each other, each other's three-dimensional angle is estimated (lower left of FIG. 7).

すなわち、ここでは全周３６０度に関して角度を変え１次元に投影したシノグラムを用いて、相互に一致度の高い成分を検索する。粒子の投影方向の似た画像同士を重ね合わせて平均化する。その相互の値を３枚以上の平均化画像について求め、それぞれの平均化画像における粒子の相対方位角を推定する。それに基づいて、平均投影画像を３次元的に組み合わせることでノイズの少ない立体構造を再構成する（図７右下）。ノイズはランダムであり、同じ位置にはこないことから、多くの画像を重ね合わせる平均化によってＳ／Ｎ比はさらに向上される。 In other words, here, components having a high degree of coincidence are searched for using a sinogram projected in a one-dimensional manner by changing the angle with respect to the entire circumference of 360 degrees. The images with similar particle projection directions are overlapped and averaged. The mutual values are obtained for three or more averaged images, and the relative azimuth angle of the particles in each averaged image is estimated. Based on this, a three-dimensional structure with less noise is reconstructed by combining the average projected images three-dimensionally (lower right of FIG. 7). Since the noise is random and does not come to the same position, the S / N ratio can be further improved by averaging many images.

また、得られた複数の平均化画像は、再び参照画像として元画像の位置あわせに用いられ、さらに新たに改善された平均化画像を生み出す。このようなサイクルを所定の回数繰り返すことにより、Ｓ／Ｎ比がさらに高まる。 Further, the obtained plurality of averaged images are used again as a reference image for alignment of the original image, and a newly improved averaged image is generated. By repeating such a cycle a predetermined number of times, the S / N ratio is further increased.

単粒子解析法は、以上の手順で行われるものである。ところが、この方法においては、解析時に、対象粒子の平行移動画像および回転移動画像の作成が必要となる。また、これらの画像と参照画像との位置合わせとが必要である。これらのすべての処理を１台の端末で行おうとした場合、データ処理量が膨大すぎて、端末への負荷が大きく、解析に長時間を要する。また、この問題は、平均化画像の作成サイクルを複数回繰り返す場合には、さらに顕著となる。 The single particle analysis method is performed by the above procedure. However, in this method, it is necessary to create a parallel movement image and a rotation movement image of the target particle at the time of analysis. In addition, alignment between these images and the reference image is necessary. If all these processes are performed by a single terminal, the amount of data processing is too large, the load on the terminal is large, and analysis takes a long time. In addition, this problem becomes more prominent when the averaged image creation cycle is repeated a plurality of times.

そこで、本発明では、測定画像の回転移動画像を作成するのではなく、参照画像（リファレンス画像）の回転移動画像を複数のリファレンス画像について予め作成することにより、類似度の高い成分を検索する際に、対象粒子の回転移動画像を作成することを不要とした。また、複数のデータ処理部を有するデータ処理システムを用いる。そして、リファレンス画像の回転移動画像の取得と、類似度の評価の各処理について、複数の所定のデータ処理部にて分散処理を行う。これにより、局所的な負荷を軽減し、膨大なデータ処理を効率よく迅速に行うことが可能となった。以下、具体的な構成を説明する。 Therefore, in the present invention, instead of creating a rotationally moving image of a measurement image, a rotationally moving image of a reference image (reference image) is created in advance for a plurality of reference images, thereby searching for a component having high similarity. In addition, it is unnecessary to create a rotational movement image of the target particles. A data processing system having a plurality of data processing units is used. Then, distributed processing is performed by a plurality of predetermined data processing units for each process of obtaining a rotationally moving image of the reference image and evaluating the similarity. As a result, the local load is reduced, and a huge amount of data processing can be performed quickly and efficiently. A specific configuration will be described below.

図１は、本発明の実施形態に係る構造推定システムの基本構成を説明する図である。図１に示した構造推定システム１００は、生体高分子等の立体構造を推定するシステムであって、ユーザ端末１０１と、スケジューラ１０３と、データ処理システム１１０とを含む。以下、生体高分子の立体構造を推定する場合を例に説明する。 FIG. 1 is a diagram illustrating a basic configuration of a structure estimation system according to an embodiment of the present invention. A structure estimation system 100 illustrated in FIG. 1 is a system that estimates a three-dimensional structure such as a biopolymer, and includes a user terminal 101, a scheduler 103, and a data processing system 110. Hereinafter, the case where the three-dimensional structure of a biopolymer is estimated will be described as an example.

ユーザ端末１０１はユーザのサービス利用のリクエストを受け付けて、生体高分子の構造解析を要求する。スケジューラ１０３は、ユーザ端末１０１からの要求を受け付けて、データ処理端末群１０５にデータ処理を要求する。また、スケジューラ１０３は、データ処理に得られた情報に基づき生体高分子の構造を推定して、ユーザ端末に提示する。 The user terminal 101 receives a user service use request and requests a structural analysis of the biopolymer. The scheduler 103 receives a request from the user terminal 101 and requests data processing from the data processing terminal group 105. Further, the scheduler 103 estimates the structure of the biopolymer based on the information obtained by the data processing and presents it to the user terminal.

データ処理システム１１０は、生体高分子の測定画像のデータを用いたデータ処理を行う。データ処理システム１１０は、複数のデータ処理端末から構成され、複数のリファレンスデータ記憶部１１３と複数のデータ処理部１１１とを含む。複数のリファレンスデータ記憶部１１１および複数のデータ処理部１１３は、それぞれ、所定の複数のデータ処理端末１０５に設けられている。 The data processing system 110 performs data processing using the measurement image data of the biopolymer. The data processing system 110 includes a plurality of data processing terminals and includes a plurality of reference data storage units 113 and a plurality of data processing units 111. The plurality of reference data storage units 111 and the plurality of data processing units 113 are respectively provided in a predetermined plurality of data processing terminals 105.

図１は、データ処理システム１１０が、データ処理端末１０５ａ〜データ処理端末１０５ｂの５台の端末から構成された例を示している。データ処理端末１０５ａ、データ処理端末１０５ｂおよびデータ処理端末１０５ｄは、データ処理部１１３およびリファレンスデータ記憶部１１１を備える。また、データ処理端末１０５ｃおよびデータ処理端末１０５ｅは、データ処理部１１３のみを備える。なお、図１には示していないが、構造推定システム１００は、第二記憶部１０９を含まずリファレンスデータ記憶部１１１を含む端末を含んでいてもよい。 FIG. 1 shows an example in which the data processing system 110 includes five terminals, that is, a data processing terminal 105a to a data processing terminal 105b. The data processing terminal 105a, the data processing terminal 105b, and the data processing terminal 105d include a data processing unit 113 and a reference data storage unit 111. The data processing terminal 105c and the data processing terminal 105e include only the data processing unit 113. Although not shown in FIG. 1, the structure estimation system 100 may include a terminal that does not include the second storage unit 109 but includes the reference data storage unit 111.

構造推定システム１００は、複数の画像から分子の立体構造を推定するシステムである。構造推定システム１００では、観察方向の異なる複数の測定画像を用いて、試料の立体構造の再構築がなされる。立体構造の再構築の際に、測定画像より少ない数の複数のリファレンス画像が用いられる。複数の測定画像は、リファレンス画像とのマッチングにより、測定画像とリファレンス画像との類似度が評価される。測定画像は、最も類似度の高いリファレンス画像ごとに分類される。そして、同じリファレンス画像に分類された複数の測定画像が平均化されて、平均化画像が得られる。得られた平均化画像は新たなリファレンス画像として更新される。更新処理が繰り返されることにより、リファレンス画像の精度が向上する。所定の回数更新された複数のリファレンス画像を用いて、試料の立体構造が精度よく再構築される。 The structure estimation system 100 is a system that estimates a three-dimensional structure of a molecule from a plurality of images. In the structure estimation system 100, the three-dimensional structure of the sample is reconstructed using a plurality of measurement images having different observation directions. When the three-dimensional structure is reconstructed, a plurality of reference images having a smaller number than the measurement image is used. The similarity between the measurement image and the reference image is evaluated by matching the plurality of measurement images with the reference image. The measurement image is classified for each reference image having the highest similarity. Then, a plurality of measurement images classified into the same reference image are averaged to obtain an averaged image. The obtained averaged image is updated as a new reference image. By repeating the update process, the accuracy of the reference image is improved. The three-dimensional structure of the sample is accurately reconstructed using a plurality of reference images updated a predetermined number of times.

構造推定システム１００では、上述した処理のうち、
（ｉ）リファレンスデータセットの作成、および
（ｉｉ）測定画像とリファレンスデータセットとの類似度の評価
に関する処理が、複数のデータ処理部１１３に分配される。 In the structure estimation system 100, among the processes described above,
The processing related to (i) creation of a reference data set and (ii) evaluation of the degree of similarity between the measurement image and the reference data set is distributed to a plurality of data processing units 113.

上記（ｉ）において、リファレンスデータセットは、一つのリファレンスセットのデータである。一つのリファレンスセットは、一つのリファレンス画像と、当該リファレンス画像を所定のピッチで回転移動させた画像の群（回転移動画像群）とから構成される。一つのリファレンスデータセットは、一つのリファレンス画像のデータと、当該リファレンス画像の回転移動画像群のデータとから構成される。また、複数のリファレンスデータセットは、複数のリファレンスデータ記憶部１１１に分配される。 In (i) above, the reference data set is data of one reference set. One reference set is composed of one reference image and a group of images obtained by rotating the reference image at a predetermined pitch (rotation-moving image group). One reference data set includes data of one reference image and data of a rotationally moving image group of the reference image. The plurality of reference data sets are distributed to the plurality of reference data storage units 111.

たとえば、図１では、リファレンス画像のデータｒｅｆ．ａ〜ｒｅｆ．ｉのそれぞれに対応するリファレンスデータセットｒｅｆ．ｓｅｔａ〜ｒｅｆ．ｓｅｔｉが作成される。たとえば、回転移動画像の回転角のピッチを１度とし、９０度までの回転移動画像を作成する場合、
ｒｅｆ．ｓｅｔａ：ｒｅｆ．ａ０（＝ｒｅｆ．ａ）、ｒｅｆ．ａ１、ｒｅｆ．ａ２、・・・、ｒｅｆ．ａ９０、
ｒｅｆ．ｓｅｔｂ：ｒｅｆ．ｂ０（＝ｒｅｆ．ｂ）、ｒｅｆ．ｂ１、ｒｅｆ．ｂ２、・・・、ｒｅｆ．ｂ９０、
・・・
ｒｅｆ．ｓｅｔｉ：ｒｅｆ．ｉ０（＝ｒｅｆ．ｉ）、ｒｅｆ．ｉ１、ｒｅｆ．ｉ２、・・・、ｒｅｆ．ｉ９０、
となり、各リファレンスセットは、９１枚の画像のデータから構成される。データ処理端末１０５ａ、データ処理端末１０５ｂ、およびデータ処理端末１０５ｄには、それぞれ、ｒｅｆ．ｓｅｔａ〜ｃ、ｒｅｆ．ｓｅｔｄ〜ｆ、およびｒｅｆ．ｓｅｔｇ〜ｉが分配される。また、リファレンスデータセットの作成が、複数のデータ処理部１１３における分散処理により行われる。これにより、データ処理システム１１０の局所的な負荷を軽減し、複数のリファレンス画像のそれぞれに対応するリファレンスデータセットを効率よく作成することができる。 For example, in FIG. 1, reference image data ref. a to ref. i reference data sets ref. seta-ref. seti is created. For example, when the rotational angle pitch of a rotationally moving image is 1 degree and a rotationally moving image up to 90 degrees is created,
ref. seta: ref. a0 (= ref.a), ref. a1, ref. a2,... ref. a90,
ref. setb: ref. b0 (= ref.b), ref. b1, ref. b2, ..., ref. b90,
...
ref. seti: ref. i0 (= ref.i), ref. i1, ref. i2, ..., ref. i90,
Thus, each reference set is composed of data of 91 images. The data processing terminal 105a, the data processing terminal 105b, and the data processing terminal 105d have ref. seta-c, ref. setd-f, and ref. setg to i are distributed. The reference data set is created by distributed processing in the plurality of data processing units 113. Thereby, the local load of the data processing system 110 can be reduced, and a reference data set corresponding to each of a plurality of reference images can be efficiently created.

また、上記（ｉｉ）において、複数の測定画像は、一つの試料を複数の方向から見た画像の群である。複数の測定画像のデータのそれぞれについて、各リファレンス画像との類似度が評価される。類似度の評価は、測定画像のデータとリファレンスデータセットとの比較により行われる。このとき、複数のデータ処理部１１３に複数のリファレンス画像の担当を分散させることにより、類似度の評価を効率よく進めることができる。図１においては、データ処理端末１０５ａ、データ処理端末１０５ｂ、データ処理端末１０５ｃ、データ処理端末１０５ｄおよびデータ処理端末１０５ｅのそれぞれに、ｒｅｆ．ａ〜ｂ、ｒｅｆ．ｃ〜ｄ、ｒｅｆ．ｅ〜ｆ、ｒｅｆ．ｇ〜ｈ、およびｒｅｆ．ｉが担当として割り当てられている。各データ処理端末は、複数の測定画像のそれぞれについて、割り当てられたリファレンスセットとのマッチングを行う。 In the above (ii), the plurality of measurement images are a group of images obtained by viewing one sample from a plurality of directions. The degree of similarity with each reference image is evaluated for each of the data of the plurality of measurement images. The evaluation of the similarity is performed by comparing the measurement image data with the reference data set. At this time, by assigning a plurality of reference images to a plurality of data processing units 113, it is possible to efficiently evaluate the similarity. In FIG. 1, each of the data processing terminal 105a, the data processing terminal 105b, the data processing terminal 105c, the data processing terminal 105d, and the data processing terminal 105e is set to ref. ab, ref. cd, ref. ef, ref. g to h, and ref. i is assigned as the person in charge. Each data processing terminal performs matching with the assigned reference set for each of the plurality of measurement images.

図２は、図１に示した構造推定システム１００の構成を示す機能ブロック図である。また、図３は、構造推定システム１００のスケジューラ１０３およびデータ処理端末１０５の詳細構造を示す機能ブロック図である。以下、図２および図３を参照して、構造推定システム１００の構成をさらに詳細に説明する。 FIG. 2 is a functional block diagram showing the configuration of the structure estimation system 100 shown in FIG. FIG. 3 is a functional block diagram showing detailed structures of the scheduler 103 and the data processing terminal 105 of the structure estimation system 100. Hereinafter, the configuration of the structure estimation system 100 will be described in more detail with reference to FIGS. 2 and 3.

構造推定システム１００は、ユーザ端末１０１と、スケジューラ１０３と、データ処理システム１１０とがネットワーク１１５により接続されたシステムである。 The structure estimation system 100 is a system in which a user terminal 101, a scheduler 103, and a data processing system 110 are connected by a network 115.

ユーザ端末１０１は、第一記憶部１０７および第一送受信部１１７を有する。第一記憶部１０７には、試料のｃｒｙｏ−ＴＥＭ画像の測定データが格納されている。 The user terminal 101 includes a first storage unit 107 and a first transmission / reception unit 117. The first storage unit 107 stores measurement data of a cryo-TEM image of the sample.

データ処理端末１０５は、データ処理端末群を構成する端末である。図２では、データ処理システム１１０がリファレンスデータ記憶部１１１とデータ処理部１１３とを両方備えるデータ処理端末１０５を３つ含む場合が例示されているが、複数のリファレンスデータ記憶部１１１と複数のデータ処理部１１３とを含むデータ処理システム１１０であれば、データ処理端末１０５の数は特に限定されない。データ処理端末１０５は、第二送受信部１１９、データ処理部１１３およびリファレンスデータ記憶部１１１を備える。リファレンスデータ記憶部１１１は、第二記憶部１０９（図３）中に設けられている。 The data processing terminal 105 is a terminal that constitutes a data processing terminal group. FIG. 2 illustrates the case where the data processing system 110 includes three data processing terminals 105 including both the reference data storage unit 111 and the data processing unit 113. However, a plurality of reference data storage units 111 and a plurality of data are included. If the data processing system 110 includes the processing unit 113, the number of data processing terminals 105 is not particularly limited. The data processing terminal 105 includes a second transmission / reception unit 119, a data processing unit 113, and a reference data storage unit 111. The reference data storage unit 111 is provided in the second storage unit 109 (FIG. 3).

第二送受信部１１９は、ネットワーク１１５を介して測定画像データ、リファレンスデータ、データ処理結果等の送受信を行う。 The second transmission / reception unit 119 transmits / receives measurement image data, reference data, data processing results, and the like via the network 115.

データ処理部１１３は、複数の測定画像のそれぞれについて、測定画像のデータまたは測定画像を平行移動させてなる平行移動画像のデータと、割り当てられた前記リファレンス画像のデータと当該リファレンス画像を回転移動させてなる回転移動画像の群のデータとから構成されるリファレンスデータセットと、を比較して、測定画像と割り当てられたリファレンス画像との類似度を評価する処理を行う。また、リファレンス画像が更新された際には、データ処理部１１３に更新リファレンス画像が割り当てられる。そして、データ処理部１１３は、測定画像と割り当てられた更新リファレンス画像との類似度を評価する処理を行う。 The data processing unit 113 rotates, for each of the plurality of measurement images, the measurement image data or translation image data obtained by translating the measurement image, the assigned reference image data, and the reference image. And a reference data set composed of data of a group of rotationally moving images, and a process of evaluating the similarity between the measured image and the assigned reference image. In addition, when the reference image is updated, the updated reference image is assigned to the data processing unit 113. Then, the data processing unit 113 performs processing for evaluating the similarity between the measurement image and the assigned update reference image.

データ処理部１１３は、平行移動画像取得部１４１、リファレンスデータセット取得方法選択部１４３、リファレンスデータセット取得部１４５、類似度パラメータ付与部１４７、および平均化処理部１４９を備える。 The data processing unit 113 includes a translation image acquisition unit 141, a reference data set acquisition method selection unit 143, a reference data set acquisition unit 145, a similarity parameter assignment unit 147, and an averaging processing unit 149.

平行移動画像取得部１４１は、測定画像のデータを面内の所定の方向に平行移動させた画像のデータを取得する。平行移動画像取得部１４１は、平行移動画像情報記憶部１３５に格納された算出式を参照して、たとえば、Ｘ軸方向およびＹ軸方向のそれぞれについて測定画像を所定のピッチで平行移動させた平行移動画像の群のデータを作成し、測定画像の識別子と関連づけて平行移動画像情報記憶部１３５に保存する。 The translation image acquisition unit 141 acquires image data obtained by translating the measurement image data in a predetermined direction in the plane. The translation image acquisition unit 141 refers to the calculation formula stored in the translation image information storage unit 135, for example, and translates the measurement image in a predetermined pitch in each of the X axis direction and the Y axis direction. Data of the group of moving images is created and stored in the parallel moving image information storage unit 135 in association with the identifier of the measurement image.

リファレンスデータセット取得方法選択部１４３は、リファレンスデータセット取得部１４５におけるリファレンスデータセットの取得方法を選択する。たとえば、ネットワーク１１５を介して他のデータ処理端末１０５と通信し、通信時間に応じて第二記憶部１０９に格納されたリファレンスデータセットを取得するか、自身のリファレンスデータセット取得部１４５において回転移動画像のデータおよびリファレンスデータセットを作成するかのいずれかを選択する。 The reference data set acquisition method selection unit 143 selects a reference data set acquisition method in the reference data set acquisition unit 145. For example, it communicates with another data processing terminal 105 via the network 115, acquires the reference data set stored in the second storage unit 109 according to the communication time, or rotates and moves in its own reference data set acquisition unit 145 Choose between creating image data and a reference data set.

リファレンスデータセット取得部１４５は、リファレンスデータセットを取得する。たとえば、データ処理端末群を構成するデータ処理端末１０５中のリファレンスデータセット記憶部１３９に保存されたデータを取得する。または、スケジューラ１０３から分配されたリファレンス画像のデータに基づき、リファレンスデータセット取得部１４５にて回転移動画像群のデータを作成し、作成した回転移動画像群のデータと回転前のリファレンス画像のデータとをリファレンス画像の識別子に関連づけてリファレンスデータセットとし、リファレンスデータセット記憶部１３９に格納する。 The reference data set acquisition unit 145 acquires a reference data set. For example, the data stored in the reference data set storage unit 139 in the data processing terminal 105 constituting the data processing terminal group is acquired. Alternatively, based on the reference image data distributed from the scheduler 103, the reference data set acquisition unit 145 creates data of the rotationally moving image group, and the generated rotationally moving image group data and the reference image data before the rotation Is associated with the identifier of the reference image as a reference data set and stored in the reference data set storage unit 139.

類似度パラメータ付与部１４７は、測定画像のデータおよびその平行移動画像のデータとリファレンスデータセットとの位置合わせを行う。また、これらを比較して、リファレンス画像のデータと測定画像のデータとの類似度を示す類似度パラメータを算出する。類似度パラメータを算出する際に、既知の方法を用いてリファレンス画像のデータおよび測定画像のデータのベクトル化処理を行ってもよい。このとき、類似度パラメータは、ベクトル化されたリファレンス画像のデータと測定画像のデータとの内積を計算することにより取得される。また、内積を反映するパラメータを算出して類似度パラメータとしてもよい。内積を算出する際に、各画像を構成するｎ×ｎのデータマトリックスを対角化処理してもよい。また、測定画像のデータのＳ／Ｎ比が低い場合、類似度パラメータの算出に先立ち、既知の方法によるフィルタリングを行ってもよい。得られた類似度パラメータを測定画像の識別子に関連づけて類似度情報記憶部１５１に格納する。 The similarity parameter assigning unit 147 aligns the measurement image data and the translation image data with the reference data set. Also, by comparing these, a similarity parameter indicating the similarity between the reference image data and the measurement image data is calculated. When calculating the similarity parameter, the reference image data and the measurement image data may be vectorized using a known method. At this time, the similarity parameter is obtained by calculating the inner product of the vectorized reference image data and the measurement image data. Also, a parameter reflecting the inner product may be calculated and used as a similarity parameter. When calculating the inner product, the n × n data matrix constituting each image may be diagonalized. When the S / N ratio of the measurement image data is low, filtering by a known method may be performed prior to the calculation of the similarity parameter. The obtained similarity parameter is stored in the similarity information storage unit 151 in association with the identifier of the measurement image.

平均化処理部１４９は、リファレンス画像ごとに、当該リファレンス画像に対して高い類似度（最適類似度）を有すると分類された複数の測定画像のデータを平均化処理する。平均化により得られたデータは、平均化情報記憶部１５３に格納されるとともに、第二送受信部１１９を介してスケジューラ１０３に送出される。 For each reference image, the averaging processing unit 149 performs averaging processing on data of a plurality of measurement images classified as having a high similarity (optimum similarity) with respect to the reference image. Data obtained by the averaging is stored in the average information storage unit 153 and sent to the scheduler 103 via the second transmission / reception unit 119.

なお、平均化処理部１４９における平均化処理の方法として、たとえば、ｎ×ｎのマトリックスの各ドットごとに、複数の測定画像のデータの算術平均を求める方法が挙げられる。また、平均化処理が、ノイズ除去の処理を含んでいてもよい。ノイズ除去の処理として、たとえば、ノイズを近似的に白色ノイズ化して除去するノイズアベレージングが挙げられる。 An example of the averaging processing method in the averaging processing unit 149 is a method of obtaining an arithmetic average of data of a plurality of measurement images for each dot of an n × n matrix. The averaging process may include a noise removal process. As the noise removal processing, for example, noise averaging that removes noise by approximating it to white noise can be cited.

第二記憶部１０９は、測定データ記憶部１３１、リファレンスデータ記憶部１１１、類似度情報記憶部１５１および平均化情報記憶部１５３を含む。 The second storage unit 109 includes a measurement data storage unit 131, a reference data storage unit 111, a similarity information storage unit 151, and an averaged information storage unit 153.

測定データ記憶部１３１は、測定画像データ記憶部１３３と平行移動画像情報記憶部１３５とから構成される。 The measurement data storage unit 131 includes a measurement image data storage unit 133 and a parallel movement image information storage unit 135.

測定画像データ記憶部１３３には、測定画像のデータが格納される。測定画像のデータは、スケジューラ１０３から提示される測定画像データの分配情報に基づき取得される。平行移動画像情報記憶部１３５には、測定画像のデータを平行移動画像のデータに変換する変換式が記憶されている。通常、一つの測定データから複数の平行移動画像の群のデータが作成される。また、平行移動画像情報記憶部１３５に、作成された平行移動画像のデータが記憶されてもよい。 The measurement image data storage unit 133 stores measurement image data. The measurement image data is acquired based on the distribution information of the measurement image data presented from the scheduler 103. The translation image information storage unit 135 stores a conversion formula for converting measurement image data into translation image data. Usually, data of a group of a plurality of translation images is created from one measurement data. In addition, data of the created parallel movement image may be stored in the parallel movement image information storage unit 135.

リファレンスデータ記憶部１１１は、リファレンス画像データ記憶部１３７およびリファレンスデータセット記憶部１３９を含む。 The reference data storage unit 111 includes a reference image data storage unit 137 and a reference data set storage unit 139.

データ処理システム１１０の複数のリファレンス画像データ記憶部１３７には、複数のリファレンス画像のデータが分散される。また、リファレンス画像が更新された際には、複数の平均化画像のデータが更新リファレンス画像として再分配される。なお、リファレンスデータの更新処理の詳細については後述する。 In the plurality of reference image data storage units 137 of the data processing system 110, data of a plurality of reference images is distributed. Further, when the reference image is updated, the data of the plurality of averaged images is redistributed as the updated reference image. Details of the reference data update process will be described later.

リファレンスデータセット記憶部１３９には、分配されたリファレンス画像より取得されるリファレンスセットのデータすなわちリファレンスデータセットがリファレンス画像データの識別子に関連づけられて記憶される。リファレンスセットは、リファレンス画像をたとえば１度のピッチで０度〜３６０度回転移動させた画像の群である。なお、０度回転移動させた画像は、もとのリファレンス画像に対応する。リファレンス画像の回転角度は、少なくとも０度〜９０度の範囲とすることが好ましい。このとき、リファレンスデータセットが、リファレンス画像のデータと、リファレンス画像を０度から９０度まで回転させた回転移動画像群のデータとを含む。０度〜９０度の回転データを予め作成しておけば、３６０度までの回転画像をすべて記憶する場合に比べてデータ量を顕著に減少させることができる。また、４回対称に対応する０度〜９０度の回転データを予め作成しておけば、Ｘ軸方向またはＹ軸方向についての符号反転により、測定画像とリファレンス画像との類似度を評価する際に必要となる０度〜３６０度回転画像のデータを後処理で容易に取得することができる。また、回転画像のピッチ角は、たとえば５度以下とし、さらに具体的には１度とする。こうすれば、測定データとリファレンスデータセットとの類似度の評価精度を向上させることができる。 In the reference data set storage unit 139, reference set data acquired from the distributed reference image, that is, a reference data set is stored in association with the identifier of the reference image data. The reference set is a group of images obtained by rotating the reference image by 0 ° to 360 °, for example, at a pitch of 1 °. Note that the image rotated by 0 degrees corresponds to the original reference image. The rotation angle of the reference image is preferably in the range of at least 0 degrees to 90 degrees. At this time, the reference data set includes reference image data and data of a rotationally moving image group obtained by rotating the reference image from 0 degrees to 90 degrees. If rotation data of 0 to 90 degrees is created in advance, the amount of data can be significantly reduced as compared with the case where all rotated images up to 360 degrees are stored. In addition, if rotation data of 0 to 90 degrees corresponding to four-fold symmetry is created in advance, the similarity between the measurement image and the reference image is evaluated by sign inversion in the X-axis direction or the Y-axis direction. Therefore, it is possible to easily acquire the data of the 0 degree to 360 degree rotated image necessary for the post-processing. The pitch angle of the rotated image is, for example, 5 degrees or less, and more specifically, 1 degree. In this way, it is possible to improve the evaluation accuracy of the similarity between the measurement data and the reference data set.

類似度情報記憶部１５１は、リファレンス画像のデータと測定度のデータとの類似度を示すパラメータを算出する算出式を記憶する。 The similarity information storage unit 151 stores a calculation formula for calculating a parameter indicating the similarity between the reference image data and the measurement degree data.

平均化情報記憶部１５３は、リファレンス画像ごとに、当該リファレンス画像に対して最も高い類似度を有すると分類された測定画像のデータを平均化する際の計算式を記憶する。 The averaged information storage unit 153 stores, for each reference image, a calculation formula used when averaging data of measurement images classified as having the highest similarity to the reference image.

スケジューラ１０３は、第三送受信部１２１、演算部１２３および第三記憶部１２９を有する。 The scheduler 103 includes a third transmission / reception unit 121, a calculation unit 123, and a third storage unit 129.

演算部１２３は、データ処理部選択部１５５、測定データ分類部１５７、および構造推定部１２５を有する。 The calculation unit 123 includes a data processing unit selection unit 155, a measurement data classification unit 157, and a structure estimation unit 125.

データ処理部選択部１５５は、複数のデータ処理部１１３の稼働状態に基づいて、類似度を評価する処理について複数のリファレンス画像が割り当てられる複数のデータ処理部１１３を選択する。また、データ処理部選択部１５５は、リファレンスデータセットを取得する複数のデータ処理部１１３をさらに選択する。 The data processing unit selection unit 155 selects a plurality of data processing units 113 to which a plurality of reference images are assigned for the process of evaluating the similarity based on the operating states of the plurality of data processing units 113. In addition, the data processing unit selection unit 155 further selects a plurality of data processing units 113 that acquire the reference data set.

つまり、データ処理部選択部１５５は、前述した
（ｉ）リファレンスデータセットの作成、および
（ｉｉ）測定画像とリファレンスデータセットとの類似度の評価
に関するデータ処理を担当するデータ処理部１１３を選択する。このとき、データ処理部選択部１５５は、たとえば、各データ処理部１１３の負荷を検知し、負荷の軽い前記データ処理部１１３を選択する。また、データ処理端末１０５のＣＰＵ占有率のデータを取得したり、稼働状態の情報を取得してもよい。また、データ処理部選択部１５５が、データ処理部１１３との通信に要する時間のデータを取得して、短時間でデータの送信が可能なデータ処理端末１０５のデータ処理部１１３を選択することもできる。 That is, the data processing unit selection unit 155 selects the data processing unit 113 that is in charge of the data processing related to (i) creation of the reference data set and (ii) evaluation of the similarity between the measurement image and the reference data set. . At this time, for example, the data processing unit selection unit 155 detects the load of each data processing unit 113 and selects the data processing unit 113 having a light load. Further, CPU occupancy data of the data processing terminal 105 may be acquired, or operating state information may be acquired. In addition, the data processing unit selection unit 155 may acquire data of the time required for communication with the data processing unit 113 and select the data processing unit 113 of the data processing terminal 105 capable of transmitting data in a short time. it can.

測定データ分類部１５７は、類似度を評価する処理の結果に基づいて、複数のリファレンス画像のいずれかに複数の測定画像を対応づける。具体的には、測定データ分類部１５７は、類似度パラメータを付与された測定画像のデータを取得し、複数の測定画像のそれぞれについて、最も一致度の高いリファレンス画像を選択し、このリファレンス画像の識別子と測定画像の識別子とを対応づけた分類データを作成し、演算情報記憶部１６３に格納する。また、リファレンスの更新の際には、類似度を評価する処理の結果に基づいて、複数の更新リファレンス画像のいずれかに複数の前記測定画像を対応づける。 The measurement data classification unit 157 associates a plurality of measurement images with any one of the plurality of reference images based on the result of the process for evaluating the similarity. Specifically, the measurement data classifying unit 157 acquires the measurement image data to which the similarity parameter is assigned, selects the reference image with the highest degree of coincidence for each of the plurality of measurement images, and selects the reference image Classification data in which the identifier and the identifier of the measurement image are associated with each other is created and stored in the calculation information storage unit 163. Further, when updating the reference, a plurality of measurement images are associated with one of the plurality of updated reference images based on the result of the process of evaluating the similarity.

構造推定部１２５は、平均化処理部１４９にて作成された更新後の複数のリファレンス画像に基づき、解析対象の膜タンパク質の立体構造を再構成する。このとき、構造推定部１２５は、測定データ分類部１５７にて同じ前記リファレンス画像に対応づけられた複数の測定画像を平均化して得られる平均化画像のデータを複数のリファレンス画像について取得し、取得した複数の平均化画像に基づき分子の立体構造を推定する。また、構造推定部１２５は、演算情報記憶部１６３に格納された演算式を参照して複数の平面画像のデータに基づき立体画像のデータを作成してもよい。さらに、リファレンスの更新の際には、構造推定部１２５は、測定データ分類部１５７にて同じ更新リファレンス画像に対応づけられた複数の測定画像を平均化して得られる平均化画像のデータを複数の更新リファレンス画像について取得し、取得した複数の平均化画像に基づき分子の立体構造を推定する。 The structure estimation unit 125 reconstructs the three-dimensional structure of the membrane protein to be analyzed based on the updated reference images created by the averaging processing unit 149. At this time, the structure estimation unit 125 acquires, for a plurality of reference images, averaged image data obtained by averaging a plurality of measurement images associated with the same reference image in the measurement data classification unit 157. The three-dimensional structure of the molecule is estimated based on the plurality of averaged images. In addition, the structure estimation unit 125 may create stereoscopic image data based on a plurality of plane image data with reference to an arithmetic expression stored in the arithmetic information storage unit 163. Furthermore, when updating the reference, the structure estimation unit 125 sets a plurality of averaged image data obtained by averaging a plurality of measurement images associated with the same updated reference image in the measurement data classification unit 157. The updated reference image is acquired, and the three-dimensional structure of the molecule is estimated based on the acquired averaged images.

第三記憶部１２９は、リファレンス情報記憶部１５９、測定データ配置情報記憶部１６１および演算情報記憶部１６３を有する。 The third storage unit 129 includes a reference information storage unit 159, a measurement data arrangement information storage unit 161, and a calculation information storage unit 163.

リファレンス情報記憶部１５９は、リファレンスデータに関する情報を記憶する。リファレンス情報記憶部１５９は、イニシャルリファレンスデータ記憶部（不図示）と、リファレンス分配情報記憶部（不図示）を備える。イニシャルリファレンスデータ記憶部は、構造推定の際に最初のリファレンスとして用いる複数のイニシャルリファレンス画像のデータが格納されている。また、リファレンス分配情報記憶部は、複数のリファレンスデータセットの分配状況に関する情報を記憶する。リファレンス分配情報記憶部には、どのデータ処理端末１０５のデータ処理部１１３にどのリファレンスセットが記憶されているかの対応付け情報が格納されている。具体的には、データ処理端末１０５の識別子とリファレンスセットの識別子とが関連づけて記憶されている。 The reference information storage unit 159 stores information related to reference data. The reference information storage unit 159 includes an initial reference data storage unit (not shown) and a reference distribution information storage unit (not shown). The initial reference data storage unit stores data of a plurality of initial reference images used as an initial reference in structure estimation. The reference distribution information storage unit stores information related to the distribution status of a plurality of reference data sets. In the reference distribution information storage unit, association information indicating which reference set is stored in the data processing unit 113 of which data processing terminal 105 is stored. Specifically, the identifier of the data processing terminal 105 and the identifier of the reference set are stored in association with each other.

測定データ配置情報記憶部１６１は、測定データの配置場所に関する情報を記憶する。測定データは、たとえばユーザ端末１０１に記憶されている。 The measurement data arrangement information storage unit 161 stores information regarding the arrangement location of measurement data. The measurement data is stored in the user terminal 101, for example.

演算情報記憶部１６３は、測定データ分類部１５７にて作成された分類データを記憶する。また、構造推定部１２５における演算に用いる演算式が格納されていてもよい。また、演算情報記憶部１６３には、更新回数に関する情報が記憶されていてもよい。更新回数に関する情報は、たとえば、リファレンスデータの必要な更新回数を試料の種類を示す識別子に対応づけたものである。 The calculation information storage unit 163 stores the classification data created by the measurement data classification unit 157. Moreover, the arithmetic expression used for the calculation in the structure estimation part 125 may be stored. Further, the calculation information storage unit 163 may store information related to the number of updates. The information regarding the number of updates is, for example, the correspondence between the required number of updates of reference data and an identifier indicating the type of sample.

次に、構造推定システム１００を用いた構造推定手順を説明する。構造推定は、図７を参照して前述した単粒子解析法に基づき行われる。前述した（ｉ）（ｉｉ）のデータ処理が、それぞれ、複数のデータ処理部１１３による分散処理により行われる。 Next, a structure estimation procedure using the structure estimation system 100 will be described. The structure estimation is performed based on the single particle analysis method described above with reference to FIG. The data processing (i) and (ii) described above are performed by distributed processing by the plurality of data processing units 113, respectively.

構造推定システム１００を用いた構造推定の手順は、以下の４つの段階に大きく分けられる。
（１）構造解析要求〜リファレンスデータセット作成
（２）測定データとリファレンスデータセットとの比較〜リファレンスの更新
（３）繰り返し
（４）構造推定
以下、それぞれの段階について、適宜図１〜図３、図５および図６を参照しながら説明する。図５は、スケジューラ１０３およびデータ処理端末１０５における解析手順を示す図である。図６は、図５の手順のうち、リファレンスデータセットの取得の手順を詳細に示す図である。 The structure estimation procedure using the structure estimation system 100 is roughly divided into the following four stages.
(1) Structural analysis request-creation of reference data set (2) Comparison between measurement data and reference data set-update of reference (3) repetition (4) structure estimation Hereinafter, for each stage, FIGS. This will be described with reference to FIGS. FIG. 5 is a diagram showing an analysis procedure in the scheduler 103 and the data processing terminal 105. FIG. 6 is a diagram showing in detail a procedure for acquiring a reference data set in the procedure of FIG.

（１）構造解析要求〜リファレンスデータセット作成
まず、ユーザ端末１０１は、ユーザのリクエストを受け付けて、複数の測定画像に基づく前記分子の構造解析を要求する。すると、図５に示したように、スケジューラ１０３は、ユーザ端末１０１の構造解析のリクエストを受け付けて（Ｓ１０１）、複数のデータ処理端末１０５から構成されるデータ処理システム１１０にデータ処理を要求する。このとき、スケジューラ１０３は、各データ処理端末１０５に稼働状況を問い合わせる（Ｓ１０３）。各データ処理端末からの報告（Ｓ２０１）を踏まえ、スケジューラ１０３は、リファレンスデータセットの取得を行う端末を決定し（Ｓ１０５）、該当端末に、イニシャルリファレンスデータを配布する。また、スケジューラ１０３は、測定画像のデータ処理を行う端末を決定し（Ｓ１０５）、該当端末に、測定画像のデータの分配状況に関する情報を送出して、データ処理システム１１０に含まれる複数のリファレンスデータ記憶部１１１に、複数のリファレンス画像のデータを分配させる。リファレンス画像のデータが分配された複数のリファレンスデータ記憶部１１１においては、自身に分配されたリファレンス画像について、リファレンスデータセットを取得する。 (1) Structural Analysis Request to Reference Data Set Creation First, the user terminal 101 receives a user request and requests structural analysis of the molecule based on a plurality of measurement images. Then, as shown in FIG. 5, the scheduler 103 receives a request for structural analysis of the user terminal 101 (S <b> 101), and requests data processing from the data processing system 110 composed of a plurality of data processing terminals 105. At this time, the scheduler 103 inquires each data processing terminal 105 about the operating status (S103). Based on the report (S201) from each data processing terminal, the scheduler 103 determines a terminal from which a reference data set is acquired (S105), and distributes initial reference data to the corresponding terminal. Also, the scheduler 103 determines a terminal that performs measurement image data processing (S105), sends information related to the distribution state of the measurement image data to the corresponding terminal, and a plurality of reference data included in the data processing system 110. The storage unit 111 distributes data of a plurality of reference images. In the plurality of reference data storage units 111 to which reference image data is distributed, a reference data set is acquired for the reference image distributed to itself.

以下においては、まず、図６および図１を参照して、リファレンスデータセットの取得手順について説明する。その後、図５にもどり、測定画像のデータ処理の手順について説明する。 In the following, first, a reference data set acquisition procedure will be described with reference to FIGS. 6 and 1. Thereafter, returning to FIG. 5, the procedure of the data processing of the measurement image will be described.

図６に示したように、スケジューラ１０３は、データ処理端末１０５の稼働状況を確認し（Ｓ１５１）、データ処理端末１０５から報告される稼働状況（Ｓ２５１）に応じて、リファレンスデータの配布端末およびリファレンスデータの作成端末を決定する（Ｓ１５３、Ｓ１５７）。リファレンスデータが配布されるデータ処理端末１０５はリファレンスデータ記憶部１１１を有し、リファレンスデータの作成を担当する端末は、データ処理部１１３を有する。 As shown in FIG. 6, the scheduler 103 confirms the operating status of the data processing terminal 105 (S151), and according to the operating status (S251) reported from the data processing terminal 105, the reference data distribution terminal and the reference A data creation terminal is determined (S153, S157). A data processing terminal 105 to which reference data is distributed has a reference data storage unit 111, and a terminal in charge of creation of reference data has a data processing unit 113.

次に、スケジューラ１０３は、リファレンスデータの配布端末に複数の初期のリファレンスデータ（イニシャルリファレンスデータ）を分配する（Ｓ１５５）。このとき、スケジューラ１０３は、データ処理端末１０５と通信して、データ処理端末１０５から報告される稼働状況を確認しながら、イニシャルリファレンスデータの受信速度に応じてデータ処理端末１０５にイニシャルリファレンスデータを配布する（Ｓ１５５）。たとえば、スケジューラ１０３は一つのデータ処理端末１０５に対し複数回に分けてイニシャルリファレンスデータを送信する。なお、スケジューラ１０３は、リファレンスデータ記憶部１１１を有する複数のデータ処理端末１０５のうちの一部にリファレンスデータを配布してもよい。また、スケジューラ１０３は、一つのリファレンスデータを複数のデータ処理端末１０５のリファレンスデータ記憶部１１１に配布してもよい。 Next, the scheduler 103 distributes a plurality of initial reference data (initial reference data) to reference data distribution terminals (S155). At this time, the scheduler 103 communicates with the data processing terminal 105 and distributes the initial reference data to the data processing terminal 105 according to the reception speed of the initial reference data while confirming the operation status reported from the data processing terminal 105. (S155). For example, the scheduler 103 transmits initial reference data to a single data processing terminal 105 in a plurality of times. The scheduler 103 may distribute reference data to some of the plurality of data processing terminals 105 having the reference data storage unit 111. The scheduler 103 may distribute one reference data to the reference data storage units 111 of the plurality of data processing terminals 105.

たとえば図１では、リファレンスデータを配布する端末として、リファレンスデータ記憶部１１１を有するデータ処理端末１０５ａ、データ処理端末１０５ｂおよびデータ処理端末１０５ｄが選出されている。スケジューラ１０３は、データ処理端末１０５ａにｒｅｆ．ａ〜ｃを配布し、データ処理端末１０５ｂにｒｅｆ．ｄ〜ｆを配布し、データ処理端末１０５ｄにｒｅｆ．ｇ〜ｉのデータを配布する。 For example, in FIG. 1, a data processing terminal 105a, a data processing terminal 105b, and a data processing terminal 105d having a reference data storage unit 111 are selected as terminals for distributing reference data. The scheduler 103 sends the ref. a to c are distributed to the data processing terminal 105b. d to f are distributed to the data processing terminal 105d. Distribute data of g to i.

また、ステップ１５５において、データ処理部１１３を有するデータ処理端末１０５に、リファレンスデータセットの作成を要求する（Ｓ１５９）。たとえば図１では、データ処理端末１０５ａにｒｅｆ．ａ〜ｒｅｆ．ｃのそれぞれのリファレンスデータセットの作成を要求し、データ処理端末１０５ｂにｒｅｆ．ｄ〜ｒｅｆ．ｆのリファレンスデータセットの作成を要求し、データ処理端末１０５ｄにｒｅｆ．ｇ〜ｒｅｆ．ｉのリファレンスデータセットの作成を要求する。 In step 155, the data processing terminal 105 having the data processing unit 113 is requested to create a reference data set (S159). For example, in FIG. 1, ref. a to ref. c is requested to create each reference data set, and the ref. d-ref. f is requested to create a reference data set, and the ref. g-ref. Requests creation of i reference data set.

リファレンスデータセットの作成のリクエストを受け付けたデータ処理端末１０５は、リファレンスデータセット取得方法選択部１４３にてリファレンスデータセットの取得方法を選択する（Ｓ２５５）。リファレンスデータセット取得部１４５は、選択された方法で回転移動画像のデータおよびリファレンスデータセットを作成し（Ｓ２６１）、リファレンスデータセット記憶部１３９に格納する。たとえば、リファレンスデータセット取得部１４５は、リファレンス画像データ記憶部１３７に格納されたリファレンス画像のデータを取得して（Ｓ２５７）、それぞれについてたとえば１度ピッチで１度から３６０度まで回転させた回転移動画像群のデータを作成し（Ｓ２５９）。そして、回転移動画像群のデータとゼロ度回転させた像つまりリファレンス像のデータとを含むリファレンスデータセットを作成する（Ｓ２６１）。 The data processing terminal 105 that has received the request for creating the reference data set selects the reference data set acquisition method by the reference data set acquisition method selection unit 143 (S255). The reference data set acquisition unit 145 creates the rotationally moving image data and the reference data set by the selected method (S261), and stores them in the reference data set storage unit 139. For example, the reference data set acquisition unit 145 acquires the reference image data stored in the reference image data storage unit 137 (S257), and each of them rotates, for example, by rotating from 1 degree to 360 degrees at a pitch of 1 degree. Image group data is created (S259). Then, a reference data set including the data of the rotationally moving image group and the image rotated by zero degrees, that is, the data of the reference image is created (S261).

なお、ステップ２５９において、３６０度までの回転移動像群のデータを作成せずに、たとえば９０度までの回転移動像群のデータを作成することもできる。このとき、後述する類似度パラメータの算出の手順において、９０度までの回転移動像群のデータの符号反転により、３６０度までの回転移動像群のデータを作製することができる。よって、ステップ１５９におけるリファレンスデータセットの作成量およびリファレンスデータセット記憶部１３９に保存されるデータ量を減少させて、リファレンスデータセットの作成処理を迅速化するとともに、類似度パラメータにおける算出処理についても効率よく行うことができる。 Note that in step 259, data of a rotationally moving image group of up to 90 degrees can be created without creating data of a rotationally moving image group of up to 360 degrees, for example. At this time, in the similarity parameter calculation procedure to be described later, the data of the rotationally moving image group up to 360 degrees can be created by the sign inversion of the data of the rotationally moving image group up to 90 degrees. Therefore, the creation amount of the reference data set in step 159 and the data amount stored in the reference data set storage unit 139 are reduced to speed up the creation process of the reference data set, and the calculation process for the similarity parameter is also efficient. Can be done well.

図１および図３に示した例では、たとえばデータ処理端末１０５ａのデータ処理部１１３が、自身のリファレンス像データ記憶部１３７に記憶されたｒｅｆ．ａ〜ｒｅｆ．ｃを参照し、それぞれのリファレンス像について、回転移動画像群を作成する。リファレンスデータセットとしてｒｅｆ．ｓｅｔａおよびｒｅｆ．ｓｅｔｂを得る。得られたリファレンスデータセットは、自身のリファレンスデータセット記憶部１３９に保存される。データ処理端末１０５ｂおよびデータ処理端末１０５ｄについても、データ処理端末１０５ａと同様の方法を用いてリファレンスデータセットが作成される。 In the example shown in FIG. 1 and FIG. 3, for example, the data processing unit 113 of the data processing terminal 105 a receives the ref. Data stored in its own reference image data storage unit 137. a to ref. Referring to c, a rotationally moving image group is created for each reference image. As a reference data set, ref. seta and ref. get setb. The obtained reference data set is stored in its own reference data set storage unit 139. For the data processing terminal 105b and the data processing terminal 105d, a reference data set is created using the same method as the data processing terminal 105a.

なお、データ処理端末１０５は必ずしも自身のリファレンスデータ記憶部１１１に記憶されたリファレンスのリファレンスデータセットを作成しなくてもよく、スケジューラ１０３は、リファレンスデータセットの作成対象となるリファレンス画像および当該リファレンス画像のデータの配置状況を、リファレンスデータセットの作成端末に送出してもよい。こうすれば、データ処理端末１０５の稼働状況に応じてリファレンスデータセットの作成をさらに迅速に行うことができる。 Note that the data processing terminal 105 does not necessarily have to create a reference data set for reference stored in its own reference data storage unit 111, and the scheduler 103 creates a reference image for which a reference data set is to be created and the reference image. The data arrangement status may be sent to the reference data set creation terminal. In this way, the reference data set can be created more quickly according to the operating status of the data processing terminal 105.

（２）測定データとリファレンスデータセットとの比較〜リファレンスの更新
図５に戻り、スケジューラ１０３は、データ処理システム１１０に含まれる複数のデータ処理部１１３に、複数の測定画像と複数のリファレンス画像との類似度を評価する処理を割り当てる。そして、スケジューラ１０３は、測定画像のデータ処理を行うデータ処理端末１０５に、測定データ配置情報記憶部１６１に記憶された測定データの配置情報と、データ処理対象となるリファレンス画像の識別子とを提示して、データ処理を要求する（Ｓ１０７）。たとえば、図１の例では、データ処理端末１０５ａ〜データ処理端末１０５ｅに、それぞれ、ｒｅｆ．ａ〜ｂ、ｒｅｆ．ｃ〜ｄ、ｒｅｆ．ｅ〜ｆ、ｒｅｆ．ｇ〜ｈおよびｒｅｆ．ｉに関するデータ処理を要求する。リクエストを受け付けたデータ処理端末１０５は、提示された測定データ配置情報を取得する（Ｓ２０３）。 (2) Comparison of Measurement Data and Reference Data Set to Reference Update Returning to FIG. 5, the scheduler 103 includes a plurality of measurement images, a plurality of reference images, and a plurality of data processing units 113 included in the data processing system 110. Assign a process to evaluate the similarity of. Then, the scheduler 103 presents the measurement data arrangement information stored in the measurement data arrangement information storage unit 161 and the identifier of the reference image to be data processed to the data processing terminal 105 that performs data processing of the measurement image. Then, data processing is requested (S107). For example, in the example of FIG. 1, each of the data processing terminal 105a to the data processing terminal 105e has ref. ab, ref. cd, ref. ef, ref. g to h and ref. Request data processing for i. The data processing terminal 105 that has received the request acquires the presented measurement data arrangement information (S203).

各データ処理端末は、担当のリファレンス画像のリファレンスデータセットをイニシャルリファレンスとして取得する（Ｓ２０５）。取得方法は、図６を参照して前述した通りであり、データ処理端末１０５は、自身でリファレンスデータセットを作成してもよいし、担当のリファレンス画像のリファレンスデータセットのデータが記憶された他の端末のリファレンスデータ記憶部１１１からデータを取得してもよい。また、他のデータ処理端末１０５と通信してみて、応答時間から、リファレンスデータセットの受信に要する時間と自身で作成した場合に要する時間とを比較して、自身で作成するかどうかを判断してもよい。他のデータ処理端末１０５からリファレンスデータセット取得する場合であって、複数のデータ処理端末１０５に対象のリファレンスセットのデータが保存されている場合には、どのデータ処理端末１０５からデータを取得するのが最も速いかについてもあわせて判断する。また、図１のデータ処理端末１０５ａのように、自身のリファレンスデータ記憶部１１１に、対象のリファレンスのリファレンスデータセットが保存されている場合には、これを取得して用いることができる。 Each data processing terminal acquires a reference data set of a reference image in charge as an initial reference (S205). The acquisition method is as described above with reference to FIG. 6, and the data processing terminal 105 may create a reference data set by itself or store data of the reference data set of the reference image in charge. Data may be acquired from the reference data storage unit 111 of the terminal. Also, after communicating with other data processing terminals 105, the response time is used to compare the time required to receive the reference data set with the time required to create the reference data set, and determine whether to create the reference data set. May be. When a reference data set is acquired from another data processing terminal 105 and the data of the target reference set is stored in a plurality of data processing terminals 105, from which data processing terminal 105 the data is acquired Also determine whether is the fastest. Further, when a reference data set of a target reference is stored in its own reference data storage unit 111 as in the data processing terminal 105a of FIG. 1, it can be acquired and used.

また、データ処理端末１０５は、類似度パラメータ付与部１４７にて、測定データ配置情報に基づき測定データを取得する（Ｓ２０７）。たとえば測定データがユーザ端末１０１の第一記憶部１０７に格納されている場合、格納された測定データを取得する。また、データ処理端末１０５は、測定データの平行移動画像を取得する（Ｓ２０９）。このとき、データ処理端末１０５は、測定画像データ記憶部１３３に格納された測定データおよび平行移動画像情報記憶部１３５に格納された演算式を参照して自身で平行移動画像のデータを作成してもよい。また、自身の測定画像データ記憶部１３３または他の端末に格納された平行移動画像のデータを取得してもよい。 In addition, the data processing terminal 105 obtains measurement data based on the measurement data arrangement information at the similarity parameter assignment unit 147 (S207). For example, when the measurement data is stored in the first storage unit 107 of the user terminal 101, the stored measurement data is acquired. In addition, the data processing terminal 105 acquires a translation image of the measurement data (S209). At this time, the data processing terminal 105 creates translation image data by referring to the measurement data stored in the measurement image data storage unit 133 and the arithmetic expression stored in the translation image information storage unit 135. Also good. Moreover, you may acquire the data of the translation image stored in own measurement image data storage part 133 or another terminal.

そして、データ処理端末１０５は、測定画像のデータまたは測定画像を平行移動させてなる平行移動画像のデータと、割り当てられたリファレンス画像のデータとリファレンス画像を回転移動させてなる回転移動画像の群のデータとから構成されるリファレンスデータセットと、を比較して、測定画像と割り当てられたリファレンス画像との類似度を評価する処理を行う。データ処理端末１０５は、担当のリファレンスのリファレンスデータセットと、測定データおよびその平行移動画像とを比較して、測定データとリファレンスとの類似度を示すパラメータを算出し、測定データの識別子、リファレンスの識別子および類似度パラメータとを互いに関連づけて類似度情報記憶部１５１に格納するとともに、スケジューラ１０３に送信する（Ｓ２１１）。 Then, the data processing terminal 105 includes the measurement image data or the parallel movement image data obtained by translating the measurement image, and the assigned reference image data and the rotational movement image group obtained by rotating the reference image. A reference data set composed of data is compared, and processing for evaluating the similarity between the measurement image and the assigned reference image is performed. The data processing terminal 105 compares the reference data set of the reference in charge with the measurement data and its translation image, calculates a parameter indicating the similarity between the measurement data and the reference, and determines the identifier of the measurement data, the reference The identifier and the similarity parameter are associated with each other, stored in the similarity information storage unit 151, and transmitted to the scheduler 103 (S211).

スケジューラ１０３は、類似度を評価する処理の結果に基づいて、複数の測定画像を、それぞれ、複数のリファレンス画像のいずれかに対応づける。このとき、スケジューラ１０３は、各データ処理端末１０５から送出された類似度パラメータの付与された測定データを取得し（Ｓ１０９）、各測定データについて、類似度の最も高いリファレンスを抽出し、測定データの識別子に分類パラメータを付与して演算情報記憶部１６３に格納するとともに、データ処理端末１０５に送出する（Ｓ１１１）。分類パラメータは、たとえば、類似度の最も高いリファレンス画像の識別子とする。 The scheduler 103 associates each of the plurality of measurement images with one of the plurality of reference images based on the result of the process for evaluating the similarity. At this time, the scheduler 103 acquires the measurement data with the similarity parameter sent from each data processing terminal 105 (S109), extracts the reference with the highest similarity for each measurement data, and extracts the measurement data A classification parameter is assigned to the identifier, stored in the calculation information storage unit 163, and sent to the data processing terminal 105 (S111). The classification parameter is, for example, an identifier of a reference image having the highest similarity.

データ処理端末１０５は、分類パラメータが付与されたリファレンス画像のデータを取得する（Ｓ２１３）。そして、平均化処理部１４９において、類似度の最も高いリファレンス画像の識別子が付与された複数の測定画像のデータを参照し、同じリファレンスに対応づけられている測定画像を平均化する（Ｓ２１５）。そして、平均化により得られた平均化画像のデータを、新たなリファレンス画像のデータとしてリファレンス画像データ記憶部１３７を更新する（Ｓ２１７）。 The data processing terminal 105 acquires reference image data to which the classification parameter is assigned (S213). Then, the averaging processing unit 149 refers to the data of a plurality of measurement images to which the identifier of the reference image having the highest similarity is assigned, and averages the measurement images associated with the same reference (S215). Then, the reference image data storage unit 137 is updated with the averaged image data obtained by the averaging as the new reference image data (S217).

（３）繰り返し
以上の手順のうち、ステップ２０５のリファレンスデータセットの取得から、ステップ２１７のリファレンス更新までのステップを、所定の回数繰り返して行う（Ｓ２１９のＹＥＳ）。 (3) Repeat In the above procedure, the steps from the acquisition of the reference data set in step 205 to the reference update in step 217 are repeated a predetermined number of times (YES in S219).

（４）構造推定
リファレンスの更新を所定の回数繰り返して行った後（Ｓ２１９のＮＯ）、データ処理端末１０５は、更新後のリファレンス画像のデータをスケジューラ１０３に提示する（Ｓ２２１）。そして、スケジューラ１０３は、複数のリファレンス画像のいずれかに対応づけられた複数の測定画像を平均化して得られる複数の平均化画像のデータを取得して、平均化画像のデータに基づいて、分子の立体構造を推定する。このとき、構造推定部１２５は、演算情報記憶部１６３に格納された演算式および取得したリファレンス画像のデータに基づき、解析対象の膜タンパク質の立体構造の推定データを作成し（Ｓ１１３）、ユーザ端末１０１に提示する（Ｓ１１５）。 (4) Structure estimation After updating the reference repeatedly a predetermined number of times (NO in S219), the data processing terminal 105 presents the updated reference image data to the scheduler 103 (S221). Then, the scheduler 103 acquires data of a plurality of averaged images obtained by averaging a plurality of measurement images associated with any of the plurality of reference images, and based on the data of the averaged images, the numerator Estimate the three-dimensional structure. At this time, the structure estimation unit 125 creates estimation data of the three-dimensional structure of the membrane protein to be analyzed based on the arithmetic expression stored in the calculation information storage unit 163 and the acquired reference image data (S113), and the user terminal 101 (S115).

本実施形態によれば、単粒子解析法により生体高分子の立体構造を推定する際に、複数のリファレンス画像とその回転移動画像群とからなるリファレンスセットのデータ（リファレンスデータセット）を予め複数のリファレンスデータ記憶部１１１に保存しておく。従来は、このようなリファレンスデータセットを予め作成しておかずに、測定画像の平行移動画像とそれぞれに対応する回転移動画像を作成し、リファレンス画像との位置合わせを行うとともに、これらの類似度を評価していた。このため、データ処理の負荷が高く、データ処理を行う端末に負荷が集中し、解析に長時間を要していた。これに対し、本実施形態では、予めデータ処理システム１１０がリファレンスセットのデータを保有しているため、類似度パラメータの算出が容易であり、測定画像を回転移動させながら類似度の評価を行う必要がない。このため、データ処理を効率よく迅速に行うことができる。また、ユーザ端末１０１がデータ処理部を有しない場合にも、ネットワーク１１５を介して構造推定を要求し、解析結果を迅速に取得することができる。 According to this embodiment, when estimating the three-dimensional structure of a biopolymer by a single particle analysis method, reference set data (reference data set) composed of a plurality of reference images and a rotationally moving image group is preliminarily stored. It is stored in the reference data storage unit 111. Conventionally, without creating such a reference data set in advance, a parallel movement image of a measurement image and a rotational movement image corresponding to each of the measurement images are created and aligned with the reference image. I was evaluating. For this reason, the load of data processing is high, the load is concentrated on the terminal that performs data processing, and a long time is required for analysis. On the other hand, in the present embodiment, since the data processing system 110 holds the reference set data in advance, it is easy to calculate the similarity parameter, and it is necessary to evaluate the similarity while rotating the measurement image. There is no. For this reason, data processing can be performed efficiently and quickly. In addition, even when the user terminal 101 does not have a data processing unit, it is possible to request structure estimation via the network 115 and quickly obtain an analysis result.

さらに、本実施形態では、リファレンスデータセットの作成を、複数のデータ処理部１１３による分散処理により行うとともに、リファレンスデータセットと測定画像のデータとを比較する際にも、複数のデータ処理部１１３による分散処理がなされる。このため、データ処理システム１１０内の所定のデータ処理部１１３の負荷が集中するのを抑制し、リファレンスセットの作成処理を効率よく迅速に行うことができる。 Furthermore, in the present embodiment, the reference data set is created by distributed processing by the plurality of data processing units 113, and also when the reference data set and the measurement image data are compared, the plurality of data processing units 113 Distributed processing is performed. For this reason, it can suppress that the load of the predetermined | prescribed data processing part 113 in the data processing system 110 concentrates, and the creation process of a reference set can be performed efficiently and rapidly.

また、スケジューラ１０３は、データ処理部選択部１５５を有する。このため、データ処理部選択部１５５において、ネットワーク上の距離の近いデータ処理端末１０５、負荷の軽いデータ処理部１１３を有するデータ処理端末１０５、通信のバンド幅の広いデータ処理端末１０５等を選択して、分散処理を要求したり、データ配布を行ったりすることができる。また、同じデータ処理端末１０５のリファレンスデータ記憶部１１１に記憶されたリファレンスデータセットとの比較を当該データ処理端末１０５のデータ処理部１１３に要求することもできる。このため、処理速度をさらに向上させることができる。 In addition, the scheduler 103 includes a data processing unit selection unit 155. Therefore, the data processing unit selection unit 155 selects a data processing terminal 105 having a short distance on the network, a data processing terminal 105 having a data processing unit 113 with a light load, a data processing terminal 105 having a wide communication bandwidth, and the like. Requesting distributed processing and distributing data. It is also possible to request the data processing unit 113 of the data processing terminal 105 to compare with the reference data set stored in the reference data storage unit 111 of the same data processing terminal 105. For this reason, the processing speed can be further improved.

このように、構造推定システム１００では、データ処理において負荷の大きい前述の
（ｉ）リファレンスデータセットの作成、および
（ｉｉ）測定画像とリファレンスデータセットとの類似度の評価
のそれぞれの処理について、複数のデータ処理部１１３による分散処理を適用する。複数のデータ処理端末１０５にリファレンス画像のデータを分配する。また、複数のデータ処理端末１０５にて分担してリファレンス画像のリファレンスデータセットを作成する。このため、ユーザ端末１０１や特定のデータ処理端末１０５に負荷がかからないようにして、大量の測定データおよび大量のリファレンスデータセットを用いたデータ処理を効率よく迅速に行うことができる。 As described above, in the structure estimation system 100, a plurality of processes (i) creation of the reference data set and (ii) evaluation of similarity between the measurement image and the reference data set, which have a large load in data processing, are performed. The distributed processing by the data processing unit 113 is applied. Reference image data is distributed to a plurality of data processing terminals 105. In addition, a plurality of data processing terminals 105 share the reference image set for the reference image. Therefore, it is possible to efficiently and quickly perform data processing using a large amount of measurement data and a large amount of reference data set without placing a load on the user terminal 101 or the specific data processing terminal 105.

また、本実施形態では、リファレンスデータセットと測定画像とのデータの類似度を評価して、類似度パラメータを測定画像の識別子に付与することにより、複数の測定データを、最も類似度の高いリファレンス画像の識別子に効率よく関連づけて分類することができる。このため、測定方向の異なる複数の測定画像を用いた解析を効率よく行うことができる。 Further, in the present embodiment, the similarity between the reference data set and the measurement image is evaluated, and a similarity parameter is assigned to the identifier of the measurement image, so that a plurality of measurement data is referred to as the reference having the highest similarity. It is possible to classify in association with an image identifier efficiently. For this reason, analysis using a plurality of measurement images having different measurement directions can be efficiently performed.

また、本実施形態では、リファレンスごとに分類された測定データを平均化して得られた平均化画像を新たなリファレンス画像として更新し、更新処理を繰り返し行う。このため、膜タンパク質のように、従来の方法では結晶化が困難であるために、立体構造の解析データの蓄積が進んでいない試料に対しても、所定の画像をイニシャルリファレンスとして用いて更新作業を繰り返し行うことにより、リファレンス画像を実画像に近づけることが可能である。また、構造推定システム１００は、たとえば、膜タンパク質以外の生体高分子の解析にも効果的に用いることができる。たとえば、構造推定システム１００を多糖類や糖タンパク質の立体構造の解析に用いてもよい。 In this embodiment, the averaged image obtained by averaging the measurement data classified for each reference is updated as a new reference image, and the update process is repeated. For this reason, it is difficult to crystallize using conventional methods, such as membrane proteins, so even for samples for which three-dimensional structure analysis data has not accumulated, update work using a predetermined image as an initial reference. By repeating the above, it is possible to bring the reference image closer to the actual image. In addition, the structure estimation system 100 can be effectively used for analysis of biopolymers other than membrane proteins, for example. For example, the structure estimation system 100 may be used for analyzing the three-dimensional structure of polysaccharides and glycoproteins.

（第二の実施形態）
第一の実施形態においては、リファレンスデータセット取得方法選択部１４３がデータ処理端末１０５に設けられた構成を例示したが、図２に示した構造推定システム１００において、リファレンスデータセット取得方法選択部１４３がスケジューラ１０３に設けられた構成としてもよい。本実施形態では、かかる構成について説明する。 (Second embodiment)
In the first embodiment, the configuration in which the reference data set acquisition method selection unit 143 is provided in the data processing terminal 105 is illustrated. However, in the structure estimation system 100 illustrated in FIG. 2, the reference data set acquisition method selection unit 143 May be provided in the scheduler 103. In the present embodiment, such a configuration will be described.

図４は、本実施形態の構造推定システムのスケジューラ１０３およびデータ処理端末１０５の詳細を示す機能ブロック図である。図４においては、スケジューラ１０３がリファレンスデータセット取得方法選択部１４３を含み、複数のデータ処理部１１３が、リファレンスデータセット取得部１４５におけるリファレンスデータセットの取得方法を選択するリファレンスデータセット取得方法選択受付部１６５を含む。 FIG. 4 is a functional block diagram showing details of the scheduler 103 and the data processing terminal 105 of the structure estimation system of this embodiment. In FIG. 4, the scheduler 103 includes a reference data set acquisition method selection unit 143, and a plurality of data processing units 113 receives a reference data set acquisition method selection reception in which the reference data set acquisition unit 145 selects a reference data set acquisition method. Part 165.

本実施形態では、スケジューラ１０３のリファレンスデータセット取得方法選択部１４３がデータ処理端末１０５の稼働状況を把握して、リファレンスデータセットの取得方法を選択する。そして、データ処理を行うデータ処理端末１０５に、取得方法の情報が提示される。データ処理端末１０５は、リファレンスデータセット取得方法選択受付部１６５にて取得方法の情報を受け付ける。リファレンスデータセット取得部１４５は、リファレンスデータセット取得方法選択受付部１６５にて受け付けた取得方法に基づきリファレンスデータセットを取得または作成する。 In this embodiment, the reference data set acquisition method selection unit 143 of the scheduler 103 grasps the operating status of the data processing terminal 105 and selects a reference data set acquisition method. Then, information on the acquisition method is presented to the data processing terminal 105 that performs data processing. The data processing terminal 105 receives information on the acquisition method at the reference data set acquisition method selection reception unit 165. The reference data set acquisition unit 145 acquires or creates a reference data set based on the acquisition method received by the reference data set acquisition method selection reception unit 165.

本実施形態では、スケジューラ１０３がデータ処理端末群全体の稼働状況を把握して、データ処理を行うデータ処理端末１０５にリファレンスデータセットの取得方法を指示するため、システム全体におけるリファレンスデータセットの取得動作をさらに効率よく行うことができる。 In this embodiment, the scheduler 103 grasps the operating status of the entire data processing terminal group, and instructs the data processing terminal 105 that performs data processing to acquire a reference data set. Can be performed more efficiently.

以上、図面を参照して本発明の実施形態について述べたが、これらは本発明の例示であり、上記以外の様々な構成を採用することもできる。 As mentioned above, although embodiment of this invention was described with reference to drawings, these are the illustrations of this invention, Various structures other than the above are also employable.

たとえば、以上の実施形態では、平均化処理部１４９がデータ処理端末１０５に設けられた場合を例に説明したが、スケジューラ１０３が平均化処理部１４９を有していてもよい。また、スケジューラ１０３とデータ処理端末１０５の両方に平均化処理部１４９が設けられていてもよい。このとき、スケジューラ１０３が、データ処理端末１０５の稼働状況を確認し、データ処理端末１０５に平均化処理を行わせるか自身の平均化処理部１４９にて平均化処理を行うかどうかを判断する平均化処理方法判断部を有していてもよい。 For example, in the above embodiment, the case where the averaging processing unit 149 is provided in the data processing terminal 105 has been described as an example, but the scheduler 103 may include the averaging processing unit 149. Moreover, the averaging processing unit 149 may be provided in both the scheduler 103 and the data processing terminal 105. At this time, the scheduler 103 confirms the operation status of the data processing terminal 105 and determines whether the data processing terminal 105 performs the averaging process or whether the averaging process unit 149 performs the averaging process. There may be provided a processing method determination unit.

また、以上の実施形態において、スケジューラ１０３が複数の端末装置から構成されており、各機能ブロックが複数の端末装置に分散して配置されていてもよい。また、ユーザ端末１０１が複数の端末から構成されている態様とすることもできる。 Further, in the above embodiment, the scheduler 103 may be configured by a plurality of terminal devices, and each functional block may be distributed and arranged in a plurality of terminal devices. In addition, the user terminal 101 may be configured from a plurality of terminals.

また、以上の実施形態において、解析対象物質は一分子から構成されていてもよいし、複数の分子が超分子構造を形成してなるものであってもよい。 In the above embodiment, the substance to be analyzed may be composed of a single molecule, or a plurality of molecules may form a supramolecular structure.

本発実施形態に係る構造推定システムの構成を説明する図である。It is a figure explaining the structure of the structure estimation system which concerns on this embodiment. 本実施形態に係る構造推定システムの構成を示す機能ブロックである。It is a functional block which shows the structure of the structure estimation system which concerns on this embodiment. 図２の構造推定システムのスケジューラおよびデータ処理端末の詳細を示す機能ブロック図である。It is a functional block diagram which shows the detail of the scheduler and data processing terminal of the structure estimation system of FIG. 図２の構造推定システムのスケジューラおよびデータ処理端末の詳細を示す機能ブロック図である。It is a functional block diagram which shows the detail of the scheduler and data processing terminal of the structure estimation system of FIG. 本実施形態に係る構造推定手順を説明する図である。It is a figure explaining the structure estimation procedure which concerns on this embodiment. 本実施形態に係る構造推定手順を説明する図である。It is a figure explaining the structure estimation procedure which concerns on this embodiment. 単粒子解析法を説明する図である。It is a figure explaining a single particle analysis method.

Explanation of symbols

１００構造推定システム
１０１ユーザ端末
１０３スケジューラ
１０５データ処理端末群
１０５ａデータ処理端末
１０５ｂデータ処理端末
１０５ｃデータ処理端末
１０５ｄデータ処理端末
１０５ｅデータ処理端末
１０７第一記憶部
１０９第二記憶部
１１１リファレンスデータ記憶部
１１３データ処理部
１１５ネットワーク
１１７第一送受信部
１１９第二送受信部
１２１第三送受信部
１２３演算部
１２５構造推定部
１２７測定データ分類部
１２９第三記憶部
１３１測定データ記憶部
１３３測定画像データ記憶部
１３５平行移動画像情報記憶部
１３７リファレンス画像データ記憶部
１３９リファレンスデータセット記憶部
１４１平行移動画像取得部
１４３リファレンスデータセット取得方法選択部
１４５リファレンスデータセット取得部
１４７類似度パラメータ付与部
１４９平均化処理部
１５１類似度情報記憶部
１５３平均化情報記憶部
１５５リファレンスデータ更新部
１５７測定データ分類部
１５９リファレンス情報記憶部
１６１測定データ配置情報記憶部
１６３演算情報記憶部
１６５リファレンスデータセット取得方法選択受付部
DESCRIPTION OF SYMBOLS 100 Structure estimation system 101 User terminal 103 Scheduler 105 Data processing terminal group 105a Data processing terminal 105b Data processing terminal 105c Data processing terminal 105d Data processing terminal 105e Data processing terminal 107 1st memory | storage part 109 2nd memory | storage part 111 Reference data memory | storage part 113 Data processing unit 115 Network 117 First transmission / reception unit 119 Second transmission / reception unit 121 Third transmission / reception unit 123 Operation unit 125 Structure estimation unit 127 Measurement data classification unit 129 Third storage unit 131 Measurement data storage unit 133 Measurement image data storage unit 135 Parallel Moving image information storage unit 137 Reference image data storage unit 139 Reference data set storage unit 141 Parallel moving image acquisition unit 143 Reference data set acquisition method selection unit 145 Reference data set acquisition unit 147 Similarity parameter assignment unit 149 Averaging processing unit 151 Similarity information storage unit 153 Averaged information storage unit 155 Reference data update unit 157 Measurement data classification unit 159 Reference information storage unit 161 Measurement data arrangement information storage unit 163 Calculation information storage unit 165 Reference data set acquisition method selection reception unit

Claims

A system for estimating the three-dimensional structure of a molecule from data of a plurality of images,
A user terminal that accepts a user request and requests a structural analysis based on a plurality of measurement images of the molecule;
A data processing system comprising a plurality of data processing terminals;
A scheduler that accepts a request from the user terminal and requests data processing from the data processing system;
Are connected via a network,
The data processing system is
A plurality of reference data storage units to which data of a plurality of reference images are distributed;
A plurality of data processing units;
Including
Each of the data processing units translates the measurement image data or the translation image data obtained by translating the measurement image for each of the plurality of measurement images in response to the data processing request from the scheduler. And a reference data set composed of data of the reference image assigned to itself and data of a group of rotationally moving images obtained by rotating and moving the reference image, and assigned to the measurement image. There line processing for evaluating the similarity between the reference image,
The scheduler
A measurement data classifying unit that associates a plurality of the measurement images with any one of the plurality of reference images based on the result of the processing for evaluating the similarity;
The averaged image data obtained by averaging the plurality of measurement images associated with the same reference image in the measurement data classifying unit is acquired for the plurality of reference images, and the plurality of averaged images acquired A structure estimation unit for estimating the three-dimensional structure of the molecule based on
Structure estimation system including

The structure estimation system according to claim 1,
A plurality of the averaged image data is redistributed as updated reference images to the plurality of reference data storage units, and the updated reference images are allocated to the plurality of data processing units,
The data processing unit performs a process of evaluating the similarity between the measurement image and the assigned update reference image,
The measurement data classifying unit associates the plurality of measurement images with any of the plurality of updated reference images based on the result of the process of evaluating the similarity,
The structure estimation unit acquires data of an averaged image obtained by averaging the plurality of measurement images associated with the same updated reference image in the measurement data classification unit for the plurality of updated reference images, A structure estimation system that estimates the three-dimensional structure of the molecule based on the plurality of acquired averaged images.

The structure estimation system according to claim 1 or 2,
The scheduler
A data processing unit selection unit that selects a plurality of data processing units to which a plurality of reference images are assigned for the processing for evaluating similarity based on operating states of the plurality of data processing units;
A reference distribution information storage unit for storing information on a distribution status of the plurality of reference data sets;
Structure estimation system including

The structure estimation system according to claim 3, wherein the data processing unit selection unit further selects a plurality of the data processing units that acquire the reference data set.

The structure estimation system according to any one of claims 1 to 4,
A plurality of the data processing units,
A reference data set acquisition unit for acquiring the reference data set;
A reference data set acquisition method selection unit that selects an acquisition method of the reference data set in the reference data set acquisition unit;
Structure estimation system including

The structure estimation system according to any one of claims 1 to 4,
A plurality of the data processing units include a reference data set acquisition unit that acquires the reference data set,
The structure estimation system in which the scheduler includes a reference data set acquisition method selection unit that selects an acquisition method of the reference data set in the reference data set acquisition unit.

7. The structure estimation system according to claim 1, wherein the reference data set includes data of the reference image and data of a rotationally moving image group obtained by rotating the reference image from 0 degrees to 90 degrees. Structure estimation system.

8. The structure estimation system according to claim 1, wherein the molecule is a biopolymer.

The structure estimation system according to claim 8, wherein the biopolymer is a membrane protein.

A method for estimating the three-dimensional structure of a molecule from data of a plurality of images,
A user terminal accepting a user request and requesting a structural analysis of the molecule based on a plurality of measurement images;
A scheduler accepting a request from the user terminal and requesting data processing from a data processing system including a plurality of data processing terminals;
A step wherein the scheduler, the multiple's Reference data storage unit included in the data processing system, for distributing the data of a plurality of reference images,
A step wherein the scheduler is to allocate the data processing plurality of data processing units included in the system, the process of evaluating the similarity between a plurality of measurement images and the plurality of reference images,
Each of the plurality of data processing units is assigned to the data of the measurement image or the data of the translation image obtained by translating the measurement image in response to the data processing request from the scheduler . by comparing the reference data set composed of a data group of rotational movement image composed by rotating moving data and the reference image of the reference image, similar to the reference image allocated to the measurement image Performing the process of evaluating the degree;
The scheduler associates each of the plurality of measurement images with one of the plurality of reference images based on the result of the process of evaluating the similarity,
The scheduler acquires a plurality of averaged image data obtained by averaging the plurality of measurement images associated with any of the plurality of reference images, and based on the averaged image data, Estimating the conformation of the molecule;
A structure estimation method including:

The computer,
A system for estimating the three-dimensional structure of a molecule from data of a plurality of images,
A user terminal that accepts a user request and requests a structural analysis based on a plurality of measurement images of the molecule;
A data processing system comprising a plurality of data processing terminals;
A scheduler that accepts a request from the user terminal and requests data processing from the data processing system;
Are connected via a network,
The data processing system is
A plurality of reference data storage units to which data of a plurality of reference images are distributed;
A plurality of data processing units;
Including
Each of the data processing units translates the measurement image data or the translation image data obtained by translating the measurement image for each of the plurality of measurement images in response to the data processing request from the scheduler. And a reference data set composed of data of the reference image assigned to itself and data of a group of rotationally moving images obtained by rotating and moving the reference image, and assigned to the measurement image. There line processing for evaluating the similarity between the reference image,
The scheduler
A measurement data classifying unit that associates a plurality of the measurement images with any one of the plurality of reference images based on the result of the processing for evaluating the similarity;
The averaged image data obtained by averaging the plurality of measurement images associated with the same reference image in the measurement data classifying unit is acquired for the plurality of reference images, and the plurality of averaged images acquired A structure estimation unit for estimating the three-dimensional structure of the molecule based on
A program that functions as a structure estimation system.