JP2008097320A

JP2008097320A - Genetic information management system

Info

Publication number: JP2008097320A
Application number: JP2006278268A
Authority: JP
Inventors: Sadahiro Kumagai; 禎洋熊谷; Takahiko Kasuga; 孝彦春日; Takeo Nagai; 健夫永井
Original assignee: Hitachi Software Engineering Co Ltd
Current assignee: Hitachi Software Engineering Co Ltd
Priority date: 2006-10-12
Filing date: 2006-10-12
Publication date: 2008-04-24

Abstract

<P>PROBLEM TO BE SOLVED: To provide an annotation imparting work environment to sequence information based on various requests during collaborative work between a plurality of users in fields of research, such as molecular biology handling DNA/RNA/amino-acid sequence. <P>SOLUTION: When a user performs search of sequence information, verification whether the user has the access authority for the sequence information data which is a search object based on a user ID is performed. When it is determined that the user has the access authority, the sequence information data which is the search object is transmitted to a terminal device which the user uses and the sequence information data concerned is displayed on a display etc. of a terminal device. Meanwhile when it is determined that the user does not have the access authority, message data to that effect is transmitted to the terminal device which the user concerned uses. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、コンピュータ端末上でのＤＮＡ／ＲＮＡ／アミノ酸配列・アノテーションデータの管理、表示の仕組みに関するものである。 The present invention relates to a mechanism for managing and displaying DNA / RNA / amino acid sequence / annotation data on a computer terminal.

ＤＮＡ、ＲＮＡのアミノ酸配列情報に関する種類は、塩基やアミノ酸の配列情報のみならず、遺伝子情報、マーカー情報、その配列に関する論文や研究者の情報など多岐にわたる。
遺伝子の配列情報は、GenBank（The National Center for Biotechnology Information）やDDBJ（DNA Data Bank of Japan）、EMBL（European Molecular Biology
Laboratory）等の公共データベースで提供される配列情報、公共データベース中の配列情報に対して研究者が解析結果などを独自に付加したもの、研究者自身が配列の決定を行って当該配列に対するコメント等をアノテーション（遺伝子配列の注釈付け）したものなどに分別される。 The types of DNA and RNA amino acid sequence information include not only base and amino acid sequence information, but also gene information, marker information, papers on the sequence, and researcher information.
Gene sequence information includes GenBank (The National Center for Biotechnology Information), DDBJ (DNA Data Bank of Japan), and EMBL (European Molecular Biology).
Sequence information provided in public databases such as Laboratory), researchers added their own analysis results to the sequence information in the public database, researchers themselves made sequence determinations, comments on the sequences, etc. Are annotated (annotated with gene sequences).

近年の研究施設においては、ＬＡＮ（Local Area Network）による相互通信が可能なコンピュータシステムが導入されている。そして、共同研究者同士のデータ共有は当該ＬＡＮ等のネットワーク上で共有されていることが多い。また、コンピュータ端末のＯＳ（Operating System）において設定されるアクセス権限によりデータへのアクセスを制限する方法が一般的である。 In recent research facilities, computer systems capable of mutual communication via a LAN (Local Area Network) have been introduced. Data sharing among collaborators is often shared on a network such as the LAN. Further, a method of restricting access to data by an access right set in an OS (Operating System) of a computer terminal is common.

このように、共同研究者間で配列情報を共有しつつ作業を進める場合、インターネットやイントラネットなどのネットワークに接続されたコンピュータ端末を利用して、データの共有を行うケースが多くなってきているが、共同研究作業の形態は様々である。したがって、これらの様々な作業形態に対して柔軟に対応できるような、配列情報へのアノテーション付与作業環境システムが求められている。 In this way, when working while sharing sequence information among collaborators, there are many cases where data is shared using computer terminals connected to a network such as the Internet or an intranet. There are various forms of joint research work. Therefore, there is a need for a work environment system for annotating sequence information that can flexibly cope with these various work modes.

例えば、遺伝子情報管理システムを利用する研究者等のユーザは、ネットワーク上のコンピュータ端末における共有フォルダ内に格納された配列情報に関するデータファイルに対して編集作業を行うことができるが、どのユーザがどの時点で編集作業を行ったという情報までは記録されない。また、後々遺伝子情報データを公開する予定ではあるものの、それまでは当該遺伝子情報データを公開したくない場合など、ユーザ間で共同作業をする際に要求されるデータファイルへの細かなアクセス権限は提供されていない。更には、複数の研究者間では、期せずして、同じ配列情報に注目することがあり、研究対象が重複するといった可能性もある。 For example, a user such as a researcher who uses a gene information management system can edit a data file related to sequence information stored in a shared folder in a computer terminal on a network. The information that the editing work was performed at the time is not recorded. In addition, if you plan to publish gene information data later, but you do not want to publish the gene information data until then, detailed access authority to the data file required when collaborating between users is Not provided. Furthermore, among the plurality of researchers, the same sequence information may be noticed unexpectedly, and there is a possibility that research objects overlap.

本発明の目的は、どの配列情報に対して、どのようなアノテーションを付与したり解析作業を行ったりしているかを明確にすることで、複数のユーザ間における共同作業時の様々な要求に即した配列情報へのアノテーション付与作業環境の提供を行うことにある。 It is an object of the present invention to clarify which annotations are assigned to which sequence information and analysis work is performed, so as to meet various requirements at the time of collaborative work among a plurality of users. It is to provide an annotating work environment for the sequence information.

上記目的を達成するために、本発明の遺伝子情報管理システムは、記憶手段を備えたデータベースサーバと、ユーザに使用される端末装置とがネットワークに通信可能に接続されているコンピュータシステムを用いた遺伝子情報管理システムであって、前記記憶手段には遺伝子の配列に関する配列情報データが登録されている配列情報データベースが記憶されており、前記配列情報データ毎にアクセス権限の有無が設定されているとともに、ユーザによって前記端末装置から前記配列情報データが検索された場合、前記データベースサーバは、前記アクセス権限に基づいて前記配列情報データを公開するか否か判断することを特徴とする。 In order to achieve the above object, the gene information management system of the present invention is a gene using a computer system in which a database server having a storage means and a terminal device used by a user are connected to be able to communicate with a network. In the information management system, the storage means stores a sequence information database in which sequence information data related to gene sequences is registered, and whether or not access authority is set for each sequence information data, When the sequence information data is retrieved from the terminal device by a user, the database server determines whether to disclose the sequence information data based on the access authority.

本発明の遺伝子情報管理システムによれば、ＤＮＡ／ＲＮＡ／アミノ酸配列を扱う分子生物学などの研究分野において、複数のユーザ間における共同作業時の様々な要求に即した、配列情報へのアノテーション付与作業環境の提供を行うことができる。 According to the gene information management system of the present invention, in the field of research such as molecular biology dealing with DNA / RNA / amino acid sequences, annotating sequence information in accordance with various demands when collaborating among multiple users. A working environment can be provided.

以下、本発明である遺伝子情報管理システムの一実施の形態について説明する。
図１に示すように、本実施形態における遺伝子情報管理システムは、データベースサーバ１１と、複数の端末装置１２とで構成されている。これらデータベースサーバ１１及び端末装置１２は、ネットワークとしてのインターネット１３に公衆電話回線を用いたＡＤＳＬ、光ファイバ等により接続されている。そして、データベースサーバ１１をサーバコンピュータとして、インターネット１３に接続された端末装置１２をクライアントコンピュータとして、クライアントサーバーシステムが構築されている。 Hereinafter, an embodiment of a gene information management system according to the present invention will be described.
As shown in FIG. 1, the gene information management system in the present embodiment includes a database server 11 and a plurality of terminal devices 12. The database server 11 and the terminal device 12 are connected to the Internet 13 as a network by ADSL using a public telephone line, an optical fiber, or the like. A client server system is constructed with the database server 11 as a server computer and the terminal device 12 connected to the Internet 13 as a client computer.

データベースサーバ１１には、相互にバスにより接続された図示しないＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）等が備えられている。また、端末装置１２等とデータの入出力を行う図示しないインターフェイスも設けられている。更には、記憶手段としての内部記憶装置であるハードディスク１４が接続されている。 The database server 11 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like that are connected to each other via a bus. Further, an interface (not shown) for inputting / outputting data to / from the terminal device 12 or the like is also provided. Furthermore, a hard disk 14 which is an internal storage device as storage means is connected.

ハードディスク１４には、ユーザによって構築された遺伝子の配列情報に関するデータベースや、ＯＳ（Operating System）等が記憶されている。ＯＳは、例えば、ＮｅｔＷａｒｅ（登録商標）といったネットワークＯＳである。当該ＯＳによって、データベースサーバ１１全体の制御とともに、インターネット１３に対する制御も実行される。 The hard disk 14 stores a database related to gene sequence information constructed by the user, an OS (Operating System), and the like. The OS is, for example, a network OS such as NetWare (registered trademark). The OS controls the entire database server 11 as well as the Internet 13.

次に、遺伝子の配列に関する情報が記憶されている配列情報データベースに登録されている配列情報データについて説明する。
図２に示すように、配列情報データベースのテーブル要素としては、ＵｎｉｑｕｅＩＤ１５、アクセス権限１６、データ作成者１７、配列タイプ１８、配列定義１９、配列２０、アノテーション情報２１がある。これらの各テーブル要素で１つの配列情報データが構成されており、配列情報データベースは当該配列情報データが複数登録されているもので構成されている。 Next, sequence information data registered in a sequence information database in which information related to gene sequences is stored will be described.
As shown in FIG. 2, the table elements of the array information database include Unique ID 15, access authority 16, data creator 17, array type 18, array definition 19, array 20, and annotation information 21. Each of these table elements constitutes one array information data, and the array information database is composed of a plurality of registered array information data.

ＵｎｉｑｕｅＩＤ１５の欄には、ユーザによって配列情報データベースに追加された配列情報データの順に連続する番号が記述される。
また、アクセス権限１６の欄には、当該配列情報データを公開するか、非公開とするか、あるいは公開するための準備中であるかについての情報が記述される。たとえば、配列情報データを公開する場合には１が、非公開とする場合には２が、そして、公開するための準備中である場合には３が、それぞれアクセス権限１６の欄にユーザによって記述される。 In the column of UniqueID15, consecutive numbers in the order of sequence information data added to the sequence information database by the user are described.
In the column of access authority 16, information on whether the sequence information data is to be disclosed, not disclosed, or is being prepared for disclosure is described. For example, 1 is described by the user in the column of the access authority 16 when the sequence information data is disclosed, 2 is displayed when the sequence information data is not disclosed, and 3 is prepared when the sequence information data is prepared. Is done.

アクセス権限１６の欄に１が記述されている場合（公開）には、当該配列情報データは、配列情報データベースを閲覧するすべてのユーザによって自由に閲覧可能である。一方、アクセス権限１６の欄に２が記述されている場合（非公開）には、当該配列情報データは、データ作成者以外のユーザが自由に閲覧することはできない。そして、アクセス権限１６の欄に３が記述されている場合（準備中）には、当該配列情報データのうち、データ作成者、配列タイプ、配列定義についてのみ、データ作成者以外のユーザが自由に閲覧することができる。 When 1 is described in the column of the access authority 16 (public), the sequence information data can be freely browsed by all users who browse the sequence information database. On the other hand, when 2 is described in the column of the access authority 16 (not disclosed), the sequence information data cannot be freely viewed by a user other than the data creator. When 3 is described in the access authority 16 column (under preparation), only the data creator, the array type, and the array definition of the array information data can be freely selected by users other than the data creator. You can browse.

データ作成者１７の欄には、配列情報データを作成したユーザの氏名が記述される。また、配列タイプ１８の欄には、核酸の場合は、ｍＲＮＡ（messenger RNA）、ｃＤＮＡ（complementary DNA）、ｃＲＮＡ（complementary RNA）等、アミノ酸配列の場合は、アミノ酸配列と、配列情報のタイプが記述される。配列定義１９の欄には、Ｂｒｃａ１遺伝子、Ｂｒｃａ２遺伝子等の遺伝子の種類に関する情報が記述される。配列２０の欄には、遺伝子の具体的な配列が記述される。 In the column of data creator 17, the name of the user who created the sequence information data is described. In the column of the sequence type 18, in the case of nucleic acids, mRNA (messenger RNA), cDNA (complementary DNA), cRNA (complementary RNA), etc., in the case of amino acid sequences, the amino acid sequence and the type of sequence information are described. Is done. In the column of the sequence definition 19, information on the types of genes such as the Brca1 gene and the Brca2 gene is described. In the column of the sequence 20, a specific sequence of the gene is described.

そして、アノテーション情報２１の欄には、遺伝子配列の注釈が作成者等によって記述される。アノテーション情報２１の欄には、遺伝子の機能や関連画像に関する情報等、複数の情報が記述されるため、アノテーション情報については、配列情報データ毎に図３に示すようなアノテーションテーブル２２が作成される。
アノテーションテーブル２２のテーブル要素としては、ＩＤ２３、アノテーション内容２４、アノテーション付与者２５、作成日時２６、更新日時２７がある。 In the annotation information 21 column, annotations of gene sequences are described by the creator or the like. Since a plurality of pieces of information such as information on gene functions and related images are described in the annotation information 21 column, an annotation table 22 as shown in FIG. 3 is created for each piece of sequence information data. .
Table elements of the annotation table 22 include ID 23, annotation content 24, annotation grantor 25, creation date 26, and update date 27.

ＩＤ２３の欄には、アノテーション毎に異なる文字や数字からなる番号が記述される。アノテーション内容２４の欄には、具体的なアノテーションの内容に関する情報が記述される。アノテーション付与者２５の欄には、当該アノテーションを付与した者の氏名が記述される。作成日時２６の欄には、アノテーション情報が初めて作成された日時が記述される。そして、更新日時２７の欄には、作成された後に更新された最新の日時が記述される。 In the column of ID23, a number consisting of different characters and numbers for each annotation is described. In the annotation content 24 column, information regarding specific annotation content is described. The name of the person who gave the annotation is described in the field of the person giving the annotation 25. The date and time when the annotation information was first created is described in the creation date and time 26 column. In the update date / time field 27, the latest date / time updated after creation is described.

次に、配列情報データベースの編集方法及び配列情報データベースへのアクセス方法について説明する。
なお、以下の処理はデータベースサーバ１１のＲＯＭ等に記憶されているプログラムに基づいて、ＣＰＵ等の制御手段による制御のもとで実行される。
図４に示すように、まず端末装置１２からユーザによって配列情報データベースにログインするためのユーザＩＤ及びパスワードが入力される（ステップＳ１）。ログインに成功すれば、ユーザは配列情報データベースへのアクセス等が可能となる。 Next, a method for editing the sequence information database and a method for accessing the sequence information database will be described.
The following processing is executed under the control of a control means such as a CPU based on a program stored in the ROM or the like of the database server 11.
As shown in FIG. 4, first, a user ID and a password for logging in to the array information database are input by the user from the terminal device 12 (step S1). If the login is successful, the user can access the sequence information database.

次に、ユーザに対して、配列情報データベースを編集するのか否かについての確認が求められる（ステップＳ２）。ユーザが、配列情報データベースの編集を希望した場合、データベースサーバ１１から端末装置１２に対して、配列情報データの作成・更新画面が送信される。ユーザは、当該画面に表示された内容に基づいて当該配列情報データの作成・更新を行う（ステップＳ３）。 Next, the user is asked to confirm whether or not to edit the sequence information database (step S2). When the user wishes to edit the sequence information database, a sequence information data creation / update screen is transmitted from the database server 11 to the terminal device 12. The user creates / updates the sequence information data based on the contents displayed on the screen (step S3).

配列情報データの作成・更新の際に、ユーザは、データベースサーバ１１から配列情報データベースのテーブル要素であるアクセス権限１６の設定を促される。ユーザは、作成・更新する配列情報データについて、他のユーザへの公開、非公開、公開準備中アクセス権限を設定する。ここで、配列情報データの更新が実行される場合であって、今までのアクセス権限が他のアクセス権限へ変更された場合、データベースサーバ１１から他のユーザに対してその旨がメール等で通知される。
ユーザによる配列情報データの作成・更新作業が終了すると、データベースサーバ１１によって、配列情報データベースの更新が実行される（ステップＳ４）。 When creating / updating array information data, the user is prompted by the database server 11 to set access authority 16 which is a table element of the array information database. The user sets public, non-disclosure, and public preparation access authority to other users for the sequence information data to be created / updated. Here, when the sequence information data is updated and the access authority so far is changed to another access authority, the database server 11 notifies other users by e-mail or the like. Is done.
When the creation / updating of the sequence information data by the user is completed, the sequence information database is updated by the database server 11 (step S4).

一方、ステップＳ２において、ユーザが、配列情報データベースの編集を希望しなかった場合、データベースサーバ１１によって、端末装置１２に対して、配列情報データの検索であるか否かを確認するための確認画面が送信される（ステップＳ５）。ユーザが、配列情報の検索を希望して配列情報の検索を実行した場合、データベースサーバ１１によって、ユーザＩＤに基づいて検索対象である配列情報データについてユーザにアクセス権限があるか否かについての確認が実行される（ステップＳ６）。 On the other hand, in step S2, if the user does not wish to edit the sequence information database, the database server 11 causes the terminal device 12 to confirm whether or not the sequence information data is being searched. Is transmitted (step S5). When the user wishes to retrieve the sequence information and executes the sequence information search, the database server 11 confirms whether or not the user has access authority for the sequence information data to be retrieved based on the user ID. Is executed (step S6).

ユーザにアクセス権限があると判断された場合、データベースサーバ１１によって、当該ユーザの使用している端末装置１２へ、検索対象である配列情報データが送信されるとともに（ステップＳ７)、当該配列情報データが端末装置１２のディスプレイ等に表示される（ステップＳ８)。ユーザにアクセス権限がないと判断された場合、データベースサーバ１１によって、当該ユーザの使用している端末装置１２へ、その旨のメッセージデータが送信されるとともに、当該メッセージデータが端末装置１２のディスプレイ等に表示される。 When it is determined that the user has access authority, the database server 11 transmits the sequence information data to be searched to the terminal device 12 used by the user (step S7), and the sequence information data Is displayed on the display or the like of the terminal device 12 (step S8). When it is determined that the user has no access authority, the database server 11 transmits message data to that effect to the terminal device 12 used by the user, and the message data is displayed on the display of the terminal device 12 or the like. Is displayed.

なお、上記実施の形態は、以下のように変更して実施してもよい。
・データベースサーバ１１及び端末装置１２は、ＬＡＮにより接続してもよい。
・配列情報データベースの構成は、図２に示すような形態に限られることはなく、適宜必要な情報項目を追加したり、不必要な情報項目を削除したりしてもよい。 The embodiment described above may be modified as follows.
The database server 11 and the terminal device 12 may be connected via a LAN.
The configuration of the sequence information database is not limited to the form shown in FIG. 2, and necessary information items may be appropriately added or unnecessary information items may be deleted.

遺伝子情報管理システムの構成図。The block diagram of a gene information management system. データの一例を示す面。A surface showing an example of data. アノテーションテーブル。Annotation table. 配列情報データベースの作成・更新等の流れを示すフローチャート。The flowchart which shows the flow of creation / update of an arrangement | sequence information database.

Explanation of symbols

１１…データベースサーバ、
１２…端末装置、
１３…ネットワークとしてのインターネット、
１４…記憶手段としてのハードディスク。 11 ... Database server,
12 ... a terminal device,
13 ... Internet as a network,
14: Hard disk as storage means.

Claims

A gene information management system using a computer system in which a database server provided with a storage means and a terminal device used by a user are connected to be able to communicate with a network,
The storage means stores a sequence information database in which sequence information data relating to gene sequences is registered,
Whether or not access authority is set for each array information data, and when the array information data is retrieved from the terminal device by a user, the database server publishes the array information data based on the access authority A gene information management system characterized by determining whether or not to perform.