WO2009157251A1

WO2009157251A1 - Method of diagnosing integration dysfunction syndrome

Info

Publication number: WO2009157251A1
Application number: PCT/JP2009/057861
Authority: WO
Inventors: 秀幸青島; 一男竹村; 健太朗飯嶋; 浩志林
Original assignee: 株式会社エスアールエル
Priority date: 2008-06-25
Filing date: 2009-04-20
Publication date: 2009-12-30
Also published as: JP2010029171A

Abstract

Disclosed is a novel means whereby integration dysfunction syndrome can be highly accurately and objectively diagnosed with the use of the blood of a patient as a sample. A method of detecting integration dysfunction syndrome with the use, as indications, of the expression amounts of ten specific genes in a sample separated from living body. According to this method, integration dysfunction syndrome can be objectively diagnosed at a high accuracy. By using a number of specimens, it is confirmed that both of the detection sensitivity (i.e., true positive ratio) and the specificity (i.e., true negative ratio) exceed 80%. Since blood is usable as the specimen, this method can be conveniently carried out.

Description

Diagnosis method of schizophrenia

The present invention relates to a method for diagnosing schizophrenia using blood as a sample.

The incidence of schizophrenia in Japan is about 0.8% of the population, and it occurs mainly in adolescence. The prognosis for the disease varies. About 1/3 of all patients see significant and continuous improvement. 1/3 improves somewhat, but leaves intermittent recurrence and residual disability. The remaining 1/3 is a serious mental illness that is severely impaired and permanently impairs social functioning.

】 Early treatment is important in the treatment of schizophrenia. Traditionally, the diagnosis of schizophrenia is based on a comprehensive evaluation by DSM-IV (Diagnostic and Manual-of Mental-Disorders-IV), a diagnostic and statistical manual for mental disorders established by the American Psychiatric Association (APA). Be defeated. However, in such a method, since it greatly depends on the subjectivity and skill of the diagnostician, it is difficult to make an objective and early diagnosis of the disease.

確立 If an objective diagnostic method using biological markers for schizophrenia is established, early diagnosis and early treatment will be possible, and it will be possible to avoid the severity and improve the cure rate. Examples of diagnostic methods using biological markers that have been reported so far include a method of diagnosing schizophrenia (schizophrenia) using serum concentration of epidermal growth factor as an index (Patent Document 1). Alternatively, there is a method using blood as a sample and using the expression level of a specific gene as an index (Patent Document 2). However, the methods described in Patent Documents 1 and 2 still cannot satisfy the accuracy of diagnosis.

Japanese Patent No. 3706913 JP 2004-135667 A

Therefore, an object of the present invention is to provide means capable of objectively diagnosing schizophrenia with high accuracy using patient blood as a sample.

The inventors of the present invention use blood as a sample, compare the expression levels of about 55,000 genes between healthy individuals and schizophrenic patients, select genes whose expression levels vary significantly, and further describe the invention described later Narrow down to the criteria that we have devised independently, and then use the neural network to increase the variable and cross-validation method to perform the primary selection of the gene group, and the inventors of the present application further select the primary selection of the gene group A low-cost and highly versatile microarray equipped with a gene group to which a large number of candidate genes for classification prediction were added, processed with a neural network in the same manner as described above, and the sensitivity of detection by the constructed classification prediction algorithm The fact that (true positive rate) and specificity (true negative rate) were 80% or more was confirmed using a large number of actual samples, and the present invention was completed.

That is, the present invention provides a method for detecting schizophrenia using as an index the expression level of the following gene groups (1) to (10) in a sample isolated from a living body.
(1) DLGAP3 (SEQ ID NO: 1)
(2) KCNJ15 (SEQ ID NO: 2)
(3) GPR30 (SEQ ID NO: 3)
(4) NPCR (SEQ ID NO: 4)
(5) TMED1 (SEQ ID NO: 5)
(6) PAFAH2 (SEQ ID NO: 6)
(7) TMEM23 (SEQ ID NO: 7)
(8) ABCG1 (SEQ ID NO: 8)
(9) PGRMC1 (SEQ ID NO: 9)
(10) INSL3 (SEQ ID NO: 10)

The present invention provides for the first time a means capable of objectively diagnosing schizophrenia with high accuracy.

It is a figure which shows the relationship between the number of probes and the correct answer rate in the classification | category prediction model by a neural network performed in the Example of this invention. It is a figure which shows the dependent variable of each sample of the learning example calculated | required by multiple regression analysis, and a test example.

As described above, the present invention uses the expression level of the gene groups (1) to (10) as an index. The sample for measuring the expression level of each gene is not particularly limited as long as it is a sample isolated from a living body, but as described in detail in the following examples, the gene group is selected using blood as a sample. Therefore, it is preferable to use blood as a sample. In addition, the gene group includes those in which the expression level is increased or decreased in schizophrenic patients. Further, in the following examples, determination can be made based on the expression levels of only the above 10 genes that have been confirmed to have a detection sensitivity (true positive rate) and specificity (true negative rate) of 80% or more. preferable. As specifically described in the examples below, it is preferable to simultaneously measure the expression levels of other genes such as various genes for normalization in order to ensure measurement accuracy. Yes, “based on the expression levels of only the 10 genes” means that the expression levels of only the 10 genes are used as direct variables for classification prediction. The true positive rate representing sensitivity is a / (a + b) in Table 1 below, and the true negative rate representing specificity is 1-false positive = d / (c + d). The correct answer rate is (a + d) / (a + b + c + d).

The sequences of the above 10 types of genes are described in each of the above SEQ ID Nos. The GenBank accession number of each gene, the gene product, and the probe used for measuring the expression level of each gene used in the following Examples The numbers and their sequence numbers are shown in Table 2 below.

In addition, among these, explanation of the function of genes whose functions are well known and whether the expression level is increased (upward arrow) or decreased (downward arrow) in schizophrenic patients compared to healthy individuals are shown in Table 3 below. Shown in

Measurement of the expression level of each gene in the sample itself can be performed by a known method. The measurement method is not particularly limited, but a method using a single-stranded oligonucleotide probe that hybridizes to the sense strand or antisense strand of each gene, preferably a DNA array on which a DNA probe is immobilized, is simple and preferred. For example, as specifically described in the Examples below, oligonucleotide probes that extract total mRNA from blood, prepare cRNA labeled with biotin from the extracted mRNA, and hybridize with cRNA derived from each gene. CRNA is applied to the immobilized array, the cRNA and probe are hybridized, the array is washed, and the amount of label remaining on the substrate is measured to determine the amount of cRNA, and hence the amount of mRNA, that is, the gene expression level. Can be measured.

The probe to be immobilized has a size that specifically hybridizes with cRNA, and usually has a size of about 18 to 50 bases, preferably about 20 to 40 bases. In addition, the probe to be immobilized is preferably completely complementary to the region of RNA to which it is hybridized, but the normal hybridization when using a DNA array as specifically described in the following examples. A small number (usually 1 or 2) of mismatches is acceptable as long as it hybridizes under the conditions. Therefore, even when a natural SNP occurs in a gene, it can be measured using the same DNA array.

In the following examples, the expression level of the gene group is SEQ ID NO: 34, SEQ ID NO: 42, SEQ ID NO: 77, SEQ ID NO: 81, SEQ ID NO: 98, SEQ ID NO: 109, SEQ ID NO: 122, SEQ ID NO: 165, SEQ ID NO: 200 and SEQ ID NO: It is measured using an oligonucleotide probe having the base sequence indicated by No. 218, and a DNA array on which these probes are immobilized can be preferably used.

The determination based on the expression level of the gene group is basically performed by comparing the expression level of the gene group with the expression level of the gene group in known schizophrenia patients and healthy subjects, which are measured in advance. . This comparison is preferably performed by a neural network trained by the variable increment method using the expression level of the gene group in known schizophrenia patients and healthy individuals. Input the measured expression levels of the above 10 types of genes into the constructed learned neural network (construction method will be described later), output the prediction probability of the group classified into the neural network, and use this prediction probability as a criterion Schizophrenia can be detected.

Alternatively, the above comparison is preferably performed by multiple regression analysis. By using the measured expression levels of the 12 genes as explanatory variables and performing multiple regression analysis of the expression levels of the genes in known schizophrenic patients and healthy individuals, a prediction formula (multiple regression formula) can be obtained. By calculating the dependent variable by inputting the expression level of the gene group in the subject to the obtained prediction formula, and comparing the dependent variable with the dependent variable of a known schizophrenic patient and a healthy person, Whether or not the subject has schizophrenia can be determined. This comparison is made, for example, based on the dependent variable calculated for each sample of the known schizophrenia patient group and the healthy subject group, by defining the value of the dependent variable that can preferably classify both groups as a cut-off value. This can be done by comparing the dependent variable of this person with this cutoff value. For example, if the expression level is analyzed in a patient with schizophrenia and the dependent variable is set to be large, if the numerical value of the dependent variable calculated for the subject is greater than the cutoff value, the subject is integrated. Can be predicted to be ataxia. The cut-off value can be appropriately determined by routine statistical processing based on the dependent variables calculated for known schizophrenia patients and healthy individuals. The technique of multiple regression analysis itself is well known, and various software and the like for performing multiple regression analysis are known, and there are many commercially available products. Any software may be used in the present invention. In addition, since the prediction formula can be determined once the analysis for the known patient and the healthy person is performed once, it is not necessary to perform the analysis for the known patient / healthy person group every time it is performed, and the prediction formula once obtained. Can also be used in subsequent implementations.

In the present invention, in the case of “multiple regression analysis”, an analysis method including a step of obtaining a dependent variable of a sample using the obtained multiple regression equation is widely included, and an analysis step for obtaining a multiple regression equation includes Not necessarily included. Therefore, as described above, any method for detecting schizophrenia using the already obtained multiple regression equation is included in the “detection method for performing comparison by multiple regression analysis” in the present invention.

The measurement value of the expression level used in the present invention is preferably a value obtained by normalizing the measured signal intensity by a global normalization method as described in the following examples. Here, the global normalization method is a method of calculating the relative expression level by calculating the median value of the expression levels of all genes mounted on the DNA microarray and dividing the expression level of each gene by this median value. .

When performing the method of the present invention using a neural network, the neural network itself is well known and a commercially available neural network can be used. However, although the neural network itself can use a commercial product, in the present invention, there is a feature in the data to be learned by the neural network, and sensitivity (true positive rate) and specificity (true negative rate) can be obtained by learning any data. It is necessary to devise whether both can be increased to 80% or more (described later).

An optimal model of a classification prediction model using a neural network can be constructed by a method detailed in the following example, for example. Briefly, for example, the optimum model can be determined as follows. First, the expression level of various genes is measured using samples collected from many schizophrenic patients and healthy individuals. The expression level of the gene can be performed using a DNA microarray as described above. In the following examples, a commercially available DNA microarray equipped with DNA probes of about 55,000 kinds of human genes was used.

Next, data cleansing is performed on the expression level measured using a DNA microarray. Here, the data cleansing can be performed, for example, by excluding probes of genes less than 30% tile or 98% tile or more of the entire expression level.

Divide the DNA microarray data of a large number of schizophrenia patients and healthy individuals into study examples and test examples independent of the study examples, let the study examples be learned by the neural network, and how much sensitivity and It was constructed by the Hold out cross validation method, which calculates and evaluates whether specificity is achieved using test examples. The classification prediction model was constructed by changing the parameters of the neural network, ensuring the independence of the samples assigned to the learning examples and test examples, and continuing the verification, and adopting the model with the best results. In the following examples, 2/3 of schizophrenia patients and healthy subjects were used as learning examples, and 1/3 were used as test examples.

First, prepare the learning data to be learned by the neural network. In the following examples, probes other than Quality Flag “Good”, probes of genes located on the Y chromosome, probes set distal from the mRNA 3 ′ end, etc. are excluded, and 10,498 from about 55,000 probes. Narrow down to the probe. Here, the quality “Flag” being “Good” means that the measured expression level is larger than 1.5SD of the background around the spot and can be trusted as the measurement value. Further, since the gene located on the Y chromosome is present only in males, it was excluded because the sensitivity and / or specificity of detection might be lowered when females were examined. Probes set distal from the mRNA 3 'end are excluded because they are subject to bias in the preparation of cRNA and are a significant variation in the measured values. Furthermore, preliminary analysis excluded those with a missing value of 25% or more, those with a large difference in expression between men and women, and those with a large difference between batches during array production.

Next, the expression level of the gene derived from the RNA hybridized with each probe, measured for each probe selected in this way, is input to the neural network, and the two-group test (t test), that is, the learning example A significant difference test (t-test) is performed between schizophrenia (non-medicine) and healthy subjects. In the following examples, samples were also narrowed down. That is, the median value of 56 healthy subjects was calculated for each probe, the correlation of each sample was examined with the data set as an object, and the parameters of the approximate curve and the signal intensity ratio greatly separated were excluded from the analysis target.

プローブ Select the probe for which significant difference is recognized by the significant difference test by the forward selection method. The variable increasing method itself is well known, and is performed by adding explanatory variables (measurement results of each probe) one by one and obtaining a combination having a high correlation with the objective variable (correct answer rate). Using a neural network installed on the computer, the variable increase method is performed, and the number of probes with the highest correct answer rate is selected. In the following examples, the correct answer rate was the highest when the measured values of 14 types of probes were used as explanatory variables among the measured values of 10,498 probes described above. According to the optimal model of the neural network constructed in this way, both sensitivity and specificity exceeded 80%, and it was possible to detect schizophrenia using this optimal model and the DNA microarray. It was.

However, a DNA microarray equipped with about 55,000 types of probes is expensive, and only one specimen can be processed with one microarray. Therefore, it is desirable to use a lower cost microarray for practical use. Therefore, in the following examples, 216 types of probes were selected from the probes including the 14 types of probes selected above and having a significant difference between schizophrenia and healthy subjects, and mounted on the substrate. Of these 216 types of probes, 202 types of probes other than the 14 types of probes described above are probes that have a significant difference between schizophrenia and healthy subjects, and patients with bipolar disorder who are similar mental disorders. Genes with statistically significant differences among schizophrenic patients were selected. Furthermore, a probe used for global normalization and a management probe (for alignment) were also mounted (details in the following examples). The global normalization was selected with small variation between arrays. In this practical array, a plurality of chambers (16 in the following embodiment) can be formed on a single substrate, that is, 16 specimens can be simultaneously tested with a single array. Cost, and inspection cost and labor can be greatly reduced.

The measurement results of the 14 types of previously specified probes, measured using this practical array, are input to the neural network classification prediction model constructed previously, and the sensitivity and specificity are determined using the above test example. As a result of calculation, the sensitivity and specificity of the healthy person was less than 80%, and when the sensitivity and specificity were calculated using another test example, both the sensitivity and specificity of schizophrenia (untreated) and healthy persons were calculated. It became less than 80%.

Therefore, using the measured values using the practical array, using the neural network installed in the computer as described above, using the cross-validation method and the variable increasing method, the probe with the highest correct answer rate is used. A combination was sought. As a result, the above 10 genes were identified. Note that the above 10 types of genes are different from the 14 types of genes specified earlier, and only 2 types of genes overlapped with both.

When the measured values of the above 10 kinds of gene probes are input as explanatory variables and the sensitivity and specificity are calculated for the test examples, the sensitivity and specificity of both schizophrenia and healthy subjects exceed 80%, and the sensitivity is high. It was also confirmed that schizophrenia can be detected with high specificity.

In addition, when multiple regression analysis was performed using the measurement values of the above-described 10 types of gene probes as explanatory variables, and sensitivity and specificity were calculated for the test examples, both schizophrenia and healthy subjects were sensitive and Specificity exceeded 80%. Multiple regression analysis also confirmed that schizophrenia could be detected with high sensitivity and high specificity using the above 10 gene expression levels.

In the present invention, “having a base sequence” means that the bases are arranged in such an order. Therefore, for example, “an oligonucleotide probe having the base sequence represented by SEQ ID NO: 42” means an oligonucleotide probe having a base sequence of tcccacatcc ccttgaatat cccaggaaaa represented by SEQ ID NO: 42 and having a size of 30 bases.

Hereinafter, the present invention will be described more specifically based on examples.

1. Probe blood collection and sample storage Schizophrenia patients Antipsychotics 58 untreated groups, 56 healthy subjects, and 41 patients with bipolar disorder, a psychotic disorder, PAXgene Blood RNA Kit (Qiagen, Valencia, Blood collection and RNA extraction were performed using CA, USA). Two PAXgene Blood RNA Tubes were collected 2.5 ml each, mixed by inversion, frozen, and transported to the laboratory. Storage was -80 ° C.

RNA extraction PAXgene Blood RNA Tubes stored at -80 ° C were thawed at room temperature and total RNA was extracted according to the manufacturer's instructions. The extracted total RNA was stored at -80 ° C.

Confirmation of concentration and quality of extracted RNA Extracted total RNA was diluted 50-fold with 10 mM Tris-HCl (pH 7.5), absorbance at 230, 260, and 280 nm was measured, and the concentration of total RNA was measured. The quality of the extracted RNA was confirmed with an Agilent 2100 bioanalyzer (Agilent Technologies, Inc. Santa Clara, CA, USA).

Preparation of cRNA cRNA was prepared using 0.5 μg of extracted total RNA. Biotin-labeled cRNA was prepared using an iExpress kit (GE Healthcare Bioscience, Chandler, CA, USA) according to the manufacturer's instructions.

The quantification and quality confirmation of the prepared cRNA were performed in the same manner as the quantification and quality confirmation of the extracted total RNA. That is, the absorbance of 230, 260, and 280 nm of cRNA solution diluted 50 times was measured, and the concentration of total RNA was measured. The quality of the cRNA was confirmed using an Agilent 2100 bioanalyzer.

Hybridization and Washing to Array As a microarray, Codelink ™ 55K Bioarray (GE Healthcare Bioscience) was used. Codelink (trademark) 55K Bioarray is coated with acrylamide with special chemical modification on the surface of the slide glass, and the 30mer probe is three-dimensionally fixed. It is an excellent microarray, and probes corresponding to about 55,000 human genes are immobilized.

10 μg of cRNA was prepared with RNase-Free H ₂ O to a final volume of 20 μl, 5 μl of 5 × Fragmentation Buffer of iExpress kit was added, and then incubated at 94 ° C. for 20 minutes to fragment the cRNA.

10 μg of fragmented cRNA (25 μl), 78 μl of iExpress® kit Hybridization Buffer A, and 130 μl of iExpress kit Hybridization Buffer B were mixed to prepare a total of 260 μl. After incubating at 90 ° C. for 5 minutes, it was incubated on ice for 5-30 minutes.

250 μl of hybridization solution was injected into the chamber of CodeLink ™ 55K Bioarray (GE Healthcare Bioscience, Chandler, CA, USA) and using a CodeLink ™ INNOVA shaker (GE Healthcare Bioscience, Chandler, CA, USA) The array was incubated for 18-24 hours at 37 ° C. with swirling at 300 rpm.

The array was fixed using Hybridization® Removal Tool, the hybridization chamber was peeled off, and the array was set on Bioarray® Rack. The Bioarray® Rack with the array set was transferred to a Large® Reagent reservoir containing 0.75 × TNT® Buffer at 46 ° C. and incubated at 46 ° C. for 1 hour.

The Bioarray® Rack was transferred to a Small® Reagent reservoir filled with 3.4 μml of Streptavidin-Cy5 diluted solution and incubated at room temperature for 30 minutes. After staining, the Bioarray Rack was transferred to a Large Reagent reservoir filled with 240 ml of 1 × TNT Buffer, and washed by repeating the operation of incubating at room temperature for 5 minutes 4 times. Next, the Bioarray® Rack was transferred to a Large® Reagent reservoir filled with 0.1 × SSC / 0.05% Tween-20, washed for 30 seconds, the array was centrifuged and dried, and then stored in the dark until scanning.

Array Scanning The washed and dried arrays were scanned with an Agilent Scanner (Agilent Technologies, Santa Clara, CA, USA). The scanner settings were Red PMT [%] 70%, Dye Channel Red (Red is Cy5). The other settings are the default. The scanned array data was saved as a TIF file and digitized.

Digitization of array data According to the manufacturer's instructions, CodeLink (trademark) Expression Analysis was used to digitize array data saved in TIF files and normalize by global normalization.

Narrowing down probes From the experimental results obtained above, probes other than Quality Flag “Good”, probes located on the Y chromosome, probes set distal from the mRNA 3 ′ end, and the like were excluded. Furthermore, preliminary analysis also excluded those with a missing value of 25% or more, those with a large difference in the expression level between men and women, and those with a large difference between batches during array production. As a result, the probe was narrowed down from about 55,000 probes to 10,498 probes.

2. Statistical processing Based on the above-mentioned conditions, schizophrenic patients, antipsychotics, 43 untreated groups, and 38 healthy subjects, significant differences between 2 groups, and 32 patients with bipolar disorder and schizophrenia Patients who have a statistically significant difference between 40 specimens of anti-psychotic drug untreated groups and a probe that has a statistically significant difference between 32 specimens of bipolar disorder and 38 healthy subjects 216 were extracted. The sequences of these probes are shown in SEQ ID NOs: 11 to 226 in the sequence listing. In addition, Table 4 below shows the gene name and GenBank Accession No. from which each probe is derived.

3. Designing a practical array CodeLink (trade name) 55K Bioarray is very expensive, and only one sample can be processed and analyzed. In order to put it to practical use, a microarray that can be analyzed at a lower cost is required. Therefore, the same surface treatment as CodeLink (trade name) 55K Bioarray is applied, and this is divided into 16 chambers, so that CodeLink (trade name) 16-Assay can process and analyze up to 16 samples at a time. A practical array based on Bioarray (Applied Microarrays) was designed. In addition to the probe of 216 genes described above, a probe used for global normalization (SEQ ID NO: 226 to 525) and a management probe provided by the manufacturer were added to design the following array.

CodeLink (Product Name) 16-Assay Bioarray Probe Breakdown
(Total 1714 spots / chamber)
-Classification prediction candidate probe: 216 probes x 4 spots-Additional probe for normalization: 299 probes x 2 spots-Management probe by manufacturer: 96 probes x 1 spot each- Standard reserved probe: Grid (32), Positive Control (60), Negative Control (64)

4. CodeLink (trade name) 16-Assay Bioarray classification prediction based on measurement results (neural network)
Based on the gene expression information of 60 untreated schizophrenia patients and 56 healthy subjects, we attempted to construct a classification prediction model using a neural network with excellent classification prediction.

In building a classification prediction model, about 2/3 of the whole is used as a learning example (40 patients with schizophrenia, antipsychotic drug untreated group, 38 healthy subjects) to build a classification prediction model, and is involved in building the model The remaining 1/3 was tested by the Hold 例 out cross validation method in which the model was evaluated as a test example (20 patients with schizophrenia, antipsychotic drug untreated group, 18 healthy subjects).

(1) Acquisition of gene expression information Using the above practical array containing 216 probes, gene expression information of 60 untreated schizophrenia patients and 56 healthy persons was acquired. Blood collection, RNA extraction, cRNA preparation, array hybridization, and fluorescent signal (fluorescent dye Cy5) scanning were performed as described above. Image data of 1714 spots read by the scanner was digitized and normalized by global normalization using CodeLink ™ Expression Analysis software.

(2) Construction of classification prediction model by neural network Analyze normalized data using neural network installed in commercially available analysis software ArrayAssist (registered trademark) (STRATAGENE) to construct classification prediction algorithm Tried. A classification prediction algorithm is a series of algorithms that can output an optimal solution by inputting a data set whose attributes have been clarified in advance and performing “learning and training”. It is said that an algorithm with high classification accuracy can be constructed from the high learning effect.

2/3 of the normalized data (schizophrenic patients antipsychotic drug untreated group 40 cases, 38 healthy subjects) was input to the ArrayAssist neural network as a learning example, and an algorithm was constructed. Feature Selection was performed by the variable increment method (Forward Selection), and the classification prediction algorithm was constructed by cross validation (N-fold cross validation (N = 3)) that divided the learning data into three sets. Specifically, the data set of the learning example was divided into three parts, and the prediction was made using one of the 216 probes with significant difference one by one while changing the data set. We decided to adopt the probe as a probe to be used for classification prediction, and added probes that were repeatedly used in this order. In this way, the number of probes used was gradually increased, and when the ratio (Number 分類 of Class Accuracy (%)) that was correctly classified reached a plateau, the learning was terminated.

Next, the data set of the test example was analyzed using the algorithm learned in this way. Enter the normalized data for the probe set used at the point where Number of Class Accuracy reaches the plateau into the learned algorithm above to verify how well the classification of the test example matches the clinical diagnosis did.

Numerous algorithms are constructed by variously changing various parameters of the neural network (learning efficiency, momentum, number of repetitions, number of layers, number of neurons), and learning accuracy by using cross-validation and test examples described above for each. Verification was performed.

As a result, it was possible to classify the test examples most correctly in the algorithm in which the parameters were set as follows.
Learning efficiency: 0.5
Momentum: 0.3
Number of repetitions: 115
Number of layers: 1
Number of neurons: 3

Table 5 and Table 6 show the prediction results for learning examples and test examples based on this algorithm. Further, FIG. 1 shows the result of Forward Selection for this algorithm.

As shown in FIG. 1, according to the constructed algorithm, it is possible to classify schizophrenia and a healthy person with high accuracy using expression data obtained by 10 probes (Table 2, supra). When the biological functions of genes detected by these probes were investigated, as shown in Table 3 above, there was a gene that was reported to have an action in the cranial nervous system.

5. CodeLink (trade name) 16-Assay Bioarray classification prediction based on measurement results (multiple regression analysis)
Similar to the above, classification prediction by multiple regression analysis was attempted using gene expression information of 60 untreated schizophrenia patients and 56 healthy subjects. Using the expression data of the 10 probes as explanatory variables, a multiple regression analysis was performed on the learning example using commercially available software (SPSS), and a prediction formula was constructed. Multiple regression analysis was performed so that the dependent variable was increased in patients with schizophrenia. Subsequently, the dependent variable was calculated about the said test example using the constructed prediction formula. The obtained prediction formula is as follows.
Y = (A ₁ X ₁ + A ₂ X ₂ + A ₃ X ₃ + A ₄ X ₄ + A ₅ X ₅ + A ₆ X ₆ + A ₇ X ₇ + A ₈ X ₈ + A ₉ X ₉ + A ₁₀ X ₁₀ + C) x 100
here,
X ₁ is the gene expression level of DLGAP3 (GE54859 SEQ ID NO: 42)
X ₂ is the gene expression level of KCN15J (GE58277 SEQ ID NO: 77)
X ₃ is the gene expression level of GPR30 (GE80129 SEQ ID NO: 165)
X ₄ is the gene expression level of NPCR (GE540583 SEQ ID NO: 34)
X ₅ is gene expression level TMED1 (GE85017 SEQ ID NO: 200)
X _6, the gene expression level of PAFAH2 (GE62881 SEQ ID NO: 122)
X ₇ is a gene expression level of TMEM23 (GE60313 SEQ ID NO: 98)
X ₈ is the gene expression level of ABCG1 (GE586854 SEQ ID NO: 81)
X ₉ is the gene expression level of PGRMC1 (GE62032 SEQ ID NO: 109)
X ₁₀ indicates the gene expression level of INSL3 (GE88024 SEQ ID NO: 218), respectively.
The coefficient multiplied by the expression level of each gene is
A ₁ is 1.00019621196698
A ₂ is 0.273175387505458
A ₃ is 0.606651443546423
A ₄ is -0.659859665599205
A ₅ is, -0.287215519193429
A ₆ is, -0.271285204843002
A ₇ is -0.220049126802913
A ₈ is -0.00285057386315785
A ₉ is 0.478133475554455
A ₁₀ is -0.169744977943406
Constant C is 0.0429404615746508
It is.

Tables 7 and 8 show the results of tabulating the dependent variable 50 as a cutoff value. Moreover, the dependent variable calculated about each sample of a learning example and a test example is shown in FIG.

Sensitivity: 92.5% (37/40), Specificity: 92.1% (35/38), Correct answer rate: 92.3% (72/78)

Sensitivity: 95.0% (19/20), Specificity: 94.4% (17/18), Correct answer rate: 94.7% (36/38)

As described above, when the cutoff value is 50, both the sensitivity and specificity in the test example exceeded 80%. Multiple regression analysis was also able to classify schizophrenia and healthy subjects with high accuracy using the above 10 gene expression levels.

Claims

A method for detecting schizophrenia, wherein the expression level of the following gene groups (1) to (10) in a sample isolated from a living body is used as an index.
(1) DLGAP3 (SEQ ID NO: 1)
(2) KCNJ15 (SEQ ID NO: 2)
(3) GPR30 (SEQ ID NO: 3)
(4) NPCR (SEQ ID NO: 4)
(5) TMED1 (SEQ ID NO: 5)
(6) PAFAH2 (SEQ ID NO: 6)
(7) TMEM23 (SEQ ID NO: 7)
(8) ABCG1 (SEQ ID NO: 8)
(9) PGRMC1 (SEQ ID NO: 9)
(10) INSL3 (SEQ ID NO: 10)
The method according to claim 1, wherein only the expression level of the gene group (1) to (10) is used as an index.
The expression level of the gene group is shown in SEQ ID NO: 34, SEQ ID NO: 42, SEQ ID NO: 77, SEQ ID NO: 81, SEQ ID NO: 98, SEQ ID NO: 109, SEQ ID NO: 122, SEQ ID NO: 165, SEQ ID NO: 200 and SEQ ID NO: 218. The method according to claim 1 or 2, which is measured by an oligonucleotide probe having a base sequence.
4. The method according to claim 1, comprising a step of comparing the expression level of the gene group in the sample with the expression level of the gene group in a known schizophrenia patient and a healthy person, which is measured in advance. The method described.
The method according to claim 4, wherein the comparison is performed by a neural network that is trained by a variable increment method using expression levels of the gene group in known schizophrenic patients and healthy individuals.
The method according to claim 4, wherein the comparison is performed by multiple regression analysis using the expression level of the gene group as an explanatory variable.
7. The method of claim 6, comprising comparing the dependent variable calculated for the sample with a cutoff value determined based on the dependent variable calculated for known schizophrenic patients and healthy individuals.
The expression level of the gene group is obtained by normalizing the signal intensity of the gene expression level measured by an array spotted with oligonucleotide probes having the nucleotide sequences shown in SEQ ID NOs: 11 to 226 by the global normalization method. 8. A method according to any one of claims 3-7.
The method according to claim 8, wherein the array further comprises oligonucleotide probes having the base sequences shown in SEQ ID NOs: 227 to 525.
10. The method according to any one of claims 1 to 9, wherein the sample is blood.