WO2018169121A1

WO2018169121A1 - Method, device, and program for predicting prognosis of synovial sarcoma by using artificial neural network

Info

Publication number: WO2018169121A1
Application number: PCT/KR2017/004189
Authority: WO
Inventors: 서성욱; 한일규; 김준혁
Original assignee: 사회복지법인 삼성생명공익재단
Priority date: 2017-03-16
Filing date: 2017-04-19
Publication date: 2018-09-20
Also published as: KR102172374B1; KR102172374B9; KR20180105905A

Abstract

A method for predicting the prognosis of synovial sarcoma by using an artificial neural network, according to one embodiment of the present invention, comprises the steps of: acquiring clinical data and survival period data of a plurality of synovial sarcoma patients; acquiring learning input data and learning output data from the clinical data and the survival period data; and teaching an artificial neural network, including an input layer, a hidden layer, and an output layer, by using the learning input data and the learning output data, so as to generate a model for predicting the survival rates of the synovial sarcoma patients.

Description

Prognostic Method, Apparatus and Program of Synovial Sarcoma Using Artificial Neural Network

The present invention relates to a method, apparatus and program for predicting prognosis of synovial sarcoma using an artificial neural network.

Synovial sarcoma accounts for about 10% of malignant soft tissue tumors

It is a tumor that occurs frequently before the 20s and 30s, and its pathophysiology and prognostic factors are not well known, and the efficacy of chemotherapy or radiation therapy in addition to surgical resection has not been established.

The study of synovial sarcoma is difficult due to its prevalence, which makes it difficult to include a large number of studies, and includes a heterogeneous group such as various histological subtypes, positions in the trunk and limbs, and inadequate surgical resection. Even so, it is difficult to reach clear conclusions. As a result, a large number of case studies and various studies have been conducted on the types of cells expressed by tumors in relation to the characteristics of synovial sarcoma, their actual biological behavior, the presence of more common biological phenotypes in adolescence, prognostic factors, and usefulness of adjuvant chemotherapy. The controversy still exists.

Existing studies on synovial sarcoma only have analyzed the effects of radiation therapy or adjuvant chemotherapy after case reports or surgical operations, and cannot accurately predict survival or prognosis of synovial sarcoma patients.

Disclosure of Invention The present invention aims to provide a method, apparatus and program for predicting the prognosis of synovial sarcoma using an artificial neural network.

However, these problems are illustrative, and the scope of the present invention is not limited thereby.

Prognosis prediction method of synovial sarcoma using an artificial neural network according to an embodiment of the present invention, the method comprising: obtaining clinical data and survival data of a plurality of synovial sarcoma patients; Acquiring training input data and training output data from the clinical data and the survival data; And generating a model for predicting survival rate of synovial sarcoma patients by learning an artificial neural network including an input layer, a hidden layer, and an output layer using the training input data and the training output data.

In one embodiment, the learning input data is age, sex, tumor location, initial metastasis, whether chemotherapy, radiation therapy, ablation of the plurality of synovial sarcoma patients Resection margin positive and pathological subtype data.

In one embodiment, acquiring training input data and training output data from the clinical data and the survival data, respectively, may include missing values using a k-nearest neighbor algorithm (knn). data, NaN) may be added.

In one embodiment, generating the model for predicting the survival rate may include training the artificial neural network for each time interval.

In one embodiment, generating the model for predicting the survival rate comprises: generating an N-th section survival prediction model using the clinical data and the N-th section survival time data of the plurality of synovial sarcoma patients; And generating an N + 1th interval survival prediction model using Nth interval survival prediction data obtained from the Nth interval survival prediction model and N + 1st interval survival time data of the plurality of synovial sarcoma patients. can do.

In an embodiment, the generating of the N-th section survival prediction model may further include assigning a score according to the survival period to the N-th section survival period data.

In one embodiment, the score may be proportional to the survival period of the N-th section.

According to another embodiment of the present invention, an apparatus for predicting prognosis of synovial sarcoma using an artificial neural network includes a data acquisition unit configured to acquire clinical data and survival data of a plurality of synovial sarcoma patients; An artificial neural network learning unit which acquires learning input data and learning output data from the clinical data and the survival period data, and learns an artificial neural network including an input layer, a hidden layer, and an output layer by using the learning input data and the learning output data. ; And a survival prediction model generator for generating a model for predicting the survival rate of the synovial sarcoma patient using the learned artificial neural network.

In one embodiment, the neural network learning unit may add missing data (NaN) using a k-nearest neighbor algorithm (knn).

In one embodiment, the artificial neural network learning unit may learn the artificial neural network for each time interval (time interval).

The survival prediction model generator generates an Nth section survival prediction model using the clinical data and Nth section survival period data of the plurality of synovial sarcoma patients, and the Nth section survival prediction model. The N + 1 section survival prediction model may be generated using the N th section survival prediction data obtained from the N + 1 section survival survival data of the plurality of synovial sarcoma patients.

In one embodiment, the survival rate prediction model generator, when generating the N-th section survival prediction model may assign a score according to the survival period to the N-th section survival period data.

Another embodiment of the present invention discloses a computer program stored in a medium for performing a prognostic prediction method of synovial sarcoma using the artificial neural network described above using a computer.

Other aspects, features, and advantages other than those described above will become apparent from the following drawings, claims, and detailed description of the invention.

According to the method, apparatus and program for predicting prognosis of synovial sarcoma using the artificial neural network according to the present invention, it is possible to accurately predict the prognosis of the synovial sarcoma patient for each individual. In addition, the prognosis of each treatment method can be simulated using the learned artificial neural network, so that the treatment method tailored to each patient can be determined. Of course, the scope of the present invention is not limited by these effects.

1 is a flow chart showing a prognostic method of synovial sarcoma using an artificial neural network according to an embodiment of the present invention.

Figure 2 is a simplified illustration of the topology (topology) of the artificial neural network according to an embodiment of the present invention.

3 is a diagram illustrating a method of generating a model for predicting the survival rate of the N-th section of the synovial sarcoma patient according to the prognostic method of the synovial sarcoma using the artificial neural network according to an embodiment of the present invention.

4 is a diagram schematically showing a part of a heatmap graph of an artificial neural network according to an embodiment of the present invention.

5 is a ROC graph showing the prediction accuracy of the prognostic prediction method of synovial sarcoma using an artificial neural network according to an embodiment of the present invention.

6 is a graph showing the results of Kaplan-Meier survival prediction analysis for each clinical variable.

Figure 7 is a ROC graph comparing the prediction accuracy of the cochlear proportional hazard model (Cox proportional hazard model) and the prognostic prediction method of synovial sarcoma using the artificial neural network of the present invention.

8 is a view schematically showing the configuration of a prognostic prediction device for synovial sarcoma using an artificial neural network according to an embodiment of the present invention.

As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail. Effects and features of the present invention, and methods of achieving them will be apparent with reference to the embodiments described below in detail together with the drawings. However, the present invention is not limited to the embodiments disclosed below but may be implemented in various forms.

In the following embodiments, the terms first, second, etc. are used for the purpose of distinguishing one component from other components rather than having a limiting meaning.

In the following examples, the singular forms "a", "an" and "the" include plural forms unless the context clearly indicates otherwise.

In the following examples, the terms including or having have meant that there is a feature or component described in the specification and does not preclude the possibility of adding one or more other features or components.

In the following embodiments, a 'node' means an object of abstract concept that can change a value with a specific algorithm and connect with another node.

In the following embodiments, the term 'input layer' is a set of one or more nodes having a particular variable assigned by the user, and the term 'output layer' is one or more nodes having a result value of the procedure according to a specific procedure determined by the user. "Hidden layer" means a set of one or more nodes that store interim results and temporary values that appear temporarily when performing a procedure set by a user.

There may be links between the nodes of the input layer and the nodes of the hidden layer, and between the nodes of the hidden layer and the nodes of the output layer, each of which has a specific weight or weight given by a user defined procedure. Can have

In the following examples, the term 'prognosis' is a medical term indicating the prediction of survival, progression and recovery of a patient.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings, and the same or corresponding components will be denoted by the same reference numerals, and redundant description thereof will be omitted. .

Prognosis prediction method of synovial sarcoma using an artificial neural network according to an embodiment of the present invention, the step of obtaining clinical data and survival data of a plurality of synovial sarcoma patients (S10); Acquiring learning input data and learning output data from the clinical data and the survival period data (S20); And generating a model for predicting survival rate of synovial sarcoma patients by learning an artificial neural network including an input layer, a hidden layer, and an output layer using the training input data and the training output data (S30).

Referring to FIG. 1, a step (S10) of obtaining clinical data of a plurality of synovial sarcoma patients and survival time data after synovial sarcoma onset is performed.

In the present specification, the clinical data includes physical personal information such as age and gender of the patient, surgical records after synovial sarcoma, and pathological records related to synovial sarcoma such as recurrence or the like. Survival data after the onset of synovial sarcoma shows that the period from the time of recognition of the synovial sarcoma to the death in the case of a patient who has already died, and from the time of the recognition of the synovial sarcoma to a surviving patient, It may mean a period until the time point, but is not limited thereto.

Clinical data and survival data after synovial sarcoma onset can be obtained from synovial sarcoma patients in one or more hospitals or regions. Clinical data may be obtained from a medical image of a patient or may be obtained from a patient's specimen test result, but is not limited thereto. The present inventors obtained clinical data and survival data from 242 synovial sarcoma patients who were followed up from March 2001 to February 2013 at Seoul National University Hospital, Samsung Seoul Hospital, and National Cancer Center. Table 1 below summarizes the clinical data of the 242 synovial sarcoma patients.

VariableVariable	ValueValue	VariableVariable	ValueValue
MedianMedian AgeAge	37.45 (5-90)37.45 (5-90)	RadiationRadiation TherapyTherapy
PatientPatient SexSex		YesYes	128 (52.9%)128 (52.9%)
MM	116 (47.9%)116 (47.9%)	NoNo	114 (47.1%)114 (47.1%)
FF	126 (52.1%)126 (52.1%)	ResectionResection MarginMargin
TumourTumour SizeSize		PositivePositive	24 (9.9%)24 (9.9%)
≤ 5 cm≤ 5 cm	129 (53.3%)129 (53.3%)	NegativeNegative	218 (90.1%)218 (90.1%)
> 5 cm> 5 cm	113 (46.6%)113 (46.6%)	SubtypeSubtype
LocationLocation ofof TumourTumour		MonophasicMonophasic	149 (61.6%)149 (61.6%)
TrunkTrunk	100 (41.3%)100 (41.3%)	BiphasicBiphasic	62 (25.6%)62 (25.6%)
ExtremityExtremity	142 (58.7%)142 (58.7%)	UndeterminedUndetermined	31 (12.8%)31 (12.8%)
InitialInitial MetastasisMetastasis		Survival Periods(months) Survival Periods (months)	65.26 (0.6-375)65.26 (0.6-375)
YesYes	26 (10.7%)26 (10.7%)
NoNo	216 (89.3%)216 (89.3%)	OverallOverall MortalityMortality
ChemotherapyChemotherapy		PositivePositive	46 (19%)46 (19%)
YesYes	121 (50%)121 (50%)	NegativeNegative	196 (81.0%)196 (81.0%)
NoNo	121 (50%)121 (50%)

Thereafter, a step (S20) of acquiring the learning input data and the learning output data from the clinical data and the survival period data is performed.

The training input data refers to data to be input to a node of the input layer in order to learn an artificial neural network to be described later.

Table 1 shows variables that can be included in the learning input data and their classification. According to one embodiment, the learning input data may include age, sex, tumor location, initial metastasis, whether chemotherapy, radiation therapy, section margin positive and Data such as pathological subtypes. On the other hand, in addition to the above examples, various clinical variables such as stage, surgery date, recurrence, cell division activity, tumor necrosis, malignancy, vascular invasion, molecular genetic subtypes, etc. may also be included in the learning input data. Of course.

Among the input data for training illustrated in [Table 1], there are only two classifications such as gender, tumor location, initial metastasis, chemotherapy, radiation treatment, and positive resection. Can be converted to a mathematical value by labeling it with 2,.

Pathological subtypes are divided into monophasic, biphasic and undetermined. At this time, the subtypes may be classified into three, including 'other', or 'other' may be treated with NaN to classify the subtypes into two, and each may be labeled and converted into a mathematical value.

Quantitative variables, such as age, can be normalized and processed to one number.

Tumor size may be treated with two classifications, such as 5 cm or less or more than 5 cm, but may be processed into quantitative real variables, normalized, processed, and changed to one number.

The training output data refers to data to be compared with the value of the node output to the output layer in order to learn the artificial neural network. This learning output data is obtained from survival data after synovial sarcoma onset in a patient. The training output data may be, for example, survival information of the patient N years after the onset of synovial sarcoma.

For example, suppose that patient A has clinical data and survival data as shown in Table 2 below.

임상 데이터Clinical data
변수variable	분류Classification	데이터data
나이(age)Age	실수값Real value	5454
성별(sex) Sex	1= 남자; 2= 여자1 = man; 2 = woman	22
종양 크기 Tumor size	1= 5cm 이하; 2 = 5cm 초과1 = 5 cm or less; 2 = greater than 5 cm	22
종양 위치 Tumor location	1 = trunk; 2= extremity1 = trunk; 2 = extremity	1One
초기 전이 여부 Initial transition	0= negative; 1= positive0 = negative; 1 = positive	1One
화학 요법 시술 여부 Chemotherapy	0= negative; 1= positive0 = negative; 1 = positive	1One
방사선 치료 여부 Whether radiation treatment	0= negative; 1= positive0 = negative; 1 = positive	1One
절제면 양성 여부 Positive margin	0= negative; 1= positive0 = negative; 1 = positive	1One
병리학적 아형 Pathological subtype	1= Monophasic; 2= Biphasic1 = Monophasic; 2 = Biphasic	22
생존 기간 데이터Survival data
생존 기간(월)Survival Month	실수값 (월)Real value (month)	5858

Patient A survived for up to 4 years (48 months) and died 5 years (60 months) after the onset of synovial sarcoma. Therefore, the survival data of patient A for each year is shown in Table 3 below.

N (년)N (years)	1One	22	33	44	55
환자 A의 생존 여부 (생존: 1, 사망: 0)Patient A Survival (Survival: 1, Death: 0)	1One	1One	1One	1 One	00

Therefore, for example, when training the artificial neural network predicting the survival rate after two years of patient A, the output data for training to be compared to the value to be output to the [survival rate node, mortality node] of the output layer of the artificial neural network to be described later is [1, 0] In the case of learning the artificial neural network predicting the survival rate after 5 years, the learning output data to be compared with the value to be output to the [survival rate node, mortality node] of the output layer may be [0, 1]. However, according to one embodiment of the present invention, a treatment method for ranking a score is provided without processing the learning output data as [0, 1] as described above when the patient dies, which will be described later. .

From the clinical data and the survival period data of the patient A, learning input data and learning output data as shown in Table 4 below can be obtained.

학습용 입력 데이터Learning input data

학습용 출력 데이터Training output data (N년 후 생존 여부)(Survival after N years)

age

gender

size

location

Early transition

Chemotherapy

Radiation therapy

Ablation noodles

Pathological subtype

One

2

3

4

5

54

2

One

2

One

0

The learning input data and the learning output data thus obtained are used to train the artificial neural network to be described later.

Meanwhile, according to an embodiment of the present invention, the step of acquiring the learning input data and the learning output data from the clinical data and the survival data, respectively, is based on missing values using a k-nearest neighbor algorithm (knn). The method may include adding (missing data, NaN).

For example, suppose there are patients A, B, and C with clinical data as shown in Table 5 below. In case of patient C, no test was performed to confirm whether the resection was positive, and a value corresponding to the resection was positive (missing, NaN).

변수variable	분류Classification	환자 APatient A	환자 BPatient B	환자C Patient C
나이(age)Age	실수값Real value	5454	6565	6464
성별(sex) Sex	1= 남자; 2= 여자1 = man; 2 = woman	22	1One	1One
종양 크기 Tumor size	1= 5cm 이하; 2 = 5cm 초과1 = 5 cm or less; 2 = greater than 5 cm	22	1One	1One
종양 위치 Tumor location	1 = trunk; 2= extremity1 = trunk; 2 = extremity	1One	22	22
초기 전이 여부 Initial transition	0= negative; 1= positive0 = negative; 1 = positive	1One	00	00
화학 요법 시술 여부 Chemotherapy	0= negative; 1= positive0 = negative; 1 = positive	1One	00	00
방사선 치료 여부 Whether radiation treatment	0= negative; 1= positive0 = negative; 1 = positive	1One	00	1One
절제면 양성 여부 Positive margin	0= negative; 1= positive0 = negative; 1 = positive	1One	00	NaNNaN
병리학적 아형 Pathological subtype	1= Monophasic; 2= Biphasic1 = Monophasic; 2 = Biphasic	22	1One	1One
생존 기간(월)Survival Month	실수값Real value	5858	7272	4646

In this case, whether the clinical data of the patient C is closer to the patient A or the patient B may be determined based on, for example, the distance of the learning input data vector of each patient. In the case of Table 3, since the clinical data of patient C is closer to patient B than patient A, it is possible to assign 1 to the positive resection of patient C.

In practice, since the number of patients to be compared is large, the above example merely simplifies the situation to illustrate the knn algorithm and does not necessarily reflect the actual situation. In this case, since there are various known knn algorithms, detailed descriptions are omitted herein.

For example, when the item of the learning input data inputted to the input layer of the artificial neural network of the present invention is not included in clinical data of another region or hospital, the missing item may be added using the knn algorithm. Therefore, it is possible to retrain the artificial neural network by adding other local data with missing data.

Through this process, the clinical data and the survival data after the onset of sarcoma can be processed to be mathematically processed, thereby obtaining learning input data and learning output data.

The clinical data of the above patients A, B, C, etc. are exemplary and do not limit the present invention. In addition, the above description has been given of an example in which survival periods are divided into monthly units, but the present invention is not limited thereto, and the survival periods may be divided into various units such as semi-annual, quarterly, monthly, and day according to design.

After acquiring the training input data and the training output data (S20), a step of generating a model for predicting survival rate of synovial sarcoma patients by training the artificial neural network using the training input data and the training output data (S30) is performed. do.

FIG. 2 is a diagram briefly illustrating the topology of an artificial neural network (hereinafter also referred to as SNN) according to an embodiment of the present invention. The neural network has an input layer with multiple nodes, one or more hidden layers and an output layer.

The input layer of the neural network has n _in nodes. Learning input data values are input to each node of the input layer. At this time, the input layer has a form like an n _in _× 1 matrix. In this case, the input layer may include a node for inputting survival prediction data of the synovial sarcoma patient. For example, if there are i data obtained from the clinical data of each synovial sarcoma patient, the learning input data to be input to the input layer of the neural network for survival prediction after N + 1 years is used to predict the N-year survival rate of the synovial sarcoma patient. It may be i + 1 data sets including the used data. For example, in FIG. 2, a total of 10 nodes are shown, including nine nodes into which nine learning input data obtained from clinical data are input and one node 210 into which survival prediction data is input.

On the other hand, the output layer of the artificial neural network has n _out nodes. The node value of the output layer output through the coefficient output and the activation function of the connection of each node is compared with the learning output data value. At this time, the output layer has a form like n _out _× 1 matrix. In one embodiment of the present invention, the output layer has two nodes, such as [survival rate node, mortality node], but is not limited thereto.

The plurality of hidden layers connects n _in nodes corresponding to the input layer to n _out nodes. In an embodiment of the present invention, the hidden layer connects an input layer into which learning input data obtained from clinical data is input, and an output layer including a 'survival rate node'. The nodes of each hidden layer may be fully connected to each other with the nodes of another adjacent hidden layer. In an embodiment of the present invention, the artificial neural network is trained using three hidden layers, but the number of hidden layers and types of algorithms are not limited thereto.

When each training input data is input to the input layer and output to the output layer via the hidden layer, each node to minimize the difference between the value (actual value) and the output value (prediction value) of each training output data corresponding to each training input data. By controlling the weight of the connection, the artificial neural network is learned.

According to an embodiment of the present invention, generating the model for predicting the survival rate may include training the artificial neural network for each time interval. The time interval may vary from year to year, half year, quarter, month, etc. Hereinafter, the year will be described as an example. For example, the neural network can be learned from clinical data of synovial sarcoma patients to predict the annual survival of synovial sarcoma patients, such as survival from one year to five years after onset.

According to an embodiment of the present invention, the step of generating a model for predicting the survival rate, N-section interval survival prediction model (PM _N ) using the clinical data and the N-section survival time data of the plurality of synovial sarcoma patients Generating a; And N + 1st section survival prediction model (PM _N) using Nth section survival prediction data (P _N ) obtained from the Nth section survival prediction model and N + 1st section survival time data of the plurality of synovial sarcoma patients. ₊₁ ) may be generated.

According to one embodiment of the present invention, 1, 2,... The survival rate is predicted for each N + 1th interval (N: natural number). At this time, the survival rate prediction result data in the Nth interval is used to predict the survival rate in the N + 1th interval. That is, the survival rate prediction for each section is made in an inductive manner.

3, the one year of survival prediction model (PM ₁₎ and, after N-year survival rate prediction model (PM _N) is shown. At this time, when the clinical data (X) and the initial survival rate (P ₀ ) are input, the survival rate prediction model (PM ₁ ), which is an input / output function that can output the survival rate (P ₁ ) after one year, trains the artificial neural network. Is generated.

In this case, the learning input data input to the input layer of the artificial neural network includes clinical data (X) and an initial survival rate (P ₀ ). Clinical data (X) that is input to each model may be initial values, that is, clinical data at initial examination. The survival rate initial value P ₀ may be set to 1, for example.

For the learning output data for predicting the survival rate after one year, the survival data after one year obtained from the survival period data of the patient is used. For example, if a patient D died 15 months after the onset of sarcoma, the surviving point was 1 year after the onset of the disease, and thus the learning output data to be compared with the value to be output to the [survival node, mortality node] of the output layer becomes [1, 0]. . The artificial neural network is trained to predict survival rate after one year of synovial sarcoma patient by using such learning input data and learning output data.

Next, when inputting clinical data (X) and a survival rate prediction result value after one year (P ₁ ), a two-year survival rate prediction model (PM ₂ ), which is an input / output function capable of outputting a survival rate after two years (P ₂ ). Created by learning this artificial neural network. In this case, the learning input data input to the input layer of the artificial neural network includes clinical data (X) and a survival rate prediction result value P ₁ after one year.

Survival data two years later obtained from the survival data of the patient is used as the learning output data. For example, if a patient D survived 15 months after the onset of sarcoma, and died 2 years after the onset of the disease, the learning output data to be compared with the value to be output to the [survival rate node, mortality node] of the output layer is [0, 1]. Can be.

However, according to one embodiment of the present invention, a treatment method for ranking a score is provided without processing the output data for learning as [0, 1] as described above when the patient dies.

According to one embodiment, generating the N-th interval survival prediction model (PM _N ),

The method may further include assigning a score according to the survival period to the N-th section survival period data. That is, in the present embodiment, the learning output data may be [p, 1-p], where p may be assigned a non-zero score value. According to one embodiment, the score may be given in proportion to the survival of the Nth section of the patient. In this case, the survival period may be divided into at least monthly units. For example, the score according to the survival period for each section of the patient D who survived for 1 year and 3 months is as shown in Table 6 below.

N (년)N (years)	1One	22	33	44	55
구간별 스코어Interval Score	1One	3/123/12	00	00	00

Therefore, in this case, when training the artificial neural network predicting survival rate after 2 years, the output data for learning to be compared with the value of [survival node, mortality node] of the output layer is [3/12, 1-3 / 12] = [0.75, 0.25].

According to this method, even in the case of data of patients whose follow-up period is less than 5 years due to death or the like (right-censored case), the survival rate is not counted as 0, and the ranked score is given as much as the survival period. In addition, the number of significant data used to generate the survival prediction model can be increased, and as a result, the accuracy of the survival prediction is improved.

The artificial neural network may be retrained to predict survival rate of two years after synovial sarcoma using the learning input data and the learning output data using the score.

As this process is repeated (N = N + 1), when the clinical data (X) and the survival rate prediction result (P _N _- ₁ ) after N-1 years are inputted, the survival rate after N years (P _N ) is output. After N years, a possible input / output function, a survival predicting model (PM _N ) is generated by training artificial neural networks.

According to an embodiment of the present invention, the survival rate after N years is predicted using the survival rate prediction result (P _N _- ₁ ) after N-1 years reflecting the 'prognosis of the patient at the time point after N-1 years'. Survival prediction performance improves as the artificial neural network is trained for each year.

On the other hand, as shown in Equation 1 below, the residual (λ _N− ) of a value indicating actual survival after N-1 years (S _N-1 ) and a predicted survival rate after N-1 years (P _N-1 ) ₁ ) multiplying the coefficient β by a value (S _N ) indicating whether or not the actual survival after N years, may be used as the output data (Y _N ) for training.

Y _N = S _N + β · λ _N-1

When model generation for predicting survival rate of synovial sarcoma patients is completed, the weight corresponding to the connection of each node of the neural network is learned to optimize the survival rate. Therefore, input data obtained from clinical data of any synovial sarcoma patient can be input to the input layer of the neural network, and the survival rate of the patient can be predicted through the output value of the output layer. That is, according to the prognostic prediction method of the synovial sarcoma using the artificial neural network according to the present invention, it is possible to accurately predict the prognosis of the synovial sarcoma patient for each individual. <Example>

데이터 획득Data acquisition

The present inventors constructed an artificial neural network based on clinical data and survival data from 242 synovial sarcoma patients who were followed up from March 2001 to February 2013 at Seoul National University Hospital, Samsung Seoul Hospital, and National Cancer Center. The training data used 80% of the total data and the test data used the remaining 20%.

인공신경망 구조Neural Network Structure

The present inventors modeled an artificial neural network predicting survival after 1 year, 2 years, 3 years, 4 years, and 5 years after the onset of synovial sarcoma. The neural network included an input layer, three hidden layers, and an output layer. The learning input data input to the input layer was composed of clinical data having nine variables and one survival rate data. The output layer consisted of two nodes applying the Softmax function and representing survival / mortality, and the hidden layer was fully-connected.

At this time, the survival predicted after N years is multiplied by the weight α and inputted into the input layer of the artificial neural network predicting the survival rate after N + 1 years. On the other hand, multiplying the residual λ subtracting the predicted survival rate after N years from the actual survival rate after N years by the coefficient β, and adding the survival rate value after N + 1 years to the output layer of the neural network predicting survival rate after N + 1 years. Compared to the output value.

Graph 511 shows a state in which learning input data is input to an input layer of an artificial neural network predicting survival rate after one year. The vertical axis of the heatmap of graph 511 is the serial number of each synovial sarcoma patient, and the horizontal axis corresponds to each node of the artificial neural network input layer. In one embodiment, the total number of nodes in the input layer includes a total of nine nodes and one survival node obtained from the clinical data shown in [Table 1]. The value corresponding to each node is represented by the intensity of the color.

Thereafter, learning of coefficients for each node is made. The results of learning are labeled as survival or death, and finally expressed as survival probabilities through the softmax function. Referring to graph 514, survival and mortality after one year for a plurality of synovial sarcoma patients were represented by two nodes. In other words, the survival predicted neural network 1 year later converges a total of 10 node values (shown in graph 511) to a total of 2 node values (shown in graph 514) through the hidden layer (shown in graphs 512 and 513).

On the other hand, the survival rate prediction data obtained in one year is input to the input layer of the survival rate prediction model after two years (Graph 531). This process is repeated, and finally, the survival prediction model after five years predicts survival after five years of synovial sarcoma (Graph 554). This is then compared with the survival data 500 after 5 years of actual use as an indicator for comparing the accuracy of the survival rate prediction.

In this case, the data of patients with a follow-up period of less than 5 years (middle truncation data, right-censored case) were included in the training data, but were excluded from the final test data in order to obtain only binary information of survival / death.

5 is a receiver operating characteristic (ROC) graph showing the prediction accuracy of the prognostic prediction method of synovial sarcoma using the artificial neural network of the present invention. The accuracy of survival prediction can be quantified by the area under curve (AUC) under the ROC graph, and the closer the area is to 1, the higher the accuracy.

In an embodiment of the present invention, a K-fold cross validation was used (n = 3). The hyperparameters of repetitive learning (n_epoch), residual coefficient (β), probability coefficient (α), number of hidden layers, and types of node functions were adjusted to maximize the average of AUC. The AUCs of the final ROCs obtained were 0.93, 0.85 and 0.87, respectively, and the adjusted super-parameter values were n_epoch = 3, residual coefficient (β) = 0.3, probability coefficient (α) = 0.01, number of hidden layers = 3 The node functions were tanh, tanh, Relu function and softmax function for the output layer, respectively.

생존율 예측 모델에 사용할 공변량(covariates) 선택Choose covariates to use for survival prediction models

Kaplan-Meier survival prediction methods and log-rank tests are performed to determine the covariates to use in the survival prediction model. Selected from clinical data.

6 is a graph showing the results of Kaplan-Meier survival prediction analysis for each clinical variable. In each graph, (a) older than 38 years old, (b) male (p = 0.021), (c) tumors larger than 5 cm (p = 0.004), and (d) tumors are located axially (p = 0.007), (e) early metastasis (p = 0.001), (h) resected surface positive (p = 0.004), and (i) single-phase (p = 0.0043). . In the case of (f) chemotherapy and (g) radiotherapy, variables were included to assess the effects of treatment.

콕스cox 비례위험모델( Proportional Risk Model CoxCox proportionalproportional hazardhazard modelmodel )과의 비교Comparison with)

For comparison with a prognostic prediction method of synovial sarcoma using an artificial neural network according to an embodiment of the present invention, multivariate Cox proportional risk regression (CoxPHR) was performed using the same training data and test data.

Figure 7 is a ROC graph comparing the prediction accuracy of the cochlear proportional hazard model (Cox proportional hazard model) and the prognostic prediction method of synovial sarcoma using the artificial neural network of the present invention. The AUC of the graphs were compared using the DeLong method.

The AUC was 0.918 (95% confidence interval: 0.829-0.970) for the model according to the invention (SNN) and 0.745 (95% confidence interval: 0.629-0.841) for the Cox model (COX). Statistically significant (p = 0.039), the AUC difference between the two models was 0.173 (95% confidence interval: 0.008-0.337). Therefore, the performance of the SNN is higher than that of the Cox model.

The performance of the SNN is better because the coefficients of the input layer nodes can be different for each annual interval, and the intermediate truncation data can also be analyzed in a nonparametric manner.

치료 방법에 따른 예후 시뮬레이션Prognosis simulation according to treatment method

After constructing the artificial neural network for the prediction of synovial sarcoma prognosis as described above, the survival rate according to the treatment method was simulated using the treatment method as a variable for each patient.

[Table 7] and [Table 8] is a table simulating the survival rate of the patient when the chemotherapy procedure is differently entered into the survival prediction model.

IndividualIndividual covariatescovariates						RealReal outcomeoutcome		5-5- yearyear survivalsurvival probability probability
SexSex	SizeSize	LocationLocation	InitialInitial meta meta	MarginMargin	SubtypeSubtype	SurviveSurvive	DeathDeath	ChemoChemo therapytherapy	NoNo adjuvant adjuvant
malemale	>5cm> 5cm		ExtExt	00	00	unclassifiedunclassified	3838	1One	0.6390.639	0.8350.835
femalefemale	>5cm> 5cm		ExtExt	00	nannan	unclassifiedunclassified	9797	1One	0.6400.640	0.7050.705
malemale	>5cm> 5cm		AxialAxial	00	00	bibi	2323	1One	0.6180.618	0.6800.680
malemale	>5cm> 5cm		ExtExt	00	00	monomono	2222	1One	0.6220.622	0.6820.682

Referring to [Table 7], when the patient is a female and the tumor size is less than 5 cm, it can be seen that the survival rate is greatly increased by adjuvant chemotherapy. Meanwhile, referring to [Table 8], when the patient is a male and the tumor size is smaller than 5 cm, it can be seen that the survival rate is lowered by the adjuvant chemotherapy.

That is, according to the prognostic prediction method of synovial sarcoma using the artificial neural network according to the present invention, it is possible to simulate the prognosis by each treatment method using the learned artificial neural network, it is possible to determine a patient-specific treatment.

The apparatus 10 for predicting prognosis of synovial sarcoma shown in FIG. 8 shows only components related to the present embodiment in order to prevent the features of the present embodiment from being blurred. Accordingly, it will be understood by those skilled in the art that other general purpose components may be further included in addition to the components illustrated in FIG. 8.

The prognostic prediction apparatus 10 of synovial sarcoma according to an embodiment of the present invention may correspond to at least one processor or may include at least one processor. Accordingly, the prognostic prediction device 10 of synovial sarcoma may be driven in a form included in another hardware device such as a microprocessor or a general purpose computer system.

The invention can be represented by functional block configurations and various processing steps. Such functional blocks may be implemented in various numbers of hardware or / and software configurations that perform particular functions. For example, the present invention is an integrated circuit configuration such as memory, processing, logic, look-up table, etc., capable of executing various functions by the control of one or more microprocessors or other control devices. You can employ them. Similar to the components in the present invention may be implemented in software programming or software elements, the present invention includes various algorithms implemented in data structures, processes, routines or other combinations of programming constructs, including C, C ++ It may be implemented in a programming or scripting language such as Java, an assembler, or the like. The functional aspects may be implemented with an algorithm running on one or more processors. In addition, the present invention may employ the prior art for electronic environment setting, signal processing, and / or data processing. Terms such as "mechanism", "element", "means", "configuration" may be used widely, and the components of the present invention are not limited to mechanical and physical configurations. The term may include the meaning of a series of routines of software in conjunction with a processor or the like.

Referring to FIG. 8, the prognostic prediction apparatus 10 of the synovial sarcoma includes a data acquirer 11, an artificial neural network learner 12, and a survival predictive model generator 13.

The data acquisition unit 11 acquires medical data of a plurality of synovial sarcoma patients, such as clinical data and survival time data after the onset of synovial sarcoma. The clinical data may be obtained from a medical image of the patient or may be obtained from a patient's specimen test result, but is not limited thereto.

The neural network learning unit 12 obtains learning input data and learning output data from clinical data and survival data of a plurality of synovial sarcoma patients, and includes an input layer, a hidden layer, and an output layer using the learning input data and the learning output data. Learning artificial neural network.

Survival prediction model generation unit 13 predicts the survival rate of the synovial sarcoma patient using the learned artificial neural network. In this case, predicting the survival rate may mean inputting clinical information of the synovial sarcoma patient to calculate the survival rate of the patient through a predetermined algorithm.

In one embodiment, the neural network learning unit 12 may add missing data (NaN) using a k-nearest neighbor algorithm (knn).

In one embodiment, the artificial neural network learning unit 12 may learn the artificial neural network for each time interval.

In one embodiment, the survival prediction model generation unit 13 generates an Nth section survival prediction model by using the clinical data and the Nth section survival period data of the plurality of synovial sarcoma patients, and the Nth section The N + 1 section survival prediction model may be generated using the N th section survival prediction data obtained from the survival prediction model and the N + 1 section survival time data of the plurality of synovial sarcoma patients.

In one embodiment, the survival prediction model generation unit 13 may assign a score according to the survival period to the N-th section survival period data when generating the N-th section survival rate prediction model.

Meanwhile, the prognosis prediction method of synovial sarcoma using an artificial neural network according to an embodiment of the present invention shown in FIG. 1 may be written as a program that can be executed by a computer, and the program may be executed using a computer-readable recording medium. It can be implemented in a general purpose digital computer to operate. The computer-readable recording medium may include a storage medium such as a magnetic storage medium (eg, a ROM, a floppy disk, a hard disk, etc.) and an optical reading medium (eg, a CD-ROM, a DVD, etc.).

According to the method, apparatus and program for predicting prognosis of synovial sarcoma using the artificial neural network according to the present invention, it is possible to accurately predict the prognosis of the synovial sarcoma patient for each individual. In addition, the prognosis of each treatment method can be simulated using the learned artificial neural network, so that the treatment method tailored to each patient can be determined.

Although the present invention has been described with reference to the embodiments shown in the drawings, this is merely exemplary, and it will be understood by those skilled in the art that various modifications and equivalent other embodiments are possible. Therefore, the true technical protection scope of the present invention will be defined by the technical spirit of the appended claims.

The present invention relates to a method, apparatus, and program for predicting the prognosis of synovial sarcoma using an artificial neural network, and may be used in an industry for predicting the prognosis of a disease.

Claims

Obtaining clinical data and survival data of the plurality of synovial sarcoma patients;

Acquiring training input data and training output data from the clinical data and the survival data; And

Learning a neural network including an input layer, a hidden layer, and an output layer using the training input data and the training output data to generate a model for predicting survival rate of synovial sarcoma patient; including, synovial sarcoma using an artificial neural network Prognosis prediction method.
The method of claim 1,

The learning input data may include age, sex, tumor location, initial metastasis, chemotherapy, radiation therapy, and section margin of the synovial sarcoma. A method for predicting prognosis of synovial sarcoma using an artificial neural network, including whether or not and pathological subtype data.
The method of claim 1,

Acquiring learning input data and learning output data from the clinical data and the survival period data, respectively.

A method for predicting prognosis of gastric cancer using an artificial neural network, comprising adding missing data (NaN) using a k-nearest neighbor algorithm (knn).
The method of claim 1,

Generating the model for predicting the survival rate, the step of learning the artificial neural network for each time interval (time interval), the prognostic prediction method of synovial sarcoma using artificial neural network.
The method of claim 1,

Generating the model for predicting the survival rate,

Generating an N-th section survival prediction model using the clinical data and the N-th section survival time data of the plurality of synovial sarcoma patients; And

Generating an N + 1th interval survival prediction model using Nth interval survival prediction data obtained from the Nth interval survival prediction model and N + 1th interval survival time data of the plurality of synovial sarcoma patients. , Prognostic Prediction of Synovial Sarcoma Using Artificial Neural Networks.
The method of claim 5,

Generating the N-th interval survival prediction model,

Providing a score according to the survival period for the N-th section survival time data; further comprising, prognostic prediction method of synovial sarcoma using an artificial neural network.
The method of claim 6,

The score is proportional to the survival period of the N-th interval, prognostic prediction method of synovial sarcoma using an artificial neural network.
A data acquisition unit for acquiring clinical data and survival time data of a plurality of synovial sarcoma patients;

An artificial neural network learning unit which acquires learning input data and learning output data from the clinical data and the survival period data, and learns an artificial neural network including an input layer, a hidden layer, and an output layer by using the learning input data and the learning output data. ; And

Survival prediction model generation unit for generating a model for predicting the survival rate of the synovial sarcoma patient using the learned artificial neural network; comprising, a prognostic system for synovium sarcoma using artificial neural network.
The method of claim 8,

The learning input data may include age, sex, tumor location, initial metastasis, chemotherapy, radiation therapy, and section margin of the synovial sarcoma. An apparatus for predicting prognosis of synovial sarcoma using an artificial neural network, including whether or not and pathological subtype data.
The method of claim 8,

The artificial neural network learning unit adds missing data (NaN) by using a k-nearest neighbor algorithm (knn). The apparatus for predicting prognosis of synovial sarcoma using an artificial neural network.
The method of claim 8,

The artificial neural network learning unit, to predict the prognosis of synovial sarcoma using the artificial neural network, which learns the artificial neural network for each time interval (time interval).
The method of claim 8,

The survival rate prediction model generator

Generating an N-th section survival prediction model using the clinical data and the N-th section survival time data of the plurality of synovial sarcoma patients,

N-segment survival prediction model is generated by using N-segment survival prediction data obtained from the N-segment survival rate prediction model and N + 1-segment survival period data of the plurality of synovial sarcoma patients. Device for predicting prognosis of synovial sarcoma.
The method of claim 12,

The survival rate prediction model generator,

When generating the N-th section survival rate prediction model, the prognosis prediction device of synovial sarcoma using an artificial neural network, which gives a score according to the survival period to the N-th section survival period data.
The method of claim 13,

The score is proportional to the survival period of the N-th interval, prognostic prediction device of synovial sarcoma using an artificial neural network.
A computer program stored in a medium for executing the method of any one of claims 1 to 5 using a computer.