WO2022086053A1

WO2022086053A1 - Artificial intelligence-based microarray specific determinant extraction system

Info

Publication number: WO2022086053A1
Application number: PCT/KR2021/014237
Authority: WO
Inventors: 이명재; 강신욱; 김원태; 김동민
Original assignee: (주)제이엘케이
Priority date: 2020-10-19
Filing date: 2021-10-14
Publication date: 2022-04-28

Abstract

The present invention relates to a system for extracting a chromosome probe used when classifying a specific class in a microarray. The present invention can calculate a third loss value by linearly combining a first loss value calculated using a softmax loss function and a second loss value calculated using a root mean square error loss function, and then update a first weight and second weight of a neural network module. In addition, the present invention can search for a bacterial artificial chromosome corresponding to a neuron in an input layer having the largest sum of first weights among neurons in a hidden layer, in the neural network module learned using the third loss value, and determine same as a determinant bacterial artificial chromosome, which is a chromosome probe that affects class classification.

Description

Artificial intelligence-based microarray specific determinant extraction system

The present invention relates to a system for extracting chromosome probes used when classifying a class of a characteristic in a microarray.

A DNA microarray, also known as a DNA chip, is one in which a large amount of gene fragments are attached to the surface of a glass slide in a state in which they are arranged at regular intervals.

Gene fragments arranged and attached at regular intervals in a DNA microarray are defined as probes, and may have a known nucleotide sequence of a specific gene.

DNA microarrays can be used to investigate the expression level of large amounts of genes in specific cells. For example, by examining the expression of a large amount of genes using a DNA microarray in various types of cancer cells, the similarity of gene expression patterns among individual cancer cells can be compared.

DNA microarrays can also be used to compare differences in gene expression between different classes of a trait. For example, by comparing the expression level of a gene before and after treatment with a drug in cells, or by comparing the expression level of a gene between a normal tissue and a diseased tissue, a gene showing a difference in expression level can be detected. there is.

As a method of analyzing the expression level of a gene using a DNA microarray, after fluorescence-labeled cDNA (complementary DNA) is bound to the DNA microarray, the intensity of fluorescence displayed on the DNA microarray is measured to determine the expression level of the gene. analysis method can be used.

If you want to compare the difference in gene expression between two classes using a DNA microarray, you can label the cDNAs of the two classes with different fluorescence (for example, red fluorescence and blue fluorescence) and bind them to the DNA microarray. Available. At this time, if the ratio of displaying red fluorescence is high, the expression level of the gene of one class of the two classes is high, and if the ratio of displaying the blue fluorescence is high, it can be considered that the expression level of the gene of the other class of the two classes is high. When the expression levels of the two classes of genes are the same or similar, yellow fluorescence may be displayed by combining red fluorescence and blue fluorescence.

On the other hand, the existing chromosomal testing method for detecting chromosomal abnormalities such as Down's syndrome and Turner's syndrome has a problem in that the diagnosis accuracy of diseases caused by minute chromosomal abnormalities is significantly lowered.

It is an object of the present invention to provide a system for extracting chromosomal probes affecting class classification in a microarray.

More specifically, to provide a system for extracting bacterial artificial chromosomes that affect class classification from a microarray.

In order to achieve the above object, the present invention provides a data extraction module for generating first bacterial artificial chromosome expression ratio data by extracting expression ratio information for each bacterial artificial chromosome from the first microarray data; a normalization module for performing Royce normalization on the first bacterial artificial chromosome expression ratio data; a neural network module comprising an input layer, a hidden layer, and an output layer, receiving the first bacterial artificial chromosome expression ratio data and calculating class information of a characteristic to be classified; a decoding module for generating first classification class information including values of neurons of the output layer of the neural network module; a first loss value calculation module for calculating a first loss value by inputting first correct answer class information and the first classification class information into a softmax loss function; a second loss value calculation module for calculating a second loss value by inputting the first correct answer class information and the first classification class information into a root mean square error loss function; It provides a microarray specific determinant extraction system including a model design module for calculating a third loss value by inputting the first loss value and the second loss value to a linear combination function.

In another embodiment of the present invention, in the microarray-specific determinant extraction system, (S1) the data extraction module extracts expression ratio information for each bacterial artificial chromosome from the first microarray data, and the first bacterial artificial chromosome expression ratio data generating a; (S2) performing, by the normalization module, Royce normalization on the first bacterial artificial chromosome expression ratio data; (S3) converting, by the encoding module, the data format so that the first bacterial artificial chromosome expression ratio data corresponds to the neurons of the input layer of the neural network module; (S4) receiving, by the neural network module, the expression rate data of the first bacterial artificial chromosome, and calculating class information of a characteristic to be classified for each neuron of an output layer; (S5) generating, by a decoding module, first classification class information including values of neurons in an output layer of the neural network module; (S6) calculating, by the first loss value calculation module, the first correct answer class information and the first classification class information into a softmax loss function to calculate a first loss value; (S7) calculating, by a second loss value calculation module, the first correct answer class information and the first classification class information into a root mean square error loss function to calculate a second loss value; (S8) calculating, by the model design module, a third loss value by inputting the first loss value and the second loss value into a linear combination function; (S9) the model design module, by backpropagating the third loss value, a first weight that is relation information between each neuron of the input layer of the neural network module and each neuron of a hidden layer of the neural network module; updating a second weight, which is relationship information between each neuron of the hidden layer and each neuron of the output layer; (S10), after the model design module searches for neurons in the input layer with the largest sum of first weights among neurons in the hidden layer of the neural network module, among neurons in the input layer of the neural network module, the retrieved and determining the bacterial artificial chromosome corresponding to the bacterial artificial chromosome identifier of the first bacterial artificial chromosome expression ratio data corresponding to the neurons of the input layer of the neural network module as a determining factor bacterial artificial chromosome, and the third Provided is a method for extracting microarray-specific determinants by repeating steps (S4) to (S9) until the loss value converges.

In the present invention, the third loss value is calculated by linearly combining the first loss value calculated using the softmax loss function and the second loss value calculated using the root mean square error loss function, and then the neural network module is trained. can do. The model design module searches for a bacterial artificial chromosome corresponding to a neuron in the input layer having the largest sum of first weights between neurons in the hidden layer in the neural network module learned according to the third loss value, and a chromosome affecting class classification It can be judged by the bacterial artificial chromosome, which is the determinant of the probe. Accordingly, it is possible to quickly determine the chromosomal probe that has the most influence for each characteristic, and the accuracy thereof can be improved.

1 is a block diagram schematically illustrating a microarray-specific determinant factor extraction system according to an embodiment of the present invention.

2 is a block diagram schematically illustrating microarray data input to a microarray-specific determinant factor extraction system according to an embodiment of the present invention.

3 is a block diagram schematically illustrating a configuration included in a neural network module in a microarray-specific determinant factor extraction system according to an embodiment of the present invention.

4 is a flowchart schematically illustrating a method for extracting a microarray specific determinant according to an embodiment of the present invention.

The present invention may be practiced with various modifications without departing from the spirit, and may have one or more embodiments. In the present invention, the examples described in "specific contents for carrying out the invention" and "drawings" are examples for describing the present invention in detail, and do not limit or limit the scope of the present invention.

Accordingly, those having ordinary knowledge in the technical field to which the present invention pertains can easily infer from "specific details for carrying out the invention" and "drawings" of the present invention are interpreted as belonging to the scope of the present invention. can do.

In addition, the size and shape of each component shown in the drawings may be exaggerated for the description of the embodiment, and do not limit the size and shape of the actually implemented invention.

Unless a term used in the specification of the present invention is specifically defined, it may have the same meaning as commonly understood by a person of ordinary skill in the art to which the present invention belongs.

Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

1 is a block diagram schematically illustrating a microarray-specific determinant factor extraction system according to an embodiment of the present invention. 2 is a block diagram schematically illustrating microarray data input to a microarray-specific determinant factor extraction system according to an embodiment of the present invention. 3 is a block diagram schematically illustrating a configuration included in a neural network module in a microarray-specific determinant factor extraction system according to an embodiment of the present invention.

The microarray specific determinant extraction system 100 according to an embodiment of the present invention includes an input/output module 111 , a storage module 112 , a data extraction module 121 , a normalization module 122 , and an encoding A module 130, a neural network module 140, a decoding module 150, a first loss value calculation module 161, a second loss value calculation module 162, a model design module 170, It may include a correction value extraction module 180 and a visualization module 190 .

The microarray specific determinant extraction system 100 according to an embodiment of the present invention can process a microarray including a bacterial artificial chromosome (BAC) as a probe as a target.

The input/output module 111 may receive the first microarray data MA1 and the second microarray data MA2 from the outside of the microarray specific determinant extraction system 100 .

The first microarray data MA1 may be training data, and bacterial artificial chromosome (BAC) information (MA1-B, hereinafter “first bacterial artificial chromosome information”) and positive bacterial artificial chromosome (BAC) information. or negative expression ratio information (MA1-R, hereinafter “first expression ratio information”) and correct answer class information (MA1-C, hereinafter “first correct answer class information”).

The first bacterial artificial chromosome information (MA1-B) includes a bacterial artificial chromosome identifier (MA1-Bi), a position at which the bacterial artificial chromosome is arranged on the microarray (MA1-Bp), and genetic information (MA1-Bp) of the bacterial artificial chromosome. Bg) may be included. The first correct answer class information MA1-C may be correct answer data corresponding to the first expression rate information MA1-R.

The second microarray data (MA2) may be verification data or general data used after verification, and includes bacterial artificial chromosome (BAC) information (MA2-B, hereinafter “second bacterial artificial chromosome information”) and bacterial artificial chromosome ( BAC) positive or negative expression ratio information (MA2-R, hereinafter “second expression ratio information”) and correct answer class information in which probability information is defined in advance for each class of the characteristic to be classified (MA2) -C, hereinafter “second correct answer class information”) may be included.

The second bacterial artificial chromosome information (MA2-B) includes a bacterial artificial chromosome identifier (MA2-Bi), a position where the bacterial artificial chromosome is arranged on the microarray (MA2-Bp), and genetic information of the bacterial artificial chromosome (MA2-Bi). Bg) may be included. The second correct answer class information MA2-C may be correct answer data corresponding to the second expression rate information MA2-R.

The input/output module 111 is a microarray specific determinant extraction system 100 and a personal area network (PAN), a local area network (LAN), a metropolitan area network (Metropolitan Area Network, MAN), a wide area network Protocols such as Transmission Control Protocol/Internet Protocol (TCP/IP), Server Message Block (SMB), Common Internet File System (CIFS), and Network File System (NFS) from other computing devices connected by a Wide Area Network (WAN). , or through another computer communication protocol, the first microarray data MA1 and the second microarray data MA2 may be transmitted.

The input/output module 110 includes a serial port, a parallel port, a Small Computer System Interface (SCSI), a Universal Serial Bus (USB), an IEEE 1394, an Advanced Technology Attachment (ATA), and a Serial Advanced (SATA). The first microarray data MA1 and the second microarray data MA2 may be transmitted from a data input/output terminal such as a technology attachment) or a peripheral device connected to another data input/output terminal.

The storage module 112 may store all data input to the microarray-specific determinant factor extraction system 100 or generated by the microarray-specific determinant factor extraction system 100 .

In addition, among the modules included in the microarray specific determinant extraction system 100 according to an embodiment of the present invention, the remaining modules except the storage module 112 load all data stored in the storage module 112 and can be used

The storage module 112 may include a storage device to store data. Storage devices include hard disk drives, optical disc drives, magnetic tapes, floppy disks, flash memory, solid state drives (SSDs), and the like. It may be a non-volatile memory device or a volatile memory device such as a random access memory (RAM), but is not limited thereto and may be a different type of memory device.

The data extraction module 121 may extract expression ratio information for each bacterial artificial chromosome (BAC) from the microarray data.

The data extraction module 121 extracts the first bacterial artificial chromosome information (MA1-B) and the corresponding first expression ratio information (MA1-R) from the first microarray data (MA1) to be paired. can For example, the data extraction module 121 extracts the bacterial artificial chromosome identifier (MA1-Bi) and the first expression ratio information (MA1-R) included in the first bacterial artificial chromosome information (MA1-B), By pairing, the first bacterial artificial chromosome expression ratio data (R1) can be generated.

The data extraction module 121 extracts the second bacterial artificial chromosome information (MA2-B) and the corresponding second expression ratio information (MA2-R) from the second microarray data (MA2) to be paired. can For example, the data extraction module 121 pairs the bacterial artificial chromosome identifier (MA2-Bi) included in the second bacterial artificial chromosome information (MA2-B) with the second expression ratio information (MA2-R). , the second bacterial artificial chromosome expression ratio data (R2) can be generated.

That is, the data extraction module 121 may generate expression ratio data for each bacterial artificial chromosome so that other modules can easily obtain expression ratio information for each bacterial artificial chromosome (BAC).

The normalization module 122 may perform normalization on the expression ratio data for each bacterial artificial chromosome generated by the data extraction module 121 .

The normalization module 122 may perform Lowess normalization so that the first bacterial artificial chromosome expression ratio data R1 and the second bacterial artificial chromosome expression ratio data R2 can maintain continuity, respectively.

The encoding module 130 may convert the expression ratio data for each bacterial artificial chromosome into a data format that the neural network module 140 can process.

The encoding module 130 may convert the data format of the first bacterial artificial chromosome expression ratio data R1 normalized by the normalization module 122 to correspond to the neurons of the input layer included in the neural network module 140 . there is.

The encoding module 130 may convert the data format of the second bacterial artificial chromosome expression ratio data R2 normalized by the normalization module 122 to correspond to the neurons of the input layer included in the neural network module 140 . there is.

The neural network module 140 may calculate class information of a characteristic to be classified according to the expression ratio data for each bacterial artificial chromosome.

The neural network module 140 may include an input layer 141 , a hidden layer 142 , and an output layer 143 . The neural network module 140 may include one or more hidden layers 142 , and the hidden layers 142 may be located between the input layer 141 and the output layer 143 .

The input layer 141 may include one or more neurons 141n. For example, the input layer 141 may include as many neurons as the number of bacterial artificial chromosome identifiers (MA1-Bi, MA2-Bi).

The neural network module 140 is the first bacterial artificial chromosome expression ratio data R1 in which the data format is converted in the encoding module 130 after being normalized in the normalization module 122, respectively, in the neurons 141n of the input layer 141 Alternatively, the second bacterial artificial chromosome expression ratio data (R2) may be input.

The hidden layer 142 may include one or more neurons 142n. Each neuron of the hidden layer 141 may correspond to all neurons of the input layer 141 . A first weight W1 may be included as relationship information between the neurons 141n of the input layer 141 and the neurons 142n of the hidden layer 142 .

The neural network module 140 multiplies the value of each neuron 141n of the input layer 141 corresponding to the neuron 142n of the hidden layer 142 by a first weight W1 therebetween, and then returns the multiplied value By adding them all together, the value of the neuron 142n of the hidden layer 142 may be calculated.

The output layer 143 may include one or more neurons 143n. For example, the output layer 143 may include as many neurons 143n as the number of classes of characteristics to be classified.

Each of the neurons 143n of the output layer 143 may correspond to all neurons 142n of the hidden layer 142 . A second weight W2 may be included as relationship information between the neurons 142n of the hidden layer 142 and the neurons 143n of the output layer 143 .

The neural network module 140 multiplies the value of each neuron 142n of the hidden layer 142 corresponding to the neuron 143n of the output layer 143 and a second weight W2 therebetween, and then adds all of the multiplied values. In addition, the value of the neuron 143n of the output layer 143 may be calculated.

The neural network module 140, after the first bacterial artificial chromosome expression ratio data (R1) or the second bacterial artificial chromosome expression ratio data (R2) is input, the class information of the characteristic to be classified, the probability for each class of the output layer It can be calculated for each neuron 143n.

The decoding module 150 may convert the data format of the value of the neuron 143n of the output layer so as to calculate the value of the loss function.

The decoding module 150, after being normalized by the normalization module 122, is calculated when inputting the first bacterial artificial chromosome expression ratio data R1 whose data format is converted in the encoding module 130 to the neural network module 140 The first classification class information C1 may be generated including the values of all neurons 143n of the output layer 143 .

The decoding module 150 is calculated when inputting to the neural network module 140 the second bacterial artificial chromosome expression ratio data R2, which is normalized in the normalization module 122 and then converted in the data format in the encoding module 130 The second classification class information C2 may be generated by including the values of all neurons 143n of the output layer 143 .

The first loss value calculation module 161 and the second loss value calculation module 162 may respectively calculate the value of the loss function by using the class information classified by the neural network module 140 and the correct answer class information.

The first loss value calculation module 161 applies the first classification class information C1 and the first correct answer class information MA1-C included in the first microarray data MA1 to the softmax according to Equation 1 below. The first loss value L1 may be calculated by inputting the (softmax) loss function.

[Equation 1]

Pi=

L1=

In Equation 1, x _i and x _j represent the first classification class information C1. N represents the total number of the first classification class information C1, which is the number of neurons 143n in the output layer 143 . yi represents the first correct answer class information MA1-C.

The first loss value calculation module 161 inputs the second classification class information C2 and the second correct answer class information MA2-C included in the second microarray data MA2 into Equation 1, One loss value L1 can be calculated. At this time, in Equation 1, x _i and x _j represent the second classification class information (C2). N represents the total number of the second classification class information C2, which is the number of neurons 143n in the output layer 143 . y _i represents the second correct answer class information MA2-C.

The second loss value calculation module 162 calculates the first classification class information C1 and the first correct answer class information MA1-C included in the first microarray data MA1 as a root mean square according to Equation 2 below. The second loss value L2 may be calculated by inputting the root mean square error (RMSE) into the loss function.

[Equation 2]

L2 =

In Equation 1, x _i represents the first classification class information (C1). N represents the total number of the first classification class information C1, which is the number of neurons 143n in the output layer 143 . y _i represents the first correct answer class information MA1-C.

The second loss value calculation module 162 inputs the second classification class information (C2) and the second correct answer class information (MA2-C) included in the second microarray data (MA2) into Equation (2), 2 The loss value L2 can be calculated. At this time, in Equation 1, _ represents the second classification class information (C2). N represents the total number of the second classification class information C2, which is the number of neurons 143n in the output layer 143 . yi represents the second correct answer class information MA2-C.

The model design module 170 may calculate a third loss value L3 by performing a linear combination of the first loss value L1 and the second loss value L2 .

The model design module 170 may calculate the third loss value L3 by inputting the first loss value L1 and the second loss value L2 to the linear combination function according to Equation 3 below.

[Equation 3]

L3 = A ₁ x L1 + A ₂ x L2

A ₁ and A ₂ are parameters of the linear combination function, and may be determined in advance according to the weights of the first loss value L1 and the second loss value L2. For example, the sum of A ₁ and A ₂ may be 1.

The model design module 170 may back propagate the third loss value L3 to update the first weight W1 and the second weight W2 of the neural network module 140 .

The model design module 170 analyzes the first weight W1 between the neurons 141n of the input layer 141 and the neurons 142n of the hidden layer 142 after the neural network learning is finished, thereby affecting class classification. Mitch can search for bacterial artificial chromosomes on microarrays. For example, the model design module 170 adds a first weight W1 between each neuron 142n of the hidden layer 142 with respect to each neuron 141n of the input layer 141, and then the summed A neuron 141n of the input layer 141 having the largest first weight W1 may be searched for. In addition, the model design module 170, the bacterial artificial chromosome identifier of the first bacterial artificial chromosome expression ratio data R1 corresponding to the neurons 141n of the input layer 141 having the largest sum of the first weights W1 The bacterial artificial chromosome (BAC) corresponding to (MA1-Bi) can be determined as the bacterial artificial chromosome (D-BAC), the determinant that most affects the characteristics to be classified.

The correction value extraction module 180 may correct the first weight W1 when the model design module 170 searches for the bacterial artificial chromosome, which is a determining factor affecting class classification.

Correction value extraction module 180, the first weight W1 between the neurons 141n of the input layer 141 and the neurons 142n of the hidden layer 142 according to the change of the first microarray data MA1 By analyzing the change, the correction value may be reflected in the first weight W1.

The visualization module 190 may visually display results calculated by other modules.

The visualization module 190 provides first classification class information (C1) according to the first bacterial artificial chromosome expression ratio data (R1) and second classification class information (C2) according to the second bacterial artificial chromosome expression ratio data (R2) can be displayed individually.

In addition, the visualization module 190 may display the bacterial artificial chromosome (D-BAC), a determinant that most affects the characteristics to be classified.

2 is a flowchart schematically illustrating a method for extracting a microarray specific determinant according to an embodiment of the present invention.

In the first step (S1) of the microarray-specific determinant extraction method (S1 to S10) according to an embodiment of the present invention, the data extraction module 121 performs the expression ratio for each bacterial artificial chromosome (BAC) in the microarray data. This is the step of extracting information.

The data extraction module 121 extracts one or more first bacterial artificial chromosome information (MA1-B) and the corresponding first expression ratio information (MA1-R) from the first microarray data (MA1) to make a pair can be built

For example, the data extraction module 121 extracts the bacterial artificial chromosome identifier (MA1-Bi) and the first expression ratio information (MA1-R) included in the first bacterial artificial chromosome information (MA1-B), By pairing, the first bacterial artificial chromosome expression ratio data (R1) can be generated.

The second step ( S2 ) is a step in which the normalization module 122 performs normalization on the expression ratio data for each bacterial artificial chromosome generated by the data extraction module 121 .

The normalization module 122 may perform Lowess normalization so that the first bacterial artificial chromosome expression ratio data R1 can maintain continuity.

The third step (S3) is a step in which the encoding module 130 converts the expression ratio data for each bacterial artificial chromosome into a data format that the neural network module 140 can process.

The encoding module 130 converts the first bacterial artificial chromosome expression ratio data R1 normalized by the normalization module 122 to correspond to the neurons 141n of the input layer 141 included in the neural network module 140 . You can convert data types.

The fourth step S4 is a step in which the neural network module 140 calculates a value for each class of the characteristic to be classified.

The neural network module 140 includes a first weight W1 between the neurons 142n of the hidden layer 142 and the neurons 141n of the input layer 141 , and the neurons 143n and the hidden layer 142 of the output layer 143 . ), the second weight W2 between the neurons 142n may be randomly initialized.

The neural network module 140 may input the first bacterial artificial chromosome expression ratio data R1 to the neurons 141n of the input layer 141 , respectively.

Each value of the neuron 143n of the output layer 143 may be a class probability, which is class information of a characteristic to be classified.

A fifth step ( S5 ) is a step in which the decoding module 150 converts the data format of the value of the neuron 143n of the output layer 143 .

The decoding module 150 may generate the first classification class information C1 including values of all neurons of the output layer 143 .

The sixth step S6 is a step in which the first loss value calculation module 161 calculates the first loss value L1.

The first loss value calculation module 161 converts the first classification class information C1 and the first correct answer class information MA1-C included in the first microarray data MA1 to the softmax of Equation 1 ) can be input to the loss function to calculate the first loss value L1.

The seventh step S7 is a step in which the second loss value calculation module 162 calculates the second loss value L2.

The second loss value calculation module 162 calculates the root mean square error of Equation 2 based on the first classification class information C1 and the first correct answer class information MA1-C included in the first microarray data MA1. RMSE) may be input to the loss function to calculate the second loss value L2.

The eighth step S8 is a step in which the model design module 170 calculates the third loss value L3.

The model design module 170 may input the first loss value L1 and the second loss value L2 to the linear combination function of Equation 3 to calculate the third loss value L3.

The ninth step S9 is a step in which the model design module 170 updates the first weight W1 and the second weight W2 of the neural network module 140 .

After the ninth step (S9), when the third loss value L3 converges, the tenth step (S10) may be performed. If the third loss value L3 does not converge, the fourth step S4 to the ninth step S9 may be repeatedly executed.

In the tenth step ( S10 ), the model design module 170 may search for bacterial artificial chromosomes in the microarray that affect class classification.

The model design module 170 adds a first weight W1 between each neuron of the hidden layer 142 with respect to each neuron of the input layer 141, and then the summed first weight W1 has the largest value. A neuron 141n of the input layer 141 may be searched for.

The model design module 170 is configured to configure the bacterial artificial chromosome identifier MA1 of the first bacterial artificial chromosome expression ratio data R1 corresponding to the neurons 141n of the input layer 141 having the largest sum first weight W1. -Bi), it can be determined as the bacterial artificial chromosome (D-BAC), the determinant that most affects the characteristics to be classified.

Although the embodiment of the present invention has been described above, the present invention is not limited to the above embodiment, and as long as it does not deviate from the spirit of the present invention and does not impair the effect, it may vary within the scope of the detailed description and accompanying drawings of the present invention. It can be changed and implemented. It is also natural that such an embodiment falls within the scope of the present invention.

Claims

a data extraction module for generating first bacterial artificial chromosome expression ratio data by extracting expression ratio information for each bacterial artificial chromosome from the first microarray data;

a normalization module for performing Royce normalization on the first bacterial artificial chromosome expression ratio data;

a neural network module comprising an input layer, a hidden layer, and an output layer, receiving the first bacterial artificial chromosome expression ratio data and calculating class information of a characteristic to be classified;

a decoding module for generating first classification class information including values of neurons of the output layer of the neural network module;

a first loss value calculation module for calculating a first loss value by inputting first correct answer class information and the first classification class information into a softmax loss function;

a second loss value calculation module for calculating a second loss value by inputting the first correct answer class information and the first classification class information into a root mean square error loss function;

A model design module for calculating a third loss value by inputting the first loss value and the second loss value into a linear combination function

A microarray specific determinant extraction system comprising a.
The method of claim 1,

The first microarray data is

first bacterial artificial chromosome information;

First expression rate information, which is positive or negative expression rate information of the bacterial artificial chromosome;

The first correct answer class information in which probability information is defined for each class of the characteristic to be classified

A microarray specific determinant extraction system comprising a.
3. The method of claim 2,

The first bacterial artificial chromosome information,

a bacterial artificial chromosome identifier;

a position where the bacterial artificial chromosomes are arranged on the microarray;

Genetic information of bacterial artificial chromosomes

A microarray specific determinant extraction system comprising a.
4. The method of claim 3,

The data extraction module,

In the first microarray data, the bacterial artificial chromosome identifier of the first bacterial artificial chromosome information and the first expression ratio information are extracted, and then paired to generate the first bacterial artificial chromosome expression ratio data

Microarray specific determinant extraction system.
5. The method of claim 4,

The input layer, the hidden layer, and the output layer of the neural network module each include one or more neurons,

The neural network module is

a first weight, which is relationship information between each neuron of the input layer and each neuron of the hidden layer;

A second weight that is relationship information between each neuron of the hidden layer and each neuron of the output layer

A microarray specific determinant extraction system comprising a.
6. The method of claim 5,

The input layer of the neural network module includes neurons as many as the number of bacterial artificial chromosome identifiers in the first bacterial artificial chromosome information.
7. The method of claim 6,

The microarray specific determinant extraction system further comprising an encoding module for converting a data format so that the first bacterial artificial chromosome expression ratio data corresponds to the neurons of the input layer of the neural network module.
8. The method of claim 7,

The model design module,

A microarray-specific determinant extraction system for updating the first weight and the second weight by backpropagating the third loss value.
9. The method of claim 8,

The model design module,

After searching for a neuron in the input layer of the neural network module, the sum of first weights between neurons in the hidden layer of the neural network module is the largest among neurons in the input layer of the neural network module,

A microarray specific determinant for determining the bacterial artificial chromosome corresponding to the bacterial artificial chromosome identifier of the first bacterial artificial chromosome expression ratio data corresponding to the neurons of the input layer of the neural network module searched as the determinant bacterial artificial chromosome extraction system.
10. The method of claim 9,

and a correction value extraction module configured to correct the first weight by analyzing a change in the first weight according to the change in the first microarray data.
11. The method of claim 10,

The microarray specific determinant extraction system further comprising a visualization module for visually displaying the first classification class information and the determinant bacterial artificial chromosome.
In the microarray specific determinant extraction system,

(S1) generating, by the data extraction module, first bacterial artificial chromosome expression ratio data by extracting expression ratio information for each bacterial artificial chromosome from the first microarray data;

(S2) performing, by the normalization module, Royce normalization on the first bacterial artificial chromosome expression ratio data;

(S3) converting, by the encoding module, the data format so that the first bacterial artificial chromosome expression ratio data corresponds to the neurons of the input layer of the neural network module;

(S4) receiving, by the neural network module, the expression rate data of the first bacterial artificial chromosome, and calculating class information of a characteristic to be classified for each neuron of an output layer;

(S5) generating, by a decoding module, first classification class information including values of neurons in an output layer of the neural network module;

(S6) calculating, by the first loss value calculation module, the first correct answer class information and the first classification class information into a softmax loss function to calculate a first loss value;

(S7) calculating, by a second loss value calculation module, the first correct answer class information and the first classification class information into a root mean square error loss function to calculate a second loss value;

(S8) calculating, by the model design module, a third loss value by inputting the first loss value and the second loss value into a linear combination function;

(S9) the model design module backpropagates the third loss value, and the first weight, which is relationship information between each neuron of the input layer of the neural network module and each neuron of the hidden layer, of the neural network module; updating a second weight, which is relationship information between each neuron of the hidden layer and each neuron of the output layer;

(S10) After the model design module searches for a neuron in the input layer with the largest sum of first weights among neurons in the hidden layer of the neural network module among neurons in the input layer of the neural network module,

Determining the bacterial artificial chromosome corresponding to the bacterial artificial chromosome identifier of the first bacterial artificial chromosome expression ratio data corresponding to the neurons of the input layer of the searched neural network module as a determining factor bacterial artificial chromosome

and repeating the steps (S4) to (S9) until the third loss value converges.