WO2025028333A1 - 情報処理装置、情報処理装置の作動方法、および情報処理装置の作動プログラム - Google Patents
情報処理装置、情報処理装置の作動方法、および情報処理装置の作動プログラム Download PDFInfo
- Publication number
- WO2025028333A1 WO2025028333A1 PCT/JP2024/026195 JP2024026195W WO2025028333A1 WO 2025028333 A1 WO2025028333 A1 WO 2025028333A1 JP 2024026195 W JP2024026195 W JP 2024026195W WO 2025028333 A1 WO2025028333 A1 WO 2025028333A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- prediction
- candidate substance
- prediction model
- information
- evaluation result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12M—APPARATUS FOR ENZYMOLOGY OR MICROBIOLOGY; APPARATUS FOR CULTURING MICROORGANISMS FOR PRODUCING BIOMASS, FOR GROWING CELLS OR FOR OBTAINING FERMENTATION OR METABOLIC PRODUCTS, i.e. BIOREACTORS OR FERMENTERS
- C12M1/00—Apparatus for enzymology or microbiology
- C12M1/34—Measuring or testing with condition measuring or sensing means, e.g. colony counters
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/15—Medicinal preparations ; Physical properties thereof, e.g. dissolubility
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
Definitions
- the technology disclosed herein relates to an information processing device, an operating method for an information processing device, and an operating program for an information processing device.
- Ames mutagenicity tests are widely used in fields such as drug discovery.
- the Ames mutagenicity test is a test to evaluate whether candidate substances for pharmaceutical products and other products have mutagenicity. Mutagenicity is the property of causing irreversible mutations in genes, and is one of the factors that induce cancer in cells.
- a candidate substance is added to bacteria such as Salmonella typhimurium, and the subsequent bacterial growth is used to evaluate whether the candidate substance has mutagenicity (positive) or not (negative).
- Types of gene mutations include base pair substitution mutations, in which part of the base sequence changes, and frameshift mutations, in which the reading frame of a triplet of base sequences is shifted by the insertion or deletion of bases.
- base pair substitution mutations in which part of the base sequence changes
- frameshift mutations in which the reading frame of a triplet of base sequences is shifted by the insertion or deletion of bases.
- Non-Patent Document 1 describes a technology that uses a multitask deep neural network model (hereinafter referred to simply as a deep learning model) to predict the evaluation results of the Ames mutagenicity test.
- the number of past Ames mutagenicity test data that can be used as training data for the deep learning model described in Non-Patent Document 1 is not very large, at only a few thousand for the data on all five strains.
- the deep learning model described in Non-Patent Document 1 is used for both base pair substitution mutations and frameshift mutations, which have completely different causes of occurrence and structural features. For this reason, the deep learning model described in Non-Patent Document 1 may not have sufficient prediction accuracy.
- One embodiment of the technology disclosed herein provides an information processing device, an operating method for an information processing device, and an operating program for an information processing device that can improve the prediction accuracy of the evaluation results of an Ames mutagenicity test and the identification accuracy of whether the type of genetic mutation is a base pair substitution mutation or a frameshift mutation.
- the information processing device disclosed herein is an information processing device that predicts the evaluation result of an Ames mutagenicity test for a candidate substance of a product using a prediction model, and includes a processor.
- the processor uses, as the prediction models, a first prediction model that outputs a first prediction evaluation result indicating whether or not a candidate substance has mutagenicity related to a base pair substitution mutation when the candidate substance is added to a first type of bacterial strain that is sensitive to a base pair substitution mutation in which a part of the base sequence is changed, and a second prediction model that outputs a second prediction evaluation result indicating whether or not a candidate substance has mutagenicity related to a frameshift mutation when the candidate substance is added to a second type of bacterial strain that is sensitive to a frameshift mutation in which the reading frame of the base sequence is shifted, obtains candidate substance information related to the candidate substance, inputs the candidate substance information or information based on the candidate substance information to the first prediction model and the second prediction model, causes the first prediction model and the second prediction
- the processor preferably presents to the user predicted information that the candidate substance is not mutagenic, and if at least one of the first and second predicted evaluation results indicates that the candidate substance is mutagenic, the processor preferably presents to the user predicted information that the candidate substance is mutagenic.
- the processor presents the first prediction evaluation result and the second prediction evaluation result themselves to the user as prediction information.
- the candidate substance information is preferably information regarding the chemical structure of the candidate substance.
- the predictive model is preferably constructed using one of the following machine learning techniques: support vector machines, linear separation, gradient boosting trees, AdaBoost, random forests, deep learning, and ensemble learning of these.
- the first prediction model is constructed by deep learning
- the second prediction model is constructed by transfer learning of the trained first prediction model
- the first prediction model and the second prediction model are constructed using different machine learning methods.
- the first prediction model is constructed using deep learning
- the second prediction model is constructed using a machine learning method other than deep learning.
- the first prediction model and the second prediction model are constructed using the same machine learning method, but it is preferable that the internal parameters for deriving the first prediction evaluation result and the second prediction evaluation result are different.
- the processor obtains feature information of the candidate substances and inputs the feature information into the prediction model.
- the feature information preferably includes at least one of features related to the geometric shape of the candidate substance, features related to the electronic properties of the candidate substance, features related to the physicochemical properties of the candidate substance, and features related to the partial structure of the candidate substance.
- the first learning data used to train the first prediction model and the second learning data used to train the second prediction model are at least partially different.
- the first learning data used to train the first prediction model and the second learning data used to train the second prediction model are preferably prepared based on information on the first and second strains.
- the method of operating the information processing device disclosed herein is a method of operating an information processing device that predicts the evaluation results of an Ames mutagenicity test for a candidate substance of a product using a prediction model, and includes using, as the prediction models, a first prediction model that outputs a first predicted evaluation result indicating whether or not a candidate substance has mutagenicity related to a base pair substitution mutation when the candidate substance is added to a first type of bacterial strain that is sensitive to a base pair substitution mutation in which a part of the base sequence is changed, and a second prediction model that outputs a second predicted evaluation result indicating whether or not a candidate substance has mutagenicity related to a frameshift mutation when the candidate substance is added to a second type of bacterial strain that is sensitive to a frameshift mutation in which the reading frame of the base sequence is shifted, obtaining candidate substance information related to the candidate substance, inputting the candidate substance information or information based on the candidate substance information into the first prediction model and the second prediction model, outputting the first predicted evaluation result and
- the operating program of the information processing device disclosed herein is an operating program of an information processing device that predicts the evaluation result of an Ames mutagenicity test for a candidate substance of a product using a prediction model, and causes a computer to execute processes including using a first prediction model that outputs a first predicted evaluation result indicating whether or not a candidate substance has mutagenicity related to a base pair substitution mutation when the candidate substance is added to a first type of bacterial strain that is sensitive to a base pair substitution mutation in which a part of the base sequence is changed, and a second prediction model that outputs a second predicted evaluation result indicating whether or not a candidate substance has mutagenicity related to a frameshift mutation when the candidate substance is added to a second type of bacterial strain that is sensitive to a frameshift mutation in which the reading frame of the base sequence is shifted, obtaining candidate substance information related to the candidate substance, inputting the candidate substance information or information based on the candidate substance information into the first prediction model and the second prediction model, outputting the first predicted evaluation result and
- the technology disclosed herein can provide an information processing device, an operating method for an information processing device, and an operating program for an information processing device that can improve the prediction accuracy of the evaluation results of an Ames mutagenicity test.
- FIG. 2 is a diagram illustrating an information processing server and a user terminal.
- FIG. 2 is a block diagram showing a computer constituting an information processing server and a user terminal.
- 2 is a block diagram showing a processing unit of a CPU of the information processing server;
- FIG. 11 is a diagram showing feature amount information.
- 13 is a diagram showing a process of a prediction unit that inputs feature amount information to a second prediction model and outputs a second prediction evaluation result from the second prediction model.
- FIG. FIG. 13 is a diagram illustrating processing in the learning phase of the first prediction model.
- FIG. 13 is a diagram illustrating allocation of first learning data and second learning data.
- 13 is a table showing a first prediction evaluation result, a second prediction evaluation result, and a comprehensive prediction evaluation result.
- FIG. 13 is a diagram showing prediction information.
- 2 is a block diagram showing a processing unit of a CPU of a user terminal;
- FIG. 13 is a diagram showing an information input screen.
- FIG. 13 is a diagram showing a predicted evaluation result display screen.
- 13 is a flowchart showing a processing procedure of the information processing server.
- FIG. 1 is a diagram showing an aspect of constructing a second prediction model by transfer learning of a trained first prediction model.
- FIG. 13 is a diagram showing another method of prediction using the first prediction model and the second prediction model.
- FIG. 1 is a diagram showing an aspect of constructing a second prediction model by transfer learning of a trained first prediction model.
- FIG. 13 is a diagram illustrating another example of allocation of the first learning data and the second learning data.
- FIG. 1 shows a first prediction model and a second prediction model, both of which are constructed by a support vector machine.
- FIG. 20 is a diagram illustrating a boundary line in a feature amount space of the first prediction model shown in FIG. 19 .
- FIG. 20 is a diagram illustrating a boundary line in a feature amount space of the second prediction model shown in FIG. 19 .
- the information processing server 10 is connected to a user terminal 11 via a network 12.
- the information processing server 10 is an example of an "information processing device" according to the technology of the present disclosure.
- the user terminal 11 is installed, for example, in a pharmaceutical company that develops pharmaceutical products, or an organization that is contracted by a pharmaceutical company to develop pharmaceutical products, that is, a contract research organization (CRO).
- CRO contract research organization
- the user terminal 11 is operated by a user U who is involved in the development of pharmaceutical products at the pharmaceutical company or the contract research organization.
- the network 12 is, for example, a wide area network (WAN) such as the Internet or a public communication network. Note that, although only one user terminal 11 is connected to the information processing server 10 in FIG. 1, multiple user terminals 11 of multiple pharmaceutical companies or contract research organizations are actually connected to the information processing server 10.
- WAN wide area network
- the user terminal 11 transmits a prediction request 13 to the information processing server 10.
- the prediction request 13 is a request to have the information processing server 10 predict the evaluation results of an Ames mutagenicity test on a candidate substance of a pharmaceutical.
- the prediction request 13 includes candidate substance information 14 relating to the candidate substance. If there are multiple candidate substances for which the evaluation results of the Ames mutagenicity test are to be predicted, the prediction request 13 includes multiple candidate substance information 14 corresponding to the multiple candidate substances, as shown in the figure.
- the candidate substance information 14 is information relating to the chemical structure of the candidate substance. More specifically, the candidate substance information 14 is a character string that represents the chemical structure of the candidate substance using SMILES (Simplified Molecular Input Line Entry System) notation.
- the prediction request 13 also includes a terminal ID (identification data) for uniquely identifying the user terminal 11 that sent the prediction request 13.
- the information processing server 10 When the information processing server 10 receives the prediction request 13, it predicts the evaluation results of the Ames mutagenicity test for the candidate substance and derives prediction information 15. The information processing server 10 distributes the prediction information 15 to the user terminal 11 that sent the prediction request 13. When the information processing server 10 receives the prediction information 15, the user terminal 11 makes the prediction information 15 available for viewing by the user U.
- the computers that make up the information processing server 10 and the user terminal 11 are basically of the same configuration, and include storage 20, memory 21, a CPU (Central Processing Unit) 22, a communication unit 23, a display 24, and an input device 25. These are interconnected via a bus line 26.
- a CPU Central Processing Unit
- a communication unit 23 a display 24, and an input device 25.
- the storage 20 is a hard disk drive built into the computer that constitutes the information processing server 10 and the user terminal 11, or connected via a cable or network.
- the storage 20 is a disk array consisting of multiple hard disk drives.
- the storage 20 stores control programs such as an operating system, various application programs (hereinafter referred to as APs (Application Programs)), and various data associated with these programs.
- APs Application Programs
- a solid state drive may be used instead of a hard disk drive.
- Memory 21 is a work memory for CPU 22 to execute processing.
- CPU 22 loads programs stored in storage 20 into memory 21 and executes processing according to the programs. In this way, CPU 22 comprehensively controls each part of the computer.
- CPU 22 is an example of a "processor" according to the technology of this disclosure. Note that memory 21 may be built into CPU 22.
- the communication unit 23 is a network interface that controls the transmission of various information via the network 12, etc.
- the display 24 displays various screens.
- the various screens are equipped with an operation function using a GUI (Graphical User Interface).
- the computers that make up the information processing server 10 and the user terminal 11 accept input of operation instructions from the input device 25 via the various screens.
- the input device 25 is a keyboard, mouse, touch panel, microphone for voice input, etc.
- the computer parts constituting the information processing server 10 are distinguished by adding the suffix "A" to their reference numbers (storage 20 and CPU 22), and the computer parts constituting the user terminal 11 are distinguished by adding the suffix "B" to their reference numbers (storage 20, CPU 22, display 24, and input device 25).
- an operating program 30 is stored in the storage 20A of the information processing server 10.
- the operating program 30 is an AP for causing a computer to function as the information processing server 10.
- the operating program 30 is an example of an "operating program of an information processing device" according to the technology of the present disclosure.
- a first prediction model 311 and a second prediction model 312 are also stored in the storage 20.
- the first prediction model 311 and the second prediction model 312 are examples of a "prediction model" according to the technology of the present disclosure. In the following description, unless there is a particular need to distinguish between them, the first prediction model 311 and the second prediction model 312 will be collectively referred to as the prediction model 31.
- the request receiving unit 35 receives various requests from the user terminal 11, including the prediction request 13. As described above, the prediction request 13 includes the candidate substance information 14. Therefore, by receiving the prediction request 13, the request receiving unit 35 acquires the candidate substance information 14. When a prediction request 13 is received, the request receiving unit 35 outputs the candidate substance information 14 included in the prediction request 13 to the RW control unit 36. In addition, the request receiving unit 35 outputs the terminal ID of the user terminal 11 included in the prediction request 13 to the screen distribution control unit 39.
- the RW control unit 36 controls the storage of various data in storage 20A and the reading of various data from storage 20A.
- the RW control unit 36 controls the storage of candidate substance information 14 in storage 20A and the reading of candidate substance information 14 from storage 20A.
- the RW control unit 36 outputs the read candidate substance information 14 to the feature derivation unit 37 and the prediction unit 38.
- the RW control unit 36 also reads the first prediction model 311 and the second prediction model 312 from storage 20A, and outputs the read first prediction model 311 and second prediction model 312 to the prediction unit 38.
- the feature derivation unit 37 derives feature information 42 of the candidate substance from the candidate substance information 14. More specifically, the feature derivation unit 37 derives one piece of feature information 42 from the candidate substance information 14 for one candidate substance. Therefore, the feature derivation unit 37 derives the same number of pieces of feature information 42 as the number of pieces of candidate substance information 14 included in the prediction request 13. The feature derivation unit 37 derives the feature information 42, for example, using a machine learning model that outputs feature information 42 when candidate substance information 14 is input. The feature derivation unit 37 outputs the feature information 42 to the prediction unit 38.
- the prediction unit 38 causes the first prediction model 311 and the second prediction model 312 to predict the evaluation results of the Ames mutagenicity test for the candidate substance based on the candidate substance information 14 and the feature amount information 42.
- the prediction unit 38 generates prediction information 15 and outputs the prediction information 15 to the screen distribution control unit 39.
- the screen distribution control unit 39 controls the distribution of various screens to the user terminal 11. Specifically, the screen distribution control unit 39 distributes and outputs various screens to the user terminal 11 that is the sender of the various requests in the form of screen data for web distribution created using a markup language such as XML (Extensible Markup Language). At this time, the screen distribution control unit 39 identifies the user terminal 11 that is the sender of the various requests based on the terminal ID from the request receiving unit 35. Note that instead of XML, other data description languages such as JSON (Javascript (registered trademark) Object Notation) may be used.
- JSON Javascript (registered trademark) Object Notation
- the various screens include an information input screen 85 (see FIG. 13) for inputting candidate substance information 14, and a prediction evaluation result display screen 95 (see FIG. 14) for displaying prediction information 15.
- the CPU 22A also includes an instruction receiving unit that receives various operation instructions from the input device 25.
- the feature information 42 includes feature quantities 45 related to the geometric shape of the candidate substance, feature quantities 46 related to the electronic properties of the candidate substance, feature quantities 47 related to the physicochemical properties of the candidate substance, and feature quantities 48 related to the partial structure of the candidate substance.
- the feature quantities 45 related to the geometric shape include the number of bonds 49 of the candidate substance, and the number of benzene rings 50 of the candidate substance.
- the feature quantities 46 related to the electronic properties include the surface charge density distribution 51 of the candidate substance, and the HOMO (Highest Occupied Molecular Orbital)-LUMO (Lowest Unoccupied Molecular Orbital) energy gap 52 of the candidate substance.
- Feature quantities 47 relating to physicochemical properties include the molecular weight 53 of the candidate substance, and the solubility in water 54 indicating the hydrophilicity and hydrophobicity of the candidate substance.
- Feature quantities 48 relating to partial structures include the Klekota-Roth fingerprint 55 of the candidate substance derived by the partial structure extraction algorithm, and the MACCS Keys fingerprint 56 of the candidate substance. Note that feature quantities 48 relating to partial structures may also include topological fingerprints, Morgan fingerprints, MinHash fingerprints, Avalon fingerprints, atom pair fingerprints, topological dihedral angle fingerprints, and Pubchem fingerprints.
- the various features in the feature information 42 have been selected by the developer of the operating program 30 as being useful for predicting whether a candidate substance has mutagenicity related to frameshift mutations.
- the number of features included in the feature information 42 is preferably 200 or more, and more preferably 1000 or more. Therefore, the feature information 42 can be said to be multidimensional feature data with hundreds to thousands of dimensions.
- the prediction unit 38 inputs the candidate substance information 14 to the first prediction model 311 and causes the first prediction model 311 to output a first prediction evaluation result 601.
- the first prediction evaluation result 601 indicates whether or not the candidate substance has mutagenicity related to base pair substitution mutations when the candidate substance is added to a first type of bacterial strain that is sensitive to base pair substitution mutations in which a part of the base sequence is changed.
- the prediction unit 38 inputs feature information 42 derived from the candidate substance information 14 to the second prediction model 312, and causes the second prediction model 312 to output a second prediction evaluation result 602.
- the feature information 42 is an example of "information based on candidate substance information" according to the technology of the present disclosure.
- the second prediction evaluation result 602 indicates whether or not the candidate substance has mutagenicity related to frameshift mutations when the candidate substance is added to a second type of bacterial strain that is sensitive to frameshift mutations that shift the reading frame of the base sequence.
- the first prediction model 311 is a machine learning model corresponding to base pair substitution mutations constructed by a deep neural network, i.e., deep learning.
- the second prediction model 312 is a machine learning model corresponding to frameshift mutations constructed by a support vector machine.
- the first prediction model 311 and the second prediction model 312 are constructed by different machine learning methods.
- the first prediction model 311 is constructed by deep learning
- the second prediction model 312 is constructed by a machine learning method other than deep learning (here, a support vector machine).
- the first prediction model 311 and the second prediction model 312 may be constructed by any of the machine learning methods of support vector machines, linear separation, gradient boosting trees, Adaboost, random forests, deep learning, and ensemble learning thereof.
- the first prediction model 311 and the second prediction model 312 are not models that are experimentally constructed by a method that does not use bacterial strains and that uses only nucleic acids and compounds, such as isothermal titration calorimetry or ultraviolet-visible spectroscopy. Furthermore, the first prediction model 311 and the second prediction model 312 are not models that are constructed only by simulations, such as docking simulations and quantum chemical calculations.
- the first prediction model 311 is trained using the first training data 621.
- the first training data 621 is a set of training candidate substance information 14L and first correct answer data 601CA.
- the training candidate substance information 14L is candidate substance information 14 of a candidate substance for which an Ames mutagenicity test was actually performed in the past.
- the first correct answer data 601CA is the result of an evaluation of whether or not the candidate substance provided by the training candidate substance information 14L has mutagenicity in an Ames mutagenicity test that was actually performed in the past.
- the candidate substance information 14L for learning is input to the first prediction model 311. This causes the first prediction evaluation result for learning 601L to be output from the first prediction model 311.
- the first prediction evaluation result for learning 601L is compared with the first correct answer data 601CA, and based on the comparison result, a loss calculation is performed for the first prediction model 311 using a loss function. Then, according to the result of the loss calculation, the internal parameters such as the filter coefficient of the first prediction model 311 are updated, and the first prediction model 311 is updated according to the update setting.
- the above series of processes including input of the candidate substance information 14L for learning to the first prediction model 311, output of the first prediction evaluation result for learning 601L from the first prediction model 311, loss calculation, update setting, and update of the first prediction model 311, are repeated while the first learning data 621 is changed. Then, when the prediction accuracy of the first prediction evaluation result for learning 601L for the first correct answer data 601CA reaches a preset level, the repetition of the above series of processes is terminated.
- the first prediction model 311 whose prediction accuracy has thus reached the preset level is stored in storage 20A. Note that learning may be terminated when the above series of processes have been repeated a predetermined number of times, regardless of the prediction accuracy. Furthermore, learning of the first prediction model 311 may be continued even after storage in storage 20A.
- the second prediction model 312 is generated based on the second training data 622.
- the second training data 622 is a set of training feature information 42L and second correct answer data 602CA.
- the training feature information 42L is feature information 42 of a candidate substance for which an Ames mutagenicity test was actually performed in the past.
- the second correct answer data 602CA is the result of an evaluation of whether or not the candidate substance that provided the training feature information 42L has mutagenicity in an Ames mutagenicity test that was actually performed in the past.
- second learning data 622 there are a plurality of second learning data 622, including data in which the second correct answer data 602CA indicates that the candidate substance is mutagenic (positive) and data in which the second correct answer data 602CA indicates that the candidate substance is not mutagenic (negative).
- a boundary line 66 is determined that allows classification of the second learning data 622 in which the second correct answer data 602CA indicates that the candidate substance is mutagenic and the second learning data 622 in which the second correct answer data 602CA indicates that the candidate substance is not mutagenic.
- the second prediction model 312 is generated.
- the generated second prediction model 312 is stored in the storage 20A. Note that learning of the second prediction model 312 may continue even after storage in storage 20A.
- the dimension of the feature space 65 is two-dimensional with a Z1 axis and a Z2 axis, but the actual dimension of the feature space 65 is several hundred to several thousand, as described above.
- the dimension of the feature space 65 is expressed as two dimensions.
- the information 70 of the Ames mutagenicity test actually performed in the past which is the basis of the first learning data 621 and the second learning data 622, includes candidate substance information 14, and the evaluation result of whether the candidate substance has mutagenicity or not, as well as information on the strains.
- first type strains that are sensitive to base pair substitution mutations
- second type strains that are sensitive to frameshift mutations.
- the first type strains are three types, TA100, TA1535, and WP2uvrA
- the second type strains are two types, TA98 and TA1537.
- WP2uvrA/pKM101 may be used instead of WP2uvrA.
- TA98NR may be used instead of TA98.
- the first type strain may also include TA102.
- the second strain may include TA97 or TA97a, or TA1538.
- the first learning data 621 As shown in Table 71, among the Ames mutagenicity tests that have actually been conducted in the past, those in which the first type of bacterial strain is registered are assigned to the first learning data 621, and those in which the second type of bacterial strain is registered are assigned to the second learning data 622.
- the first learning data 621 and the second learning data 622 are prepared based on information on the first type of bacterial strain and the second type of bacterial strain. Therefore, the first learning data 621 and the second learning data 622 are different.
- the candidate substance information 14 of the Ames mutagenicity test assigned to the first learning data 621 becomes candidate substance information for learning 14L of the first learning data 621, and the evaluation result becomes the first correct answer data 601CA.
- the feature information 42 derived from the candidate substance information 14 of the Ames mutagenicity test assigned to the second learning data 622 becomes feature information for learning 42L of the second learning data 622, and the evaluation result becomes the second correct answer data 602CA.
- the information 70 of the Ames mutagenicity test that was actually conducted in the past may be public information that is widely available to the public, or may be information that has been independently accumulated by a pharmaceutical company or a contracted pharmaceutical development organization. Also, it may be composed of both public information and information that has been independently accumulated by a pharmaceutical company or a contracted pharmaceutical development organization.
- the prediction unit 38 outputs an overall predicted evaluation result 76 indicating that the candidate substance is not mutagenic.
- the prediction unit 38 outputs an overall predicted evaluation result 76 indicating that the candidate substance is mutagenic.
- the prediction unit 38 outputs a first predicted evaluation result 601, a second predicted evaluation result 602, and an overall predicted evaluation result 76 as prediction information 15.
- FIG. 11 illustrates an example in which both the first predicted evaluation result 601 and the second predicted evaluation result 602 indicate that the candidate substance is not mutagenic.
- a prediction AP 80 is stored in the storage 20B of the user terminal 11.
- the prediction AP 80 is installed in the user terminal 11 by the user U.
- the prediction AP 80 is an AP for predicting the evaluation results of an Ames mutagenicity test.
- the CPU 22B of the user terminal 11 works in cooperation with the memory 21 and the like to function as a browser control unit 82.
- the browser control unit 82 controls the operation of a dedicated web browser for the prediction AP 80.
- the browser control unit 82 reproduces various screens based on various screen data from the information processing server 10, and displays the reproduced various screens on the display 24B.
- the browser control unit 82 also accepts various operation instructions input by the user U from the input device 25B via the various screens.
- the browser control unit 82 transmits various requests, including a prediction request 13, to the information processing server 10 in response to the operation instructions.
- an information input screen 85 as shown in FIG. 13 is displayed on the display 24B under the control of the browser control unit 82.
- the information input screen 85 is provided with input boxes 86 for the candidate substance information 14 of a plurality of candidate substances.
- the chemical structural formula of the candidate substance can be written using a description tool that appears when the description tool display button 87 is selected, or a file of the chemical structural formula of the candidate substance can be dropped.
- the input boxes 86 can be added by selecting the add buttons 88A and 88B at the bottom.
- the add button 88A is a button for adding one input box 86 at a time
- the add button 88B is a button for adding ten input boxes 86 at a time.
- the user U After inputting the chemical structural formula of a desired candidate substance in the input box 86, the user U selects the prediction button 89.
- the browser control unit 82 When the prediction button 89 is selected, the browser control unit 82 generates a prediction request 13 including candidate substance information 14 corresponding to the chemical structural formula inputted in the input box 86, and transmits the generated prediction request 13 to the information processing server 10.
- a predicted evaluation result display screen 95 shown in FIG. 14 as an example is displayed on the display 24B under the control of the browser control unit 82.
- the predicted evaluation result display screen 95 displays a list of predicted information 15 for each candidate substance. In this way, the predicted information 15 is presented to the user U in the form of screen data distribution.
- a chemical structure formula display button 96 is provided at the top of the predicted evaluation result display screen 95.
- the chemical structure formula display button 96 is selected, a list screen of the chemical structure formulas of the candidate substances is displayed.
- a save button 97 and an OK button 98 are provided at the bottom of the predicted evaluation result display screen 95.
- the save button 97 is selected, the candidate substance information 14 and the predicted information 15 are stored in association with the storage 20B of the user terminal 11.
- the OK button 98 is selected, the display of the predicted evaluation result display screen 95 is erased.
- the CPU 22A of the information processing server 10 functions as a request receiving unit 35, a RW control unit 36, a feature derivation unit 37, a prediction unit 38, and a screen delivery control unit 39, as shown in FIG. 3.
- the CPU 22B of the user terminal 11 functions as a browser control unit 82, as shown in FIG. 12.
- the display 24B of the user terminal 11 displays the information input screen 85 shown in FIG. 13.
- a prediction request 13 is sent from the browser control unit 82 to the information processing server 10.
- the prediction request 13 includes candidate substance information 14, which is a character string expressing the chemical structure of the candidate substance in SMILES notation, and the terminal ID of the user terminal 11, etc.
- the request receiving unit 35 receives the prediction request 13 (YES in step ST100).
- the candidate substance information 14 included in the prediction request 13 is output from the request receiving unit 35 to the RW control unit 36, and is stored in the storage 20A under the control of the RW control unit 36 (step ST110).
- the terminal ID of the user terminal 11 included in the prediction request 13 is output from the request receiving unit 35 to the screen distribution control unit 39.
- the candidate substance information 14 is read from the storage 20A by the RW control unit 36 (step ST120).
- the candidate substance information 14 is output from the RW control unit 36 to the feature derivation unit 37 and the prediction unit 38.
- the feature derivation unit 37 derives feature information 42 from the candidate substance information 14 (step ST130).
- the feature information 42 includes feature amounts 45 related to the geometric shape of the candidate substance, feature amounts 46 related to the electronic properties of the candidate substance, feature amounts 47 related to the physicochemical properties of the candidate substance, and feature amounts 48 related to the partial structure of the candidate substance.
- the feature information 42 is output from the feature derivation unit 37 to the prediction unit 38.
- the candidate substance information 14 is input to the first prediction model 311.
- the first prediction evaluation result 601 is output from the first prediction model 311 (step ST140_1).
- the feature amount information 42 is input to the second prediction model 312.
- the second prediction evaluation result 602 is output from the second prediction model 312 (step ST140_2).
- prediction information 15 is generated that includes the first prediction evaluation result 601, the second prediction evaluation result 602, and the overall prediction evaluation result 76.
- the prediction information 15 is output from the prediction unit 38 to the screen distribution control unit 39.
- the screen delivery control unit 39 generates screen data for the predicted evaluation result display screen 95 shown in FIG. 14 based on the prediction information 15.
- the screen data for the predicted evaluation result display screen 95 is delivered to the user terminal 11 that sent the prediction request 13 under the control of the screen delivery control unit 39 (step ST150).
- the screen data of the predicted evaluation result display screen 95 is reproduced, and the reproduced predicted evaluation result display screen 95 is displayed on the display 24B. In this way, the predicted information 15 is presented to the user U.
- the information processing server 10 uses the first prediction model 311 and the second prediction model 312 as the prediction model 31 for predicting the evaluation result of the Ames mutagenicity test for a candidate substance.
- the first prediction model 311 outputs a first predicted evaluation result 601 indicating whether or not the candidate substance has mutagenicity related to base pair substitution mutation when the candidate substance is added to a first type of bacterial strain that is sensitive to base pair substitution mutation in which a part of the base sequence is changed.
- the second prediction model 312 outputs a second predicted evaluation result 602 indicating whether or not the candidate substance has mutagenicity related to frameshift mutation when the candidate substance is added to a second type of bacterial strain that is sensitive to frameshift mutation in which the reading frame of the base sequence is shifted.
- the screen delivery control unit 39 presents the prediction information 15 to the user U by delivering screen data of the prediction evaluation result display screen 95 including prediction information 15 corresponding to the first prediction evaluation result 601 and the second prediction evaluation result 602 to the user terminal 11.
- This makes it possible to improve the accuracy of prediction of the evaluation results of the Ames mutagenicity test compared to using a prediction model common to both base pair substitution mutations and frameshift mutations, which have completely different causes, structural features, etc.
- it also makes it possible to improve the accuracy of identifying whether the type of gene mutation is a base pair substitution mutation or a frameshift mutation.
- the candidate substance information 14 is information related to the chemical structure of the candidate substance.
- Information related to the chemical structure is information that most succinctly represents the properties of the candidate substance. Therefore, the first prediction evaluation result 601 that reflects the properties of the candidate substance can be output from the first prediction model 311.
- the prediction model 31 is constructed by one of the machine learning methods: support vector machine, linear separation, gradient boosting tree, AdaBoost, random forest, deep learning, and ensemble learning of these. Since these are all very common machine learning methods, it is easy to construct a prediction model 31 with relatively high prediction accuracy.
- the first prediction model 311 and the second prediction model 312 are constructed using different machine learning methods. Therefore, it is possible to construct the first prediction model 311 suitable for predicting whether or not a candidate substance has mutagenicity associated with a base pair substitution mutation, and the second prediction model 312 suitable for predicting whether or not a candidate substance has mutagenicity associated with a frameshift mutation.
- Base pair substitution mutations are the main cause of mutagenicity, and there is more data available for reference than frameshift mutations, which are a secondary cause of mutagenicity.
- the first learning data 621 is more than the second learning data 622.
- the first prediction model 311 is suitable for deep learning, which requires a large amount of learning data to improve prediction accuracy.
- the second prediction model 312 is suitable for a machine learning method other than deep learning, which can improve prediction accuracy by using selected features even with a relatively small amount of learning data. In this way, by constructing the first prediction model 311 and the second prediction model 312 using the machine learning method that each is good at, it is possible to further improve the prediction accuracy of the evaluation results of the Ames mutagenicity test.
- multiple prediction models 31 may be constructed using multiple different machine learning methods, and the prediction model 31 with the highest prediction accuracy among them may be adopted.
- the feature information 42 includes feature 45 related to the geometric shape of the candidate substance, feature 46 related to the electronic properties of the candidate substance, feature 47 related to the physicochemical properties of the candidate substance, and feature 48 related to the partial structure of the candidate substance. This can improve the prediction accuracy of the second prediction evaluation result 602. Note that it is sufficient for the feature information 42 to include at least one of the above feature amounts 45 to 48.
- the first learning data 621 used to train the first prediction model 311 and the second learning data 622 used to train the second prediction model 312 are at least partially different. Therefore, learning can be performed that is appropriate for each of the first prediction model 311 and the second prediction model 312.
- the first learning data 621 and the second learning data 622 are prepared based on information on the first and second strains. Therefore, it is possible to prepare the first learning data 621 and the second learning data 622 suitable for the first prediction model 311 and the second prediction model 312, respectively.
- a first prediction model 311 corresponding to a base pair substitution mutation constructed by a deep neural network may be transferred to construct a second prediction model 312 corresponding to a frameshift mutation.
- the first prediction model 311 which has been trained using the first learning data 621 as shown in FIG. 7, is similarly subjected to the process shown in FIG. 7 using the second learning data 622.
- the first prediction model 311 is trained using the first learning data 621, which is greater in number than the second learning data 622, and this is used as the second prediction model 312 by transfer learning.
- the processing can be simplified since there is no need to derive the feature amount information 42.
- the first prediction model 311 and the second prediction model 312 have different internal parameters, such as filter coefficients.
- the prediction unit 38 performs in parallel the prediction of whether or not the candidate substance has mutagenicity related to base pair substitution mutation using the first prediction model 311 and the prediction of whether or not the candidate substance has mutagenicity related to frameshift mutation using the second prediction model 312, but this is not limited to this.
- prediction may be performed according to the procedure shown in FIG. 17. That is, first, the first prediction model 311 is used to predict whether or not the candidate substance has mutagenicity related to base pair substitution mutation. Then, only when the first prediction evaluation result 601 indicates that the candidate substance does not have mutagenicity related to base pair substitution mutation, the second prediction model 312 is used to predict whether or not the candidate substance has mutagenicity related to frameshift mutation.
- the prediction unit 38 outputs an overall prediction evaluation result 76 indicating that the candidate substance is not mutagenic.
- the prediction unit 38 outputs an overall prediction evaluation result 76 indicating that the candidate substance is mutagenic. If the first prediction evaluation result 601 indicates that the candidate substance is mutagenic, prediction using the second prediction model 312 is not performed, so that the processing time can be shortened. However, as in the above example, it is preferable to always perform prediction using the first prediction model 311 and prediction using the second prediction model 312, because this increases the reliability of the overall prediction evaluation result 76.
- the method of sorting the first learning data 621 and the second learning data 622 is not limited to the example shown in Fig. 9.
- Fig. 9 As an example, as shown in table 100 in Fig. 18, among Ames mutagenicity tests actually performed in the past, those registered as having mutagenicity in the evaluation results are sorted based on the strain information as in the case of Fig. 9, but those registered as not having mutagenicity in the evaluation results may be sorted into both the first learning data 621 and the second learning data 622. In other words, it is sufficient that the first learning data 621 and the second learning data 622 are at least partially different.
- (Variation 4) 5 and 6 show an example in which the first prediction model 311 and the second prediction model 312 are constructed by different machine learning methods (the first prediction model 311 is constructed by a deep neural network, and the second prediction model 312 is constructed by a support vector machine), but this is not limited to the above.
- the first prediction model 311 and the second prediction model 312 may be constructed by the same machine learning method.
- FIG. 19 shows an example in which the first prediction model 311 and the second prediction model 312 are both constructed by a support vector machine.
- the first boundary line 661 in the feature space 65 of the first prediction model 311 is naturally different from the second boundary line 662 in the feature space 65 of the second prediction model 312.
- the first boundary line 661 and the second boundary line 662 are examples of the "internal parameters" according to the technology of the present disclosure.
- the first prediction model 311 and the second prediction model 312 are constructed by the same machine learning method, the first prediction model 311 suitable for predicting whether or not a candidate substance has mutagenicity related to a base pair substitution mutation, and the second prediction model 312 suitable for predicting whether or not a candidate substance has mutagenicity related to a frameshift mutation can be used.
- the first learning data 621 is composed of a set of learning feature information 42L and first correct answer data 601CA, as shown in FIG. 20.
- the first prediction model 311 and the second prediction model 312 may both be constructed by deep learning such as a deep neural network.
- the internal parameters are, for example, coefficients set in an intermediate layer (hidden layer) or coefficients of a filter used in a convolution operation.
- the candidate substance information 14 is not limited to a character string that represents the chemical structure of the candidate substance using the SMILES notation shown as an example. It may also be a MOL (Molecular Design Limited) file that represents the chemical structure of the candidate substance, or an SDF (Structure-Data File), etc. In any case, it is preferable for the description method to be capable of uniquely determining a three-dimensional structure such as an isomer, and it is even more preferable for the description method to be capable of representing the three-dimensional coordinate information of a molecule.
- the feature information 42 may be derived by a device other than the information processing server 10 and input to the information processing server 10. Also, the user U may input the feature information 42 via the input device 25B of the user terminal 11.
- the prediction information 15 may be only the overall prediction evaluation result 76. Conversely, the prediction information 15 may be only the set of the first prediction evaluation result 601 and the second prediction evaluation result 602.
- the information processing server 10 may be installed in a pharmaceutical company or a pharmaceutical development contract organization, or in a data center independent of the pharmaceutical company or pharmaceutical development contract organization.
- the predicted information 15 itself may be delivered to the user terminal 11.
- the user terminal 11 generates the predicted evaluation result display screen 95 based on the predicted information 15 under the control of the browser control unit 82.
- the method of presenting the prediction information 15 to the user U is not limited to the example of presenting it by distributing screen data.
- the prediction information 15 may be presented to the user U by printing it on a paper medium, or the prediction information 15 may be attached to an e-mail and sent to the user terminal 11.
- the product is not limited to pharmaceuticals. It can be cosmetics, etc.
- the hardware configuration of the computer constituting the information processing server 10 can be modified in various ways.
- the information processing server 10 can be composed of multiple computers separated as hardware in order to improve processing power and reliability.
- the functions of the request reception unit 35 and RW control unit 36 and the functions of the feature derivation unit 37, prediction unit 38, and screen delivery control unit 39 can be distributed and assigned to two computers.
- the information processing server 10 is composed of two computers. Some or all of the functions of the information processing server 10 may be assigned to the user terminal 11.
- the hardware configuration of the computer of the information processing server 10 can be changed as appropriate according to the required performance such as processing power, safety, and reliability.
- APs such as the operating program 30 can of course be duplicated or stored in multiple storage devices in order to ensure safety and reliability.
- the hardware structure of the processing unit that executes various processes such as the request receiving unit 35, RW control unit 36, feature derivation unit 37, prediction unit 38, screen delivery control unit 39, and browser control unit 82 can use the various processors shown below.
- the various processors include the CPUs 22A and 22B, which are general-purpose processors that execute software (operation program 30 and prediction AP 80) and function as various processing units, as well as programmable logic devices (PLDs) such as FPGAs (Field Programmable Gate Arrays) that are processors whose circuit configuration can be changed after manufacture, and dedicated electrical circuits such as ASICs (Application Specific Integrated Circuits), which are processors with circuit configurations designed specifically to execute specific processes.
- PLDs programmable logic devices
- FPGAs Field Programmable Gate Arrays
- ASICs Application Specific Integrated Circuits
- a single processing unit may be configured with one of these various processors, or may be configured with a combination of two or more processors of the same or different types (e.g., a combination of multiple FPGAs and/or a combination of a CPU and an FPGA). Also, multiple processing units may be configured with a single processor.
- Examples of configuring multiple processing units with a single processor include, first, a form in which one processor is configured with a combination of one or more CPUs and software, as typified by client and server computers, and this processor functions as multiple processing units. Second, a form in which a processor is used to realize the functions of the entire system, including multiple processing units, with a single IC (Integrated Circuit) chip, as typified by system-on-chip (SoC). In this way, the various processing units are configured as a hardware structure using one or more of the various processors listed above.
- SoC system-on-chip
- the hardware structure of these various processors can be an electrical circuit that combines circuit elements such as semiconductor elements.
- An information processing device that predicts an evaluation result of an Ames mutagenicity test for a candidate substance of a product using a prediction model
- a processor is provided.
- the processor a first prediction model that outputs a first prediction evaluation result indicating whether or not the candidate substance has mutagenicity related to base pair substitution mutations, in the case where the candidate substance is added to a first species strain that is sensitive to base pair substitution mutations in which a part of a base sequence is changed; a second prediction model that outputs a second prediction evaluation result indicating whether or not the candidate substance has mutagenicity related to the frameshift mutation when the candidate substance is added to a second species strain that is sensitive to a frameshift mutation in which the reading frame of a base sequence is shifted; obtaining candidate substance information regarding said candidate substance; inputting the candidate substance information or information based on the candidate substance information into the first prediction model and the second prediction model, and outputting the first prediction evaluation result and the second prediction evaluation result from the first prediction model and the second prediction model; presenting
- the information processing device is information regarding a chemical structure of the candidate substance.
- the predictive model is constructed by any one of machine learning techniques including a support vector machine, a linear separation, a gradient boosting tree, an AdaBoost, a random forest, deep learning, and an ensemble learning thereof.
- the first prediction model is constructed by deep learning;
- the information processing device according to claim 5, wherein the second prediction model is constructed by transfer learning of the first prediction model that has already been trained.
- the information processing device according to claim 5, wherein the first prediction model and the second prediction model are constructed using different machine learning techniques.
- the first prediction model is constructed by deep learning;
- the prediction model is constructed by any one of machine learning techniques including a support vector machine, a linear separation, a gradient boosting tree, an AdaBoost, a random forest, and an ensemble learning thereof
- the processor acquiring feature amount information of the candidate substance;
- the information processing device according to any one of claims 5, 7, 8, and 9, wherein the feature amount information is input to the prediction model.
- the feature information includes at least one of a feature relating to the geometric shape of the candidate substance, a feature relating to the electronic properties of the candidate substance, a feature relating to the physicochemical properties of the candidate substance, and a feature relating to the partial structure of the candidate substance.
- a and/or B is synonymous with “at least one of A and B.”
- a and/or B means that it may be just A, or just B, or a combination of A and B.
- the same concept as “A and/or B” is also applied when three or more things are linked together with “and/or.”
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Immunology (AREA)
- Medicinal Chemistry (AREA)
- Biomedical Technology (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Physics & Mathematics (AREA)
- Organic Chemistry (AREA)
- Food Science & Technology (AREA)
- Pathology (AREA)
- Urology & Nephrology (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- Hematology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biophysics (AREA)
- Genetics & Genomics (AREA)
- General Engineering & Computer Science (AREA)
- Cell Biology (AREA)
- Sustainable Development (AREA)
- Pharmacology & Pharmacy (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2025537335A JPWO2025028333A1 (https=) | 2023-07-28 | 2024-07-22 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023-123812 | 2023-07-28 | ||
| JP2023123812 | 2023-07-28 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025028333A1 true WO2025028333A1 (ja) | 2025-02-06 |
Family
ID=94395262
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2024/026195 Pending WO2025028333A1 (ja) | 2023-07-28 | 2024-07-22 | 情報処理装置、情報処理装置の作動方法、および情報処理装置の作動プログラム |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2025028333A1 (https=) |
| WO (1) | WO2025028333A1 (https=) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019010095A (ja) * | 2017-06-30 | 2019-01-24 | 学校法人 明治薬科大学 | 予測装置、予測方法、予測プログラム、学習モデル入力データ生成装置および学習モデル入力データ生成プログラム |
| JP2021071932A (ja) * | 2019-10-31 | 2021-05-06 | 横河電機株式会社 | 装置、方法およびプログラム |
-
2024
- 2024-07-22 WO PCT/JP2024/026195 patent/WO2025028333A1/ja active Pending
- 2024-07-22 JP JP2025537335A patent/JPWO2025028333A1/ja active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019010095A (ja) * | 2017-06-30 | 2019-01-24 | 学校法人 明治薬科大学 | 予測装置、予測方法、予測プログラム、学習モデル入力データ生成装置および学習モデル入力データ生成プログラム |
| JP2021071932A (ja) * | 2019-10-31 | 2021-05-06 | 横河電機株式会社 | 装置、方法およびプログラム |
Non-Patent Citations (3)
| Title |
|---|
| LUI RAYMOND, GUAN DAVY, MATTHEWS SLADE: "Mechanistic Task Groupings Enhance Multitask Deep Learning of Strain-Specific Ames Mutagenicity", CHEMICAL RESEARCH IN TOXICOLOGY, AMERICAN CHEMICAL SOCIETY, WASHINGTON, DC, US, vol. 36, no. 8, 21 August 2023 (2023-08-21), US , pages 1248 - 1254, XP093271034, ISSN: 0893-228X, DOI: 10.1021/acs.chemrestox.2c00385 * |
| M. J. MARTINEZ ET AL.: "Journal of Chemical Information and Modeling", vol. 62, September 2022, article "Multitask Deep Neural Networks for Ames Mutagenicity Prediction" |
| MARTÍNEZ MARÍA JIMENA, SABANDO MARÍA VIRGINIA, SOTO AXEL J., ROCA CARLOS, REQUENA-TRIGUERO CARLOS, CAMPILLO NURIA E., PÁEZ JUAN A.: "Multitask Deep Neural Networks for Ames Mutagenicity Prediction", JOURNAL OF CHEMICAL INFORMATION AND MODELING, AMERICAN CHEMICAL SOCIETY , WASHINGTON DC, US, vol. 62, no. 24, 26 December 2022 (2022-12-26), US , pages 6342 - 6351, XP093271039, ISSN: 1549-9596, DOI: 10.1021/acs.jcim.2c00532 * |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2025028333A1 (https=) | 2025-02-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Dai et al. | Matrix factorization‐based prediction of novel drug indications by integrating genomic space | |
| Higham et al. | Fitting a geometric graph to a protein–protein interaction network | |
| Leontis et al. | The RNA Ontology Consortium: an open invitation to the RNA community | |
| CN108140075A (zh) | 将用户行为分类为异常 | |
| JP7006296B2 (ja) | 学習プログラム、学習方法および学習装置 | |
| Abdulloh et al. | Observation of imbalance tracer study data for graduates employability prediction in Indonesia | |
| JP2018147280A (ja) | データ分析装置及びデータ分析方法 | |
| Luo et al. | Projecting molecules into synthesizable chemical spaces | |
| CN108255706A (zh) | 自动化测试脚本的编辑方法、装置、终端设备及存储介质 | |
| Carré et al. | Reverse engineering highlights potential principles of large gene regulatory network design and learning | |
| Barros-Justo et al. | The impact of Use Cases in real-world software development projects: A systematic mapping study | |
| Harihar et al. | Importance of inter-residue contacts for understanding protein folding and unfolding rates, remote homology, and drug design | |
| EP4280119A1 (en) | Rule update program, rule update method, and rule update device | |
| Brown et al. | Large-scale causal discovery using interventional data sheds light on gene network structure in k562 cells | |
| WO2025028333A1 (ja) | 情報処理装置、情報処理装置の作動方法、および情報処理装置の作動プログラム | |
| Grassi et al. | SEMtree: tree-based structure learning methods with structural equation models | |
| JP4421971B2 (ja) | 解析エンジン交換型システム及びデータ解析プログラム | |
| JP7450187B2 (ja) | プログラム、情報処理方法、及び情報処理装置 | |
| JP2010250377A (ja) | リンク予測システム、方法及びプログラム | |
| Karthika et al. | Genetic algorithm-based feature selection and self-organizing auto-encoder (SOAE) for SNP genomics data classifications | |
| Phenix et al. | Identifiability and inference of pathway motifs by epistasis analysis | |
| Smith et al. | Protocol for CAROM: a machine learning tool to predict post-translational regulation from metabolic signatures | |
| WO2019198408A1 (ja) | 学習装置、学習方法、及び学習プログラム | |
| Khatir et al. | Pairwise test case generation using (1+ 1) evolutionary algorithm for software product line testing | |
| Yu et al. | Combining machine learning, molecular dynamics, and free energy analysis for (5HT)-2A receptor modulator classification |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24848988 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2025537335 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2025537335 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024848988 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |