WO2022039092A1 - Learning model generation method, program, and calculation device - Google Patents

Learning model generation method, program, and calculation device Download PDF

Info

Publication number
WO2022039092A1
WO2022039092A1 PCT/JP2021/029682 JP2021029682W WO2022039092A1 WO 2022039092 A1 WO2022039092 A1 WO 2022039092A1 JP 2021029682 W JP2021029682 W JP 2021029682W WO 2022039092 A1 WO2022039092 A1 WO 2022039092A1
Authority
WO
WIPO (PCT)
Prior art keywords
learning model
information
sample
pathogen
infection
Prior art date
Application number
PCT/JP2021/029682
Other languages
French (fr)
Japanese (ja)
Inventor
孝章 赤池
智史 高木
Original Assignee
孝章 赤池
バイオ・アクセラレーター株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 孝章 赤池, バイオ・アクセラレーター株式会社 filed Critical 孝章 赤池
Publication of WO2022039092A1 publication Critical patent/WO2022039092A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Definitions

  • the present invention relates to a learning model generation method, a program, and an arithmetic unit.
  • Patent Document 1 describes a method for diagnosing a microbial infection in an organism, wherein the microbial infection is at least partially present as a basic body in the cell material of the organism, and the diagnostic method is a method of diagnosing the cell material from the organism.
  • Disclosed is a method comprising the step of subjecting a volume of the test composition comprising the test composition to a mass analysis method to identify the presence of the microbial infection.
  • the method for generating a learning model according to the first aspect of the present invention is a method for creating a learning model for diagnosing the degree of infection with a pathogen, and includes information on proteins obtained from a sample collected from a sample provider and the sample.
  • a plurality of teacher data including the degree of infection with the pathogen of the donor are acquired, and the information of the protein is input using the teacher data, and a learning model is generated in which the degree of infection with the pathogen is output.
  • the program according to the second aspect of the present invention acquires evaluation data which is information on the protein of the sample collected from the evaluation subject, inputs the information on the protein obtained from the sample collected from the sample provider, and provides the sample.
  • the acquired evaluation data is input to a learning model trained using teacher data that outputs the degree of infection of the pathogen of the person, and the degree of infection of the pathogen of the evaluation target person is output.
  • the arithmetic unit inputs the reading unit that reads the evaluation data, which is the information on the protein of the sample collected from the evaluation subject, and the information on the protein obtained from the sample collected from the sample provider.
  • a storage unit that stores a learning model trained using teacher data that outputs the degree of infection of the pathogen of the sample provider, and inputting the evaluation data into the learning model, the evaluation target person It is provided with a calculation unit that outputs the degree of infection with the pathogen.
  • Pathogens replicate genetic information such as DNA and RNA in the host's body, but produce various proteins for self-replication.
  • it is effective to detect the genetic information of the replicated pathogen as in the prior art.
  • it is necessary to confirm that the gene for the pathogen does not exist, and if the sample does not contain the pathogen for some reason or if the pathogen contains less than the detection sensitivity. Is mistakenly determined not to exist.
  • the number of proteins produced by pathogens and proteins produced by pathogens in host cells is overwhelmingly large compared to genetic information, and can be a feature indicating the presence of pathogens.
  • the relationship between the degree of infection with a pathogen and a protein is learned by machine learning, and inference is performed using the learning model obtained by this learning.
  • FIG. 1 is an overall configuration diagram of an infection determination system S for determining infection with a pathogen P.
  • the pathogen P is not particularly limited and may be a virus, a bacterium, or a fungus.
  • the infection determination system S includes an arithmetic unit 10, a pretreatment device 92, a mass spectrometer 93, and a gene analysis device 94.
  • the arithmetic unit 10 includes an arithmetic unit 1, a communication unit 2, and a storage unit 3.
  • the storage unit 3 is a non-volatile storage device such as a hard disk drive.
  • the storage unit 3 stores the teacher data 31, the learning model 32, the learning model program 32A, and the evaluation data 33.
  • the arithmetic unit 1 is, for example, a central processing unit (not shown), and performs learning processing and inference processing described later by expanding and executing a program stored in a read-only memory (not shown) in a volatile memory (not shown). ..
  • the arithmetic unit 1 may be realized by using a hardware circuit or a reconfigurable logic circuit, or may be realized by a combination thereof.
  • the arithmetic unit 1 performs learning processing and inference processing.
  • the learning process is the creation and updating of the learning model 32 using the teacher data 31.
  • the inference process is the evaluation of the evaluation data 33 using the learning model 32.
  • the communication unit 2 is a communication module capable of communicating with the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95.
  • the communication unit 2 may be directly connected to the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95, or may be connected via a communication network, for example, the Internet. Since the communication unit 2 acquires the evaluation data 33 from the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95, it can also be called a “reading unit” that performs a process of reading the evaluation data 33 from the outside.
  • the teacher data 31 includes a plurality of sets of teacher protein information 311, teacher genome information 312, teacher medical examination information 313, and infection confirmation information 314.
  • the teacher protein information 311 is the identification information of the protein detected by the mass spectrometer 93 and the quantitative information of the detected protein, and is created by the mass spectrometer 93. Quantitative information on a protein is either the specific amount of protein detected and the weight ratio of that protein to the total amount of total protein detected.
  • the teacher genome information 312 is information created by the gene analysis device 94 and indicating whether or not the sample contains the known genome information of the pathogen P.
  • Teacher consultation information 313 is various objective and subjective information such as body temperature, cough, headache, myalgia, and fatigue.
  • the infection confirmation information 314 is the presence or absence of infection with the pathogen P, the presence or absence of infectivity when infected, and the number of days until the infectivity is generated when there is no infectivity.
  • Each infection confirmation information 314 is associated with teacher protein information 311, teacher genomic information 312, and teacher consultation information 313.
  • the teacher data 31 is derived from the teacher sample 911 and the group 90 collected from the group 90 as described later.
  • FIG. 2 is a diagram showing an example of teacher data 31.
  • the teacher protein information 311 describes the name of the protein and the weight ratio of the protein to the total amount of the protein. As described above, the weight may be described instead of the ratio, or the number of moles or the number of molecules may be described.
  • the presence or absence of the pathogen P is described in the teacher genomic information 312 under the assumption that the pathogen P has only DNA.
  • the teacher genomic information 312 is provided with a DNA column and an RNA column independently.
  • the teacher consultation information 313 of FIG. 2 describes body temperature, cough, and headache.
  • the infection confirmation information 314 of FIG. 2 describes that the person is infected and currently has no infectivity, but it is about 3 days before the infectivity develops.
  • the learning model 32 includes one or more models such as a predictive model, a regression model, a stochastic model, an artificial neural network, and a long short-term memory (LSTM) neural network.
  • regression models include decision trees, classifiers, linear regression models, and so on.
  • probabilistic models include support vector machines, Markov models and hidden Markov models.
  • Artificial neural networks include recurrent neural networks and the like.
  • the learning model 32 may include a model for determining the presence or absence of infection with the pathogen P and a model for determining the presence or absence of infectivity for the pathogen P.
  • the learning model 32 is created and updated using the teacher data 31. That is, the learning model 32 also includes various parameters required for the calculation.
  • the learning model 32 is a learning model in which learning is performed by inputting teacher protein information 311, teacher genome information 312, and teacher examination information 313 and outputting infection confirmation information 314 in the learning phase. Further, in the inference phase, the learning model 32 infers the degree of infection by inputting the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333 described below.
  • the learning model program 32A is a program created for the convenience of executing inference processing using the learning model 32 and executed by the arithmetic unit 1.
  • the learning model 32 is premised on being updated, and in order to execute the inference process, it is necessary to convert it into a format suitable for execution in the arithmetic unit 1.
  • the learning model program 32A cuts out and clearly indicates a format suitable for this execution.
  • the learning model program 32A may be optimized according to the hardware configuration of the arithmetic unit 1.
  • the evaluation data 33 is data to be inferred using the learning model 32.
  • the evaluation data 33 includes the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333.
  • the evaluation protein information 331 is protein information obtained by processing the evaluation sample 912 collected from the evaluation subject 90A, and the type of information is the same as that of the teacher protein information 311.
  • the evaluation genome information 332 is information indicating whether or not the pathogen P genomic information is included, which is obtained by processing the evaluation sample 912 collected from the evaluation subject 90A, and the type of information is the teacher genome information 312. Is the same as.
  • the evaluation medical examination information 333 is information obtained by a medical examination for the evaluation target person 90A, and the type of information is the same as that of the teacher medical examination information 313.
  • the teacher data 31 is mainly generated using the teacher sample 911 collected from the group 90, whereas the evaluation sample 912 is an evaluation sample collected from the evaluation subject 90A. The difference is that it is generated using 912. Further, the information corresponding to the infection confirmation information 314 of the teacher data 31 is not included in the evaluation data 33. The information corresponding to the infection confirmation information 314 in the evaluation data 33 is output by inference using the learning model 32.
  • the group 90 is a group of humans of various genders and ages, including a person infected with the pathogen P targeted by the present embodiment.
  • Each person belonging to the population 90 will have a teacher sample 911 taken from his or her body for a predetermined period, eg, every 3 months, in a predetermined cycle, eg, every day.
  • the teacher specimen 911 is nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood. It is sufficient that at least one person belonging to the population 90 is infected with the pathogen P during at least a part of the above-mentioned predetermined period.
  • the mucous membrane is, for example, the mucous membrane of the eyes, ears, nose, mouth and the like.
  • the group 90 can also be referred to as a "sample provider" in the sense that the learning model 32 provides sample data for learning.
  • the evaluation target 90A is a human whose infection with the pathogen P is unknown.
  • the evaluation sample 912 is collected at least once.
  • the evaluation sample 912 is nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood.
  • the teacher sample 911 and the evaluation sample 912 are collectively referred to as a "sample" 91.
  • the pretreatment device 92 is a device that performs pretreatment for detecting proteins, DNA, and RNA contained in the sample 91. Since the content of processing by the pretreatment device 92 is known, it will not be described in detail in this specification.
  • the pretreatment device 92 may be fully automatic, or may be partially or wholly operated manually.
  • the mass spectrometer 93 is a device that measures the name and content ratio of the protein contained in the sample 91 by a known method.
  • the mass spectrometer 93 may be, for example, either a liquid chromatograph mass spectrometer or a gas chromatograph mass spectrometer, or may include both.
  • the mass spectrometer 93 transmits the measurement result to the storage unit 3 as teacher protein information 311 or evaluation protein information 331.
  • the mass spectrometer 93 stores the measurement result as the teacher protein information 311 when the sample 91 is the teacher sample 911, and stores the measurement result as the evaluation protein information 331 when the sample 91 is the evaluation sample 912.
  • the mass spectrometer 93 may be fully automated, or may be partially or wholly operated manually.
  • the gene analysis device 94 analyzes the gene contained in the sample 91 by using one or more known measurement methods such as a nucleic acid amplification method, a Q probe method, and a Tm analysis method.
  • the gene analysis device 94 transmits information indicating whether or not the DNA or RNA of the pathogen P is detected from the sample 91 to the storage unit 3 as the teacher genomic information 312 or the evaluation genomic information 332.
  • the gene analysis apparatus 94 stores the measurement result as the teacher genome information 312 when the sample 91 is the teacher sample 911, and stores the measurement result as the evaluation genome information 332 when the sample 91 is the evaluation sample 912.
  • the gene analyzer 94 may be fully automatic, or may be partially or wholly operated manually.
  • the medical examination terminal 95 is a computer operated by the expert 96, and has a human interface and a communication module for communicating with the storage unit 3.
  • the medical examination terminal 95 is, for example, a general-purpose personal computer or a smartphone.
  • Expert 96 examines each of the groups 90 and creates teacher examination information 313. However, the teacher consultation information 313 can be added or modified later. Details will be described later.
  • teacher specimens 911 such as nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled condensate, and blood are collected from each of the population 90.
  • the teacher sample 911 is sent to the mass spectrometer 93 and the gene analyzer 94 after being processed by the pretreatment device 92. Specifically, the sample obtained by the known protein extraction operation is sent to the mass spectrometer 93, and the sample obtained by the known RNA or DNA extraction operation is sent to the gene analyzer 94.
  • the mass spectrometer 93 writes the name of the detected protein and the weight ratio of the detected protein in the storage unit 3 as teacher protein information 311.
  • the gene analysis apparatus 94 writes in the storage unit 3 whether or not the teacher sample 911 contains the DNA or RNA of the pathogen P as the teacher genomic information 312.
  • Each person belonging to the group 90 receives a medical examination by an expert 96 at the same timing as the collection of the sample 91.
  • the expert 96 confirms various states including symptoms when infected with the pathogen P known in the medical examination, and creates the teacher medical examination information 313 and the infection confirmation information 314 in the storage unit 3 using the medical examination terminal 95.
  • the infection confirmation information 314 is also added or corrected after the fact. Subsequent additions and corrections of the infection confirmation information 314 will be described with specific examples.
  • infectious symptom a symptom observed when infected with the pathogen P (hereinafter referred to as "infectious symptom") is first confirmed for an individual belonging to the group 90 by a medical examination by a specialist 96, the following treatment is performed. .. That is, it is described in the infection confirmation information 314 up to the previous day that the individual was infected. Further, the number of days until the outbreak of infectivity is estimated from the measurement of infectivity or the known characteristics regarding the infection of the pathogen P, and the number of days until the outbreak of infectivity is added to the infection confirmation information 314 up to the previous day.
  • Teacher data 31 is created by the above processing.
  • One teacher data 31 is created every time one person collects one teacher sample 911 in this way. Therefore, if 100 people collect teacher sample 911 every day for 30 days, a total of 3,000 teacher data 31 will be created. The individual teacher data 31 are created.
  • the method of creating the evaluation data 33 is substantially the same as the method of creating the teacher data 31. However, since the evaluation data 33 does not include the information corresponding to the infection confirmation information 314 of the teacher data 31, no ex post facto correction is necessary as in the teacher data 31.
  • the method for creating the learning model 32 for diagnosing the degree of infection with the pathogen P which is executed in the infection determination system S, is a method for creating a protein obtained from a teacher sample 911 collected from a group 90, which is a plurality of sample providers.
  • a plurality of teacher data 31 including the teacher protein information 311 which is information and the infection confirmation information 314 which is the degree of infection to the pathogen of the sample provider are acquired, and the protein information is input using the teacher data 31 to enter the pathogen.
  • a learning model 32 is generated, which outputs the degree of infection to. Therefore, it is possible to create a learning model 32 capable of more sensitive detection than in the case of detecting the genomic information of the pathogen itself of the pathogen P. Specifically, it is as follows.
  • the detection is impossible unless the collected sample 91 contains the genomic information of the pathogen P. Therefore, if there is a problem in the collection method of the sample 91 and the pathogen P is collected from a place where the pathogen P does not exist, or if the pathogen P is not collected by chance, the genomic information of the pathogen P is not detected and there is no erroneous infection. It may be judged.
  • the learning model 32 to be created is more sensitive than the conventional method. Infection to P can be detected.
  • the learning model 32 further inputs the teacher examination information 313. Therefore, non-protein information such as body temperature, cough, headache, myalgia, and fatigue can be included in the learning and reasoning elements.
  • the learning model 32 further inputs the teacher genomic information 312 regarding the pathogen P. Therefore, the learning model 32 created can more reliably detect infection with the pathogen P.
  • the degree of infection is the presence or absence of infection with a pathogen.
  • the degree of infection is the presence or absence of infection with a pathogen and the presence or absence of infectivity.
  • the degree of infection is the presence or absence of infection with a pathogen, the presence or absence of infectivity, and the number of days remaining until the person has infectivity.
  • the learning model program 32A acquires evaluation data 33, which is information on the protein of the sample 91 collected from the evaluation target person 90A, from the storage unit 3, and obtains it from the teacher sample 911 collected from the sample provider, that is, the group 90.
  • the acquired evaluation data 33 is input to the learning model 32 trained using the teacher data 31 which inputs the information of the protein to be obtained and outputs the degree of infection to the pathogen P of the population 90, and the evaluation subject 90A.
  • the arithmetic unit 10 is made to execute a process of outputting the degree of infection with the pathogen P.
  • the arithmetic unit 10 is obtained from a reading unit that reads evaluation data 33, which is information on the protein of the evaluation sample 912 collected from the evaluation target person 90A, that is, a communication unit 2, and a teacher sample 911 collected from the group 90.
  • a storage unit 3 for storing a learning model 32 trained using teacher data 31 in which teacher protein information 311 which is protein information is input and infection confirmation information 314 which is the degree of infection to a pathogen of a population 90 is output.
  • a calculation unit 1 for inputting evaluation data 33 into the learning model 32 and outputting the degree of infection of the evaluation target person 90A with the pathogen.
  • the evaluation target 90A does not have to be examined by the expert 96.
  • the evaluation data 33 does not include the evaluation consultation information 333.
  • the group 90 does not have to be examined by the expert 96, and the teacher data 31 does not include the teacher examination information 313.
  • the infection determination system S does not have to include the gene analysis device 94.
  • the teacher data 31 does not include the teacher genome information 312, and the evaluation data 33 does not include the evaluation genome information 332.
  • the infection confirmation information 314, that is, the degree of infection includes not only the presence or absence of infection but also the presence or absence of infectivity and the number of days until the occurrence of infectivity.
  • the infection confirmation information 314, that is, the degree of infection may be the presence or absence of infection and the presence or absence of infectivity, or may be only the presence or absence of infection. The severity of the symptoms may be added to the degree of infection.
  • the sample 91 contained nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood.
  • the sample 91 may contain at least one of nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood.
  • the teacher sample 911 contains two or more types of nasal mucosal fluid, pharyngeal mucosal fluid, blood, exhaled condensate, and saliva, and is treated as information on proteins different for each type of sample 91. Therefore, the learning model 32 can increase the types of information used in the calculation and improve the accuracy of inference.
  • the learning model 32 may be created for each type of the sample 91, and the inference result may be performed for each type of the sample 91.
  • the inference result may be performed for each type of the sample 91.
  • four learning models 32 corresponding to each of nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled fluid, and blood are created, and learning and inference are performed on each of them.
  • the result of inference is also performed for each type of sample 91.
  • a human can infer the state of infection by referring to the inference result for each type of the sample 91. Therefore, it becomes easier to give meaning to the diagnosis result as compared with the case where the inference is performed only by the learning model 32. For example, if it is inferred from the exhaled breath condensate of the evaluation subject 90A that there is an infection, it can be inferred that the infection has spread to the lungs and the condition is serious.
  • the population 90 has a teacher sample 911 collected at predetermined cycles over a predetermined period.
  • the number of times the teacher sample 911 is collected from each individual belonging to the group 90 may be one.
  • the progress after collection is observed, and the infection confirmation information 314 is added or corrected as necessary.
  • the population 90 is a human population.
  • the population 90 may be non-human organisms such as cows, pigs, chickens, fish, insects and the like.
  • a learning model 32 corresponding to each of the plurality of pathogens P may be created, and the degree of infection to the plurality of pathogens P may be estimated at the same time in the inference phase.
  • FIG. 3 is a configuration diagram of the infection determination system S1 in the second embodiment.
  • the infection determination system S1 is different from the first embodiment in that the learning device 10A and the inference device 10B are provided instead of the arithmetic unit 10.
  • the learning device 10A includes a learning calculation unit 1A, a learning communication unit 2A, and a learning storage unit 3A.
  • the teacher data 31 and the learning model 32 are stored in the learning storage unit 3A.
  • the learning calculation unit 1A is, for example, a central processing unit (not shown), and performs learning processing by expanding a program stored in a read-only memory (not shown) into a volatile memory (not shown) and executing the program.
  • the learning calculation unit 1A may be realized by using a hardware circuit or a reconfigurable logic circuit, or may be realized by a combination thereof.
  • the learning communication unit 2A is a communication module capable of communicating with the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the inference device 10B.
  • the learning communication unit 2A may be directly connected to the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the inference device 10B, or may be connected via a communication network, for example, the Internet.
  • the learning storage unit 3A is a non-volatile storage device such as a hard disk drive.
  • the inference device 10B includes an inference calculation unit 1B, an inference communication unit 2B, and an inference storage unit 3B.
  • the learning model program 32A and the evaluation data 33 are stored in the inference storage unit 3B.
  • the inference calculation unit 1B is, for example, a central processing unit (not shown), and performs inference processing by expanding and executing the learning model program 32A in a volatile memory (not shown).
  • the inference calculation unit 1B may be realized by using a hardware circuit or a reconfigurable logic circuit, or may be realized by a combination thereof. That is, the learning model program 32A may be a computer program read by the processor or circuit information for reconstructing a logic circuit.
  • the inference communication unit 2B is a communication module capable of communicating with the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the learning device 10A.
  • the inference communication unit 2B may be directly connected to the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the learning device 10A, or may be connected via a communication network, for example, the Internet. Since the inference communication unit 2B acquires the evaluation data 33 from the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95, it can also be called a “reading unit” that performs a process of reading the evaluation data 33 from the outside.
  • the inference storage unit 3B is a non-volatile storage device such as a hard disk drive.
  • the mass spectrometer 93 and the gene analysis device 94 output the processing result to the learning device 10A when the teacher sample 911 is the processing target, and output the processing result to the inference device 10B when the evaluation sample 912 is the processing target. do.
  • the examination terminal 95 outputs the result to the learning device 10A when the group 90 is examined, and outputs the result to the inference device 10B when the evaluation target person 90A is examined.
  • the learning device 10A creates a learning model program 32A and transmits it to the inference device 10B each time the learning model 32 is created and updated.
  • the inference device 10B executes the learning model program 32A to infer the evaluation data 33.
  • learning and inference can be performed by separate devices.
  • the learning model program 32A is transmitted from the learning device 10A to the inference device 10B via the communication line. That is, in the second embodiment, the learning model program 32A is transmitted using a communication medium, that is, a network such as wired, wireless, or optical, or a carrier wave or a digital signal propagating in the network.
  • the learning device 10A and the inference device 10B are provided with input / output interfaces (not shown), and the learning model program 32A is transmitted via a recording medium in which the input / output interfaces can be used, such as a USB memory, a hard disk drive, or a CD-R. May be done.
  • the learning device 10A creates a learning model program 32A and transmits it to the inference device 10B.
  • the learning device 10A may transmit the learning model 32 as it is to the inference device 10B, and create the learning model program 32A using the learning model 32 received by the inference device 10B.
  • FIG. 1 A third embodiment of the method of generating a learning model will be described with reference to FIG.
  • the same components as those in the first embodiment are designated by the same reference numerals, and the differences will be mainly described.
  • the points not particularly described are the same as those in the first embodiment.
  • the evaluation sample 912 is mainly different from the first embodiment.
  • FIG. 4 is a configuration diagram of the infection determination system S2 according to the third embodiment.
  • the learning model 32 created by the process described in the first embodiment is applied to the environment E, and the presence of the pathogen in the environment E is evaluated.
  • the environment E may be a closed space or an open space.
  • the environment E is a lobby of a building, an office, a hotel room, a living room at home, a vehicle, a train, a passenger plane, a gymnasium, a competition space of a stadium, a viewing space of a stadium, and a street.
  • the evaluation sample 912 can be extracted from the environment E by using various methods.
  • the liquid obtained by repeatedly passing the air in the environment E through a filter having a fine mesh and bringing a predetermined liquid into contact with the filter can be used as the evaluation sample 912.
  • the condensed liquid obtained by compressing the air in the environment E may be used as the evaluation sample 912, or the air in the environment E may be pumped into the liquid for a predetermined time to use the liquid as the evaluation sample 912.
  • the processing method of the acquired evaluation sample 912 is the same as that of the first embodiment.
  • the evaluation sample 912 is extracted from the air in the office, the evaluation data 33 is created by processing with the pretreatment device 92 or the like, and the presence or absence of the pathogen P in the office is used by using the arithmetic unit 10. Can be estimated. Therefore, since the presence or absence of the pathogen P can be determined for any environment E, it can contribute to the prevention of infection of the infectious disease P for an unspecified number of people.
  • the pathogen P is detected in the air of the environment E, each person included in the environment E is set as the evaluation target person 90A, and the evaluation sample 912 and the evaluation sample 912 described in the first embodiment are obtained and evaluated. Evaluation may be performed.
  • the inference in the third embodiment may be executed using the learning model 32 generated as follows. That is, the teacher data 31 does not include the teacher examination information 313, but only the teacher protein information 311, the teacher genome information 312, and the infection confirmation information 314. Then, for example, a liquid obtained by treating the air in the room where the infected person infected with the pathogen P and having infectivity stays, that is, a sample 91 derived from the infected person infected with the pathogen is a teacher sample in which the pathogen P is present. It is set to 911. Further, for example, the liquid obtained by treating the air in the room where an uninfected person or a non-infected person who is not infected with the pathogen P stays is referred to as a teacher sample 911 in which the pathogen P does not exist.
  • the learning model 32 in this modification may estimate the presence or absence of the pathogen P in the evaluation sample 912, or may estimate the certainty of the presence of the pathogen P.
  • a plurality of teacher data 31 including protein information obtained from a sample derived from an infected person infected with a pathogen are acquired, protein information is input using the teacher data 31, and the presence or absence of pathogen P is output.
  • the learning model 32 is generated.
  • a learning model 32 may be created for a plurality of pathogens P, and the type of pathogen P contained in the sample may be inferred. That is, a plurality of teacher data 31 including the protein information obtained from the sample 91 collected from the sample and the type of the pathogen P in the sample are acquired, and the protein information is input using the teacher data 31 and included in the sample.
  • a learning model 32 may be generated with the type of pathogen as an output. In this case, in the inference phase, the type of pathogen P contained in the sample 91 is output by inputting the protein information obtained by processing the sample 91 acquired from the environment E into the learning model 32.
  • FIGS. 5 to 8 A fourth embodiment of the method of generating a learning model will be described with reference to FIGS. 5 to 8.
  • the same components as those in the first embodiment are designated by the same reference numerals, and the differences will be mainly described.
  • the points not particularly described are the same as those in the first embodiment.
  • This embodiment differs from the first embodiment mainly in that ions described later are included in the teacher data.
  • exhaled breath collector manufactured by GL Sciences Co., Ltd.
  • exhaled breath recovery device manufactured by GL Sciences Co., Ltd.
  • the exhaled breath collection device is cooled to -20 degrees Celsius in advance, and the subject breathes for 5 to 10 minutes through the exhaled breath collection device (mouthpiece type, mask type) to rapidly cool the exhaled aerosol and exhale. About 1 ml of the condensate was recovered.
  • HPE-IAM ⁇ - (4-hydroxyphenyl) ethyliodoacetamide
  • FIG. 5 shows the data of the quantitative analysis results of these 3 ions for 12 healthy subjects and 12 new corona-infected subjects.
  • concentration of various sulfur metabolites contained in the exhaled breath condensate was generally lower than that in the living body. It is considered that this is due to the fact that sulfur metabolites in the living body are diluted in the exhaled breath and that the efficiency of exhaled breath recovery is not always good.
  • the levels of sulfite ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) are significantly higher in moderately ill patients than in healthy subjects. It was confirmed that.
  • FIG. 6 shows the measurement results of sulfate ion (HSO 3- ) and thiosulfate ion (HS 2 O 3- ) when X becomes severe. It was found that the production level of thiosulfate ion (HS 2 O 3- ) was significantly higher than that of healthy subjects even in moderate and severe cases. The aforementioned X showed a marked increase in sulfite ion (HSO 3- ) in specimens collected before it became severe, but not in moderately ill patients who did not become severe. .. Therefore, sulfite ion (HSO 3- ) is a promising biomarker for assessing the transition risk of aggravation of pneumonia in coronavirus infections.
  • active sulfur has antioxidant activity, that is, it is oxidized by oxidative stress, active oxygen, etc.
  • the aggravation of pneumonia that is, the exacerbation of oxidative stress, thereby changing the profile of sulfur metabolites toward oxidation. It can be said that it shifts.
  • FIG. 7 is a configuration diagram of the arithmetic unit 10C according to the fourth embodiment.
  • the teacher data 31 further includes teacher ion information 315.
  • the evaluation data 33 further includes evaluation ion information 335.
  • the teacher ion information 315 is the identification information and detection of an ion containing at least one of sulfate ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) .
  • Quantitative information on ions produced by the mass spectrometer 93.
  • the quantitative information of ions is, for example, the concentration of each ion.
  • the evaluation ion information 335 is ion information obtained by processing the evaluation sample 912 collected from the evaluation target person 90A, and the type of information is the same as that of the teacher ion information 315.
  • FIG. 8 is a diagram showing an example of teacher data 31 in the present embodiment. The difference from the teacher data 31 shown in FIG. 2 in the first embodiment is that the teacher ion information 315 is added.
  • the teacher ion information 315 and the evaluation ion information 335 were obtained by the above-mentioned method using the breath recovery device, the breath recovery device, and the mass spectrometer described in the present embodiment.
  • the teacher protein information 311 and the teacher genome information 312, the evaluation protein information 331, and the evaluation genome information 332 may be obtained through the same processing as the teacher ion information 315 and the evaluation ion information 335, or the teacher ion information 315 and the evaluation. It may be obtained by a treatment different from that of the ion information 335, that is, the treatment in the first embodiment.
  • the learning model 32 in the present embodiment learns by inputting teacher ion information 315 and outputting infection confirmation information 314 in addition to teacher protein information 311, teacher genomic information 312, and teacher examination information 313. It is a learning model that is performed. Further, in the inference phase, the learning model 32 infers the degree of infection by inputting the evaluation ion information 335 in addition to the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333.
  • the learning model program 32A created for the convenience of executing the inference process using the learning model 32 includes the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333, similarly to the learning model 32 in the inference phase.
  • the evaluation ion information 335 is used as an input to infer the degree of infection.
  • the teacher data includes at least one information of sulfite ion, thiosulfate ion, and hydrogen sulfide ion obtained from the teacher sample 911 as ion information.
  • the learning model 32 protein information and ion information are input, and the degree of infection with a pathogen is output. Therefore, a learning model 32 can be generated using at least one of sulfate ion, thiosulfate ion, and hydrogen sulfide ion, which are promising biomarkers, and the degree of infection with the pathogen in the evaluation sample 912 can be accurately inferred.
  • the configuration of the functional block is only an example.
  • Several functional configurations shown as separate functional blocks may be integrally configured, or the configuration represented by one functional block diagram may be divided into two or more functions. Further, a configuration in which a part of the functions of each functional block is provided in another functional block may be provided.

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Hematology (AREA)
  • Immunology (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Food Science & Technology (AREA)
  • Biotechnology (AREA)
  • Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Analytical Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biochemistry (AREA)
  • Epidemiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Mathematical Physics (AREA)
  • Cell Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Microbiology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Bioethics (AREA)

Abstract

This learning model generation method is for creating a learning model for making a diagnosis of a degree of infection by a pathogen. The learning model generation method involves: acquiring a plurality of teaching data items that each include information of a protein obtained from a sample collected from a sample provider and a degree of infection at which the sample provider is infected with the pathogen; and generating, by using the teaching data items, a learning model to which the information of the protein is to be inputted and from which the degree of infection by the pathogen is outputted.

Description

学習モデルの生成方法、プログラム、演算装置Learning model generation method, program, arithmetic unit
 本発明は、学習モデルの生成方法、プログラム、および演算装置に関する。 The present invention relates to a learning model generation method, a program, and an arithmetic unit.
 病原体への感染の有無を高精度に判断する手法の需要が高まっている。特許文献1には、生物における微生物感染の診断方法であって、前記微生物感染が前記生物の細胞材料中に基本小体として少なくとも部分的に存在し、前記診断方法は、前記生物から細胞材料のサンプルを得る工程と、当該サンプルを処理して、前記サンプルが基本小体を含む限り、前記基本小体に富む試験組成物を得る工程と、多くても予め定められた最大数の基本小体を含む一体積量の前記試験組成物を質量分析法に供して、前記微生物感染の存在を同定する工程と、を含む方法が開示されている。 There is an increasing demand for a method for accurately determining the presence or absence of infection with a pathogen. Patent Document 1 describes a method for diagnosing a microbial infection in an organism, wherein the microbial infection is at least partially present as a basic body in the cell material of the organism, and the diagnostic method is a method of diagnosing the cell material from the organism. A step of obtaining a sample and a step of processing the sample to obtain a test composition rich in the basic body as long as the sample contains the basic body, and a predetermined maximum number of basic body bodies at most. Disclosed is a method comprising the step of subjecting a volume of the test composition comprising the test composition to a mass analysis method to identify the presence of the microbial infection.
日本国特表2020-510829号公報Japan Special Table 2020-510829 Gazette
 特許文献1に記載されている発明では、病原体そのものを検出対象としているため病原体への感染の検出精度に改善の余地がある。 In the invention described in Patent Document 1, since the pathogen itself is the detection target, there is room for improvement in the detection accuracy of the infection to the pathogen.
 本発明の第1の態様による学習モデルの生成方法は、病原体への感染の程度を診断する学習モデルの作成方法であって、サンプル提供者から採取した検体から得られるタンパク質の情報と、前記サンプル提供者の病原体への感染の程度とを含む教師データを複数取得し、前記教師データを用いて、前記タンパク質の情報を入力、前記病原体への感染の程度を出力とする学習モデルを生成する。
 本発明の第2の態様によるプログラムは、評価対象者から採取された検体のタンパク質の情報である評価データを取得し、サンプル提供者から採取した検体から得られるタンパク質の情報を入力、前記サンプル提供者の前記病原体への感染の程度を出力とする教師データを用いて学習させた学習モデルに、取得した前記評価データを入力して、前記評価対象者の前記病原体への感染の程度を出力する処理をコンピュータに実行させる。
 本発明の第3の態様による演算装置は、評価対象者から採取された検体のタンパク質の情報である評価データを読み込む読込部と、サンプル提供者から採取した検体から得られるタンパク質の情報を入力、前記サンプル提供者の前記病原体への感染の程度を出力とする教師データを用いて学習させた学習モデルを格納する記憶部と、前記学習モデルに前記評価データを入力して、前記評価対象者の前記病原体への感染の程度を出力する演算部と、を備える。
The method for generating a learning model according to the first aspect of the present invention is a method for creating a learning model for diagnosing the degree of infection with a pathogen, and includes information on proteins obtained from a sample collected from a sample provider and the sample. A plurality of teacher data including the degree of infection with the pathogen of the donor are acquired, and the information of the protein is input using the teacher data, and a learning model is generated in which the degree of infection with the pathogen is output.
The program according to the second aspect of the present invention acquires evaluation data which is information on the protein of the sample collected from the evaluation subject, inputs the information on the protein obtained from the sample collected from the sample provider, and provides the sample. The acquired evaluation data is input to a learning model trained using teacher data that outputs the degree of infection of the pathogen of the person, and the degree of infection of the pathogen of the evaluation target person is output. Let the computer perform the process.
The arithmetic unit according to the third aspect of the present invention inputs the reading unit that reads the evaluation data, which is the information on the protein of the sample collected from the evaluation subject, and the information on the protein obtained from the sample collected from the sample provider. A storage unit that stores a learning model trained using teacher data that outputs the degree of infection of the pathogen of the sample provider, and inputting the evaluation data into the learning model, the evaluation target person It is provided with a calculation unit that outputs the degree of infection with the pathogen.
 本発明によれば、病原体への感染の検出精度が高い学習モデルを作成できる。 According to the present invention, it is possible to create a learning model with high detection accuracy of infection with a pathogen.
第1の実施の形態における感染判断システムの全体構成図Overall configuration diagram of the infection determination system in the first embodiment 第1の実施の形態における教師データの一例を示す図The figure which shows an example of the teacher data in 1st Embodiment 第2の実施の形態における感染判断システムの全体構成図Overall configuration diagram of the infection determination system in the second embodiment 第3の実施の形態における感染判断システムの全体構成図Overall configuration diagram of the infection determination system in the third embodiment 第4の実施の形態における分析結果を示す図The figure which shows the analysis result in 4th Embodiment 第4の実施の形態における測定結果を示す図The figure which shows the measurement result in 4th Embodiment 第4の実施の形態における演算装置の構成図Configuration diagram of arithmetic unit in the fourth embodiment 第4の実施の形態における教師データの一例を示す図The figure which shows an example of the teacher data in 4th Embodiment
 本明細書では、病原体への感染の判断において、タンパク質の存在に注目する。病原体は、宿主の体内でDNAやRNAなどの遺伝子情報を複製するが、自己複製のために様々なタンパク質を生成する。病原体が存在することを確認するためには、従来技術のように複製された病原体の遺伝子情報を検出することが有効である。しかし病原体が存在しないことを確認するためには、病原体の遺伝子が存在しないことを確認する必要があり、検体に何らかの理由で病原体が含まれない場合や病原体が検出感度以下しか含まれない場合には誤って存在しないと判断される。その一方で病原体が生成するタンパク質や病原体が宿主細胞に生成させるタンパク質は、遺伝子情報に比べればその数は圧倒的に多く、病原体の存在を示す特徴となりえる。 This specification focuses on the presence of proteins in determining infection with a pathogen. Pathogens replicate genetic information such as DNA and RNA in the host's body, but produce various proteins for self-replication. In order to confirm the existence of a pathogen, it is effective to detect the genetic information of the replicated pathogen as in the prior art. However, in order to confirm that the pathogen does not exist, it is necessary to confirm that the gene for the pathogen does not exist, and if the sample does not contain the pathogen for some reason or if the pathogen contains less than the detection sensitivity. Is mistakenly determined not to exist. On the other hand, the number of proteins produced by pathogens and proteins produced by pathogens in host cells is overwhelmingly large compared to genetic information, and can be a feature indicating the presence of pathogens.
 ただしタンパク質の種類は膨大であり、他の影響を受けて変化しやすいので病原体との関連性を人間が把握することは非常に困難である。そのため本明細書では、病原体への感染の程度とタンパク質との関係を機械学習により学習して、この学習により得られた学習モデルを用いて推論を行う。 However, it is very difficult for humans to grasp the relationship with pathogens because the types of proteins are enormous and easily changed by other influences. Therefore, in the present specification, the relationship between the degree of infection with a pathogen and a protein is learned by machine learning, and inference is performed using the learning model obtained by this learning.
―第1の実施の形態―
 以下、図1~図2を参照して、学習モデルの生成方法の第1の実施の形態を説明する。
-First embodiment-
Hereinafter, the first embodiment of the learning model generation method will be described with reference to FIGS. 1 and 2.
 図1は、病原体Pへの感染を判断する感染判断システムSの全体構成図である。病原体Pは特に限定されず、ウイルス、細菌、および真菌のいずれでもよい。感染判断システムSは、演算装置10と、前処理装置92と、質量分析装置93と、遺伝子解析装置94とを含む。演算装置10は、演算部1と、通信部2と、記憶部3とを備える。記憶部3は、たとえばハードディスクドライブなどの不揮発性記憶装置である。記憶部3には、教師データ31と、学習モデル32と、学習モデルプログラム32Aと、評価データ33とが格納される。 FIG. 1 is an overall configuration diagram of an infection determination system S for determining infection with a pathogen P. The pathogen P is not particularly limited and may be a virus, a bacterium, or a fungus. The infection determination system S includes an arithmetic unit 10, a pretreatment device 92, a mass spectrometer 93, and a gene analysis device 94. The arithmetic unit 10 includes an arithmetic unit 1, a communication unit 2, and a storage unit 3. The storage unit 3 is a non-volatile storage device such as a hard disk drive. The storage unit 3 stores the teacher data 31, the learning model 32, the learning model program 32A, and the evaluation data 33.
 演算部1は、たとえば不図示の中央演算装置であり、不図示の読み出し専用メモリに格納されるプログラムを不図示の揮発性メモリに展開して実行することで後述する学習処理および推論処理を行う。ただし演算部1は、ハードウエア回路や再構成可能な論理回路を用いて実現してもよいし、これらの組合せにより実現してもよい。演算部1は、学習処理と推論処理とを行う。学習処理とは、教師データ31を用いた学習モデル32の作成および更新である。推論処理とは、学習モデル32を用いた評価データ33の評価である。これらの処理は後述する。 The arithmetic unit 1 is, for example, a central processing unit (not shown), and performs learning processing and inference processing described later by expanding and executing a program stored in a read-only memory (not shown) in a volatile memory (not shown). .. However, the arithmetic unit 1 may be realized by using a hardware circuit or a reconfigurable logic circuit, or may be realized by a combination thereof. The arithmetic unit 1 performs learning processing and inference processing. The learning process is the creation and updating of the learning model 32 using the teacher data 31. The inference process is the evaluation of the evaluation data 33 using the learning model 32. These processes will be described later.
 通信部2は、質量分析装置93、遺伝子解析装置94、および診察端末95と通信可能な通信モジュールである。通信部2は、質量分析装置93、遺伝子解析装置94、および診察端末95と直接に接続されてもよいし、通信ネットワーク、たとえばインターネットを介して接続されてもよい。なお通信部2は、評価データ33を質量分析装置93、遺伝子解析装置94、および診察端末95から取得するので、評価データ33を外部から読み込む処理を行う「読込部」と呼ぶこともできる。 The communication unit 2 is a communication module capable of communicating with the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95. The communication unit 2 may be directly connected to the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95, or may be connected via a communication network, for example, the Internet. Since the communication unit 2 acquires the evaluation data 33 from the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95, it can also be called a “reading unit” that performs a process of reading the evaluation data 33 from the outside.
 教師データ31は、教師タンパク質情報311と、教師ゲノム情報312と、教師診察情報313と、感染確定情報314とを複数組含む。教師タンパク質情報311とは、質量分析装置93により検出されたタンパク質の識別情報および検出されたタンパク質の量的情報であり、質量分析装置93により作成される。タンパク質の量的情報とは、検出されたタンパク質の具体的な量、および検出された全タンパク質の総量におけるそのタンパク質の重量比率のいずれかである。教師ゲノム情報312とは、遺伝子解析装置94により作成される、検体に病原体Pの公知であるゲノム情報が含まれるか否かを示す情報である。 The teacher data 31 includes a plurality of sets of teacher protein information 311, teacher genome information 312, teacher medical examination information 313, and infection confirmation information 314. The teacher protein information 311 is the identification information of the protein detected by the mass spectrometer 93 and the quantitative information of the detected protein, and is created by the mass spectrometer 93. Quantitative information on a protein is either the specific amount of protein detected and the weight ratio of that protein to the total amount of total protein detected. The teacher genome information 312 is information created by the gene analysis device 94 and indicating whether or not the sample contains the known genome information of the pathogen P.
 教師診察情報313とは、体温、咳、頭痛、筋肉痛、疲労感など様々な客観的および主観的な情報である。感染確定情報314とは、病原体Pへの感染の有無、感染していた場合の感染力の有無、および感染力がない場合の感染力を生じるまでの日数である。それぞれの感染確定情報314は、教師タンパク質情報311、教師ゲノム情報312、および教師診察情報313と関連付けられる。教師データ31は後述するように集団90から採取した教師検体911および集団90に由来している。 Teacher consultation information 313 is various objective and subjective information such as body temperature, cough, headache, myalgia, and fatigue. The infection confirmation information 314 is the presence or absence of infection with the pathogen P, the presence or absence of infectivity when infected, and the number of days until the infectivity is generated when there is no infectivity. Each infection confirmation information 314 is associated with teacher protein information 311, teacher genomic information 312, and teacher consultation information 313. The teacher data 31 is derived from the teacher sample 911 and the group 90 collected from the group 90 as described later.
 図2は、教師データ31の一例を示す図である。教師タンパク質情報311はタンパク質の名称と、タンパク質の総量に占めるそのタンパク質の重量比率が記載されている。前述のとおり、比率の代わりに重量を記載してもよいし、モル数や分子数を記載してもよい。図2に示す例では病原体PはDNAのみを有するとの仮定のもとで、病原体Pの有無が教師ゲノム情報312に記載される。病原体PがDNAとRNAを有する場合には、教師ゲノム情報312にはDNAの欄とRNAの欄が独立して設けられる。図2の教師診察情報313には、体温、咳、および頭痛について記載されている。図2の感染確定情報314には、感染しており現在は感染力がないが、感染力の発生まであと約3日であることが記載されている。 FIG. 2 is a diagram showing an example of teacher data 31. The teacher protein information 311 describes the name of the protein and the weight ratio of the protein to the total amount of the protein. As described above, the weight may be described instead of the ratio, or the number of moles or the number of molecules may be described. In the example shown in FIG. 2, the presence or absence of the pathogen P is described in the teacher genomic information 312 under the assumption that the pathogen P has only DNA. When the pathogen P has DNA and RNA, the teacher genomic information 312 is provided with a DNA column and an RNA column independently. The teacher consultation information 313 of FIG. 2 describes body temperature, cough, and headache. The infection confirmation information 314 of FIG. 2 describes that the person is infected and currently has no infectivity, but it is about 3 days before the infectivity develops.
 学習モデル32は、予測モデル、回帰モデル、確率モデル、人工のニューラルネットワーク、および長・短期記憶(LSTM)ニューラルネットワークなどの1または複数のモデルを含む。たとえば回帰モデルには、決定木、分類器、線形回帰モデルなどが含まれる。たとえば確率モデルには、サポートベクタマシン、マルコフモデルおよび隠れマルコフモデルなどが含まれる。人工のニューラルネットワークには、再帰型ニューラルネットワークなどが含まれる。学習モデル32には、病原体Pへの感染の有無を判断するためのモデルと、病原体Pへの感染力の有無を判断するためのモデルとが含まれてもよい。学習モデル32は教師データ31を用いて作成および更新される。すなわち学習モデル32には、演算に必要な各種のパラメータも含まれる。 The learning model 32 includes one or more models such as a predictive model, a regression model, a stochastic model, an artificial neural network, and a long short-term memory (LSTM) neural network. For example, regression models include decision trees, classifiers, linear regression models, and so on. For example, probabilistic models include support vector machines, Markov models and hidden Markov models. Artificial neural networks include recurrent neural networks and the like. The learning model 32 may include a model for determining the presence or absence of infection with the pathogen P and a model for determining the presence or absence of infectivity for the pathogen P. The learning model 32 is created and updated using the teacher data 31. That is, the learning model 32 also includes various parameters required for the calculation.
 学習モデル32は、学習フェーズにおいて、教師タンパク質情報311、教師ゲノム情報312、および教師診察情報313を入力とし、感染確定情報314を出力として学習が行われる学習モデルである。また学習モデル32は、推論フェーズにおいて、次に説明する評価タンパク質情報331、評価ゲノム情報332、および評価診察情報333を入力として、感染の程度を推論する。 The learning model 32 is a learning model in which learning is performed by inputting teacher protein information 311, teacher genome information 312, and teacher examination information 313 and outputting infection confirmation information 314 in the learning phase. Further, in the inference phase, the learning model 32 infers the degree of infection by inputting the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333 described below.
 学習モデルプログラム32Aは、学習モデル32を用いた推論処理の実行の便宜のために作成され、演算部1により実行されるプログラムである。学習モデル32は更新されることを前提としており、推論処理を実行するためには演算部1における実行に適した形式に変換する必要がある。この実行に適した形式を切り出して明示したものが学習モデルプログラム32Aである。なお学習モデルプログラム32Aは、演算部1のハードウエア構成にあわせて最適化がなされてもよい。 The learning model program 32A is a program created for the convenience of executing inference processing using the learning model 32 and executed by the arithmetic unit 1. The learning model 32 is premised on being updated, and in order to execute the inference process, it is necessary to convert it into a format suitable for execution in the arithmetic unit 1. The learning model program 32A cuts out and clearly indicates a format suitable for this execution. The learning model program 32A may be optimized according to the hardware configuration of the arithmetic unit 1.
 評価データ33は、学習モデル32を用いて推論を行う対象となるデータである。評価データ33は、評価タンパク質情報331と、評価ゲノム情報332と、評価診察情報333を含む。評価タンパク質情報331は、評価対象者90Aから採取された評価検体912を処理して得られたタンパク質の情報であり、情報の種類は教師タンパク質情報311と同一である。評価ゲノム情報332は、評価対象者90Aから採取された評価検体912を処理して得られた、病原体Pのゲノム情報が含まれるか否かを示す情報であり、情報の種類は教師ゲノム情報312と同一である。評価診察情報333は、評価対象者90Aを対象とした診察により得られる情報であり、情報の種類は教師診察情報313と同一である。 The evaluation data 33 is data to be inferred using the learning model 32. The evaluation data 33 includes the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333. The evaluation protein information 331 is protein information obtained by processing the evaluation sample 912 collected from the evaluation subject 90A, and the type of information is the same as that of the teacher protein information 311. The evaluation genome information 332 is information indicating whether or not the pathogen P genomic information is included, which is obtained by processing the evaluation sample 912 collected from the evaluation subject 90A, and the type of information is the teacher genome information 312. Is the same as. The evaluation medical examination information 333 is information obtained by a medical examination for the evaluation target person 90A, and the type of information is the same as that of the teacher medical examination information 313.
 教師データ31と評価データ33とを比較すると、教師データ31は集団90から採取した教師検体911を用いて主に生成されるのに対して、評価検体912は評価対象者90Aから採取した評価検体912を用いて生成される点が異なる。また、教師データ31の感染確定情報314に対応する情報は、評価データ33には含まれない点が異なる。なお、評価データ33における感染確定情報314に対応する情報は、学習モデル32を用いた推論により出力される。 Comparing the teacher data 31 and the evaluation data 33, the teacher data 31 is mainly generated using the teacher sample 911 collected from the group 90, whereas the evaluation sample 912 is an evaluation sample collected from the evaluation subject 90A. The difference is that it is generated using 912. Further, the information corresponding to the infection confirmation information 314 of the teacher data 31 is not included in the evaluation data 33. The information corresponding to the infection confirmation information 314 in the evaluation data 33 is output by inference using the learning model 32.
 集団90は、本実施の形態が対象とする病原体Pに感染している人を含み、性別および年齢が多様な人間の集団である。集団90に属するそれぞれの人は、所定の期間、たとえば3か月間にわたって所定の周期、たとえば1日ごとに、自身の身体から教師検体911が採取される。教師検体911とは、鼻腔粘膜液、咽頭粘膜液、唾液、呼気の凝縮液、および血液である。なお、集団90に属する少なくとも一人が、前述の所定の期間の少なくとも一部の期間において病原体Pに感染していればよい。粘膜はたとえば、目、耳、鼻、口などの粘膜である。なお集団90は、学習モデル32が学習を行うためのサンプルデータを提供する人々という意味で「サンプル提供者」と呼ぶこともできる。 The group 90 is a group of humans of various genders and ages, including a person infected with the pathogen P targeted by the present embodiment. Each person belonging to the population 90 will have a teacher sample 911 taken from his or her body for a predetermined period, eg, every 3 months, in a predetermined cycle, eg, every day. The teacher specimen 911 is nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood. It is sufficient that at least one person belonging to the population 90 is infected with the pathogen P during at least a part of the above-mentioned predetermined period. The mucous membrane is, for example, the mucous membrane of the eyes, ears, nose, mouth and the like. The group 90 can also be referred to as a "sample provider" in the sense that the learning model 32 provides sample data for learning.
 評価対象者90Aは、病原体Pへの感染が不明な人間である。評価対象者90Aは、少なくとも1回は評価検体912が採取される。評価検体912とは、鼻腔粘膜液、咽頭粘膜液、唾液、呼気の凝縮液、および血液である。なお以下では、教師検体911と評価検体912とをあわせて「検体」91と呼ぶ。 The evaluation target 90A is a human whose infection with the pathogen P is unknown. For the evaluation target person 90A, the evaluation sample 912 is collected at least once. The evaluation sample 912 is nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood. In the following, the teacher sample 911 and the evaluation sample 912 are collectively referred to as a "sample" 91.
 前処理装置92は、検体91に含まれるタンパク質、DNA、およびRNAの検出を行うための前処理を行う装置である。前処理装置92による処理内容は公知なので本明細書では詳述しない。前処理装置92は全自動動作でもよいし、一部または全部が手動操作でもよい。 The pretreatment device 92 is a device that performs pretreatment for detecting proteins, DNA, and RNA contained in the sample 91. Since the content of processing by the pretreatment device 92 is known, it will not be described in detail in this specification. The pretreatment device 92 may be fully automatic, or may be partially or wholly operated manually.
 質量分析装置93は、公知の手法により検体91に含まれるタンパク質の名称および含有比率を測定する装置である。質量分析装置93はたとえば、液体クロマトグラフ質量分析計、およびガスクロマトグラフ質量分析計のいずれかであってもよいし、両者を含んでもよい。質量分析装置93は、測定結果を教師タンパク質情報311または評価タンパク質情報331として記憶部3に送信する。質量分析装置93は、検体91が教師検体911の場合には測定結果を教師タンパク質情報311として保存し、検体91が評価検体912の場合には測定結果を評価タンパク質情報331として保存する。質量分析装置93は全自動動作でもよいし、一部または全部が手動操作でもよい。 The mass spectrometer 93 is a device that measures the name and content ratio of the protein contained in the sample 91 by a known method. The mass spectrometer 93 may be, for example, either a liquid chromatograph mass spectrometer or a gas chromatograph mass spectrometer, or may include both. The mass spectrometer 93 transmits the measurement result to the storage unit 3 as teacher protein information 311 or evaluation protein information 331. The mass spectrometer 93 stores the measurement result as the teacher protein information 311 when the sample 91 is the teacher sample 911, and stores the measurement result as the evaluation protein information 331 when the sample 91 is the evaluation sample 912. The mass spectrometer 93 may be fully automated, or may be partially or wholly operated manually.
 遺伝子解析装置94は、核酸増幅法、Qプローブ法、およびTm解析法などの公知の測定方法を1つ以上用いて検体91に含まれる遺伝子を解析する。遺伝子解析装置94は、検体91から病原体PのDNAまたはRNAが検出されたか否かを示す情報を教師ゲノム情報312または評価ゲノム情報332として記憶部3に送信する。遺伝子解析装置94は、検体91が教師検体911の場合には測定結果を教師ゲノム情報312として保存し、検体91が評価検体912の場合には測定結果を評価ゲノム情報332として保存する。遺伝子解析装置94は全自動動作でもよいし、一部または全部が手動操作でもよい。 The gene analysis device 94 analyzes the gene contained in the sample 91 by using one or more known measurement methods such as a nucleic acid amplification method, a Q probe method, and a Tm analysis method. The gene analysis device 94 transmits information indicating whether or not the DNA or RNA of the pathogen P is detected from the sample 91 to the storage unit 3 as the teacher genomic information 312 or the evaluation genomic information 332. The gene analysis apparatus 94 stores the measurement result as the teacher genome information 312 when the sample 91 is the teacher sample 911, and stores the measurement result as the evaluation genome information 332 when the sample 91 is the evaluation sample 912. The gene analyzer 94 may be fully automatic, or may be partially or wholly operated manually.
 診察端末95は、専門家96が操作するコンピュータであり、ヒューマンインタフェースおよび記憶部3との通信を行う通信モジュールを有する。診察端末95はたとえば、汎用のパーソナルコンピュータやスマートフォンである。専門家96は、集団90のそれぞれを診察して教師診察情報313を作成する。ただし教師診察情報313は後に追記や修正が可能である。詳しくは後述する。 The medical examination terminal 95 is a computer operated by the expert 96, and has a human interface and a communication module for communicating with the storage unit 3. The medical examination terminal 95 is, for example, a general-purpose personal computer or a smartphone. Expert 96 examines each of the groups 90 and creates teacher examination information 313. However, the teacher consultation information 313 can be added or modified later. Details will be described later.
(教師データの作成)
 教師データ31の作成方法、すなわち教師タンパク質情報311、教師ゲノム情報312、および教師診察情報313の作成方法を説明する。まず集団90のそれぞれから鼻腔粘膜液、咽頭粘膜液、唾液、呼気の凝縮液、および血液などの教師検体911を採取する。教師検体911は、前処理装置92により処理された後に、質量分析装置93および遺伝子解析装置94に送られる。具体的には、公知のタンパク質抽出操作で得られたサンプルが質量分析装置93に送られ、公知のRNA、DNA抽出操作で得られたサンプルが遺伝子解析装置94に送られる。
(Creation of teacher data)
The method of creating the teacher data 31, that is, the method of creating the teacher protein information 311, the teacher genome information 312, and the teacher examination information 313 will be described. First, teacher specimens 911 such as nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled condensate, and blood are collected from each of the population 90. The teacher sample 911 is sent to the mass spectrometer 93 and the gene analyzer 94 after being processed by the pretreatment device 92. Specifically, the sample obtained by the known protein extraction operation is sent to the mass spectrometer 93, and the sample obtained by the known RNA or DNA extraction operation is sent to the gene analyzer 94.
 質量分析装置93は、検出したタンパク質の名称および検出したタンパク質の重量比率を教師タンパク質情報311として記憶部3に書き込む。遺伝子解析装置94は、教師検体911に病原体PのDNAまたはRNAが含まれるか否かを教師ゲノム情報312として記憶部3に書き込む。 The mass spectrometer 93 writes the name of the detected protein and the weight ratio of the detected protein in the storage unit 3 as teacher protein information 311. The gene analysis apparatus 94 writes in the storage unit 3 whether or not the teacher sample 911 contains the DNA or RNA of the pathogen P as the teacher genomic information 312.
 集団90に属するそれぞれの人は、検体91の採取と同様のタイミングで、専門家96による診察を受ける。専門家96は、診察において公知である病原体Pに感染した際の症状を含む様々な状態を確認し、診察端末95を用いて記憶部3に教師診察情報313および感染確定情報314を作成する。ただし感染確定情報314は、事後的な追記や修正も行われる。感染確定情報314の事後的な追記や修正について具体例を挙げて説明する。 Each person belonging to the group 90 receives a medical examination by an expert 96 at the same timing as the collection of the sample 91. The expert 96 confirms various states including symptoms when infected with the pathogen P known in the medical examination, and creates the teacher medical examination information 313 and the infection confirmation information 314 in the storage unit 3 using the medical examination terminal 95. However, the infection confirmation information 314 is also added or corrected after the fact. Subsequent additions and corrections of the infection confirmation information 314 will be described with specific examples.
 たとえば集団90に属するある個人について、専門家96の診察により病原体Pに感染した際に観察される症状(以下、「感染症状」と呼ぶ)が初めて確認された場合には、次の処理を行う。すなわち、その個人について前日までの感染確定情報314に、感染があったことが記載される。さらに、感染力の測定または病原体Pの感染に関する既知の特性から感染力発生までの日数を推定し、前日までの感染確定情報314に感染力発生までの日数が追記される。 For example, when a symptom observed when infected with the pathogen P (hereinafter referred to as "infectious symptom") is first confirmed for an individual belonging to the group 90 by a medical examination by a specialist 96, the following treatment is performed. .. That is, it is described in the infection confirmation information 314 up to the previous day that the individual was infected. Further, the number of days until the outbreak of infectivity is estimated from the measurement of infectivity or the known characteristics regarding the infection of the pathogen P, and the number of days until the outbreak of infectivity is added to the infection confirmation information 314 up to the previous day.
 以上の処理により教師データ31が作成される。1つの教師データ31は、このように一人が1回の教師検体911を採取するごとに作成されるので、仮に100人が30日にわたって1日ごとに教師検体911を採取すると、全部で3千個の教師データ31が作成される。 Teacher data 31 is created by the above processing. One teacher data 31 is created every time one person collects one teacher sample 911 in this way. Therefore, if 100 people collect teacher sample 911 every day for 30 days, a total of 3,000 teacher data 31 will be created. The individual teacher data 31 are created.
(評価データの作成)
 評価データ33の作成方法は、教師データ31の作成方法と略同一である。ただし評価データ33には、教師データ31の感染確定情報314に対応する情報が含まれないので、教師データ31のように事後的な修正は不要である。
(Creation of evaluation data)
The method of creating the evaluation data 33 is substantially the same as the method of creating the teacher data 31. However, since the evaluation data 33 does not include the information corresponding to the infection confirmation information 314 of the teacher data 31, no ex post facto correction is necessary as in the teacher data 31.
 上述した第1の実施の形態によれば、次の作用効果が得られる。
(1)感染判断システムSにおいて実行される、病原体Pへの感染の程度を診断する学習モデル32の作成方法は、複数のサンプル提供者である集団90から採取した教師検体911から得られるタンパク質の情報である教師タンパク質情報311と、サンプル提供者の病原体への感染の程度である感染確定情報314とを含む教師データ31を複数取得し、教師データ31を用いて、タンパク質の情報を入力、病原体への感染の程度を出力とする学習モデル32を生成する。そのため、病原体Pの病原体そのもののゲノム情報を検出する場合に比べて、より高感度な検出が可能な学習モデル32を作成することができる。具体的には以下のとおりである。
According to the first embodiment described above, the following effects can be obtained.
(1) The method for creating the learning model 32 for diagnosing the degree of infection with the pathogen P, which is executed in the infection determination system S, is a method for creating a protein obtained from a teacher sample 911 collected from a group 90, which is a plurality of sample providers. A plurality of teacher data 31 including the teacher protein information 311 which is information and the infection confirmation information 314 which is the degree of infection to the pathogen of the sample provider are acquired, and the protein information is input using the teacher data 31 to enter the pathogen. A learning model 32 is generated, which outputs the degree of infection to. Therefore, it is possible to create a learning model 32 capable of more sensitive detection than in the case of detecting the genomic information of the pathogen itself of the pathogen P. Specifically, it is as follows.
 上述した特許文献1のように、病原体Pのゲノム情報を検出する場合には、採取した検体91に病原体Pのゲノム情報が含まれていなければ検出が不可能である。そのため、検体91の採取方法に問題があり病原体Pが存在しない箇所から採取した場合や、偶然に病原体Pが採取されなかった場合には病原体Pのゲノム情報が検出されず、誤って感染なしと判断されることがある。しかし本実施の形態では、病原体Pに感染して病原体が生成したタンパク質や病原体Pが宿主細胞に生成させたタンパク質を機械学習により学習するので、作成する学習モデル32は従来手法よりも感度よく病原体Pへの感染を検出できる。 When detecting the genomic information of the pathogen P as in Patent Document 1 described above, the detection is impossible unless the collected sample 91 contains the genomic information of the pathogen P. Therefore, if there is a problem in the collection method of the sample 91 and the pathogen P is collected from a place where the pathogen P does not exist, or if the pathogen P is not collected by chance, the genomic information of the pathogen P is not detected and there is no erroneous infection. It may be judged. However, in the present embodiment, since the protein produced by the pathogen infected with the pathogen P and the protein produced by the pathogen P in the host cell are learned by machine learning, the learning model 32 to be created is more sensitive than the conventional method. Infection to P can be detected.
(2)学習モデル32は、教師診察情報313をさらに入力とする。そのため、体温、咳、頭痛、筋肉痛、疲労感などタンパク質以外の情報も学習および推論の要素に含めることができる。 (2) The learning model 32 further inputs the teacher examination information 313. Therefore, non-protein information such as body temperature, cough, headache, myalgia, and fatigue can be included in the learning and reasoning elements.
(3)学習モデル32は、病原体Pに関する教師ゲノム情報312をさらに入力とする。そのため作成する学習モデル32は、より確実に病原体Pへの感染を検出できる。 (3) The learning model 32 further inputs the teacher genomic information 312 regarding the pathogen P. Therefore, the learning model 32 created can more reliably detect infection with the pathogen P.
(4)感染の程度とは、病原体への感染の有無である。 (4) The degree of infection is the presence or absence of infection with a pathogen.
(5)感染の程度とは、病原体への感染の有無および感染力の有無である。 (5) The degree of infection is the presence or absence of infection with a pathogen and the presence or absence of infectivity.
(6)感染の程度とは、病原体への感染の有無、感染力の有無、および感染力を有するまでの残り日数である。 (6) The degree of infection is the presence or absence of infection with a pathogen, the presence or absence of infectivity, and the number of days remaining until the person has infectivity.
(7)学習モデルプログラム32Aは、評価対象者90Aから採取された検体91のタンパク質の情報である評価データ33を記憶部3から取得し、サンプル提供者すなわち集団90から採取した教師検体911から得られるタンパク質の情報を入力、集団90の病原体Pへの感染の程度を出力とする教師データ31を用いて学習させた学習モデル32に、取得した評価データ33を入力して、評価対象者90Aの病原体Pへの感染の程度を出力する処理を演算装置10に実行させる。 (7) The learning model program 32A acquires evaluation data 33, which is information on the protein of the sample 91 collected from the evaluation target person 90A, from the storage unit 3, and obtains it from the teacher sample 911 collected from the sample provider, that is, the group 90. The acquired evaluation data 33 is input to the learning model 32 trained using the teacher data 31 which inputs the information of the protein to be obtained and outputs the degree of infection to the pathogen P of the population 90, and the evaluation subject 90A. The arithmetic unit 10 is made to execute a process of outputting the degree of infection with the pathogen P.
(8)演算装置10は、評価対象者90Aから採取された評価検体912のタンパク質の情報である評価データ33を読み込む読込部、すなわち通信部2と、集団90から採取した教師検体911から得られるタンパク質の情報である教師タンパク質情報311を入力、集団90の病原体への感染の程度である感染確定情報314を出力とする教師データ31を用いて学習させた学習モデル32を格納する記憶部3と、学習モデル32に評価データ33を入力して、評価対象者90Aの病原体への感染の程度を出力する演算部1と、を備える。 (8) The arithmetic unit 10 is obtained from a reading unit that reads evaluation data 33, which is information on the protein of the evaluation sample 912 collected from the evaluation target person 90A, that is, a communication unit 2, and a teacher sample 911 collected from the group 90. A storage unit 3 for storing a learning model 32 trained using teacher data 31 in which teacher protein information 311 which is protein information is input and infection confirmation information 314 which is the degree of infection to a pathogen of a population 90 is output. , A calculation unit 1 for inputting evaluation data 33 into the learning model 32 and outputting the degree of infection of the evaluation target person 90A with the pathogen.
(変形例1)
 評価対象者90Aは、専門家96による診察を受けなくてもよい。その場合は評価データ33には評価診察情報333が含まれない。またこの場合には、集団90も専門家96による診察を受けなくてよく、教師データ31に教師診察情報313が含まれない。
(Modification 1)
The evaluation target 90A does not have to be examined by the expert 96. In that case, the evaluation data 33 does not include the evaluation consultation information 333. Further, in this case, the group 90 does not have to be examined by the expert 96, and the teacher data 31 does not include the teacher examination information 313.
(変形例2)
 感染判断システムSは、遺伝子解析装置94を含まなくてもよい。この場合には、教師データ31には教師ゲノム情報312が含まれず評価データ33には評価ゲノム情報332が含まれない。
(Modification 2)
The infection determination system S does not have to include the gene analysis device 94. In this case, the teacher data 31 does not include the teacher genome information 312, and the evaluation data 33 does not include the evaluation genome information 332.
(変形例3)
 上述した第1の実施の形態では、感染確定情報314、すなわち感染の程度は、感染の有無だけでなく、感染力の有無、および感染力発生までの日数が含まれた。しかし感染確定情報314、すなわち感染の程度は、感染の有無および感染力の有無でもよいし、感染の有無だけでもよい。なお、感染の程度に症状の重篤性を加えてもよい。
(Modification 3)
In the first embodiment described above, the infection confirmation information 314, that is, the degree of infection includes not only the presence or absence of infection but also the presence or absence of infectivity and the number of days until the occurrence of infectivity. However, the infection confirmation information 314, that is, the degree of infection may be the presence or absence of infection and the presence or absence of infectivity, or may be only the presence or absence of infection. The severity of the symptoms may be added to the degree of infection.
(変形例4)
 上述した第1の実施の形態では、検体91には、鼻腔粘膜液、咽頭粘膜液、唾液、呼気の凝縮液、および血液が含まれた。しかし検体91は、鼻腔粘膜液、咽頭粘膜液、唾液、呼気の凝縮液、および血液の少なくとも1つが含まれればよい。
(Modification example 4)
In the first embodiment described above, the sample 91 contained nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood. However, the sample 91 may contain at least one of nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled breath condensate, and blood.
(変形例5)
 上述した第1の実施の形態では、学習および推論は検体91の種類の情報が含まれなかった。しかし、質量分析装置93および遺伝子解析装置94の出力に検体91の種類の情報を付したものを教師データ31として用いてもよい。具体的には同一人物の同時に採取した検体91であっても、鼻腔粘膜液に含まれるタンパク質の情報と、咽頭粘膜液に含まれるタンパク質の情報と、唾液に含まれるタンパク質の情報と、呼気の凝縮液に含まれるタンパク質の情報と、血液に含まれるタンパク質の情報とが別の情報として扱われる。
(Modification 5)
In the first embodiment described above, learning and reasoning did not include information on the type of sample 91. However, the output of the mass spectrometer 93 and the gene analyzer 94 with the information of the type of the sample 91 may be used as the teacher data 31. Specifically, even in the case of the sample 91 collected at the same time by the same person, the information on the protein contained in the nasal mucosal fluid, the information on the protein contained in the pharyngeal mucosal fluid, the information on the protein contained in the saliva, and the exhaled breath. The information on the protein contained in the condensate and the information on the protein contained in the blood are treated as different information.
 この変形例5によれば、次の作用効果が得られる。
(9)教師検体911は、鼻腔粘膜液、咽頭粘膜液、血液、呼気凝縮液、および唾液のうち2種類以上を含み、検体91の種類ごとに異なるタンパク質の情報として扱う。そのため学習モデル32は、演算に用いられる情報の種類を増加させ、推論の精度を向上させることができる。
According to this modification 5, the following effects can be obtained.
(9) The teacher sample 911 contains two or more types of nasal mucosal fluid, pharyngeal mucosal fluid, blood, exhaled condensate, and saliva, and is treated as information on proteins different for each type of sample 91. Therefore, the learning model 32 can increase the types of information used in the calculation and improve the accuracy of inference.
(変形例6)
 変形例5においてさらに、学習モデル32を検体91の種類ごとに作成し、推論の結果も検体91の種類ごとに行ってもよい。具体的には、鼻腔粘膜液、咽頭粘膜液、唾液、呼気の凝縮液、および血液のそれぞれに対応する4つの学習モデル32を作成し、それぞれで学習および推論を行う。そして推論の結果も検体91の種類ごとに行われる。この変形例6によれば、人間が検体91の種類ごと推論結果を参照して感染の状態を推察することができる。そのため、推論を学習モデル32だけで行う場合に比べて、診断結果の意味付けが容易になる。たとえば、評価対象者90Aの呼気の凝縮液からも感染ありと推論される場合には、肺に感染が広がっており重篤な状態であると推測できる。
(Modification 6)
Further, in the modification 5, the learning model 32 may be created for each type of the sample 91, and the inference result may be performed for each type of the sample 91. Specifically, four learning models 32 corresponding to each of nasal mucosal fluid, pharyngeal mucosal fluid, saliva, exhaled fluid, and blood are created, and learning and inference are performed on each of them. And the result of inference is also performed for each type of sample 91. According to this modification 6, a human can infer the state of infection by referring to the inference result for each type of the sample 91. Therefore, it becomes easier to give meaning to the diagnosis result as compared with the case where the inference is performed only by the learning model 32. For example, if it is inferred from the exhaled breath condensate of the evaluation subject 90A that there is an infection, it can be inferred that the infection has spread to the lungs and the condition is serious.
(変形例7)
 上述した第1の実施の形態では、集団90は所定の期間にわたって所定の周期ごとに教師検体911が採取された。しかし集団90に属する各個人から教師検体911を採取する回数は1回でもよい。ただしこの場合であっても採取後の経過を観察し、感染確定情報314を必要に応じて追記や修正を行う。
(Modification 7)
In the first embodiment described above, the population 90 has a teacher sample 911 collected at predetermined cycles over a predetermined period. However, the number of times the teacher sample 911 is collected from each individual belonging to the group 90 may be one. However, even in this case, the progress after collection is observed, and the infection confirmation information 314 is added or corrected as necessary.
(変形例8)
 上述した第1の実施の形態では、集団90は人間の集団とした。しかし集団90は人間以外の生物、たとえば、牛、豚、鶏、魚、昆虫などでもよい。
(Modification 8)
In the first embodiment described above, the population 90 is a human population. However, the population 90 may be non-human organisms such as cows, pigs, chickens, fish, insects and the like.
(変形例9)
 複数の病原体Pのそれぞれに対応する学習モデル32を作成し、推論フェーズでは複数の病原体Pへの感染の程度を同時に推定してもよい。
(Modification 9)
A learning model 32 corresponding to each of the plurality of pathogens P may be created, and the degree of infection to the plurality of pathogens P may be estimated at the same time in the inference phase.
―第2の実施の形態―
 図3を参照して、学習モデルの生成方法の第2の実施の形態を説明する。以下の説明では、第1の実施の形態と同じ構成要素には同じ符号を付して相違点を主に説明する。特に説明しない点については、第1の実施の形態と同じである。本実施の形態では、主に、学習処理と推論処理が異なる装置において実行される点で、第1の実施の形態と異なる。
-Second embodiment-
A second embodiment of the method of generating a learning model will be described with reference to FIG. In the following description, the same components as those in the first embodiment are designated by the same reference numerals, and the differences will be mainly described. The points not particularly described are the same as those in the first embodiment. The present embodiment is different from the first embodiment in that the learning process and the inference process are mainly executed in different devices.
 図3は、第2の実施の形態における感染判断システムS1の構成図である。感染判断システムS1は、演算装置10の代わりに学習装置10Aおよび推論装置10Bを備える点が第1の実施の形態と異なる。 FIG. 3 is a configuration diagram of the infection determination system S1 in the second embodiment. The infection determination system S1 is different from the first embodiment in that the learning device 10A and the inference device 10B are provided instead of the arithmetic unit 10.
 学習装置10Aは、学習演算部1Aと、学習通信部2Aと、学習記憶部3Aとを備える。学習記憶部3Aには、教師データ31および学習モデル32が格納される。学習演算部1Aは、たとえば不図示の中央演算装置であり、不図示の読み出し専用メモリに格納されるプログラムを不図示の揮発性メモリに展開して実行することで学習処理を行う。ただし学習演算部1Aは、ハードウエア回路や再構成可能な論理回路を用いて実現してもよいし、これらの組合せにより実現してもよい。 The learning device 10A includes a learning calculation unit 1A, a learning communication unit 2A, and a learning storage unit 3A. The teacher data 31 and the learning model 32 are stored in the learning storage unit 3A. The learning calculation unit 1A is, for example, a central processing unit (not shown), and performs learning processing by expanding a program stored in a read-only memory (not shown) into a volatile memory (not shown) and executing the program. However, the learning calculation unit 1A may be realized by using a hardware circuit or a reconfigurable logic circuit, or may be realized by a combination thereof.
 学習通信部2Aは、質量分析装置93、遺伝子解析装置94、診察端末95、および推論装置10Bと通信可能な通信モジュールである。学習通信部2Aは、質量分析装置93、遺伝子解析装置94、診察端末95、および推論装置10Bと直接に接続されてもよいし、通信ネットワーク、たとえばインターネットを介して接続されてもよい。学習記憶部3Aは、たとえばハードディスクドライブなどの不揮発性記憶装置である。 The learning communication unit 2A is a communication module capable of communicating with the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the inference device 10B. The learning communication unit 2A may be directly connected to the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the inference device 10B, or may be connected via a communication network, for example, the Internet. The learning storage unit 3A is a non-volatile storage device such as a hard disk drive.
 推論装置10Bは、推論演算部1Bと、推論通信部2Bと、推論記憶部3Bとを備える。推論記憶部3Bには、学習モデルプログラム32Aおよび評価データ33が格納される。推論演算部1Bは、たとえば不図示の中央演算装置であり、学習モデルプログラム32Aを不図示の揮発性メモリに展開して実行することで推論処理を行う。ただし推論演算部1Bは、ハードウエア回路や再構成可能な論理回路を用いて実現してもよいし、これらの組合せにより実現してもよい。すなわち学習モデルプログラム32Aは、プロセッサが読み込むコンピュータプログラムでもよいし、論理回路を再構成するための回路情報でもよい。 The inference device 10B includes an inference calculation unit 1B, an inference communication unit 2B, and an inference storage unit 3B. The learning model program 32A and the evaluation data 33 are stored in the inference storage unit 3B. The inference calculation unit 1B is, for example, a central processing unit (not shown), and performs inference processing by expanding and executing the learning model program 32A in a volatile memory (not shown). However, the inference calculation unit 1B may be realized by using a hardware circuit or a reconfigurable logic circuit, or may be realized by a combination thereof. That is, the learning model program 32A may be a computer program read by the processor or circuit information for reconstructing a logic circuit.
 推論通信部2Bは、質量分析装置93、遺伝子解析装置94、診察端末95、および学習装置10Aと通信可能な通信モジュールである。推論通信部2Bは、質量分析装置93、遺伝子解析装置94、診察端末95、および学習装置10Aと直接に接続されてもよいし、通信ネットワーク、たとえばインターネットを介して接続されてもよい。なお推論通信部2Bは、評価データ33を質量分析装置93、遺伝子解析装置94、および診察端末95から取得するので、評価データ33を外部から読み込む処理を行う「読込部」と呼ぶこともできる。推論記憶部3Bは、たとえばハードディスクドライブなどの不揮発性記憶装置である。 The inference communication unit 2B is a communication module capable of communicating with the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the learning device 10A. The inference communication unit 2B may be directly connected to the mass spectrometer 93, the gene analysis device 94, the examination terminal 95, and the learning device 10A, or may be connected via a communication network, for example, the Internet. Since the inference communication unit 2B acquires the evaluation data 33 from the mass spectrometer 93, the gene analysis device 94, and the examination terminal 95, it can also be called a “reading unit” that performs a process of reading the evaluation data 33 from the outside. The inference storage unit 3B is a non-volatile storage device such as a hard disk drive.
 質量分析装置93、および遺伝子解析装置94は、教師検体911を処理対象とする場合は処理結果を学習装置10Aに出力し、評価検体912を処理対象とする場合は処理結果を推論装置10Bに出力する。診察端末95は、集団90を診察した際には学習装置10Aに結果を出力し、評価対象者90Aを診察した際には推論装置10Bに結果を出力する。学習装置10Aは、学習モデル32を作成および更新するたびに学習モデルプログラム32Aを作成して推論装置10Bに送信する。推論装置10Bは、学習モデルプログラム32Aを実行して評価データ33の推論を行う。 The mass spectrometer 93 and the gene analysis device 94 output the processing result to the learning device 10A when the teacher sample 911 is the processing target, and output the processing result to the inference device 10B when the evaluation sample 912 is the processing target. do. The examination terminal 95 outputs the result to the learning device 10A when the group 90 is examined, and outputs the result to the inference device 10B when the evaluation target person 90A is examined. The learning device 10A creates a learning model program 32A and transmits it to the inference device 10B each time the learning model 32 is created and updated. The inference device 10B executes the learning model program 32A to infer the evaluation data 33.
 上述した第2の実施の形態によれば、学習と推論を別々の装置で実行できる。 According to the second embodiment described above, learning and inference can be performed by separate devices.
(第2の実施の形態の変形例1)
 第2の実施の形態では、学習モデルプログラム32Aは通信回線を介して学習装置10Aから推論装置10Bに送信された。すなわち第2の実施の形態では、通信媒体、すなわち有線、無線、光などのネットワーク、または当該ネットワークを伝搬する搬送波やディジタル信号を利用して学習モデルプログラム32Aが送信された。しかし、学習装置10Aおよび推論装置10Bが不図示の入出力インタフェースを備え、それら入出力インタフェースが利用可能な記録媒体、たとえばUSBメモリ、ハードディスクドライブ、CD-Rなどを介して学習モデルプログラム32Aが伝達されてもよい。
(Modification 1 of the second embodiment)
In the second embodiment, the learning model program 32A is transmitted from the learning device 10A to the inference device 10B via the communication line. That is, in the second embodiment, the learning model program 32A is transmitted using a communication medium, that is, a network such as wired, wireless, or optical, or a carrier wave or a digital signal propagating in the network. However, the learning device 10A and the inference device 10B are provided with input / output interfaces (not shown), and the learning model program 32A is transmitted via a recording medium in which the input / output interfaces can be used, such as a USB memory, a hard disk drive, or a CD-R. May be done.
(第2の実施の形態の変形例2)
 第2の実施の形態では、学習装置10Aが学習モデルプログラム32Aを作成して推論装置10Bに送信した。しかし学習装置10Aは学習モデル32をそのまま推論装置10Bに送信し、推論装置10Bが受信した学習モデル32を用いて学習モデルプログラム32Aを作成してもよい。
(Modification 2 of the second embodiment)
In the second embodiment, the learning device 10A creates a learning model program 32A and transmits it to the inference device 10B. However, the learning device 10A may transmit the learning model 32 as it is to the inference device 10B, and create the learning model program 32A using the learning model 32 received by the inference device 10B.
―第3の実施の形態―
 図4を参照して、学習モデルの生成方法の第3の実施の形態を説明する。以下の説明では、第1の実施の形態と同じ構成要素には同じ符号を付して相違点を主に説明する。特に説明しない点については、第1の実施の形態と同じである。本実施の形態では、主に、評価検体912が第1の実施の形態と異なる。
-Third embodiment-
A third embodiment of the method of generating a learning model will be described with reference to FIG. In the following description, the same components as those in the first embodiment are designated by the same reference numerals, and the differences will be mainly described. The points not particularly described are the same as those in the first embodiment. In the present embodiment, the evaluation sample 912 is mainly different from the first embodiment.
 図4は、第3の実施の形態における感染判断システムS2の構成図である。本実施の形態では、第1の実施の形態において説明した処理により作成した学習モデル32を環境Eに対して適用し、環境Eにおける病原体の存在を評価する。環境Eは、閉鎖空間でもよいし開放空間でもよい。たとえば環境Eとは、建物のロビー、オフィス、ホテルの客室、自宅の居室、車両、電車、旅客機、体育館、競技場の競技スペース、競技場の観覧スペース、および街頭などである。 FIG. 4 is a configuration diagram of the infection determination system S2 according to the third embodiment. In the present embodiment, the learning model 32 created by the process described in the first embodiment is applied to the environment E, and the presence of the pathogen in the environment E is evaluated. The environment E may be a closed space or an open space. For example, the environment E is a lobby of a building, an office, a hotel room, a living room at home, a vehicle, a train, a passenger plane, a gymnasium, a competition space of a stadium, a viewing space of a stadium, and a street.
 本実施の形態では、様々な手法を用いて環境Eから評価検体912を抽出することができる。たとえば環境E中の空気を微小な網目のフィルタに繰り返し通過させ、そのフィルタに所定の液体を接触させて得られる液体を評価検体912とすることができる。また環境E中の空気を圧縮して得られる凝縮液を評価検体912としてもよいし、環境E中の空気を液体内に所定時間圧送してその液体を評価検体912としてもよい。取得した評価検体912の処理方法は第1の実施の形態と同様である。 In the present embodiment, the evaluation sample 912 can be extracted from the environment E by using various methods. For example, the liquid obtained by repeatedly passing the air in the environment E through a filter having a fine mesh and bringing a predetermined liquid into contact with the filter can be used as the evaluation sample 912. Further, the condensed liquid obtained by compressing the air in the environment E may be used as the evaluation sample 912, or the air in the environment E may be pumped into the liquid for a predetermined time to use the liquid as the evaluation sample 912. The processing method of the acquired evaluation sample 912 is the same as that of the first embodiment.
 本実施の形態によれば、たとえばオフィス内の空気から評価検体912を抽出し、前処理装置92などによる処理を経て評価データ33を作成し、演算装置10を用いてオフィス内の病原体Pの有無を推定できる。そのため、任意の環境Eを対象として病原体Pの有無を判断できるため、不特定多数を対象とした感染症Pの感染防止に寄与することができる。なお環境Eの空気から病原体Pが検出された場合には、環境Eに含まれるそれぞれの人を評価対象者90Aとして、第1の実施の形態において説明した評価検体912の取得および評価検体912の評価を行ってもよい。 According to the present embodiment, for example, the evaluation sample 912 is extracted from the air in the office, the evaluation data 33 is created by processing with the pretreatment device 92 or the like, and the presence or absence of the pathogen P in the office is used by using the arithmetic unit 10. Can be estimated. Therefore, since the presence or absence of the pathogen P can be determined for any environment E, it can contribute to the prevention of infection of the infectious disease P for an unspecified number of people. When the pathogen P is detected in the air of the environment E, each person included in the environment E is set as the evaluation target person 90A, and the evaluation sample 912 and the evaluation sample 912 described in the first embodiment are obtained and evaluated. Evaluation may be performed.
(第3の実施の形態の変形例)
 以下のようにして生成した学習モデル32を用いて第3の実施の形態における推論を実行してもよい。すなわち、教師データ31には教師診察情報313を含めず、教師タンパク質情報311、教師ゲノム情報312、および感染確定情報314のみとする。そしてたとえば、病原体Pに感染し感染力を有する状態の感染者が滞在する部屋の空気を処理して得られる液体、すなわち病原体に感染した感染者に由来する検体91を病原体Pが存在する教師検体911とする。またたとえば、無人または病原体Pに感染していない非感染者が滞在する部屋の空気を処理して得られる液体を、病原体Pが存在しない教師検体911とする。
(Modified example of the third embodiment)
The inference in the third embodiment may be executed using the learning model 32 generated as follows. That is, the teacher data 31 does not include the teacher examination information 313, but only the teacher protein information 311, the teacher genome information 312, and the infection confirmation information 314. Then, for example, a liquid obtained by treating the air in the room where the infected person infected with the pathogen P and having infectivity stays, that is, a sample 91 derived from the infected person infected with the pathogen is a teacher sample in which the pathogen P is present. It is set to 911. Further, for example, the liquid obtained by treating the air in the room where an uninfected person or a non-infected person who is not infected with the pathogen P stays is referred to as a teacher sample 911 in which the pathogen P does not exist.
 本変形例における学習モデル32は、評価検体912における病原体Pの有無を推定してもよいし、病原体Pの存在の確からしさを推定してもよい。本変形例では、病原体に感染した感染者に由来する検体から得られるタンパク質の情報を含む教師データ31を複数取得し、教師データ31を用いて、タンパク質の情報を入力、病原体Pの有無を出力とする学習モデル32を生成する。 The learning model 32 in this modification may estimate the presence or absence of the pathogen P in the evaluation sample 912, or may estimate the certainty of the presence of the pathogen P. In this modification, a plurality of teacher data 31 including protein information obtained from a sample derived from an infected person infected with a pathogen are acquired, protein information is input using the teacher data 31, and the presence or absence of pathogen P is output. The learning model 32 is generated.
 さらに、複数の病原体Pを対象として学習モデル32を作成し、サンプルに含まれる病原体Pの種類を推論してもよい。すなわち、サンプルから採取した検体91から得られるタンパク質の情報と、サンプルにおける病原体Pの種類とを含む教師データ31を複数取得し、教師データ31を用いて、タンパク質の情報を入力、サンプルに含まれる病原体の種類を出力とする学習モデル32を生成してもよい。この場合には推論フェーズにおいて、学習モデル32に環境Eから取得した検体91を処理して得られるタンパク質の情報を入力することで、検体91に含まれる病原体Pの種類が出力される。 Further, a learning model 32 may be created for a plurality of pathogens P, and the type of pathogen P contained in the sample may be inferred. That is, a plurality of teacher data 31 including the protein information obtained from the sample 91 collected from the sample and the type of the pathogen P in the sample are acquired, and the protein information is input using the teacher data 31 and included in the sample. A learning model 32 may be generated with the type of pathogen as an output. In this case, in the inference phase, the type of pathogen P contained in the sample 91 is output by inputting the protein information obtained by processing the sample 91 acquired from the environment E into the learning model 32.
―第4の実施の形態―
 図5~図8を参照して、学習モデルの生成方法の第4の実施の形態を説明する。以下の説明では、第1の実施の形態と同じ構成要素には同じ符号を付して相違点を主に説明する。特に説明しない点については、第1の実施の形態と同じである。本実施の形態では、主に、後述するイオンを教師データに含める点で、第1の実施の形態と異なる。
-Fourth Embodiment-
A fourth embodiment of the method of generating a learning model will be described with reference to FIGS. 5 to 8. In the following description, the same components as those in the first embodiment are designated by the same reference numerals, and the differences will be mainly described. The points not particularly described are the same as those in the first embodiment. This embodiment differs from the first embodiment mainly in that ions described later are included in the teacher data.
(試験データ)
 新型コロナ非感染の健常者12名と新型コロナ感染者12名を被験者として、呼気中の硫黄代謝物の解析を行った。新型コロナ感染の有無はPCR検査に基づいた。新型コロナ感染者12名のうち2名は軽症、10名は中等症であり、そのうち、1名は後に重症化した。軽症、中等症、重症の分類は、飽和酸素濃度、呼吸器症状、そして、胸部CT画像に基づいた。この分類については、例えば、新型コロナウイルス感染症(COVID-19)、診療の手引き・第4.1版に記載されている。
(Test data)
We analyzed sulfur metabolites in exhaled breath using 12 healthy subjects who were not infected with the new corona and 12 subjects who were infected with the new corona. The presence or absence of new corona infection was based on PCR tests. Of the 12 people infected with the new corona, 2 were mildly ill, 10 were moderately ill, and 1 of them later became severely ill. The mild, moderate, and severe classifications were based on saturated oxygen levels, respiratory symptoms, and chest CT images. This classification is described in, for example, Severe Acute Respiratory Syndrome (COVID-19), Medical Guide, Edition 4.1.
 呼気の回収は、呼気回収器(ジーエルサイエンス株式会社製)、及び、呼気回収装置(ジーエルサイエンス株式会社製)を利用した。呼気回収装置を事前に摂氏マイナス20度に冷却しておき、呼気回収器(マウスピース型、マスク型)を介して、被験者が5~10分間呼吸することで、呼気エアロゾルを急速に冷却し呼気凝縮液を約1ml回収した。 For the collection of exhaled breath, an exhaled breath collector (manufactured by GL Sciences Co., Ltd.) and an exhaled breath recovery device (manufactured by GL Sciences Co., Ltd.) were used. The exhaled breath collection device is cooled to -20 degrees Celsius in advance, and the subject breathes for 5 to 10 minutes through the exhaled breath collection device (mouthpiece type, mask type) to rapidly cool the exhaled aerosol and exhale. About 1 ml of the condensate was recovered.
 呼気凝縮液25μlに親電子性アルキル化剤であるβ-(4-hydroxyphenyl) ethyl iodoacetamide(HPE-IAM)を5mMになるように添加し、摂氏37度で30分間アルキル化反応を起こさせた。その後、1%になるようにギ酸を添加し、硫黄代謝物をHPE-IAM付加体として安定化させた。親電子性アルキル化剤は硫黄の電子対をアルカリ化して、硫黄代謝物の分解を抑制する。 Β- (4-hydroxyphenyl) ethyliodoacetamide (HPE-IAM), which is an electrophile alkylating agent, was added to 25 μl of the exhaled breath condensate so as to be 5 mM, and an alkylation reaction was caused at 37 degrees Celsius for 30 minutes. Then, formic acid was added to 1% to stabilize the sulfur metabolite as an HPE-IAM adduct. The electrophile alkylating agent alkalizes the electron pair of sulfur and suppresses the decomposition of sulfur metabolites.
 さらに、複数種の硫黄代謝物夫々について、溶液中の濃度が10nMになるように安定同位体標識HPE-IAM付加体スタンダードを添加し、これを質量分析サンプルとした。公知の文献(Ida T, Sawa T, Ihara H, Tsuchiya Y, Watanabe Y, Kumagai Y, Suematsu M, Motohashi H, Fujii S, Matsunaga T, Yamamoto M, Ono K, Devarie-Baez NO, Xian M, Fukuto JM, Akaike T. Reactive cysteine persulfides and S-polythiolation regulate oxidative stress and redox signaling. Proc Natl Acad Sci USA 111:7606-7611 (2014).)に記載の方法に従って亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)、二硫化水素イオン(HS2 -)のHPE-IAMアダクトを、質量分析装置を用いて定量的に解析した。即ち、上記の質量分析サンプル/スタンダードを高速液体クロマトグラフ質量分析計:LCMSTM-8060(Shimadzu)に35μlをインジェクションし、表1に示すMRM(多重反応モニタリング)条件により、定量的に解析した。 Furthermore, for each of the plurality of sulfur metabolites, a stable isotope-labeled HPE-IAM adduct standard was added so that the concentration in the solution was 10 nM, and this was used as a mass spectrometric sample. Known literature (Ida T, Sawa T, Ihara H, Tsuchiya Y, Watanabe Y, Kumagai Y, Suematsu M, Motohashi H, Fujii S, Matsunaga T, Yamamoto M, Ono K, Devarie-Baez NO, Xian M, Fukuto JM , Akaike T. Reactive cysteine persulfides and S-polythiolation regulate oxidative stress and redox signaling. Proc Natl Acad Sci USA 111: 7606-7611 ( 2014).) The HPE - IAM adducts of HS 2 O 3- ) and hydrogen disulfide ion (HS 2- ) were quantitatively analyzed using a mass analyzer. That is, 35 μl of the above mass spectrometric sample / standard was injected into a high performance liquid chromatograph mass spectrometer: LCMS TM -8060 (Shimadzu), and quantitatively analyzed under the MRM (multiple reaction monitoring) conditions shown in Table 1.
Figure JPOXMLDOC01-appb-T000001
Figure JPOXMLDOC01-appb-T000001
 図5に、健常者12人、新型コロナ感染者12人夫々について、これら3イオンの定量分析結果のデータを示す。呼気凝縮液に含まれる各種硫黄代謝物の濃度は、生体内と比較して全体的に低い生成レベルであった。これは、生体内の硫黄代謝物が、呼気中で希釈されることや、呼気回収の効率が必ずしも良くないことなどが影響していると考えられる。その上で、健常者と比較して、中等症患者では、亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)、二硫化水素イオン(HS2 -)のレベルが有意に高いことが確認された。 FIG. 5 shows the data of the quantitative analysis results of these 3 ions for 12 healthy subjects and 12 new corona-infected subjects. The concentration of various sulfur metabolites contained in the exhaled breath condensate was generally lower than that in the living body. It is considered that this is due to the fact that sulfur metabolites in the living body are diluted in the exhaled breath and that the efficiency of exhaled breath recovery is not always good. In addition, the levels of sulfite ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) are significantly higher in moderately ill patients than in healthy subjects. It was confirmed that.
 前述の新型コロナ感染者12名のうちの一人であるXは中等症から重症に移行した。Xの重症化時の亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)の測定結果を図6に示す。チオ硫酸イオン(HS2O3 -)については、中等症、重症時でもその生成レベルが、健常者と比較して有意に高いことが分かった。前述のXは、重症化する前に採取した検体について、亜硫酸イオン(HSO3 -)の顕著な増加が認められたが、重症化しなかった中等症患者では、そのような増加は見られなかった。従って、亜硫酸イオン(HSO3 -)は、新型コロナウイルス感染症の肺炎の重症化の移行リスクを評価するバイオマーカとして有望である。 X, one of the 12 new corona-infected individuals mentioned above, has transitioned from moderate to severe. FIG. 6 shows the measurement results of sulfate ion (HSO 3- ) and thiosulfate ion (HS 2 O 3- ) when X becomes severe. It was found that the production level of thiosulfate ion (HS 2 O 3- ) was significantly higher than that of healthy subjects even in moderate and severe cases. The aforementioned X showed a marked increase in sulfite ion (HSO 3- ) in specimens collected before it became severe, but not in moderately ill patients who did not become severe. .. Therefore, sulfite ion (HSO 3- ) is a promising biomarker for assessing the transition risk of aggravation of pneumonia in coronavirus infections.
 活性硫黄は抗酸化活性があり、つまり、酸化ストレス、活性酸素などにより、酸化されるので、肺炎の重症化、つまり、酸化ストレスの悪化、これにより、硫黄代謝物のプロファイルが、酸化の方にシフトすると言える。 Since active sulfur has antioxidant activity, that is, it is oxidized by oxidative stress, active oxygen, etc., the aggravation of pneumonia, that is, the exacerbation of oxidative stress, thereby changing the profile of sulfur metabolites toward oxidation. It can be said that it shifts.
 図7は、第4の実施の形態における演算装置10Cの構成図である。教師データ31は、教師イオン情報315をさらに含む。評価データ33は、評価イオン情報335をさらに含む。 FIG. 7 is a configuration diagram of the arithmetic unit 10C according to the fourth embodiment. The teacher data 31 further includes teacher ion information 315. The evaluation data 33 further includes evaluation ion information 335.
 教師イオン情報315とは、少なくとも亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)、および二硫化水素イオン(HS2 -)の1つを含むイオンの識別情報および検出されたイオンの量的情報であり、質量分析装置93により作成される。イオンの量的情報とは、たとえばそれぞれのイオンの濃度である。評価イオン情報335は、評価対象者90Aから採取された評価検体912を処理して得られたイオンの情報であり、情報の種類は教師イオン情報315と同一である。 The teacher ion information 315 is the identification information and detection of an ion containing at least one of sulfate ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) . Quantitative information on ions, produced by the mass spectrometer 93. The quantitative information of ions is, for example, the concentration of each ion. The evaluation ion information 335 is ion information obtained by processing the evaluation sample 912 collected from the evaluation target person 90A, and the type of information is the same as that of the teacher ion information 315.
 図8は、本実施の形態における教師データ31の一例を示す図である。第1の実施の形態において図2に示した教師データ31との相違点は、教師イオン情報315が付加されている点である。 FIG. 8 is a diagram showing an example of teacher data 31 in the present embodiment. The difference from the teacher data 31 shown in FIG. 2 in the first embodiment is that the teacher ion information 315 is added.
 教師イオン情報315および評価イオン情報335は、本実施の形態において説明した呼気回収器、呼気回収装置、および質量分析装置を用いて、前述の手法により得られたものである。教師タンパク質情報311、教師ゲノム情報312、評価タンパク質情報331、および評価ゲノム情報332は、教師イオン情報315および評価イオン情報335と同様の処理を経て得られてもよいし、教師イオン情報315および評価イオン情報335とは異なる処理、すなわち第1の実施の形態における処理により得られてもよい。 The teacher ion information 315 and the evaluation ion information 335 were obtained by the above-mentioned method using the breath recovery device, the breath recovery device, and the mass spectrometer described in the present embodiment. The teacher protein information 311 and the teacher genome information 312, the evaluation protein information 331, and the evaluation genome information 332 may be obtained through the same processing as the teacher ion information 315 and the evaluation ion information 335, or the teacher ion information 315 and the evaluation. It may be obtained by a treatment different from that of the ion information 335, that is, the treatment in the first embodiment.
 本実施の形態における学習モデル32は、学習フェーズにおいて、教師タンパク質情報311、教師ゲノム情報312、および教師診察情報313に加えて、教師イオン情報315を入力とし、感染確定情報314を出力として学習が行われる学習モデルである。また学習モデル32は、推論フェーズにおいて、評価タンパク質情報331、評価ゲノム情報332、および評価診察情報333に加えて評価イオン情報335を入力として、感染の程度を推論する。 In the learning phase, the learning model 32 in the present embodiment learns by inputting teacher ion information 315 and outputting infection confirmation information 314 in addition to teacher protein information 311, teacher genomic information 312, and teacher examination information 313. It is a learning model that is performed. Further, in the inference phase, the learning model 32 infers the degree of infection by inputting the evaluation ion information 335 in addition to the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333.
 学習モデル32を用いた推論処理の実行の便宜のために作成される学習モデルプログラム32Aは、推論フェーズにおける学習モデル32と同様に、評価タンパク質情報331、評価ゲノム情報332、および評価診察情報333に加えて評価イオン情報335を入力として、感染の程度を推論する。 The learning model program 32A created for the convenience of executing the inference process using the learning model 32 includes the evaluation protein information 331, the evaluation genome information 332, and the evaluation consultation information 333, similarly to the learning model 32 in the inference phase. In addition, the evaluation ion information 335 is used as an input to infer the degree of infection.
 上述した第4の実施の形態によれば、次の作用効果が得られる。
(10)教師データには、教師検体911から得られる亜硫酸イオン、チオ硫酸イオン、および、二硫化水素イオンの少なくとも一つの情報がイオン情報として含まれる。学習モデル32の生成において、タンパク質の情報およびイオン情報を入力とし、病原体への感染の程度を出力とする。そのため、バイオマーカとして有望である亜硫酸イオン、チオ硫酸イオン、および二硫化水素イオンの少なくとも一つを用いて学習モデル32を生成し、評価検体912における病原体への感染の程度を精度よく推論できる。
According to the fourth embodiment described above, the following effects can be obtained.
(10) The teacher data includes at least one information of sulfite ion, thiosulfate ion, and hydrogen sulfide ion obtained from the teacher sample 911 as ion information. In the generation of the learning model 32, protein information and ion information are input, and the degree of infection with a pathogen is output. Therefore, a learning model 32 can be generated using at least one of sulfate ion, thiosulfate ion, and hydrogen sulfide ion, which are promising biomarkers, and the degree of infection with the pathogen in the evaluation sample 912 can be accurately inferred.
 上述した各実施の形態および変形例において、機能ブロックの構成は一例に過ぎない。別々の機能ブロックとして示したいくつかの機能構成を一体に構成してもよいし、1つの機能ブロック図で表した構成を2以上の機能に分割してもよい。また各機能ブロックが有する機能の一部を他の機能ブロックが備える構成としてもよい。 In each of the above-described embodiments and modifications, the configuration of the functional block is only an example. Several functional configurations shown as separate functional blocks may be integrally configured, or the configuration represented by one functional block diagram may be divided into two or more functions. Further, a configuration in which a part of the functions of each functional block is provided in another functional block may be provided.
 上述した各実施の形態および変形例は、それぞれ組み合わせてもよい。上記では、種々の実施の形態および変形例を説明したが、本発明はこれらの内容に限定されるものではない。本発明の技術的思想の範囲内で考えられるその他の態様も本発明の範囲内に含まれる。 The above-mentioned embodiments and modifications may be combined. Although various embodiments and modifications have been described above, the present invention is not limited to these contents. Other aspects considered within the scope of the technical idea of the present invention are also included within the scope of the present invention.
1…演算部
1A…学習演算部
1B…推論演算部
3…記憶部
3A…学習記憶部
3B…推論記憶部
10…演算装置
10A…学習装置
10B…推論装置
31…教師データ
32…学習モデル
32A…学習モデルプログラム
33…評価データ
90A…評価対象者
311…教師タンパク質情報
312…教師ゲノム情報
313…教師診察情報
314…感染確定情報
331…評価タンパク質情報
332…評価ゲノム情報
333…評価診察情報
91…検体
911…教師検体
912…評価検体
1 ... Calculation unit 1A ... Learning calculation unit 1B ... Reasoning calculation unit 3 ... Storage unit 3A ... Learning storage unit 3B ... Reasoning storage unit 10 ... Calculation device 10A ... Learning device 10B ... Reasoning device 31 ... Teacher data 32 ... Learning model 32A ... Learning model program 33 ... Evaluation data 90A ... Evaluation target person 311 ... Teacher protein information 312 ... Teacher genomic information 313 ... Teacher consultation information 314 ... Infection confirmation information 331 ... Evaluation protein information 332 ... Evaluation genome information 333 ... Evaluation consultation information 91 ... Specimen 911 ... Teacher sample 912 ... Evaluation sample

Claims (20)

  1.  病原体への感染の程度を診断する学習モデルの作成方法であって、
     サンプル提供者から採取した検体から得られるタンパク質の情報と、前記サンプル提供者の病原体への感染の程度とを含む教師データを複数取得し、
     前記教師データを用いて、前記タンパク質の情報を入力とし、前記病原体への感染の程度を出力とする学習モデルを生成する、学習モデルの生成方法。
    It is a method of creating a learning model for diagnosing the degree of infection with a pathogen.
    Obtain multiple teacher data including protein information obtained from the sample collected from the sample provider and the degree of infection of the sample provider with the pathogen.
    A method for generating a learning model, which uses the teacher data to input information on the protein and output the degree of infection with the pathogen.
  2.  請求項1に記載の学習モデルの生成方法において、
     前記学習モデルは、診察情報をさらに入力とする、学習モデルの生成方法。
    In the method for generating a learning model according to claim 1,
    The learning model is a method of generating a learning model in which medical examination information is further input.
  3.  請求項1または請求項2に記載の学習モデルの生成方法において、
     前記学習モデルは、前記病原体に関するゲノム情報をさらに入力とする、学習モデルの生成方法。
    In the method for generating a learning model according to claim 1 or 2.
    The learning model is a method of generating a learning model in which genomic information about the pathogen is further input.
  4.  請求項1から請求項3までのいずれか1項に記載の学習モデルの生成方法において、
     前記検体は、鼻腔粘膜液、咽頭粘膜液、血液、呼気凝縮液、および唾液のうち2種類以上を含み、前記検体の種類ごとに異なるタンパク質の情報として扱う、学習モデルの生成方法。
    In the method for generating a learning model according to any one of claims 1 to 3.
    A method for generating a learning model, wherein the sample contains two or more of nasal mucosal fluid, pharyngeal mucosal fluid, blood, exhaled condensate, and saliva, and is treated as information on proteins that differ depending on the type of the sample.
  5.  請求項1から請求項4までのいずれか1項に記載の学習モデルの生成方法において、
     前記感染の程度とは、前記病原体への感染の有無である、学習モデルの生成方法。
    In the method for generating a learning model according to any one of claims 1 to 4.
    The degree of infection is a method for generating a learning model, which is the presence or absence of infection with the pathogen.
  6.  請求項1から請求項4までのいずれか1項に記載の学習モデルの生成方法において、
     前記感染の程度とは、前記病原体への感染の有無および感染力の有無である、学習モデルの生成方法。
    In the method for generating a learning model according to any one of claims 1 to 4.
    The degree of infection is a method for generating a learning model, which is the presence or absence of infection with the pathogen and the presence or absence of infectivity.
  7.  請求項1から請求項4までのいずれか1項に記載の学習モデルの生成方法において、
     前記感染の程度とは、前記病原体への感染の有無、感染力の有無、および感染力を有するまでの残り日数である、学習モデルの生成方法。
    In the method for generating a learning model according to any one of claims 1 to 4.
    The degree of infection is a method for generating a learning model, which is the presence or absence of infection with the pathogen, the presence or absence of infectivity, and the number of days remaining until the patient has infectivity.
  8.  評価対象者から採取された評価検体のタンパク質の情報である評価データを取得し、
     サンプル提供者から採取した教師検体から得られるタンパク質の情報を入力、前記サンプル提供者の病原体への感染の程度を出力とする教師データを用いて学習させた学習モデルに、取得した前記評価データを入力して、前記評価対象者の前記病原体への感染の程度を出力する処理をコンピュータに実行させるプログラム。
    Obtain evaluation data, which is information on the protein of the evaluation sample collected from the evaluation subject,
    The acquired evaluation data is applied to a learning model trained using teacher data in which information on proteins obtained from a teacher sample collected from a sample provider is input and the degree of infection with the pathogen of the sample provider is output. A program that causes a computer to perform a process of inputting and outputting the degree of infection of the evaluated person with the pathogen.
  9.  請求項8に記載のプログラムにおいて、
     前記学習モデルは、診察情報をさらに入力とする、プログラム。
    In the program according to claim 8,
    The learning model is a program in which medical examination information is further input.
  10.  請求項8または請求項9に記載のプログラムにおいて、
     前記学習モデルは、前記病原体に関するゲノム情報をさらに入力とする、プログラム。
    In the program according to claim 8 or 9.
    The learning model is a program that further inputs genomic information about the pathogen.
  11.  請求項8から請求項10までのいずれか1項に記載のプログラムにおいて、
     前記評価検体および前記教師検体は、鼻腔粘膜液、咽頭粘膜液、血液、呼気凝縮液、および唾液のうち2種類以上を含む、プログラム。
    In the program according to any one of claims 8 to 10.
    The evaluation sample and the teacher sample include two or more of nasal mucosal fluid, pharyngeal mucosal fluid, blood, exhaled condensate, and saliva.
  12.  請求項8から請求項11までのいずれか1項に記載のプログラムにおいて、
     前記感染の程度とは、前記病原体への感染の有無である、プログラム。
    In the program according to any one of claims 8 to 11.
    The degree of infection is the presence or absence of infection with the pathogen, a program.
  13.  請求項8から請求項11までのいずれか1項に記載のプログラムにおいて、
     前記感染の程度とは、前記病原体への感染の有無および感染力の有無である、プログラム。
    In the program according to any one of claims 8 to 11.
    The degree of infection is the presence or absence of infection with the pathogen and the presence or absence of infectivity, a program.
  14.  請求項8から請求項11までのいずれか1項に記載のプログラムにおいて、
     前記感染の程度とは、前記病原体への感染の有無、感染力の有無、および感染力を有するまでの残り日数である、プログラム。
    In the program according to any one of claims 8 to 11.
    The degree of infection is the presence or absence of infection with the pathogen, the presence or absence of infectivity, and the number of days remaining until the patient has infectivity.
  15.  評価対象者から採取された評価検体のタンパク質の情報を含む評価データを読み込む読込部と、
     サンプル提供者から採取した教師検体から得られるタンパク質の情報を入力、前記サンプル提供者の病原体への感染の程度を出力とする教師データを用いて学習させた学習モデルを格納する記憶部と、
     前記学習モデルに前記評価データを入力して、前記評価対象者の前記病原体への感染の程度を出力する演算部と、を備える演算装置。
    A reading unit that reads evaluation data including information on the protein of the evaluation sample collected from the evaluation target, and
    A storage unit that stores a learning model trained using teacher data that inputs protein information obtained from a teacher sample collected from a sample provider and outputs the degree of infection of the sample provider with a pathogen.
    An arithmetic unit including an arithmetic unit that inputs the evaluation data into the learning model and outputs the degree of infection of the evaluation target person with the pathogen.
  16.  病原体の存在を診断する学習モデルの作成方法であって、
     病原体に感染した感染者に由来する検体から得られるタンパク質の情報を含む教師データを複数取得し、
     前記教師データを用いて、前記タンパク質の情報を入力、前記病原体の有無を出力とする学習モデルを生成する、学習モデルの生成方法。
    A method of creating a learning model for diagnosing the presence of pathogens.
    Obtain multiple teacher data including information on proteins obtained from specimens derived from infected persons infected with pathogens.
    A method for generating a learning model, which uses the teacher data to input information on the protein and output the presence or absence of the pathogen as an output.
  17.  病原体の存在を診断する学習モデルの作成方法であって、
     サンプルから採取した検体から得られるタンパク質の情報と、前記サンプルにおける病原体の種類とを含む教師データを複数取得し、
     前記教師データを用いて、前記タンパク質の情報を入力、前記サンプルに含まれる病原体の種類を出力とする学習モデルを生成する、学習モデルの生成方法。
    A method of creating a learning model for diagnosing the presence of pathogens.
    Obtain multiple teacher data including protein information obtained from the sample collected from the sample and the type of pathogen in the sample.
    A method for generating a learning model, which uses the teacher data to input information on the protein and output the type of pathogen contained in the sample.
  18.  採取された検体のタンパク質の情報である評価データを取得し、
     病原体に感染した感染者に由来する検体から得られるタンパク質の情報を入力、前記病原体の有無を出力とする教師データを用いて学習させた学習モデルに、取得した前記評価データを入力して、前記検体における前記病原体の有無を出力する処理をコンピュータに実行させるプログラム。
    Obtain evaluation data, which is information on the protein of the collected sample,
    The acquired evaluation data is input to the learning model trained by inputting information on proteins obtained from a sample derived from an infected person infected with a pathogen and using teacher data that outputs the presence or absence of the pathogen. A program that causes a computer to execute a process that outputs the presence or absence of the pathogen in a sample.
  19.  請求項1から請求項4までのいずれか1項に記載の学習モデルの生成方法において、
     前記教師データには、前記検体から得られる亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)、および、二硫化水素イオン(HS2 -)の少なくとも一つの情報がイオン情報として含まれ、
     前記学習モデルの生成において、前記タンパク質の情報および前記イオン情報を入力とし、前記病原体への感染の程度を出力とする、学習モデルの生成方法。
    In the method for generating a learning model according to any one of claims 1 to 4.
    In the teacher data, at least one information of sulfate ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) obtained from the sample is used as ion information. Included,
    A method for generating a learning model, in which information on the protein and information on the ions are input and the degree of infection with the pathogen is output in the generation of the learning model.
  20.  請求項15に記載の演算装置において、
     前記評価データには、前記評価検体から得られる亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)、および、二硫化水素イオン(HS2 -)の少なくとも一つの情報がイオン情報として含まれ、
     前記学習モデルは、前記教師検体から得られる亜硫酸イオン(HSO3 -)、チオ硫酸イオン(HS2O3 -)、および、二硫化水素イオン(HS2 -)の少なくとも一つの情報をさらに入力として得られる、演算装置。
    In the arithmetic unit according to claim 15,
    In the evaluation data, at least one information of sulfate ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) obtained from the evaluation sample is ion information. Included as
    The learning model is further input with at least one piece of information about sulfate ion (HSO 3- ) , thiosulfate ion (HS 2 O 3- ) , and hydrogen sulfide ion (HS 2- ) obtained from the teacher sample. The resulting arithmetic unit.
PCT/JP2021/029682 2020-08-17 2021-08-11 Learning model generation method, program, and calculation device WO2022039092A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020-137219 2020-08-17
JP2020137219A JP2023145811A (en) 2020-08-17 2020-08-17 Learning model generation method, program, and computation device

Publications (1)

Publication Number Publication Date
WO2022039092A1 true WO2022039092A1 (en) 2022-02-24

Family

ID=80322742

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/029682 WO2022039092A1 (en) 2020-08-17 2021-08-11 Learning model generation method, program, and calculation device

Country Status (2)

Country Link
JP (1) JP2023145811A (en)
WO (1) WO2022039092A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022244789A1 (en) * 2021-05-19 2022-11-24 国立大学法人東北大学 Biomarker for infectious disease

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008545960A (en) * 2005-05-23 2008-12-18 セント ジョージズ エンタープライゼズ リミテッド Tuberculosis diagnosis
JP2013513111A (en) * 2009-12-02 2013-04-18 ザ・リサーチ・フアウンデーシヨン・オブ・ステイト・ユニバーシテイ・オブ・ニユーヨーク Gas sensor with correction of baseline variation
JP2013066474A (en) * 2006-08-11 2013-04-18 Baylor Research Inst Gene expression signature in blood leukocyte permits differential diagnosis of acute infection
JP2019039896A (en) * 2017-08-26 2019-03-14 FAIMStech Japan株式会社 Disease determination system and healthcare service provision system
WO2019244646A1 (en) * 2018-06-18 2019-12-26 日本電気株式会社 Disease risk prediction device, disease risk prediction method, and disease risk prediction program
JP2020113285A (en) * 2014-08-14 2020-07-27 メメド ダイアグノスティクス リミテッド Computational analysis of biological data using manifold and hyperplane

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008545960A (en) * 2005-05-23 2008-12-18 セント ジョージズ エンタープライゼズ リミテッド Tuberculosis diagnosis
JP2013066474A (en) * 2006-08-11 2013-04-18 Baylor Research Inst Gene expression signature in blood leukocyte permits differential diagnosis of acute infection
JP2013513111A (en) * 2009-12-02 2013-04-18 ザ・リサーチ・フアウンデーシヨン・オブ・ステイト・ユニバーシテイ・オブ・ニユーヨーク Gas sensor with correction of baseline variation
JP2020113285A (en) * 2014-08-14 2020-07-27 メメド ダイアグノスティクス リミテッド Computational analysis of biological data using manifold and hyperplane
JP2019039896A (en) * 2017-08-26 2019-03-14 FAIMStech Japan株式会社 Disease determination system and healthcare service provision system
WO2019244646A1 (en) * 2018-06-18 2019-12-26 日本電気株式会社 Disease risk prediction device, disease risk prediction method, and disease risk prediction program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022244789A1 (en) * 2021-05-19 2022-11-24 国立大学法人東北大学 Biomarker for infectious disease

Also Published As

Publication number Publication date
JP2023145811A (en) 2023-10-12

Similar Documents

Publication Publication Date Title
US11864880B2 (en) Method for analysis of cough sounds using disease signatures to diagnose respiratory diseases
Wilson Advances in electronic-nose technologies for the detection of volatile biomarker metabolites in the human breath
Turner Techniques and issues in breath and clinical sample headspace analysis for disease diagnosis
WO2022039092A1 (en) Learning model generation method, program, and calculation device
Campbell et al. Bone lead levels and language processing performance
Robotti et al. Machine learning-based voice assessment for the detection of positive and recovered COVID-19 patients
Zhang et al. Breath analysis for medical applications
Sujono et al. Asthma identification using gas sensors and support vector machine
EP4018927A1 (en) Apparatus for identifying pathological states and corresponding method.
Kasbohm et al. Strategies for the identification of disease-related patterns of volatile organic compounds: prediction of paratuberculosis in an animal model using random forests
WO2022021664A1 (en) Method and system for rapidly detecting covid-19
Bhatia et al. Transfer learning for detection of COVID-19 infection using chest X-ray images
Jafarzadeh et al. Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard
Bhaskar et al. Automated COVID-19 Detection From Exhaled Human Breath Using CNN-CatBoost Ensemble Model
Smith et al. Selected ion flow tube mass spectrometry
Turner Voc analysis by sift-ms, gc-ms, and electronic nose for diagnosing and monitoring disease
US20230215569A1 (en) Systems and methods for detecting diseases based on the presence of volatile organic compounds in the breath
Ribeiro et al. A system for enhancing human-level performance in COVID-19 antibody detection
Heydari et al. Clustering of Infected Patients by COVID-19 Using Self-Organized Mapping and Extracting the Most Important Clinical Features
Brener et al. An electronic nose can identify humans by the smell of their ear
Nguyen et al. High-Dimensional Multivariate Longitudinal Data for Survival Analysis of Cardiovascular Event Prediction in Young Adults: Insights from a Comparative Explainable Study
US20210369136A1 (en) System & Method for Measurement of Respiratory Rate and Tidal Volume Through Feature Analysis of Breath Sounds to Detect Disease State
Roquencourt Signal processing and analysis of PTR-TOF-MS data from exhaled breath for biomarker discovery
Bai et al. Pathogen-driven Infectious Disease Recognition and Classification: An In-depth Analysis using Machine Learning and Deep Learning Methods
Haq et al. A Stacking Approach Based on Machine Learning Techniques for Lungs Cancer Prediction in Healthcare Systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21858239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21858239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP