CN113889279B - Combination therapy information mining and inquiring method, device and electronic equipment - Google Patents

Combination therapy information mining and inquiring method, device and electronic equipment Download PDF

Info

Publication number
CN113889279B
CN113889279B CN202111143489.4A CN202111143489A CN113889279B CN 113889279 B CN113889279 B CN 113889279B CN 202111143489 A CN202111143489 A CN 202111143489A CN 113889279 B CN113889279 B CN 113889279B
Authority
CN
China
Prior art keywords
combination therapy
information
clinical
test
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111143489.4A
Other languages
Chinese (zh)
Other versions
CN113889279A (en
Inventor
周立运
陈田甜
谢伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huabin Licheng Technology Co ltd
Original Assignee
Beijing Huabin Licheng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huabin Licheng Technology Co ltd filed Critical Beijing Huabin Licheng Technology Co ltd
Priority to CN202111143489.4A priority Critical patent/CN113889279B/en
Publication of CN113889279A publication Critical patent/CN113889279A/en
Application granted granted Critical
Publication of CN113889279B publication Critical patent/CN113889279B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device and electronic equipment for information mining and query of combined therapy, wherein the method for information mining of combined therapy comprises the following steps: acquiring test texts of all combined therapy tests; acquiring clinical treatment information of each combination therapy test based on the test text of each combination therapy test; acquiring clinical evaluation information and/or clinical result information of each combination therapy test based on the test text of each combination therapy test; constructing a set of combination therapy information based on clinical treatment information for each combination therapy trial, and the clinical assessment information and/or clinical outcome information. According to the method, the device and the electronic equipment for information mining and query of the combination therapy, provided by the invention, the information set of the combination therapy is constructed, so that the realization efficiency of the information mining of the combination therapy is effectively improved and the cost of the information mining of the combination therapy is reduced while the comprehensive and reliable information mining of the combination therapy is realized.

Description

Combination therapy information mining and inquiring method, device and electronic equipment
Technical Field
The invention relates to the technical field of computers, in particular to a method, a device and electronic equipment for information mining and query of a combination therapy.
Background
In order to pursue long-term disease control, prolong life cycle, improve curative effect and solve drug resistance, the combination therapy of two or more drugs is one of the good solutions.
For the clinical doctors, the combination of drugs is often actively or passively faced in clinic, and it is necessary to know whether the relevant clinical research foundation and the relevant clinical research result exist or not so as to obtain the implementation basis or prognosis evaluation of the combined drug treatment scheme. These pieces of information are scattered in a large amount of clinical trial registration data and literature data and are irregularly disclosed. The difficulty in obtaining such useful information makes it difficult for doctors to keep effective tracking and apply the latest research results in the world to clinical practice.
For drug research and development enterprises, developing a new combined treatment scheme may effectively pull clinical popularization of drugs and endow the drugs with new application scenes and life cycles. However, each new combination regimen has been introduced, requiring extensive clinical trial studies. Considering that the cost of clinical trials is high, in the phase of developing clinical trials of combination therapy, staff in pharmaceutical enterprises need to consult a large amount of clinical trial registration data and literature data or consult clinicians, and pay high attention to relevant combined drug discovery and research results that similar competitive products have been developed, so as to obtain or design a possibly beneficial combined drug clinical trial scheme. Similar to doctors, the staff of the medicine enterprise needs to spend a lot of time to search and arrange mass data, and the efficiency is very low.
It should be noted that, regardless of the physician or the pharmaceutical enterprise, not only the combination of two or more drugs, but also the combination of different target sites and different therapies is often needed. For example, PD1 drugs are used in combination with targeted therapeutic drugs, chemotherapeutic drugs. The existing clinical test data or literature data only refers to relevant drugs and does not indicate or mark targets or therapies. The completion of the above arrangement or research means that the whole amount of manual arrangement of the related mass data is required, which is difficult to realize.
In addition, the relevant medicine and indication information in different information sources or documents are not uniform in standard, time-consuming and inefficient in manual arrangement, and poor in result accuracy and reliability.
Disclosure of Invention
The invention provides a method, a device and electronic equipment for mining and inquiring information of a combination therapy, which are used for solving the problems that the existing information mining of the combination therapy needs manual arrangement, time and labor are wasted, and the reliability is poor.
The invention provides a combined therapy information mining method, which comprises the following steps:
acquiring test texts of all combined therapy tests;
acquiring clinical treatment information of each combination therapy test based on the test text of each combination therapy test;
acquiring clinical evaluation information and/or clinical result information of each combination therapy test based on the test text of each combination therapy test;
constructing a set of combination therapy information based on clinical treatment information for each combination therapy trial, and the clinical assessment information and/or clinical outcome information.
According to the method for mining the information of the combination therapy, the test text of each combination therapy test is acquired, and the method comprises the following steps:
extracting a treatment scheme text from the clinical trial text;
acquiring a drug entity contained in the treatment plan text;
screening test text drugs for each combination therapy trial based on the number of drug entities contained in the treatment regimen text.
According to the method for mining the information of the combination therapy, the clinical treatment information comprises at least one of indications, associated targets, associated treatment modes, clinical stages and treatment modes.
According to the method for mining the information of the combination therapy provided by the invention, the clinical treatment information of each combination therapy experiment is obtained based on the experiment text of each combination therapy experiment, and the method comprises at least one of the following steps:
selecting text indications of the combination therapy tests from test texts of the combination therapy tests, and standardizing the text indications based on an indication dictionary to obtain the indications of the combination therapy tests;
determining the associated target point of each combination therapy test based on the drug name contained in the test text of each combination therapy test and the relationship between the drug name and the target point established in advance;
determining an associated treatment mode of each combination therapy experiment based on the drug name contained in the experiment text of each combination therapy experiment and a pre-established relationship between the drug name and the treatment mode;
selecting and cleaning clinical stages of the combination therapy trials from the trial texts of the combination therapy trials;
the treatment type for each combination therapy trial was generated based on the clinical trial title and inclusion criteria contained in the trial text for each combination therapy trial.
According to the method for mining the information of the combination therapy provided by the invention, the clinical evaluation information and/or the clinical result information of each combination therapy experiment are obtained based on the test text of each combination therapy experiment, and the method comprises the following steps:
acquiring the curative effect index of each combination therapy test based on the test text of each combination therapy test, wherein the curative effect index comprises at least one of a control group, a primary endpoint, a secondary endpoint, an adverse reaction, a preset target and an author self-evaluation;
constructing clinical outcome information for each combination therapy trial based on at least one of the primary endpoint, the secondary endpoint, and the adverse reaction;
and/or evaluating the curative effect of each combination therapy experiment based on the curative effect index of each combination therapy experiment to obtain the clinical evaluation information of each combination therapy experiment.
According to the method for mining the information of the combination therapy provided by the invention, the construction of the information set of the combination therapy based on the clinical treatment information of each combination therapy test and the clinical evaluation information and/or the clinical result information comprises the following steps:
obtaining drug approval information in each combination therapy test based on drug data on the market of each country;
constructing a set of combination therapy information based on clinical treatment information, the clinical assessment information and/or clinical outcome information, and the drug approval information for each combination therapy trial.
The invention also provides a method for inquiring the information of the combination therapy, which comprises the following steps:
acquiring a target search term input by a user;
and screening clinical treatment information, clinical evaluation information and/or clinical result information of the combined therapy test corresponding to the target search term from a combined therapy information set, wherein the combined therapy information set is determined based on the combined therapy information mining method.
The present invention also provides a combination therapy information mining device, including:
the text acquisition unit is used for acquiring test texts of all the combination therapy tests;
an information acquisition unit for acquiring clinical treatment information of each combination therapy trial based on the trial text of each combination therapy trial;
the curative effect evaluation unit is used for acquiring clinical evaluation information and/or clinical result information of each combination therapy experiment based on the test text of each combination therapy experiment;
and the set construction unit is used for constructing a combined therapy information set based on clinical registration information of each combined therapy experiment and the clinical evaluation information and/or the clinical result information.
The present invention also provides a combination therapy information inquiry apparatus, including:
the search term acquiring unit is used for acquiring a target search term input by a user;
and the information screening unit is used for screening and obtaining clinical treatment information, clinical evaluation information and/or clinical result information of the combined therapy test corresponding to the target search term from a combined therapy information set, wherein the combined therapy information set is determined based on the combined therapy information mining method.
The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for information mining or the method for information query of a combination therapy as described in any of the above.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the combination therapy information mining method or the combination therapy information query method as described in any of the above.
According to the combination therapy information mining and inquiring method, device and electronic equipment, the test text of each combination therapy test is subjected to text analysis, so that the clinical treatment information, the clinical evaluation information and/or the clinical result information of each combination therapy test are obtained, and the combination therapy information set is constructed on the basis of the clinical treatment information, so that the realization efficiency of the combination therapy information mining is effectively improved and the cost of the combination therapy information mining is reduced while the comprehensive and reliable combination therapy information mining is realized.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method of information mining for combination therapy provided by the present invention;
FIG. 2 is a schematic flow chart of step 110 of the information mining method for combination therapy provided by the present invention;
FIG. 3 is a schematic flow chart illustrating step 130 of the information mining method for combination therapy provided by the present invention;
FIG. 4 is a flow chart illustrating step 140 of the information mining method for combination therapy provided by the present invention
FIG. 5 is a flow chart of a method for querying information on combination therapy provided by the present invention;
FIG. 6 is a schematic structural diagram of a combination therapy information mining device provided by the present invention;
FIG. 7 is a schematic structural diagram of a combination therapy information query device provided by the present invention;
fig. 8 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to pursue long-term disease control, prolong life cycle, improve curative effect and solve drug resistance, the combined treatment is one of the good solutions.
In addition, for drug enterprises, compared with the development of new target drugs, widening the appearance of new drugs on the market at the clinical trial stage for indications or helping the market of marketed drugs is a method with less risk and faster return, wherein how to prove that drugs can improve the curative effect at the clinical trial stage is a goal, and the combination with another drug or drugs is one of the main methods for improving the curative effect. Or the effective combined target points with known 1+1 > 2 are combined with innovative technologies, such as PROTEC, ADC, bispecific antibody and the like, to develop a new medicine targeting multiple target points simultaneously.
In the field of tumor therapy, the development of two-drug and above combined therapy is one of the current hotspots. Meanwhile, clinical tests are the stage with the longest time consumption and the largest capital investment in a new drug development cycle, and the key for improving drug competitiveness in the clinical stage is how to avoid risks, reduce cost and shorten time. When the prodrug enterprises develop the clinical test of combined treatment, many clinical tests only pair and pair existing products of companies, and the clinical test is developed rapidly, which undoubtedly increases the risk of failure and wastes resources. Therefore, prior to the development of clinical trials of combination therapies, drug enterprises need to understand the information of existing combination therapies.
In order to acquire the information of the combination therapy, the ocular and prodrug enterprises are customarily extracted from clinical test information, but usually can only start from the drug dimension and cannot perform statistical analysis on data from different dimensions of a clinical treatment scheme, even the combination of different dimensions; in fact, the medicine enterprise does not only want to know the information of the combination of two specific medicines in research, and the demand for the combined information of multiple dimensions is more urgent. In addition, the pharmaceutical enterprises need to know the information of multiple dimensions of the clinical treatment scheme of the combination therapy by reading published documents, internal research and development data of companies or consulting clinicians, but the information is huge in quantity, dispersion and high in repeatability, so that the full-scale information is difficult to obtain, useful information is difficult to extract from the huge information, and the obtained conclusion is greatly influenced by personal experience.
In conclusion, an efficient, quick, objective and reliable information mining method for the combination therapy is urgently needed at present, so that a drug enterprise is helped to screen out a combination therapy direction which can be explored by a target drug in a clinical test stage, failure risks are avoided to a great extent, and clinical test cost is reduced.
In view of the above problems, embodiments of the present invention provide a method for mining information of combination therapy. Fig. 1 is a schematic flow chart of a method for information mining of combination therapy provided by the present invention, as shown in fig. 1, the method includes:
step 110, test texts for each combination therapy trial are obtained.
In particular, combination therapy trials refer to clinical trials conducted as combination therapy. Combination therapy is a method of treating a disease using a reasonable combination of drugs having similar effects but different types, and it is usually necessary to perform a combination therapy test in order to determine the efficacy and safety of the combination therapy. The test text of each combination therapy test indicates text data in which information related to the combination therapy test is described. The sources of the test texts include, but are not limited to, Clinical test databases registered by ISRCTN, european union Clinical test databases EudraCT, Chinese Clinical test registration centers (Chinese Clinical test), Clinical test registration and information disclosure platforms (Chinese drug trials), papers and related documents, etc. The test text for each combination therapy trial herein is the combination therapy information that needs to be mined.
And step 120, acquiring clinical treatment information of each combination therapy experiment based on the test text of each combination therapy experiment.
In particular, the mining of the combination therapy information can be performed from the aspect of clinical treatment information of each combination therapy trial. The clinical treatment information is used for representing information related to a clinical treatment plan of the combination therapy, and may specifically include information of a Standard of Care (SOC) of a combination drug of the combination therapy, and may also include information of an indication, or may also include information of a related target, information of a related treatment mode, or information of a clinical development stage.
The mining of clinical treatment information can be realized by analyzing test texts of each combination therapy test. For example, the name of the drug involved in the text of any combination therapy trial can be obtained by physically extracting the trial text, i.e., the name information of the drug contained in the clinical treatment information of the combination therapy trial can be obtained.
The Clinical treatment information is mostly derived from information directly registered and disclosed on the platform, and here, when the Clinical treatment information of each combination therapy Trial is acquired, the Clinical treatment information can be acquired from a test text in which the registration and disclosure of each combination therapy Trial on the Clinical Trial registration platform is emphasized, and here, the Clinical Trial registration platform may be at least one of a Clinical Trial database Clinical Trial, an ISRCTN registered Clinical Trial database, an european union Clinical Trial database EudraCT, a chinese Clinical Trial registration center, a drug Clinical Trial registration, and an information disclosure platform.
And step 130, acquiring clinical evaluation information and/or clinical result information of each combination therapy experiment based on the experiment text of each combination therapy experiment.
In particular, the mining of the combination therapy information can also be performed in terms of the clinical effects of the respective combination therapy trials, where the clinical effects can be reflected in clinical evaluation information and/or clinical outcome information. The clinical evaluation information is used for representing the evaluation result of the clinical test of the combined drug of the combined therapy test, specifically may be a preset clinical result label, and may be set in sequence from bad to good according to the clinical result, such as "termination", "not good", "similar", "positive", "effective", and the like, and the clinical result information is used for representing the clinical test result of the combined therapy test, specifically may include a primary endpoint and a secondary endpoint of the combined therapy test, and may also include adverse reactions of the combined therapy test and the like. Here, the clinical evaluation information is evaluation information biased to subjectivity, and the clinical result information is a more objective actual test result.
The mining of clinical evaluation information can be realized by firstly analyzing the test texts of the combination therapy tests and then evaluating the curative effect of the combination therapy tests. For example, by analyzing the test text of any combination therapy test, the efficacy evaluation indexes of various aspects such as the primary endpoint, the secondary endpoint, the preset target and the like of the combination therapy test can be obtained, and then the efficacy evaluation indexes of the several aspects are comprehensively considered to carry out comprehensive efficacy evaluation on the combination therapy test to obtain the clinical evaluation information of the combination therapy test.
For mining of clinical result information, the clinical result information can be obtained by mining test results actually recorded in test texts of each combination therapy test, and specifically can be realized in the modes of rule matching, language segment classification, entity identification and the like, and the embodiment of the invention is not specifically limited to this.
It should be noted that clinical evaluation information and clinical result information are mostly derived from literature records, and here, when clinical evaluation information of each combination therapy trial is obtained, it may be emphasized that each combination therapy trial is obtained from a trial text described in a paper document.
In addition, the steps 120 and 130 respectively perform information mining of the combination therapy from both aspects of clinical treatment information and clinical effect, and the step 120 may be performed before or after the step 130, or may be performed in synchronization with the step 130, which is not specifically limited in the embodiment of the present invention.
The information registered in the clinical trial registration platform with the bias of step 120 and the thesis documents with the bias of step 130 belong to trial texts, the trial texts from different sources can be associated with each other through clinical registration numbers, specifically, in the clinical trial registration platform, the information of each clinical trial corresponds to one clinical registration number, in the thesis documents, the clinical registration numbers for discussion are usually disclosed in the abstract of the thesis documents, and the comprehensiveness of the trial texts can be ensured by combining the information of the same clinical registration numbers on the registration platform and the information in the thesis documents.
Step 140, constructing a combination therapy information set based on the clinical treatment information, the clinical evaluation information and/or the clinical result information of each combination therapy trial.
Specifically, after obtaining the clinical treatment information, the clinical evaluation information, and/or the clinical result information of each combination therapy trial, a combination therapy information set can be constructed based on the information, and the combination therapy information set collects the clinical treatment information, the clinical evaluation information, and/or the clinical result information of each combination therapy trial, and can be embodied as an information database of the combination therapy trial.
In particular, clinical treatment information, clinical assessment information and/or clinical outcome information of each combination therapy trial may be integrated to obtain a combination therapy information set. The combination therapy information set herein may include not only the above two information items, but also related information obtained based on the above two information items, for example, what the combination therapy test including a certain drug is specific, what the combination therapy test for treating a certain indication is specific, and specifically, when displaying, the information may be displayed in a table form, so as to facilitate the user to query and analyze data of the information, and the ranking when displaying may be set in consideration of the merits of clinical evaluation information of different combination therapy tests, which is not specifically limited in the embodiments of the present invention.
According to the method provided by the embodiment of the invention, the test texts of the combination therapy tests are subjected to text analysis to obtain the clinical treatment information, the clinical evaluation information and/or the clinical result information of the combination therapy tests, and the combination therapy information set is constructed on the basis of the obtained information, so that the realization efficiency of the information mining of the combination therapy is effectively improved and the cost of the information mining of the combination therapy is reduced while the comprehensive and reliable information mining of the combination therapy is realized.
Based on the above embodiment, fig. 2 is a schematic flowchart of step 110 in the method for mining information of combination therapy provided by the present invention, as shown in fig. 2, step 110 includes:
step 111, extracting a treatment scheme text from the clinical trial text;
step 112, acquiring a drug entity contained in the treatment plan text;
step 113, screening out test texts for each combination therapy trial based on the number of drug entities contained in the treatment protocol text.
Specifically, the Clinical test text refers to text data containing relevant information of Clinical tests, and the sources of obtaining the Clinical test text include, but are not limited to, Clinical test database Clinical Trial, Clinical test database registered by ISRCTN, european union Clinical test database EudraCT, Chinese Clinical test registry (Chinese Clinical Trial), pharmaceutical Clinical test registration and information bulletin platform (Chinese drug trials), papers and relevant documents, and the like. The clinical trial text may include a trial text of a clinical trial related to a single type of drug, or may include a trial text of a combination therapy trial, and the clinical trial text needs to be screened in order to obtain the trial text of the combination therapy trial.
Specifically, when the clinical trial text is screened, the treatment plan text of each clinical trial can be extracted first. For example, a clinical registration number corresponding to each clinical trial may be first obtained from a clinical trial text, and then, in the clinical trial text, with the clinical trial registration number as a search term, a text passage containing information about a treatment protocol of the clinical trial registration number may be searched as a treatment protocol text corresponding to each clinical trial registration number. It should be noted that each clinical trial corresponds to a clinical trial registration number, and the acquisition of the clinical registration number can be matched and identified through a preset rule. Here, the Clinical registration number corresponding to the Clinical Trial may be obtained from an existing registration platform, and may be obtained by querying at least one of a Clinical Trial database Clinical Trial triple, a Clinical Trial database registered by ISRCTN, an european union Clinical Trial database EudraCT, a chinese Clinical Trial registration center, and a pharmaceutical Clinical Trial registration and information presentation platform.
For the treatment scheme text corresponding to each clinical trial registration number, the drug entities contained in the treatment scheme text can be obtained from the treatment scheme text, so that the number of the drug entities contained in the treatment scheme text can be obtained by counting the number of different drug entity names contained in the treatment scheme text corresponding to one clinical trial registration number. Here, for the statistics of the number of drug entities included in the treatment plan text, the language segments describing the drugs in the treatment plan text may be directly located, the statistics of the drug entities is realized in a rule matching manner, or the division of the drugs in the language segments is realized in a word segmentation manner, and on this basis, the data of the drug entities is obtained through statistics, or the entity recognition may be performed on the treatment plan text, so that each drug entity included in the treatment plan text is obtained, and the data of the drug entities is obtained through statistics. To enable entity recognition for the treatment plan text, an entity recognition model may be trained in advance. The treatment scheme text can be input into a pre-trained entity recognition model, entity recognition is carried out on the treatment scheme text by the entity recognition model, and entity labels of each word in the treatment scheme entity text are output, wherein the labeling system of the entity recognition can be BIO, BIOES and the like, wherein B represents the beginning of an entity, E represents the end of the entity, I represents an intermediate word of the entity, O represents a non-entity, and S represents a single entity.
After the number of the drug entities contained in the treatment plan text is obtained, the test text of each combination therapy test can be screened and determined from the clinical test text according to the number of the drug entities. In specific implementation, the number of the drug entities may be filtered, the treatment scheme text containing the number of the drug entities greater than or equal to 2 is determined as the test text of the combined therapy test, the treatment scheme text may be labeled first, the treatment scheme containing the number of the drug entities greater than or equal to 2 is labeled as combined medication, the treatment scheme containing the number of the drug entities less than 2 is labeled as single medication, and then clinical test registration numbers corresponding to the treatment scheme labeled as combined medication are screened out, so as to determine the test text of the clinical test registration numbers of the combined medication.
It is noted that each combination therapy trial corresponds to a clinical trial accession number. The test text of the combination therapy test corresponding to the clinical test registration number obtained by searching the clinical test registration number can cover the text of the treatment scheme suitable for the combination therapy corresponding to the clinical test registration number.
The method provided by the embodiment of the invention determines the test texts of each combination therapy test by analyzing the number of the drug entities contained in the treatment scheme texts corresponding to the registration numbers of each clinical test, provides an information source for the subsequent information mining of each combination therapy, and is favorable for realizing comprehensive and reliable information mining of the combination therapy.
Based on the above embodiment, the treatment protocol text including the number of drug entities may be presented in the form shown in table 1, where table 1 reflects the standard drug names corresponding to the clinical trial registration numbers, where the standard drug names are obtained by matching the drug entity names obtained by entity recognition in the established drug dictionary in the database. In addition, Standard of Care (SOC) information of drug entity names, such as information of chemotherapy, radiotherapy, etc., is replaced with SOC.
Labeling the treatment protocol text may be presented in the form shown in table 2, with the treatment protocol labeled as: single or combined medicine.
TABLE 1
Clinical trial registration number Name of drug
NCTxxxxxxxx Laolatinib
NCTxxxxxxxx Nawuliuyumab + ipilimumab
NCTxxxxxxxx Nawuliu monoclonal antibody + ipilimumab +SOC
TABLE 2
Clinical trial registration number Name of drug
NCTxxxxxxxx Laolatinib (Single drug)
NCTxxxxxxxx Nawuliuyumab + ipilimumab (combination drug)
NCTxxxxxxxx Nawuliuyumab + ipilimumab + SOC (combination drug)
Subsequently, a test text corresponding to the clinical test registration number with the treatment scheme label of "combination drug" can be screened from table 2, and the test text is the test text of the combination therapy test.
Based on the above embodiments, the clinical treatment information includes at least one of an indication, an associated target, an associated treatment mode, a clinical stage, and a treatment type.
In particular, considering that when a drug enterprise conducts research on the establishment of a combination therapy test, not only the information of the combination of a plurality of specific drugs is required to be known, but also the requirement for the combined information of a plurality of dimensions is more urgent. For example, information on the combination therapy of a drug and a target (e.g., a drug + PD1), or information on the combination therapy of a drug and a therapeutic mode (e.g., a drug + chemotherapy). Accordingly, mining of clinical treatment information may be performed from at least one of the aspects of indications, associated targets, associated treatment patterns, clinical phases, and treatment types.
Herein, indications are characterized by the range of diseases that can be treated by the combination therapy regimen, e.g., the indication corresponding to the combination drug name "nivolumab + SOC" is gastric cancer and the indication corresponding to the combination drug name "nivolumab + cabozantinib" is renal cell carcinoma. The associated target is characterized by the name of the target corresponding to the name of the drug of the combination therapy, for example, the associated target can be PD1 or c-Kit, etc. The associated treatment pattern characterizes a treatment pattern corresponding to the drug name of the combination therapy, for example, the associated treatment pattern may be an immunotherapy or a targeted therapy, etc. The clinical stage is characterized by the clinical stage corresponding to the clinical trial of the combination therapy, for example, the clinical stage may be stage I, stage II, stage III or stage IV. The treatment type is characterized by the treatment type corresponding to the combination therapy test, and can be first-line treatment, second-line and above treatment, neoadjuvant treatment and the like.
Further, when performing information mining of each combination therapy, the clinical treatment information may include any one of the above-described indications, associated targets, associated treatment modes, clinical phases, and treatment types, may also include any two or three of the above, and may also include all four of the above.
Based on the above embodiment, step 120 includes at least one of the following steps:
selecting the text indications of the combination therapy tests from the test texts of the combination therapy tests, and standardizing the text indications based on an indication dictionary to obtain the indications of the combination therapy tests;
determining the associated target point of each combination therapy test based on the drug name contained in the test text of each combination therapy test and the relationship between the drug name and the target point established in advance;
determining an associated treatment mode of each combination therapy experiment based on the drug name contained in the experiment text of each combination therapy experiment and a pre-established relationship between the drug name and the treatment mode;
selecting and cleaning clinical stages of the combination therapy trials from the trial texts of the combination therapy trials;
the treatment type for each combination therapy trial was generated based on the clinical trial title and inclusion criteria contained in the trial text for each combination therapy trial.
Specifically, the mining of the indications for each combination therapy can be started from the textual indications of each combination therapy test, and the textual indications refer to the indication information corresponding to the clinical registration numbers of each combination therapy test. The indication dictionary is preset and contains the corresponding relation between the medicine and the indication. In the concrete implementation, the indication information corresponding to each clinical registration number, namely the text indication, can be obtained from the test text of each combination therapy test, and then the text indication data is cleaned and matched with the established indication dictionary in the database, so that the standardization of the text indication is completed, and the indication of each combination therapy test is obtained. Here, the text of the combination therapy test may be obtained by performing adaptation-verifying body recognition on the test text of each combination therapy test, or performing rule matching on the test text of the combination therapy test, which is not specifically limited in this embodiment of the present invention.
The mining of the associated target points of each combination therapy test can be started from the medicine names contained in the test texts of each combination therapy test, and after the medicine names contained in the treatment scheme texts of each combination therapy test are obtained, the target point names corresponding to the medicine names are obtained based on the established association relationship between the medicine names and the target points, and the associated target points of each combination therapy test are determined.
The mining of the association treatment pattern for each combination therapy test may also be started from the drug name included in the test text of each combination therapy test, and after the drug name included in the treatment plan text of each combination therapy test is acquired, the treatment pattern corresponding to the drug name is acquired based on the established association relationship between the drug name and the treatment pattern, and the association treatment pattern for each combination therapy test is determined.
The mining of the clinical stage of each combination therapy trial can be started from the trial text of each combination therapy trial, and the clinical stage information is obtained from the trial text, and the specific obtaining mode can be based on keyword matching or rule matching, and then the clinical stage of the combination therapy trial is obtained by performing standardization processing, for example, the obtained raw data is: phase 1, Phase 2, the clinical phases of wash normalization correspond to Phase I and Phase II, respectively.
The mining of the treatment types of each combination therapy test can be started from the clinical test title and the grouping standard contained in the test text of each combination therapy test, the clinical test title and the grouping standard of each combination therapy test are obtained and then are analyzed to obtain the treatment types corresponding to the combination therapy test, the clinical test title and the grouping standard are analyzed, specifically, the keyword detection can be carried out on the clinical test title and the grouping standard, the keyword obtained by the detection is subjected to rule matching with a preset rule, the treatment type corresponding to the matched rule is used as the treatment type of the combination therapy test, or the clinical test title and the grouping standard can be input into a pre-trained treatment type generation model, and the semantic meaning of the model based on the clinical test title and the grouping standard is generated through the treatment type, the treatment type of the combination therapy trial is generated and output.
After obtaining at least one of the indications, associated targets, associated treatment modes, clinical phases and treatment types of each combination therapy trial, the information can be integrated to obtain clinical treatment information for each combination therapy trial.
According to the method provided by the embodiment of the invention, at least one of the indication, the associated target, the associated treatment mode and the clinical stage of each combination therapy is obtained by mining the test text of each combination therapy test, and information of multiple dimensions such as diseases, targets and medicines is provided, so that the integration of the information of each combination therapy is realized.
Based on the above-described embodiments, mining for the treatment type of each combination therapy trial may be accomplished based on a generative model whose inputs are the clinical trial title explicitly labeled "title:" and the grouping Criteria explicitly labeled "Inclusion criterion:" and whose output is the treatment type of the combination therapy trial. The explicit labels of 'title:' and 'Inclusion criterion:' are input, so that the generative model can more easily learn the characteristic boundaries of the clinical trial title and grouping standard in the training process, and the requirement of the generative model on the training data volume is reduced.
For example, the generative model may be mt5 generative model, in the following table input _ text is an input sample, and thermal _ labels is an output sample, each input may correspond to an output consisting of multiple labels, and different labels may be separated by commas. Subsequently, in the process of model prediction, if the result output by the model contains commas, the combined therapy test corresponds to a plurality of treatment types and is separated according to the commas.
Figure BDA0003284857340000111
Figure BDA0003284857340000121
Based on the above examples, clinical treatment information in combination with indications, associated targets, associated treatment modalities, and clinical stages for each combination therapy trial can be presented in the form shown in table 3.
TABLE 3
Figure BDA0003284857340000122
Based on any of the above embodiments, fig. 3 is a schematic flowchart of step 130 in the method for mining information of combination therapy provided by the present invention, as shown in fig. 3, step 130 includes:
and 131, acquiring the curative effect indexes of the combined therapy tests based on the test texts of the combined therapy tests, wherein the curative effect indexes comprise at least one of a control group, a primary endpoint, a secondary endpoint, adverse reactions, a preset target and self evaluation of an author.
In particular, the assessment of efficacy with respect to clinical outcome of each combination therapy trial may be based on at least one of several dimensions: control, primary endpoint, secondary endpoint, adverse reaction, pre-set target and author self-evaluation.
The treatment plan of each clinical trial may include experimental group information and control group information, and for mining of these two pieces of information, the experimental group and control group information corresponding to each clinical trial registration number may be obtained from the trial text of each combination therapy trial, and the specific obtaining manner may be based on keyword matching or rule matching, for example, the experimental group information and the control group information may be directly extracted on a clinical trial registration platform based on the clinical registration numbers, or entity recognition or rule matching may be performed on the trial method sections in the titles and abstracts in the thesis documents. For example, the information of the experimental group and the control group corresponding to each clinical trial registration number is shown in table 4 below.
TABLE 4
Figure BDA0003284857340000131
The primary endpoint can reflect the purpose of a main clinical trial, can exactly reflect the main efficacy index of the drug effectiveness, and can be a clinical endpoint or a recognized alternative endpoint under the same research purpose of the indication. The secondary endpoint is an important supportive efficacy index associated with the primary clinical trial objective, or an efficacy index associated with a secondary objective.
The mining of primary and secondary endpoints can be initiated in conjunction with the test methods and test results of the therapy trial. In specific implementation, the test method and test result corresponding to each clinical registration number can be obtained from the test text of each combination therapy test. Here, the conventional writing habit of the paper literature, the abstract of the general paper can be expressed as four-stage writing or one-stage writing, wherein the four-stage writing means that the four-stage writing is respectively written in four stages directly according to the sequence of the "test purpose", "test method", "test result" and "test conclusion", and then the paragraph of the abstract of the "test method" and the paragraph of the "test result" can be directly located from the relevant paper literature of the combined therapy test; the one-stage writing means that all relevant contents of the test are written in one stage, at this time, the test text of the combined therapy test can be divided into sentences, each text obtained by dividing the sentences is respectively input into a classification model, the classification model can be pre-trained and used for distinguishing the language segment type of the input text, the language segment type output by the classification model can be a test method or a non-test method, and can also be a test purpose, a test method, a test result or a test conclusion, so that the test method and the test result are positioned.
On this basis, a pre-set clinical primary/secondary endpoint index can be obtained in the trial method by entity identification or based on rule matching. Based on the acquired preset clinical primary/secondary endpoint index information, acquiring 'primary endpoint data' or 'secondary endpoint data' or 'adverse reaction data' corresponding to the target clinical test from the corresponding test result, formatting, and extracting endpoint indexes and corresponding data.
Specifically, when a primary endpoint, a secondary endpoint and an adverse reaction are mined, rule matching can be performed directly based on preset rules, or entity recognition or rule matching can be performed on a corpus expressing the primary endpoint/the secondary endpoint in a test method paragraph, and if the primary endpoint and the secondary endpoint cannot be obtained through the entity recognition or the rule matching, entity recognition or rule matching can be performed on the whole test method paragraph.
For example, for any test method corresponding to the combination therapy test, the test method may be inputted into the entity recognition model in the form of text, and the type of the index outputted by the entity recognition model as the entity is obtained, as shown in table 5 below, where the text is the clinical test method, and the primary endpoint and the secondary endpoint are the index types obtained by the entity recognition respectively. Table 6 shows the endpoint indicators and corresponding test result data obtained after formatting, and table 6 shows the test result data for each endpoint indicator of a combination therapy test.
TABLE 5
Figure BDA0003284857340000141
TABLE 6
Figure BDA0003284857340000142
The preset target is a clinical trial target preset for the author of the combination therapy trial, and mining of the preset target can be started from the trial method of each combination therapy trial. The method comprises the steps of obtaining test method information corresponding to each clinical test registration number from a combined therapy test text, judging a clinical test target preset by an author based on emotion analysis, for example, extracting a text from a main end point to a text ending part of a text from a test method language section, carrying out emotion analysis on the part of the text, and carrying out emotion analysis on the whole test method language section if the part of the text cannot be found. Clinical trial objectives included: the expected clinical result of the experimental group is better than that of the control group, and the expected clinical result of the experimental group is not worse than that of the control group. Labeling is carried out aiming at a clinical test target preset by an author to obtain a preset target of good effect and bad effect, wherein the good effect represents that the expected clinical result of the experimental group is better than that of the control group, and the bad effect represents that the expected clinical result of the experimental group is not worse than that of the control group. Before mining the preset target, the sample test method and the corresponding preset target label can be collected in advance, the emotion analysis model is trained based on the sample test method, the emotion analysis model can learn the emotion contained in the sample test method and corresponds to the preset target, and when the preset target is obtained, the text of the test method can be directly input into the trained emotion analysis model, so that the preset target is obtained.
For example, the preset targets for any of the combination therapy trials can be presented as shown in table 7. The text is a clinical test target preset by an author, and the result is a preset target.
Figure BDA0003284857340000151
For the author's self-review, one can start with a text of discussion information for the clinical outcome from the authors of each combination therapy trial. The discussion information text of the author corresponding to each clinical registration number for the clinical result can be obtained from the test text of each combination therapy test, and the author self-evaluation clinical result label is preset based on the discussion information text, such as: the method is effective, positive, non-inferior, similar and negative, and utilizes Natural Language Processing (NLP) to analyze emotion of the target text based on rules or an automatic system to obtain self-rating information of the author on the clinical test result. Here, the discussion information text corresponding to the combination therapy test may be a "test conclusion" section in the abstract of the paper literature, if the section does not exist in the abstract, the abstract of the paper literature of the combination therapy test may be divided into sentences, and each text obtained by dividing the sentences is respectively input into a classification model, the classification model may be pre-trained for distinguishing the type of the section to which the input text belongs, and the type of the section output by the classification model may be the discussion information section or a non-discussion information section, or may be a "test purpose", "test method", "test result" or "test conclusion", thereby locating the "test conclusion".
For example, the authors self-rated for any combination therapy trial, as shown in table 8. Wherein, the text is the discussion information text of the author for the clinical test result, and the label is the obtained self-rating information of the author.
TABLE 8
Figure BDA0003284857340000152
Figure BDA0003284857340000161
Step 132, constructing clinical outcome information for each combination therapy trial based on at least one of the primary endpoint, the secondary endpoint, and the adverse reaction;
and/or evaluating the curative effect of each combination therapy experiment based on the curative effect index of each combination therapy experiment to obtain the clinical evaluation information of each combination therapy experiment.
Specifically, in the efficacy indexes of the combination therapy tests obtained in step 131, the primary endpoint, the secondary endpoint and the adverse reaction are objective results obtained by the tests, and can be used for constructing clinical result information of the combination therapy tests.
In addition, the efficacy of each combination therapy trial may be evaluated based on the above-mentioned index obtained in step 131 to obtain clinical evaluation information of each combination therapy trial. In specific implementation, clinical evaluation information labels, such as "terminate", "not good", "similar", "not good", "positive", and "good" may be preset based on preset standard clinical evaluation rules.
For example, standard clinical assessment rules may be as follows:
firstly, if the self-evaluation information of the author on the clinical test result is 'excellent effect',
if the control group medicament in the clinical test scheme is a positive medicament;
the test scheme presets a target of "optimum effect";
the primary/secondary endpoints in the clinical test result reach the standard;
then: the clinical assessment information is "optimal effect";
② if the author self-appraises "positive",
if the control group drug in the clinical trial protocol is blank or placebo;
the test scheme presets a target of "optimum effect";
the primary/secondary endpoints in the clinical test result reach the standard;
then: clinical assessment information was "positive";
thirdly, if the author self-rates as 'positive',
the main endpoint in the clinical test result reaches the standard;
then: clinical assessment information was "positive";
if the author self-evaluates as "similar",
if the control group medicament in the clinical test scheme is a positive medicament;
the primary/secondary endpoints in the clinical test result reach the standard;
then: clinical assessment information was "similar";
fifthly, if the control group medicine in the clinical test scheme is a positive medicine;
the test scheme preset target is "not inferior";
the main end point in the clinical test result does not reach the standard;
then: clinical assessment information was "poor";
sixthly, if the author self-judges as 'passive',
then: clinical assessment information was "poor";
seventhly, if the control group medicament in the clinical test scheme is blank or placebo;
the test scheme presets a target of "optimum effect";
the main end point in the clinical test result does not reach the standard;
then: clinical assessment information was "poor";
if the information corresponding to each clinical registration number obtained from the clinical test text contains any information related to test termination, the clinical evaluation information is 'termination'.
For example, the obtained original information is: recirculation Status: terminated (corporation Decision to terminate stub after Lead-In port of the stub complete.)
The clinical assessment information is "terminated".
According to the method provided by the embodiment of the invention, the curative effect evaluation is carried out on the combined therapy test through a plurality of dimensions such as a control group, a primary endpoint, a secondary endpoint, adverse reactions, a preset target, self evaluation of an author and the like, so that objective and accurate evaluation on the test result of the combined therapy is realized.
Based on any of the above embodiments, fig. 4 is a schematic flowchart of step 140 in the method for mining information of combination therapy provided by the present invention, as shown in fig. 4, step 140 includes:
step 141, obtaining drug approval information in each combination therapy test based on drug data on the market in each country;
step 142, constructing a combination therapy information set based on the clinical treatment information, clinical evaluation information and/or clinical result information, and drug approval information of each combination therapy trial.
Specifically, the Drug data on the market of each country refers to the drugs and related information that have been approved by each country, and may include approved Drug lists published by organizations such as the Food and Drug Administration (FDA)/Drug Evaluation Center For National Food and Drug Administration (CDE)/European drugs Administration (EMA)/NMPA), or Drug names, indications, clinical registration numbers extracted and standardized from Drug specifications.
Performing a traversal search on the above listed drug data in each country according to the clinical treatment information or clinical trial registration number obtained in step 120, if the clinical treatment information or clinical trial registration number is included in the drug data in each country, marking an approved national bureau in the drug approval information, for example, the drug approval information may be one of FDA/EMA/CDE/NMPA; when the clinical treatment information or clinical test registration number of the combination therapy test is not contained in the drug data on the market of each country, the drug approval information is null.
After obtaining clinical treatment information, drug approval information, and clinical assessment information and/or clinical outcome information for each combination therapy trial, the information can be integrated to obtain each combination therapy information set.
For example, the resulting sets of combination therapy information can be presented in the form of table 9.
TABLE 9
Figure BDA0003284857340000181
According to the method provided by the embodiment of the invention, the information set of each combination therapy is obtained by integrating the clinical treatment information, the drug approval information, the clinical evaluation information and/or the clinical result information, so that the realization efficiency of the information mining of the combination therapy is effectively improved and the cost of the information mining of the combination therapy is reduced while the comprehensive and reliable information mining of the combination therapy is realized.
Fig. 5 is a schematic flow chart of a method for querying information of a combination therapy provided by the present invention, as shown in fig. 5, the method includes:
step 510, acquiring a target search term input by a user;
step 520, screening the clinical treatment information, the clinical evaluation information and/or the clinical result information of the combined therapy test corresponding to the target search term from the combined therapy information set, wherein the combined therapy information set is determined based on the combined therapy information mining method.
Specifically, based on the combination therapy information set obtained by the method, a user can search and perform statistical analysis on data in any dimension or combination dimension such as medicine, indication, target, treatment mode and the like to obtain an effective medicine or target combination scheme, select a proper indication and assist the marketed medicines to widen the indication more quickly or new medicines to market more quickly.
In specific implementation, data can be aggregated based on the same 'combined drug name', any one or any combination of index words of indications, drugs, targets and treatment modes can be input, treatment data of the current combined scheme can be obtained, for example, the 'PD-1/PD-L1' target + 'hepatocellular carcinoma' is retrieved, combined therapy test information required to be inquired by a user is obtained, and then statistical analysis is carried out on the data.
According to the method provided by the embodiment of the invention, through single or combined screening strategies of multiple dimensions such as indications, targets, medicines, treatment modes and the like, a user can be flexibly and accurately helped to screen out the combined treatment direction which can be explored by a target product in a clinical test stage, so that the failure risk is avoided to a great extent, and the clinical test cost is reduced.
Based on the above embodiments, the query and data analysis of the combination therapy information set can be performed as follows:
if the search term "PD-1/PD-L1" target site + "hepatocellular carcinoma" is input, the combination therapy information as shown in Table 10 below is obtained.
Watch 10
Figure BDA0003284857340000191
Figure BDA0003284857340000201
By analyzing the data according to the data, the following results can be obtained: the first four combined drugs are shown as above (only three tumor species of renal cell carcinoma, hepatocellular carcinoma and triple negative breast cancer are shown), so that the 'nivolumab + ipilimumab' is still the hottest combination scheme at present and is mature, and the research success possibility is greatly improved no matter the existing PD1/PD-L1 inhibitor and CTLA4 inhibitor in a user pipeline are combined for carrying out clinical tests, or a new technology such as a bispecific antibody is utilized to develop a new drug to target the two targets. In addition, the combination of PD-1/PD-L1 inhibitor and "VEGFR" targeting drug is also the more combination scheme used in current clinical trials, especially the research on the combination of the acitinizumab and bevacizumab has been approved in hepatocellular carcinoma, and at present, there are many combinations of "PD-1/PD-L1" + "VEGFR" in hepatocellular carcinoma, such as the combination of Carrayleigh mab + Apatinib, Pabolizumab + Revatinib, Nawaruzumab + Cabotinib, etc.
Furthermore, the data can be further analyzed by the dimensions of drugs, targets, indications and the like.
Dimension 1, drugs
The data is aggregated for the same disease with the target drug selected by the user, all combination regimens for the target drug, e.g.,
Figure BDA0003284857340000202
and (3) data analysis: there are 64 combined treatment regimens of cimiraprizumab, the combination mainly focused on non-small cell lung cancer, head and neck squamous cell carcinoma, melanoma, in non-small cell lung cancer the drug combined with cimiraprizumab was mainly ipilimumab and has progressed to phase III clinical trial, in head and neck squamous cell carcinoma the drug combined with cimiraprizumab was mainly chemotherapy, and has progressed to phase II clinical trial. Therefore, for drugs with similar target to cimetipril mab, this combination mode can also be referred to in non-small cell lung cancer and head and neck squamous cell carcinoma if they are just entering clinical trial, but at the same time it is also noted that cimetipril mab will preempt the corresponding market share in advance because it is developed later than cimetipril mab.
Dimension 2, target
The clinical trial is aggregated with the user-selected target according to the same combination drug target, e.g.,
Figure BDA0003284857340000211
Figure BDA0003284857340000221
it should be noted that in the above table, the inner frame is a dashed line, and each row represents a co-therapy trial at the target site indicated by the solid line inner frame above.
On the basis of the above, a clinical trial starting time dimension can be added, and the clinical trial starting and ending dates corresponding to the registration numbers of each clinical trial can be obtained from the clinical trial text, for example, the most common target points in combination with PD1 are VEGFR, CTLA4, RET, etc., among the schemes in combination with VEGFR, a plurality of schemes are as follows: axitinib + palbociclumab was approved by FDA, and most of the study results were positive results, and 2018-. In the scheme combined with CTLA4, PDL1 drug is mainly nivolumitumumab, which means that the scheme of ipilimumab + nivolumitumumab is approved by FDA, and from the beginning of the experiment, 2017 and 2019 are the periods with the highest combination heat of the target, and the heat is reduced at present.
3, disease of dimension
Clinical trials are aggregated in the same combination therapy mode with the user selected target disease, e.g.,
Figure BDA0003284857340000222
Figure BDA0003284857340000231
and (3) data analysis: the most popular in the field of liver cancer is still the mode of immune + targeted therapy, most of the drugs for immunotherapy are rayleigh monoclonal antibody, palbociclumab, certilizumab, and cetirizumab, most of the drugs for combination with immunotherapy are small-molecule drugs, such as ranvatinib, regorafenib, cabozinib, and apatinib, and most of the results are positive, wherein the replacing monoclonal antibody and bevacizumab have been approved.
In the following, a description will be given of a combination therapy information mining device according to the present invention, and the combination therapy information mining device described below and the combination therapy information mining method described above may be referred to in correspondence with each other. Fig. 6 is a schematic structural diagram of a combination therapy information mining device provided by the present invention, and as shown in fig. 6, the device includes:
a text acquisition unit 610 for acquiring test texts for each combination therapy test;
an information obtaining unit 620 configured to obtain clinical treatment information of each combination therapy test based on the test text of each combination therapy test;
a curative effect evaluation unit 630, configured to obtain clinical evaluation information and/or clinical result information of each combination therapy trial based on the trial text of each combination therapy trial;
a set constructing unit 640, configured to construct a combination therapy information set based on the clinical registration information of each combination therapy trial, and the clinical evaluation information and/or the clinical result information.
According to the device provided by the embodiment of the invention, the test texts of the combination therapy tests are subjected to text analysis to obtain the clinical treatment information, the clinical evaluation information and/or the clinical result information of the combination therapy tests, and the combination therapy information set is constructed on the basis of the obtained information, so that the realization efficiency of the information mining of the combination therapy is effectively improved and the cost of the information mining of the combination therapy is reduced while the comprehensive and reliable information mining of the combination therapy is realized.
Based on the above embodiment, the text obtaining unit 610 is further configured to:
extracting a treatment scheme text from the clinical trial text;
acquiring a drug entity contained in the treatment plan text;
test texts for each combination therapy trial were screened based on the number of drug entities contained in the treatment regimen text.
Based on the above embodiment, the information obtaining unit 620 is configured to at least one of the following steps:
selecting the text indications of the combination therapy tests from the test texts of the combination therapy tests, and standardizing the text indications based on an indication dictionary to obtain the indications of the combination therapy tests;
determining the associated target point of each combination therapy test based on the drug name contained in the test text of each combination therapy test and the relationship between the drug name and the target point established in advance;
determining an associated treatment mode of each combination therapy experiment based on the medicine name contained in the test text of each combination therapy experiment and the relationship between the medicine name and the treatment mode which are established in advance;
selecting and cleaning clinical stages of the combination therapy trials from the trial texts of the combination therapy trials;
the treatment type for each combination therapy trial was generated based on the clinical trial title and inclusion criteria contained in the trial text for each combination therapy trial.
Based on any of the above embodiments, the curative effect evaluation unit 630 is further configured to:
acquiring the curative effect index of each combination therapy test based on the test text of each combination therapy test, wherein the curative effect index comprises at least one of a control group, a primary endpoint, a secondary endpoint, an adverse reaction, a preset target and an author self-evaluation;
constructing clinical outcome information for each combination therapy trial based on at least one of the primary endpoint, the secondary endpoint, and the adverse reaction;
and/or evaluating the curative effect of each combination therapy experiment based on the curative effect index of each combination therapy experiment to obtain the clinical evaluation information of each combination therapy experiment.
Based on any of the above embodiments, the set constructing unit 640 is further configured to:
obtaining drug approval information in each combination therapy test based on drug data on the market of each country;
constructing a combination therapy information set drug based on clinical treatment information, the clinical evaluation information and/or clinical result information, and the drug approval information of each combination therapy trial.
In the following, a combination therapy information query device according to the present invention is described, and the combination therapy information query device described below and the combination therapy information query method described above may be referred to in correspondence with each other. Fig. 7 is a schematic structural diagram of a combination therapy information query device provided by the present invention, and as shown in fig. 7, the device includes:
a search term obtaining unit 710, configured to obtain a target search term input by a user;
an information screening unit 720, configured to screen clinical registration information of the combination therapy trial corresponding to the target search term, and clinical evaluation information and/or clinical result information from a combination therapy information set, where the combination therapy information set is determined based on the combination therapy information mining method as described above.
The device provided by the embodiment of the invention helps a user to screen out the combined treatment direction which can be explored by a target product in a clinical test stage through single or combined screening strategies of multiple dimensions such as indications, targets, medicines, treatment modes and the like, so that the failure risk is avoided to a great extent, and the clinical test cost is reduced.
Fig. 8 illustrates a physical structure diagram of an electronic device, and as shown in fig. 8, the electronic device may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. Processor 810 may invoke logic instructions in memory 830 to perform a combination therapy information mining method or a combination therapy information query method. The information mining method of the combination therapy comprises the following steps:
acquiring test texts of all combination therapy tests;
acquiring clinical treatment information of each combination therapy test based on the test text of each combination therapy test;
acquiring clinical evaluation information and/or clinical result information of each combination therapy test based on the test text of each combination therapy test;
a set of combination therapy information is constructed based on clinical treatment information, as well as clinical assessment information and/or clinical outcome information for each combination therapy trial.
The combined therapy information query method comprises the following steps: acquiring a target search term input by a user;
and screening clinical treatment information, clinical evaluation information and/or clinical result information of the combined therapy test corresponding to the target search term from a combined therapy information set, wherein the combined therapy information set is determined based on the combined therapy information mining method.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, which includes a computer program, which can be stored on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, the computer can execute the combination therapy information mining method or the combination therapy information query method provided by the above methods. The information mining method of the combination therapy comprises the following steps:
acquiring test texts of all combined therapy tests;
acquiring clinical treatment information of each combination therapy test based on the test text of each combination therapy test;
acquiring clinical evaluation information and/or clinical result information of each combination therapy test based on the test text of each combination therapy test;
a set of combination therapy information is constructed based on clinical treatment information, as well as clinical assessment information and/or clinical outcome information for each combination therapy trial.
The combined therapy information query method comprises the following steps: acquiring a target search term input by a user;
and screening clinical treatment information, clinical evaluation information and/or clinical result information of the combined therapy test corresponding to the target search term from a combined therapy information set, wherein the combined therapy information set is determined based on the combined therapy information mining method.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the combination therapy information mining method or the combination therapy information query method provided by the above methods. The information mining method of the combination therapy comprises the following steps: acquiring test texts of all combined therapy tests;
acquiring clinical treatment information of each combination therapy test based on the test text of each combination therapy test;
acquiring clinical evaluation information and/or clinical result information of each combination therapy test based on the test text of each combination therapy test;
a set of combination therapy information is constructed based on clinical treatment information, as well as clinical assessment information and/or clinical outcome information for each combination therapy trial.
The combined therapy information query method comprises the following steps: acquiring a target search term input by a user;
and screening clinical treatment information, clinical evaluation information and/or clinical result information of the combined therapy test corresponding to the target search term from a combined therapy information set, wherein the combined therapy information set is determined based on the combined therapy information mining method.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (6)

1. A method of co-therapy information mining, comprising:
acquiring test texts of all combined therapy tests;
acquiring clinical treatment information of each combination therapy test based on the test text of each combination therapy test;
acquiring clinical evaluation information and clinical result information of each combination therapy experiment based on a test text of each combination therapy experiment, wherein the clinical evaluation information is obtained by evaluating the curative effect of each combination therapy experiment based on the curative effect index of each combination therapy experiment, the curative effect index comprises at least one of a control group, a primary endpoint, a secondary endpoint, an adverse reaction, a preset target and an author self-evaluation, and the clinical result information is constructed based on at least one of the primary endpoint, the secondary endpoint and the adverse reaction;
constructing a combination therapy information set based on clinical treatment information of each combination therapy experiment, and the clinical evaluation information and the clinical result information;
the obtaining of test texts for each combination therapy trial comprises:
extracting a treatment scheme text from the clinical trial text;
acquiring a drug entity contained in the treatment plan text;
screening test texts of each combination therapy test based on the number of drug entities contained in the treatment plan text;
the clinical treatment information comprises at least one of an indication, an associated target, an associated treatment mode, a clinical stage, and a treatment type;
the method for acquiring the clinical treatment information of each combination therapy test based on the test text of each combination therapy test comprises at least one of the following steps:
selecting text indications of the combination therapy tests from test texts of the combination therapy tests, and standardizing the text indications based on an indication dictionary to obtain the indications of the combination therapy tests;
determining the associated target point of each combination therapy test based on the drug name contained in the test text of each combination therapy test and the relationship between the drug name and the target point established in advance;
determining an associated treatment mode of each combination therapy experiment based on the drug name contained in the experiment text of each combination therapy experiment and a pre-established relationship between the drug name and the treatment mode;
selecting and cleaning clinical stages of the combination therapy trials from the trial texts of the combination therapy trials;
the treatment type for each combination therapy trial was generated based on the clinical trial title and inclusion criteria contained in the trial text for each combination therapy trial.
2. The combination therapy information mining method according to claim 1, wherein constructing a combination therapy information set based on clinical treatment information of each combination therapy trial, and the clinical assessment information and clinical outcome information comprises:
obtaining drug approval information in each combination therapy test based on drug data on the market of each country;
constructing a set of combination therapy information based on clinical treatment information, the clinical assessment information and/or clinical outcome information, and the drug approval information for each combination therapy trial.
3. A method for querying combination therapy information, comprising:
acquiring a target search term input by a user;
screening clinical treatment information of a combination therapy trial corresponding to the target search term, and clinical evaluation information and/or clinical result information from a combination therapy information set determined based on the combination therapy information mining method according to claim 1 or 2.
4. A combination therapy information mining device, comprising:
the text acquisition unit is used for acquiring test texts of all the combination therapy tests;
an information acquisition unit for acquiring clinical treatment information of each combination therapy trial based on the trial text of each combination therapy trial;
the curative effect evaluation unit is used for obtaining clinical evaluation information and clinical result information of each combined therapy experiment based on the test text of each combined therapy experiment, the clinical evaluation information is obtained by evaluating the curative effect of each combined therapy experiment based on the curative effect index of each combined therapy experiment, the curative effect index comprises at least one of a control group, a primary endpoint, a secondary endpoint, an adverse reaction, a preset target and an author self-evaluation, and the clinical result information is constructed based on at least one of the primary endpoint, the secondary endpoint and the adverse reaction;
the set construction unit is used for constructing a combined therapy information set based on clinical registration information of each combined therapy experiment, the clinical evaluation information and the clinical result information;
the text obtaining unit is further configured to:
extracting a treatment scheme text from the clinical trial text;
acquiring a drug entity contained in the treatment plan text;
screening test texts of each combination therapy test based on the number of drug entities contained in the treatment plan text;
the clinical treatment information comprises at least one of an indication, an associated target, an associated treatment mode, a clinical stage, and a treatment type;
the information obtaining unit is further configured to perform at least one of the following steps:
selecting text indications of the combination therapy tests from test texts of the combination therapy tests, and standardizing the text indications based on an indication dictionary to obtain the indications of the combination therapy tests;
determining the associated target point of each combination therapy test based on the drug name contained in the test text of each combination therapy test and the relationship between the drug name and the target point established in advance;
determining an associated treatment mode of each combination therapy experiment based on the drug name contained in the experiment text of each combination therapy experiment and a pre-established relationship between the drug name and the treatment mode;
selecting and cleaning clinical stages of the combination therapy trials from the trial texts of the combination therapy trials;
the treatment type for each combination therapy trial was generated based on the clinical trial title and inclusion criteria contained in the trial text for each combination therapy trial.
5. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the combination therapy information mining method of claim 1 or 2 or the steps of the combination therapy information query method of claim 3.
6. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, performs the steps of the combination therapy information mining method of claim 1 or 2 or the steps of the combination therapy information query method of claim 3.
CN202111143489.4A 2021-09-28 2021-09-28 Combination therapy information mining and inquiring method, device and electronic equipment Active CN113889279B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111143489.4A CN113889279B (en) 2021-09-28 2021-09-28 Combination therapy information mining and inquiring method, device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111143489.4A CN113889279B (en) 2021-09-28 2021-09-28 Combination therapy information mining and inquiring method, device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113889279A CN113889279A (en) 2022-01-04
CN113889279B true CN113889279B (en) 2022-08-05

Family

ID=79007506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111143489.4A Active CN113889279B (en) 2021-09-28 2021-09-28 Combination therapy information mining and inquiring method, device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113889279B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114822859B (en) * 2022-03-31 2023-11-03 数魔方(北京)医药科技有限公司 Treatment thread mining and searching method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109830302A (en) * 2019-01-28 2019-05-31 北京交通大学 Medication mode excavation method, apparatus and electronic equipment
CN110364266A (en) * 2019-06-28 2019-10-22 深圳裕策生物科技有限公司 For instructing the database and its construction method and device of clinical tumor personalized medicine
CN111223543A (en) * 2020-02-13 2020-06-02 曹庆恒 Method, system and equipment for intelligently guiding treatment scheme
CN112489812A (en) * 2020-11-30 2021-03-12 北京华彬立成科技有限公司 Drug development analysis method, drug development analysis device, electronic device, and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013006783A1 (en) * 2011-07-07 2013-01-10 Georgetown University System and method for performing pharmacovigilance
CN105144179B (en) * 2013-01-29 2019-05-17 分子健康股份有限公司 System and method for clinical decision support
US20170372018A1 (en) * 2016-06-28 2017-12-28 Melrose Pain Solutions LLC Melrose Pain Solutions® Method and Algorithm: Managing Pain in Opioid Dependent Patients
CN109545284A (en) * 2018-10-16 2019-03-29 中国人民解放军军事科学院军事医学研究院 Drug integrated information database building method and system based on drug and target information
CN109830303A (en) * 2019-02-01 2019-05-31 上海众恒信息产业股份有限公司 Clinical data mining analysis and aid decision-making method based on internet integration medical platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109830302A (en) * 2019-01-28 2019-05-31 北京交通大学 Medication mode excavation method, apparatus and electronic equipment
CN110364266A (en) * 2019-06-28 2019-10-22 深圳裕策生物科技有限公司 For instructing the database and its construction method and device of clinical tumor personalized medicine
CN111223543A (en) * 2020-02-13 2020-06-02 曹庆恒 Method, system and equipment for intelligently guiding treatment scheme
CN112489812A (en) * 2020-11-30 2021-03-12 北京华彬立成科技有限公司 Drug development analysis method, drug development analysis device, electronic device, and storage medium

Also Published As

Publication number Publication date
CN113889279A (en) 2022-01-04

Similar Documents

Publication Publication Date Title
US11714839B2 (en) Apparatus and method for automated and assisted patent claim mapping and expense planning
Liu et al. DrugCombDB: a comprehensive database of drug combinations toward the discovery of combinatorial therapy
Cheung et al. Current trends in flow cytometry automated data analysis software
Tari et al. Systematic drug repurposing through text mining
Wu et al. Ranking gene-drug relationships in biomedical literature using latent dirichlet allocation
US10878010B2 (en) System and method for clinical trial candidate matching
Droste et al. Information on ethical issues in health technology assessment: how and where to find them
Chen et al. Spreadsheet property detection with rule-assisted active learning
Lever et al. Text-mining clinically relevant cancer biomarkers for curation into the CIViC database
CN109684468B (en) Document screening and labeling system aiming at evidence-based medicine
CN113539515A (en) Clinical demand mining method and device, electronic equipment and storage medium
Rinaldi et al. Relation mining experiments in the pharmacogenomics domain
French et al. Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application
CN113889279B (en) Combination therapy information mining and inquiring method, device and electronic equipment
CN113674867A (en) Clinical data mining method and device, electronic equipment and storage medium
Sanchez-Graillet et al. An annotated corpus of clinical trial publications supporting schema-based relational information extraction
CN112466463A (en) Intelligent answering system based on tumor accurate diagnosis and treatment knowledge graph
Chang et al. Understanding common key indicators of successful and unsuccessful cancer drug trials using a contrast mining framework on ClinicalTrials. gov
CN114121293A (en) Clinical trial information mining and inquiring method and device
Duan et al. The top 100 most-cited papers in pheochromocytomas and paragangliomas: A bibliometric study
San Torcuato et al. Tracking Openness and Topic Evolution of COVID-19 Publications January 2020-March 2021: Comprehensive Bibliometric and Topic Modeling Analysis
Zeng et al. Adapting a natural language processing tool to facilitate clinical trial curation for personalized cancer therapy
CN109086570B (en) Multi-database sequential interaction method and device
Samuel et al. Mining online full-text literature for novel protein interaction discovery
Hou et al. Mining and standardizing chinese consumer health terms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant