CN119889552B - Medical examination project reference interval suitability evaluation system - Google Patents

Medical examination project reference interval suitability evaluation system Download PDF

Info

Publication number
CN119889552B
CN119889552B CN202510348983.6A CN202510348983A CN119889552B CN 119889552 B CN119889552 B CN 119889552B CN 202510348983 A CN202510348983 A CN 202510348983A CN 119889552 B CN119889552 B CN 119889552B
Authority
CN
China
Prior art keywords
reference interval
data
test
test item
abnormal rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510348983.6A
Other languages
Chinese (zh)
Other versions
CN119889552A (en
Inventor
杨大干
赵敏
胡长爱
范利娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
First Hospital of China Medical University
First Affiliated Hospital of Zhejiang University School of Medicine
Original Assignee
First Hospital of China Medical University
First Affiliated Hospital of Zhejiang University School of Medicine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Hospital of China Medical University, First Affiliated Hospital of Zhejiang University School of Medicine filed Critical First Hospital of China Medical University
Priority to CN202510348983.6A priority Critical patent/CN119889552B/en
Publication of CN119889552A publication Critical patent/CN119889552A/en
Application granted granted Critical
Publication of CN119889552B publication Critical patent/CN119889552B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/40ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application discloses a medical examination project reference interval suitability evaluation system. The medical examination item reference interval suitability evaluation system evaluates the abnormal rate of the reference interval and the abnormal grade distribution of the examination result through a preset reference interval and a reference interval obtained based on the collection calculation of physical examination data, and also evaluates the consistency between the preset reference interval and the document reference interval of the examination item, so as to generate a suitability evaluation result of the preset reference interval of each examination item. Therefore, the suitability evaluation system of the reference interval of the medical test project can evaluate the suitability of the preset reference interval, and has important significance for value medical treatment. The data extracted by the medical test project reference interval suitability evaluation system come from real-world test data, and the test of subjects does not need to be additionally recruited, so that the cost for evaluating the suitability of the reference interval can be greatly reduced.

Description

Medical examination project reference interval suitability evaluation system
Technical Field
The application relates to the field of reference interval evaluation, in particular to a medical examination project reference interval suitability evaluation system.
Background
The reference intervals of each test item used in the medical laboratory are mainly derived from reagent specifications, sanitary industry standards, authoritative documents, teaching materials and the like. Due to differences in socioeconomic level, regional living environment, eating habits, etc., there may be situations where the reference intervals of the existing individual detection items are not suitable for the population served by the laboratory. When the reference interval is too wide, the treatment of partial patients is delayed, and when the reference interval is too narrow, the excessive examination of the patients is caused, and medical resources are wasted, so that the method has important significance for value medical treatment, and whether the reference interval is suitable for periodic evaluation or not is judged.
At present, the suitability evaluation of the reference interval comprises the steps of screening at least 20 or 60 qualified reference individuals, detecting according to a standard operation program to obtain a primary detection result, checking and removing outliers in the primary detection result, considering that the primary detection result after removing the outliers passes through the verification of the preset reference interval when the number of the reference individuals in the preset reference interval is more than or equal to 90%, and considering that the preset reference interval is suitable when the verification of the preset reference interval passes.
However, the above-mentioned suitability evaluation method for the preset reference interval has the disadvantages of higher cost, complicated process, less data amount, deviation of reference individual selection, and no combination of real-world test big data, no combination of latest research document progress, poor reliability of evaluation results, and urgent need for a new evaluation scheme for rapidly and accurately evaluating the suitability of the reference interval.
Disclosure of Invention
An advantage of the application is that a medical test item reference interval suitability evaluation system is provided, wherein the medical test item reference interval suitability evaluation system can evaluate the suitability of a preset reference interval, and has important significance for value medical treatment.
An advantage of the present application is to provide a medical test item reference interval suitability evaluation system, wherein data selected by the medical test item reference interval suitability evaluation system is from real-world test data, and the cost of evaluating the reference interval suitability can be reduced to a large extent without additionally recruiting a subject for detection.
An advantage of the present application is that it provides a medical test item reference interval suitability assessment system, wherein the medical test item reference interval suitability assessment system is further capable of assessing the suitability of a reference interval in a document.
According to one aspect of the present application, there is provided a medical examination item reference interval suitability evaluation system, comprising:
the preset reference interval extraction module is used for extracting at least one preset reference interval of the test item from the test item reference interval database;
The physical examination data acquisition and analysis module is used for acquiring a set of physical examination data and determining a reconstruction reference interval of the test item by using an EP28 illegal method;
A test data collection module for extracting a test data set of the test item for a predetermined period of time from a laboratory information system;
The data grouping module is used for carrying out data segmentation on the inspection data set of the inspection item according to three visit types of physical examination, clinic and hospitalization to obtain a first inspection data subset, a second inspection data subset and a third inspection data subset;
The reference interval abnormal rate evaluation module is used for respectively calculating the upper limit abnormal rate and the lower limit abnormal rate of the first test data subset, the second test data subset and the third test data subset relative to the preset reference interval and the reconstruction reference interval so as to obtain a reference interval abnormal rate evaluation result;
And the suitability management module is used for generating a suitability evaluation result of the test item based on the reference interval abnormal rate evaluation result.
In the medical examination item reference interval suitability evaluation system according to the present application, the reference interval abnormality rate evaluation module includes:
The preset reference interval abnormal rate evaluation unit is used for respectively calculating the upper limit abnormal rate and the lower limit abnormal rate of the preset reference interval corresponding to the first test data subset, the second test data subset and the third test data subset to obtain a preset reference interval abnormal rate evaluation result;
the reconstruction reference interval abnormal rate evaluation unit is used for respectively calculating the upper limit abnormal rate and the lower limit abnormal rate of the reconstruction reference interval corresponding to the first detection data subset, the second detection data subset and the third detection data subset to obtain a reconstruction reference interval abnormal rate evaluation result.
In the medical examination item reference interval suitability evaluation system according to the present application, the suitability management module includes:
The consistency evaluation unit is used for carrying out consistency evaluation analysis on the preset reference interval abnormal rate evaluation result and the reconstruction reference interval abnormal rate evaluation result to obtain a consistency evaluation result;
And the suitability evaluation result determining unit is used for generating a suitability evaluation result of the test item based on the consistency evaluation result, the preset and reconstructed reference interval abnormal rate evaluation result.
In the medical examination item reference interval suitability evaluation system according to the present application, the consistency evaluation unit is further configured to:
Calculating the difference between the upper limit abnormal rate of the reconstruction reference interval and the upper limit abnormal rate of the preset reference interval and calculating the difference between the lower limit abnormal rate of the reconstruction reference interval and the lower limit abnormal rate of the preset reference interval to obtain a first difference and a second difference;
and generating the consistency evaluation result based on the comparison between the first difference degree and the second difference degree and a preset threshold value.
In the medical test item reference interval suitability evaluation system, the consistency evaluation unit is further used for calculating the ratio between the upper limit abnormal rate and the lower limit abnormal rate of the reference interval of any two test data subsets in the first test data subset, the second test data subset and the third test data subset to obtain a plurality of abnormal rate ratios, and generating a consistency evaluation result based on the comparison between each abnormal rate ratio in the plurality of abnormal rate ratios and a preset threshold value.
In the medical examination item reference interval suitability evaluation system according to the present application, the system further comprises an abnormality grade judgment criterion determination module for determining an abnormality grade judgment criterion based on the set of physical examination data, and an abnormality grade calculation module for analyzing the second examination data subset and the third examination data subset based on the abnormality grade criterion to obtain an abnormality grade distribution of an outpatient examination result and an abnormality grade distribution of an inpatient examination result.
In the medical examination item reference interval suitability evaluation system according to the present application, the abnormality rank determination criterion determination module is further configured to:
determining upper and lower limits of normal grades based on grading conditions of which the two sides of the aggregate distribution of the physical examination data are more than 2.5% or less than 97.5% or grading conditions of which the one side of the aggregate distribution of the physical examination data is less than 95%;
Determining upper and lower limits of a mild grade based on a grading condition that the distribution of the set of physical examination data is more than 1.5% and less than or equal to 2.5% or more than 97.5% and less than 98.5% or a grading condition that the distribution of the set of physical examination data is more than or equal to 95% and less than 97% on one side;
Determining upper and lower limits of a middle grade based on a grading condition that the distribution of the set of physical examination data is greater than or equal to 1.0% and less than or equal to 1.5% or greater than or equal to 98.5% and less than 99.0% or a grading condition that the distribution of the set of physical examination data is greater than or equal to 97.0% and less than 98.0% on one side;
determining upper and lower limits of a severe grade based on a grading condition that the distribution of the set of physical examination data is greater than or equal to 0.5% and less than or equal to 1.0% or greater than or equal to 99.0% and less than 99.5% or a grading condition that the distribution of the set of physical examination data is greater than or equal to 98.0% and less than 99.0% on one side;
The upper and lower limits of the extreme grade are determined based on a grading condition of 0.5% or more or 99.5% or less on both sides of the aggregate distribution of the physical examination data or a grading condition of 99.0% or more on one side of the aggregate distribution of the physical examination data.
In the medical examination item reference interval suitability evaluation system according to the application, the system further comprises a reference acquisition module for searching for a reference, a reference grade evaluation module for performing quality evaluation on the reference to obtain a reference quality grade evaluation result, and a reference interval management module for extracting and managing related data of a suggested reference interval about the examination item from the reference based on a large language model.
In the medical examination item reference interval suitability evaluation system according to the present application, the reference interval management module is further configured to:
calculating a bias ratio between a proposed reference interval of the test item and the reconstructed reference interval;
When the bias ratio is larger than a preset threshold value, judging that the consistency between the recommended reference interval of the test item and the reconstructed reference interval of the test item does not meet the preset requirement;
And when the bias ratio is smaller than or equal to a preset threshold value, judging that the consistency between the recommended reference interval of the test item and the reconstructed reference interval of the test item meets the preset requirement.
Further objects and advantages of the present application will become fully apparent from the following description and the accompanying drawings.
These and other objects, features and advantages of the present application will become more fully apparent from the following detailed description, the accompanying drawings and the appended claims.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts.
FIG. 1 is a block diagram of a medical test item reference interval suitability assessment system according to an embodiment of the present application.
Fig. 2 is a schematic diagram of an interface of EP28 nonparametric calculation of a medical test item reference interval suitability evaluation system according to an embodiment of the present application.
FIG. 3 illustrates an interface schematic of a preset reference interval of a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 4 illustrates yet another interface schematic of a preset reference interval of a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 5 illustrates a block diagram of an adaptability management module in a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 6 illustrates an interface diagram of anomaly rate level calculation for a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 7 illustrates yet another interface schematic of anomaly rate level computation for a medical test item reference interval suitability assessment system in accordance with an embodiment of the present application.
FIG. 8 illustrates an interface diagram of an anomaly level criteria determination for a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 9 illustrates yet another interface schematic of an anomaly level criterion determination for a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 10 illustrates an interface schematic diagram of a reference interval literature list of a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 11 illustrates yet another interface schematic diagram of reference interval literature content management for a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 12 illustrates an interface schematic diagram of reference interval literature comparisons of a medical test item reference interval suitability assessment system according to an embodiment of the present application.
FIG. 13 illustrates yet another interface schematic of a reference interval document comparison of a medical test item reference interval suitability assessment system according to an embodiment of the present application.
Detailed Description
Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
It will be understood that the terms "a" and "an" should be interpreted as referring to "at least one" or "one or more," i.e., in one embodiment, the number of elements may be one, while in another embodiment, the number of elements may be plural, and the term "a" should not be interpreted as limiting the number. "plurality" means two or more.
Although ordinal numbers of "first," "second," etc., for example, will be used to describe various components, those components are not limited herein. The term is used merely to distinguish one component from another. For example, a first component may be referred to as a second component, and likewise, a second component may be referred to as a first component, without departing from the teachings of the present inventive concept. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing various embodiments only and is not intended to be limiting. As used herein, the singular is intended to include the plural unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, operations, elements, or groups thereof, but do not preclude the presence or addition of one or more other features, integers, operations, elements, or groups thereof.
As shown in fig. 1 to 13, a medical examination item reference interval suitability evaluation system according to an embodiment of the present application is elucidated. The medical test item reference interval suitability evaluation system 1000 can evaluate the suitability of a preset reference interval, and has important significance for value medical treatment. The data extracted by the medical test item reference interval suitability evaluation system 1000 is from real-world test data, and the test of subjects does not need to be additionally recruited, so that the cost for evaluating the suitability of the reference interval can be greatly reduced. The medical test item reference interval suitability assessment system 1000 is also capable of assessing the suitability of a reference interval in a document.
Specifically, as shown in fig. 1, the medical test item reference interval suitability evaluation system 1000 comprises a preset reference interval extraction module 10, a physical examination data acquisition and analysis module 20, a test data acquisition module 30, a data grouping module 40, a reference interval abnormal rate evaluation module 50 and a suitability management module 60, wherein the preset reference interval extraction module 10 is used for extracting a preset reference interval of at least one test item from a test item reference interval database, the physical examination data acquisition and analysis module 20 is used for acquiring a set of physical examination data and determining a reconstruction reference interval of the test item by using an EP28 nonparametric method, the test data acquisition module 30 is used for extracting a test data set of the test item in a preset time period from a laboratory information system, the data grouping module 40 is used for carrying out data segmentation on the test data set of the test item according to three types of physical examination, clinic and hospitalization to obtain a first test data subset, a second test data subset and a third test data subset, the reference interval abnormal rate evaluation module 50 is used for respectively calculating the first test data subset, the second test data and the third test data subset, and the reference interval abnormal rate evaluation module is used for generating an abnormal rate evaluation result based on the reference interval abnormal rate evaluation result of the reference interval abnormal rate and the reference interval suitable for the test item, and the abnormal rate evaluation result is obtained by the reference interval abnormal rate evaluation module.
Specifically, during the operation of the preset reference interval extraction module, that is, through the preset reference interval extraction module 10, at least one preset reference interval of the test item is extracted from the test item reference interval database. The preset reference interval of the at least one test item includes, but is not limited to, a preset reference interval of platelet count, a preset reference interval of total protein, a preset reference interval of albumin, a preset reference interval of glutamic pyruvic transaminase, and the like.
It is worth mentioning that the preset reference interval of each of the item test items may also be divided into a plurality of groups, for example, the preset reference interval of the platelet count test item includes a preset reference interval of the platelet count test item of the first group, a preset reference interval of the platelet count test item of the second group, a preset reference interval of the platelet count test item of the third group, a preset reference interval of the platelet count test item of the fourth group, a preset reference interval of the platelet count test item of the fifth group, a preset reference interval of the platelet count test item of the sixth group, and a preset reference interval of the platelet count test item of the further group. In one specific example, each group of preset reference intervals corresponds to a group of test subjects, the test subjects being grouped according to age and gender, for example, the first group of preset reference intervals are test subjects being platelet count test items of a 6 month- <1 year old group of females, the second group of preset reference intervals are test subjects being platelet count test items of a1 year- <2 year old group of females, the third group of preset reference intervals are test subjects being platelet count test items of a 2 year- <6 year old group of females, the fourth group of preset reference intervals are test subjects being platelet count test items of a 6 year- <12 year old group of females, the fifth group of preset reference intervals are test subjects being platelet count test items of a 12 year- <18 year old group of females, the sixth group of preset reference intervals are test subjects being platelet count test items of a3 year old group of females, and the other groups of preset reference intervals are test subjects being platelet count test items of a 53 year old group of the other group. It is also worth mentioning that in other examples, the data in the test item reference interval database may be grouped by test object first and then by test item type.
During operation of the physical examination data acquisition and analysis module 20, it first acquires a set of physical examination data and processes the set of physical examination data using EP28 nonparametric methods to determine a reconstructed reference interval for at least one test item, as shown in fig. 2. Besides the EP28 nonparametric method, other suitable reference interval establishment methods such as an EP28 parameter method and refineR, TMC, kosmic can be adopted. One of ordinary skill in the art will appreciate that EP28 nonparametric methods include outlier data exclusion. Specifically, the abnormal physical examination data elimination process comprises deleting data with incomplete data, selecting multiple detection results of the same individual for only 1 last time, deleting data of positive (chest X-ray, ultrasound, CT, etc.) results, and eliminating (such as WBC, hb, MCV, MCHC, ALT, GGT, BUN, cr, CEA, AFP, CA-9, etc.) data exceeding a reference interval by LAVE method. After exclusion of the extra-threshold values by Tukey, the remaining set of physical examination data included n test data. The n inspection data can be arranged in ascending order, and the n inspection data after the ascending order can be respectively used,...,The representation is made of a combination of a first and a second color,,In order to check the minimum value in the project data,To check for a maximum in the project data. Dividing n pieces of test data into 100 equal parts, wherein the number corresponding to the r% rank is the r percentile, and the r percentile is expressed by a symbol Pr. The rank order of the reference lower limit value of the reconstructed reference interval and the reference upper limit value of the reconstructed reference interval may be denoted by P 2.5 and P 97.5, respectively, wherein the calculation is performed by the following equation: And . If the calculated values are not integers, they may be rounded off. The confidence interval of the lower limit value of the reconstruction reference interval is P 1-P5, and the confidence interval of the upper limit of the reconstruction reference interval is P 95-P99.
Accordingly, the reestablished reference interval of each test item of the physical examination class can be obtained by using the EP28 nonparametric method, for example, the reestablished reference interval of the platelet count of the physical examination class, the reestablished reference interval of the total protein of the physical examination class, the reestablished reference interval of the albumin test item of the physical examination class, the reestablished reference interval of the glutamic pyruvic transaminase test item of the physical examination class, and the like. Accordingly, the reconstruction reference interval of each test item of the physical examination class may be divided into a plurality of groups of reconstruction reference intervals of the test item, for example, the reconstruction reference interval of the platelet count of the first group of the physical examination class includes the reconstruction reference interval of the platelet count of the first group of the physical examination class, the reconstruction reference interval of the platelet count of the second group of the physical examination class, the reconstruction reference interval of the platelet count of the third group of the physical examination class, the reconstruction reference interval of the platelet count of the fourth group of the physical examination class, the reconstruction reference interval of the platelet count of the fifth group of the physical examination class, the reconstruction reference interval of the platelet count of the sixth group of the physical examination class, and the reconstruction reference interval of the platelet count of the further group of the physical examination class. For example, the reconstructed reference interval of each group of the physical examination class corresponds to a group of test objects, the test objects are grouped according to age and sex, for example, the reference interval of the platelet count of the first group of the physical examination class is 6 months- <1 year old for females, the reference interval of the platelet count of the second group of the physical examination class is 1 year- <2 years old for females, the reference interval of the platelet count of the third group of the physical examination class is 2 years- <6 years old for females, the reference interval of the platelet count of the fourth group of the physical examination class is 6 years- <12 years old for females, the reference interval of the platelet count of the fifth group of the physical examination class is 12 years- <18 years old for females, the reference interval of the platelet count of the sixth group of the physical examination class is 18 years- <53 years old for females, and the reference interval of the platelet count of the other groups of the physical examination class corresponds to the test objects of other groups.
During operation of the test data acquisition module, i.e. by means of the test data acquisition module 30, a test data set of at least one test item within a predetermined period of time is extracted from the laboratory information system. In particular, test data from the real world can be extracted from the laboratory information system without additional recruitment of subjects for testing, as compared to existing reference interval suitability assessment procedures, to significantly reduce the cost of assessing reference interval suitability.
In one implementation, the test data acquisition module 30 first extracts test data sets from the laboratory information system for a predetermined period of time, and then groups the test data sets for the predetermined period of time first by test subjects (in one particular example, test subjects are grouped by age and gender), as shown in fig. 3 and 4, then by category of visit, and then by type of test item.
Likewise, the set of inspection data within the predetermined period of time may be split into a plurality of grouped inspection data subsets, e.g., a first grouped inspection data subset, a second grouped inspection data subset, a third grouped inspection data subset, a fourth grouped inspection data subset, a fifth grouped inspection data subset, a sixth grouped inspection data subset, and further grouped inspection data subsets. The first group of test data subsets are 6 months to 1 year old of a female, the second group of test data subsets are 1 year to 2 years of a female, the third group of test data subsets are 2 years to 6 years of a female, the fourth group of test data subsets are 6 years to 12 years of a female, the fifth group of test data subsets are 12 years to 18 years of a female, the sixth group of test data subsets are 18 years to 53 years of a female, and the other groups of test data subsets are other groups of test data subsets.
The test data subsets of each group include a first type of test item subset with a visit type of physical examination, a second type of test item subset with a visit type of clinic, and a third test item subset with a visit type of hospitalization, e.g., the test data subset of the first group of the group of 12 years-18 years old test subjects including one first type of test item subset, one second type of test item subset, and one third test item subset, and the test data subset of the group of 18 years-53 years old test subjects including another first type of test item subset, another second type of test item subset, and another third test item subset.
The subset of test items of each type of visit type in each of the subset of test data of the group includes test data of at least one test item. Accordingly, the first type of test item subset of each grouped test data subset comprises a single test item data subset of at least one physical examination class, the second type of test item subset of each grouped test data subset comprises a single test item data subset of at least one clinic class, the third type of test item subset of each grouped test data subset comprises a single test item data subset of at least one hospitalization class, for example, the first type of visit in the first grouped test data subset of which the test subject is female 18-53 years old comprises platelet count test item test data under which the test subject is female 18-53 years old grouped physical examination class, and so forth. It should be understood that the test data set extracted from the laboratory information system for a predetermined period of time may be divided into data according to the group of test subjects, the test subjects being grouped according to age and sex, then divided into data according to test item data, and then divided into diagnosis type.
Accordingly, in the process of performing data segmentation on the inspection data set of each inspection item according to three visit types of physical examination, clinic and hospitalization by the data grouping module 40 to obtain a first inspection data subset, a second inspection data subset and a third inspection data subset, the first inspection data subset in the inspection data set of each inspection item is the inspection data subset with the visit type of the inspection item being physical examination, the second inspection data subset in the inspection data set of each inspection item is the inspection data subset with the visit type of the inspection item being clinic, and the third inspection data subset in the inspection data set of each inspection item is the inspection data subset with the visit type of the inspection item being hospitalization.
Specifically, during operation of the reference interval anomaly rate evaluation module 50, the first test data subset, the second test data subset, and the third test data subset are respectively calculated with respect to the upper limit anomaly rate and the lower limit anomaly rate of the preset reference interval and the reconstructed reference interval to obtain a reference interval anomaly rate evaluation result. For ease of illustration, a specific test item is exemplified, such as platelet count.
In a specific example, the reference interval abnormality rate evaluation module comprises a preset reference interval abnormality rate evaluation unit and a reconstruction reference interval abnormality rate evaluation unit, wherein the preset reference interval abnormality rate evaluation unit is used for respectively calculating the upper limit abnormality rate and the reference interval lower limit abnormality rate of a preset reference interval corresponding to the first test data subset, the second test data subset and the third test data subset to obtain a preset reference interval abnormality rate evaluation result, and the reconstruction reference interval abnormality rate evaluation unit is used for respectively calculating the upper limit abnormality rate and the lower limit abnormality rate of a reconstruction reference interval corresponding to the first test data subset, the second test data subset and the third test data subset to obtain a reconstruction reference interval abnormality rate evaluation result.
Specifically, the preset reference interval abnormality rate evaluation unit is further configured to calculate a ratio of the numbers of pieces of inspection data exceeding the upper limit value of the preset reference interval in the first, second and third inspection data subsets of the inspection item, respectively, to obtain upper limit abnormality rates of preset reference intervals corresponding to the first, second and third inspection data subsets. Here, the ratio is a ratio of the number of pieces of inspection data exceeding the upper limit value of the preset reference section divided by the cardinality of the respective inspection data subsets. Similarly, the preset reference interval abnormality rate evaluation unit is further configured to calculate ratios of numbers of inspection data below a lower limit value of the preset reference interval in the first, second, and third inspection data subsets of the inspection item, respectively, to obtain lower limit abnormality rates of preset reference intervals corresponding to the first, second, and third inspection data subsets.
Accordingly, the reconstruction reference interval abnormality rate evaluation unit may calculate the upper limit abnormality rate of the reconstruction reference interval and the lower limit abnormality rate of the reconstruction reference interval corresponding to the first, second and third inspection data subsets in a similar operation mode to the preset reference interval abnormality rate evaluation unit, and for brevity of text, the description will not be repeated.
As shown in fig. 5, in the embodiment of the present application, the suitability management module 60 includes a consistency evaluation unit 61 and a suitability evaluation result determination unit 62, where the consistency evaluation unit 61 is configured to perform consistency evaluation analysis on the preset reference interval abnormality rate evaluation result and the reconstruction reference interval abnormality rate evaluation result to obtain a consistency evaluation result, and the suitability evaluation result determination unit 62 is configured to generate a suitability evaluation result of the test item based on the consistency evaluation result, the preset and reconstruction reference interval abnormality rate evaluation result.
In the embodiment of the present application, the operation logic of the consistency evaluation unit 61 is as follows, calculating a difference between the upper limit anomaly rate of the reconstruction reference interval and the upper limit anomaly rate of the preset reference interval and calculating a difference between the lower limit anomaly rate of the reconstruction reference interval and the lower limit anomaly rate of the preset reference interval to obtain a first difference and a second difference, and generating the consistency evaluation result based on a comparison between the first difference and the second difference and a preset threshold.
In a specific example, the degree of difference between the calculated upper limit abnormality rate of the reconstructed reference interval and the upper limit abnormality rate of the preset reference interval is calculated, that is, the estimated abnormal rate of the reference interval by the EP28 nonparametric method is compared, and if the difference is greater than 50%, it is not suitable, and if the difference is less than 50%, it may be suitable.
In another embodiment of the present application, the operation logic of the consistency evaluation unit 61 is as follows, firstly, calculating the ratio between the upper limit abnormal rate and the lower limit abnormal rate of the preset reference interval of any two of the first, second and third test data subsets to obtain a plurality of abnormal rate ratios, and then, generating the consistency evaluation result based on the comparison between each abnormal rate ratio of the plurality of abnormal rate ratios and a preset threshold value. Specifically, in one specific example, the reference interval may be unsuitable when the abnormality rate of the clinic, hospitalization and the physical examination abnormality rate differ too much or too little. Such a reference interval may be unsuitable if the outpatient abnormality rate is 5 times higher than that of physical examination or if the hospitalization abnormality rate is 20 times higher than that of physical examination.
It should be noted that, in the implementation, the determination range can be set up according to the characteristics of different projects, crowd distribution and special features of hospitals, which is not limited by the present application.
In a specific example, according to the setting conditions of age, sex and the like of the reference interval, inquiring data of 2024, 10, 22 and 2024, 11 and 22 days in a laboratory information system, calculating the abnormal rate of the upper limit or the lower limit of a PLT reference interval (male: 83-303×10 9/L, female: 101-320×10 9/L) according to three different diagnosis types of physical examination, clinic and hospitalization respectively, and reusing the abnormal rate of the upper limit or the lower limit of the reference interval (male: 137-354×10 9/L, female 144-384×10 9/L) recalculated by the non-parametric method of the physical examination data of the EP 28.
As a result of suitability evaluation, the preset reference interval of PLT is significantly lower than that of EP28 nonparametric law and reference literature, and the upper and lower limit abnormality rates of physical examination, clinic and hospitalization of the preset reference interval shown in fig. 7 are 15.75%, 13.82%, 10.56% and 0.16%, 2.71% and 18.27%, respectively, and the extreme rate of abnormal grade distribution hospitalization patient is 20.51%, and the suitability of the reference interval of PLT needs to be further studied.
As shown in FIG. 1, in a preferred embodiment of the present application, the medical test item reference interval suitability evaluation system 1000 further includes an abnormality grade determination criterion determination module 70 and an abnormality grade calculation module 80. The abnormal level judgment criterion determination module 70 is configured to determine an abnormal level judgment criterion based on the set of physical examination data. The abnormality level calculation module 80 is configured to analyze the second subset of test data and the third subset of test data based on the abnormality level determination criteria to obtain an out-patient test result abnormality level distribution and an in-patient test result abnormality level distribution.
In a specific example, the abnormality grade judgment criterion determining module 70 determines the upper and lower limit values of five grades (as shown in fig. 6, 7, 8 and 9) respectively of five grades, which are normal, slightly abnormal, moderately abnormal, severely abnormal and extremely abnormal, based on the set of physical examination data, wherein physical examination data not belonging to the normal grade is regarded as abnormal physical examination data, that is, abnormal physical examination data includes slightly abnormal grade, moderately abnormal grade, severely abnormal grade and extremely abnormal grade.
In this specific example, the abnormality grade judgment criterion determining module is further configured to determine the upper and lower limits of the normal grade based on a grade condition that the distribution of the set of physical examination data is more than 2.5% or less than 97.5% on both sides or a grade condition that the distribution of the set of physical examination data is less than 95% on one side; the method includes determining an upper and lower limit of a mild grade based on a grading condition of 1.5% or more and less than 2.5% or more and 97.5% or less and a grading condition of 98.5% or more and a grading condition of 95% or less and less than 97% on one side of a collection distribution of the physical examination data, determining an upper and lower limit of a severe grade based on a grading condition of 1.0% or less and less than 1.5% or more and 98.0% or less and a grading condition of 97.0% or more and 98.0% or more and a grading condition of 99.5% or less and a grading condition of 99.0% based on one side of the collection distribution of the physical examination data and 0.5% or less and more and 0% or less than 1.0% or 99.0% or more.
In this particular example, the anomaly level calculation module 80 operates logic to count the proportion of the second subset of test data and the third subset of test data that are normal, slightly anomalous, moderately anomalous, severely anomalous, and extremely anomalous, respectively, based on the anomaly level criteria, in such a manner as to determine an outpatient test result anomaly level and an inpatient test result anomaly level.
In one specific example, for a certain group of test item data in a certain time period in a laboratory, data with a visit type of physical examination is screened out and abnormality levels of out-patient and in-patient test results are calculated based on the physical examination big data distribution. Furthermore, the distribution of the physical examination big data is used as the division basis of the abnormal grade. Normally, the bilateral range of physical examination data distribution is >2.5% or <97.5%, and the unidirectional range is <95%. Mild, bilateral distribution of physical examination data ranged from 1.5-2.5% or 97.5-98.5% with unidirectional range from 95-97%. Moderately, the distribution of physical examination data is in a bilateral range of 1.0-1.5% or 98.5-99.0%, and the unidirectional range is 97.0-98.0%. The weight of the physical examination data is 0.5-1.0% or 99.0-99.5% on both sides, and the unidirectional range is 98.0-99.0%. The distribution bilateral range of physical examination data is not more than 0.5 or not more than 99.5%, and the unidirectional range is not less than 99.0%. Then, the number of the test results on the previous day is updated every day by a cyclic algorithm, the cumulative percentage of the result values is calculated by counting the proportion of the number below the result values to the total number, and the crowd percentages are counted according to the classification statistics of 0.5, 1, 1.5, 2, 2.5, 3, 4,5, 6.
The practical example is that the examination result of a platelet count examination of a certain day is 109×10 9/L in total, if there is already a value of 109×10 9/L in the cumulative percentage counted before, the number of updated items, plus the cumulative number of 27, should be 32 in total, and the result is 104 in 109×10 9/L and below, and the cumulative percentage is 0.501%. If there is no value 109X 10 9/L, the result value is added, the number is 5, and the cumulative percentage is calculated.
It should be noted that, when performing suitability management on the preset reference intervals of the multiple test items, the suitability of the preset reference intervals of the multiple test items may be evaluated by the above-mentioned medical test item reference interval suitability evaluation system 1000 by using multithreading. It should be noted that, in the process of evaluating the suitability of the reference section, the actual suitability of the reference section may be evaluated more comprehensively by using the abnormal grade evaluation result as an inclusion index, which is not the application.
As shown in FIG. 1, in this preferred embodiment of the application, the medical test item reference interval suitability assessment system 1000 also performs suitability management of reference intervals in conjunction with references. It is worth mentioning that suitability management of the reference interval in combination with the reference is an option.
Specifically, as shown in fig. 5, the medical test item reference interval suitability evaluation system 1000 further includes a reference acquisition module 90, a reference grade evaluation module 100, and a reference interval management module 110. The reference acquisition module 90 is used to search for references. The reference grade assessment module 100 is configured to perform quality assessment on a reference to obtain a reference quality grade assessment result. The reference section management module 110 is configured to extract and manage relevant data of a suggested reference section of the reference about at least one test item based on a large language model.
Accordingly, the medical test item reference interval suitability evaluation system 1000 also searches for a reference through the reference acquisition module 90, performs quality evaluation on the reference through the reference level evaluation module 100 to obtain a reference quality level evaluation result, and extracts and manages related data about a suggested reference interval of at least one test item in the reference through the reference interval management module 110 based on a large language model.
In searching for references by the reference collection module 90, references may be searched from large databases, such as, for example, the chinese literature database, the world wide web database, the Wiley database, etc., or references may be added by linking to online databases and library resources (as shown in fig. 11). The method comprises the steps of converting keywords (a test item name, a detection system, a reference interval and the like) into high-dimensional feature vectors, capturing semantic and syntactic information in the keywords through the vectors, calculating the similarity of the high-dimensional feature vectors and the high-dimensional feature vectors corresponding to each corpus knowledge text block, searching a most relevant corpus knowledge text base, sorting search results according to the similarity according to similarity results to obtain a plurality of primary screening documents, and further screening the primary screening documents according to partitions, influence factors, the number of times of introduction, the publishing time and document types to obtain a plurality of target documents. The method supports the importing of the external documents into the document library in a document uploading mode.
That is, in one specific example of the present application, the process of searching for references includes the steps of:
S1, inputting reference search keywords;
s2, carrying out semantic embedded coding on the reference search keywords to obtain reference search keyword semantic embedded coding vectors;
s3, extracting a set of corpus knowledge text blocks from a corpus knowledge base;
S4, respectively carrying out semantic coding on each corpus knowledge text block in the corpus knowledge text block set to obtain a corpus knowledge text block semantic embedded coding vector set;
S5, calculating the similarity of each corpus knowledge text block semantic embedded coding vector in the reference search keyword semantic embedded coding vector and the corpus knowledge text block semantic embedded coding vector set to obtain a similarity set;
and S6, sorting the set of similarity to obtain a plurality of preliminary screening documents.
In a preferred embodiment, in order to improve the accuracy of calculating the similarity, the process of calculating the similarity between the semantic embedded coding vector of the reference search keyword and each semantic embedded coding vector of the corpus-knowledge text block in the set of semantic embedded coding vectors of the corpus-knowledge text block includes the following steps:
S51, carrying out semantic fine granularity interactive coding on the semantic embedded coding vector of the reference search keyword and the semantic embedded coding vector of the corpus knowledge text block to obtain a search keyword-corpus knowledge text block bidirectional fine granularity semantic association coding vector;
s52, inputting the search keyword-corpus knowledge text block bidirectional fine granularity semantic association coding vector into a similarity estimation module based on a decoder to obtain the similarity.
In this specific example, performing semantic fine granularity interactive coding on the reference search keyword semantic embedded coding vector and the corpus knowledge text block semantic embedded coding vector includes:
s511, performing principal component analysis on the semantic embedded coding vectors of the corpus knowledge text blocks to obtain a set of principal component feature coding vectors of the corpus knowledge text blocks, wherein the process is expressed as follows:
Wherein, Representing the corpus knowledge text block semantically embedded encoding vectors,The transpose of the vector is represented,Representing the length of the corpus knowledge text block semantically embedded encoding vector,The covariance matrix is represented by a matrix of covariance,A set of feature encoding vectors representing principal components of the corpus knowledge text block,Representing the first, second and first respectivelyPrincipal component feature encoding vectors of individual corpus knowledge text blocks,Representing the total number of principal component feature encoding vectors of the corpus knowledge text block,A diagonal matrix is represented and,The elements of the diagonal line are represented,Representing the first, second and third on the diagonal of the diagonal matrixAnd characteristic values.
It should be appreciated that the high-dimensional corpus-knowledge text block semantic embedded encoding vectors are first converted by principal component analysis into a lower-dimensional, but more compact and information-rich representation, i.e., a collection of corpus-knowledge text block principal component feature encoding vectors. This process is not just a simple dimension reduction operation, but it is actually a deep purification of the information of the original data. In this process, information that would otherwise be dispersed in a high-dimensional space is reorganized and concentrated on an ordered set of principal component axes. This means that a small number of head principal components can capture most of the variant information in the data, while those redundant or noisy information are naturally filtered out. For example, when a document retrieval system is constructed, the problem of multiple collinearity in the text can be effectively eliminated by carrying out principal component analysis on each text block in the corpus, so that the difference between different text blocks is more obvious, and the relevance and accuracy of the retrieval result are further improved. In addition, due to the reduction of feature dimensions, the time cost of model training is also greatly reduced, while the risk of overfitting is also reduced, as fewer features mean fewer model parameters, thereby reducing model complexity.
S512, performing linear transformation on each corpus knowledge text block principal component feature code vector in the corpus knowledge text block principal component feature code vector set to obtain a corpus knowledge text block principal component linear transformation feature code vector set, wherein each corpus knowledge text block principal component linear transformation feature code vector in the corpus knowledge text block principal component linear transformation feature code vector set and the reference search keyword semantic embedded code vector have the same feature scale, and the process is expressed as follows:
Wherein, Representation pairThe linear transformation is performed such that,Representing a set of principal component linear transformation feature encoding vectors of a corpus knowledge text block,Representing first, second and firstAnd the principal component feature code vectors of the corpus knowledge text blocks.
It should be appreciated that by linear transformation, the scale of the principal component feature encoding vectors of each corpus knowledge text block can be adjusted so that they can be compared and interacted with under the same frame of reference. This scaling is not only to make the data of different modalities closer in value, but more important to ensure that the data of each modality can fairly contribute its unique information during the information fusion process, rather than that the information of some modalities is excessively amplified or ignored due to the scale difference.
S513, embedding the reference search keyword senses into coding vectors and carrying out inter-mode independence coding on each corpus knowledge text block principal component feature coding vector in the corpus knowledge text block principal component feature coding vector set to obtain a set of search keyword-corpus knowledge text block principal component inter-mode independence coding matrixes, wherein the process is expressed as follows:
Wherein, Represent the firstPrincipal component feature encoding vectors of individual corpus knowledge text blocks,AndFor feature mapping functions, such as linear mapping or nonlinear kernel functions,Representing the reference search keyword sense embedded encoding vector,Representing the length of the reference search key word sense embedded encoding vector,Represent the firstAnd the search keywords are the independent coding matrix among the principal component modes of the corpus knowledge text block.
It should be appreciated that by inter-modality independence encoding, an optimized space can be created in which each text block in the search keyword and corpus is represented in a new, more refined manner, i.e., a collection of search keyword-corpus knowledge text block principal component inter-modality independence encoding matrices is formed. This way of coding emphasizes the difference and unique contribution of the two in terms of the expressed content, not just by formally combining the two types of vectors, but based on a deep understanding of the relationship between them. In particular, when comparing search keywords with a large number of corpus knowledge text blocks, direct fusion of these vectors can lead to problems, such as that certain high-dimensional features can have high correlation between different modalities, which can lead to duplication of information and increase complexity of the model. By means of inter-modality independence coding, one can identify features that are truly independent, so that the features can be utilized more effectively to enhance the performance of the model. This approach encourages the model to focus on those parts that provide unique information between different modalities, rather than simply superimposing all available information. By means of inter-modality independence coding, the system can better identify which documents contain not only terms directly related to query terms, but also unique information capable of supplementing and expanding the meaning of the query terms. The method is helpful to promote the relevance of the search result, so that the documents finally presented to the user are not only matched on the surface, but also high-quality resources closely connected with the query intention in a deep sense. Furthermore, such coding strategies also help balance the importance of the different modalities, ensuring that none of the modalities dominates the final decision due to their inherent characteristics (e.g., scale size). Therefore, the text blocks in the keyword and corpus can play the maximum role in the due range, and more accurate and valuable retrieval results are provided for users.
S514, calculating a set of independent soft constraint factors among the principal component modes of the search keyword-corpus knowledge text block based on the set of independent coding matrixes among the principal component modes of the search keyword-corpus knowledge text block, wherein the process is expressed as follows by a formula:
Wherein, Representing the square of the F-norm of the matrix,Represent the firstAnd the independent soft constraint factors among the principal component modes of the text block of the search keywords-corpus knowledge.
It should be appreciated that a soft constraint factor of inter-modality independence between the search keyword and each corpus knowledge text block is calculated. This process essentially creates a dynamic adjustment mechanism that allows the model to adaptively weigh the importance of each modality depending on the specifics of the input data. In particular, the soft constraint factor provides a flexible way to enable the model to emphasize inter-modality independence while also taking into account the unavoidable correlation between them. This flexibility is critical to handling complex data in the real world, as the actual data often contains information that both complements each other and that overlaps to some extent. Furthermore, the dynamic nature of the soft constraint factor means that it can adjust its behavior as the input changes. If a particular query term is particularly relevant to certain types of documents, the soft constraint factor can automatically adapt to this situation, ensuring that those most relevant documents are prioritized. In this way, the system can maintain high accuracy and relevance, both when processing highly specialized queries and when broad subject searches.
S515, embedding the reference search keyword senses into coding vectors and inputting the corpus knowledge text block principal component feature coding vectors into a feature interaction response unit to obtain a set of fine granularity response interaction coding vectors among the search keyword-corpus knowledge text block principal component modes, wherein the process is expressed as follows:
Wherein, Representing the multiplication by the position point,The representation is in terms of a position vector,The division by position is indicated,Representing a function of the cascade of functions,Represent the firstThe number of weight matrices is a function of the number of weight matrices,Represent the firstThe number of offset vectors is chosen such that,Represent the firstAnd each search keyword-corpus knowledge text block principal component inter-modal fine-granularity response interactive coding vector.
It should be appreciated that when these vectors are input to the feature interaction response unit, the system begins fine-grained feature interactions. This is not just a simple vector addition or dot product operation, but rather a deep mining of complex relationships and potential patterns between different modality features. In particular, the feature interaction response unit explores multi-level associations between search keywords and each text block, including but not limited to semantic similarity, topical overlap, and subtle associations in context. This fine-grained interaction not only resides at the surface level, but it also goes deep into the feature dimension level, mining nonlinear relationships and hierarchies. This means that the system is able to capture information deeper in the literature. Finally, the fine granularity response interactive coding vector set among the principal component modes of the search keyword-corpus knowledge text block generated through the process provides a more comprehensive and fine representation form. These code vectors not only reflect the degree of direct matching between the query terms and the documents, but also reveal more complex interactions and complementary relationships between the two.
S516, dynamically and adaptively aggregating a set of fine granularity response interaction coding vectors among the main component modes of the search keyword-corpus knowledge text block based on the set of independent soft constraint factors among the main component modes of the search keyword-corpus knowledge text block to obtain the interaction response coding vectors, wherein the process is expressed as follows:
Wherein, Representing the normalized exponential function of the sample,Representing the interaction response encoding vector.
It should be understood that, with the previously calculated set of soft constraint factors for independence between principal component modalities of the search keyword-corpus knowledge text block, the system starts dynamic adaptive aggregation of fine-grained response interactive coding vectors between principal component modalities of the search keyword-corpus knowledge text block. The core of this process is to dynamically adjust the contribution weights of each fine-grained response according to soft constraint factors, thereby selectively fusing the most relevant feature information. The dynamic adjustment mechanism enables the system to flexibly cope with the characteristics of different input data. For example, when a query term is particularly relevant to a particular type of document, the system automatically increases the weight of that type of document in the final aggregate result. Conversely, if some documents, while containing relevant information about a portion of the query terms, deviate significantly from the query intent as a whole, their weights will be correspondingly reduced. This ensures that the resulting interaction response encoding vector contains not only the direct matching information between the query terms and the document, but also reflects a more complex complementary relationship between the two. In addition, the dynamic self-adaptive aggregation process can effectively reduce the influence of redundant information. Because soft constraint factors emphasize independence and complementarity between modalities, the system tends to preserve those portions that provide unique perspectives and information, while suppressing duplicate or extraneous information. In this way, the finally obtained interactive response coding vector is not only more compact and refined, but also has higher degree of distinction and expressive power.
In the quality assessment of a reference by the reference level assessment module 100 to obtain a reference quality level assessment result, the target content of the reference may be extracted and/or translated using reference content extraction and translation techniques. PDF documents may be scanned by Optical Character Recognition (OCR) techniques to convert them into editable text. For foreign documents, translation can be performed by using an encoder-decoder architecture with the aid of a deep learning model, wherein the encoder converts source language text into intermediate text of a vector sequence, and the decoder translates the intermediate text into target text, and the model understands the meaning more accurately by means of a concentration mechanism, thereby generating translation results more conforming to the expression habit of the target language. The large language model is trained through a corpus of a large number of medical documents, and the understanding and translating capacity of medical terms is enhanced for professional optimization in the medical field. The large language model is utilized to automatically search the literature and extract the related data of the reference interval, so that manual searching of the literature and the content in the manual searching of the literature are avoided, and time and energy are saved for researchers.
Further, the reading order of the text can be determined by using a page layout detection model, a text block is processed by using a cleaning and formatting algorithm, for example, abstract sentence function classification, chapter function recognition, quotation function recognition and the like, namely, one or more types of knowledge units are subjected to attribute or relation discrimination, then the text quality is improved by adopting a combination and post-processing algorithm, finally, structured information in a document is extracted by using a Natural Language Processing (NLP) technology and machine learning in a PDF analysis library, such as a detection year, a region, crowd race, a detection instrument, a detection method, a reference interval establishing method, a reference interval, a confidence interval and the like, and the fields are automatically input into a system to form a document library for management after being extracted. In addition, because the medical literature contains some unstructured information, including inclusion criteria, exclusion criteria, etc., the information needs to be extracted and consolidated into structured information for comparison and analysis.
Documents researching a reference section of the same type of test item are classified into one type through a machine learning classification model, the understanding of the relation and influence among the documents is enhanced by utilizing a graph neural network analysis document citation network and an academic social network, and deeper text understanding and analysis are performed by using a pre-training language model, so that the accuracy of document quality assessment is improved, as shown in a table 1.
In order to evaluate the quality of the documents more accurately, the quality of the documents can be evaluated as A, B, C, D grades by comparing with a document quality evaluation standard check list, wherein the overall grade A refers to that all quality evaluation standards of one document reach grade A, the grade B refers to that the lowest grade reached in the quality evaluation standard is grade B, the grade C refers to that the lowest grade reached in the quality evaluation standard is grade C, and the grade D refers to that the lowest grade reached in the quality evaluation standard is grade D. Wherein the D scale indicates that the reference interval evaluated in this document is not applicable to clinical practice.
Table 1 reference interval document quality evaluation checklist
Subsequent data processing may be performed for documents with higher quality levels of the reference. In the process of extracting and managing related data about a suggested reference section of at least one test item in the reference based on a large language model through the reference section management module 110, in a document library, a document name, document quality, influence factors, publication date, journal, author, document whole text, abstract, quotation, etc. can be browsed, item names, units, nations, instruments (as shown in fig. 11), reagents, test methods, regions, reference section establishment methods, threshold values, inclusion exclusion criteria, reference sections grouped by gender and age, etc. can be managed, in addition, a document library can be associated, there is a one-to-many relationship between documents and items, the suggested reference sections of the same test item in different documents can be classified by the same item names, and the suggested reference sections of the same test item in different documents can be displayed in the form of generated pictures. The anomaly rate and anomaly level of the reference section in the real world data of different visit types (physical examination, clinic, hospitalization) can be analyzed by importing data from a laboratory information system, namely, the anomaly rate and anomaly level of the reconstructed reference section corresponding to the first test data subset, the second test data subset and the third test data subset are analyzed and displayed in pictures.
In one specific example, the process of extracting and managing relevant data in the reference about a suggested reference interval for at least one test item based on a large language model by the reference interval management module 110 includes evaluating consistency of the test item with a reference-based suggested reference interval. The operation logic is as follows:
And automatically evaluating the quality of the reference section by calculating the bias ratio of the reference section of the reference document to the reference section of the test item so as to judge whether the reference section is suitable or not and visually displaying.
Wherein LL 0、UL0 and Me 0 represent the lower, upper and median limits of the reference intervals of the test items, and LL, UL and Me represent the lower, upper and median limits of the reference intervals of the literature. When BR LL or BR UL is larger than 0.375, the bias is larger, the consistency of the reference interval of the test item and the reference interval in the document is poorer, otherwise, the bias is smaller, and the consistency of the reference interval of the test item and the reference interval in the document is better. The reference comparison results are visually displayed in a graphic manner, and as shown in fig. 12 and 13, a reference list including authors (english), year, sex, age, gestation period, etc. is displayed on the left of the graph, and a reference section with a bias ratio greater than 0.375 is displayed in red. BR Me is only used as a reference, and consistency judgment standards are not set.
According to the research area and the species, the detection instrument and the method of the reference section of the literature, the method of establishing the reference section, the crowd grouping, the sample size and other data, a certain test item can select or estimate the upper limit and the lower limit of the recommended reference section (median or mode) in combination with the current condition of the test item of a laboratory.
That is, the technical logic abstraction is summarized as follows, calculating the bias ratio between the upper limit value of the preset reference interval of each group corresponding to each type of test data subset of each test item and the upper limit value of the recommended reference interval of the group corresponding to the type of test data subset of the test item according to the following formula:
Wherein SD RI represents a reference value, UL 0 represents an upper limit value of a recommended reference interval of a group corresponding to a class of inspection data subset based on an inspection item, LL 0 represents a lower limit value of a recommended reference interval of a group corresponding to a class of inspection data subset based on an inspection item, BR UL represents an upper limit bias ratio of an upper limit value of a preset reference interval of a group corresponding to a class of inspection data subset of an inspection item to an upper limit bias ratio of a recommended reference interval of a group corresponding to a class of inspection data subset of the inspection item, UL represents an upper limit value of a preset reference interval of a group corresponding to a class of inspection data subset of an inspection item;
calculating a bias ratio between a lower limit value of a preset reference interval of each test item and a lower limit value of a suggested reference interval of the test item according to the following formula:
Wherein BR LL represents a bias ratio of a lower limit value of a preset reference interval of a group corresponding to a class of inspection data subset of an inspection item to a lower limit value of a recommended reference interval of the group corresponding to the class of inspection data subset of the inspection item, LL represents a lower limit value of a preset reference interval of a group corresponding to a class of inspection data subset of an inspection item, BR Me represents a bias ratio of a median of a preset reference interval of a group corresponding to a class of inspection data subset of an inspection item to a median of a recommended reference interval of a group corresponding to the class of inspection data subset of the inspection item, me represents a median of a preset reference interval of a group corresponding to a class of inspection data subset of an inspection item.
In a specific example, the preset threshold is equal to 0.375 whenOr (b)If the consistency evaluation result is greater than 0.375, judging that the consistency between the recommended reference interval of the test item and the reconstructed reference interval of the test item does not meet the preset requirement, whenAndAnd if the consistency evaluation result is less than or equal to 0.375, judging that the consistency between the recommended reference interval of the test item and the reconstructed reference interval of the test item meets the preset requirement. BR Me is only used as a reference, and consistency judgment standards are not set.
In summary, a medical test item reference interval suitability assessment system 1000 according to an embodiment of the present application is illustrated. The medical test item reference interval suitability evaluation system 1000 groups test data according to gender and age, and establishes reference intervals corresponding to age segments of different sexes, so that accuracy of the reference intervals can be improved to a certain extent. The collection of test data of the subject in the medical test project reference interval suitability evaluation system 1000 is from the real medical record data of the real patient, so that on one hand, the data source is reliable, and on the other hand, the subject is not required to be specially recruited for detection in order to perform the parameter interval experiment, the feasibility can be improved to a certain extent, and the cost is reduced.
The application and its embodiments have been described above with no limitation, and the actual construction is not limited to the embodiments of the application as shown in the drawings. In summary, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical solution will not be creatively devised without departing from the gist of the present application, and the structural manner and the embodiment are all intended to be within the protection scope of the present application.

Claims (9)

1. A medical test item reference interval suitability evaluation system, comprising:
the preset reference interval extraction module is used for extracting at least one preset reference interval of the test item from the test item reference interval database;
The physical examination data acquisition and analysis module is used for acquiring a physical examination data set, and processing the physical examination data set by using an EP28 nonparametric method to determine a reconstruction reference interval of the test item;
A test data collection module for extracting a test data set of the test item for a predetermined period of time from a laboratory information system;
The data grouping module is used for carrying out data segmentation on the inspection data set of the inspection item according to three visit types of physical examination, clinic and hospitalization to obtain a first inspection data subset, a second inspection data subset and a third inspection data subset;
The reference interval abnormal rate evaluation module is used for calculating the upper limit abnormal rate and the lower limit abnormal rate of the first test data subset, the second test data subset and the third test data subset relative to a preset reference interval to obtain a preset reference interval abnormal rate evaluation result;
the suitability management module is used for carrying out consistency evaluation analysis on the preset reference interval abnormal rate evaluation result and the reconstruction reference interval abnormal rate evaluation result so as to obtain a consistency evaluation result.
2. The medical test item reference interval suitability assessment system of claim 1, wherein the reference interval abnormality rate assessment module comprises:
The preset reference interval abnormal rate evaluation unit is used for respectively calculating the upper limit abnormal rate and the lower limit abnormal rate of the preset reference interval corresponding to the first test data subset, the second test data subset and the third test data subset to obtain a preset reference interval abnormal rate evaluation result;
the reconstruction reference interval abnormal rate evaluation unit is used for respectively calculating the upper limit abnormal rate and the lower limit abnormal rate of the reconstruction reference interval corresponding to the first detection data subset, the second detection data subset and the third detection data subset to obtain a reconstruction reference interval abnormal rate evaluation result.
3. The medical test item reference interval suitability assessment system of claim 2, wherein the suitability management module comprises:
The consistency evaluation unit is used for carrying out consistency evaluation analysis on the preset reference interval abnormal rate evaluation result and the reconstruction reference interval abnormal rate evaluation result to obtain a consistency evaluation result;
And the suitability evaluation result determining unit is used for generating a suitability evaluation result of the test item based on the consistency evaluation result, the preset reference interval abnormal rate evaluation result and the reconstruction reference interval abnormal rate evaluation result.
4. The medical test item reference interval suitability assessment system of claim 3, wherein said consistency assessment unit is further configured to:
Calculating the difference between the upper limit abnormal rate of the reconstruction reference interval and the upper limit abnormal rate of the preset reference interval and calculating the difference between the lower limit abnormal rate of the reconstruction reference interval and the lower limit abnormal rate of the preset reference interval to obtain a first difference and a second difference;
and generating the consistency evaluation result based on the comparison between the first difference degree and the second difference degree and a preset threshold value.
5. The medical test item reference interval suitability assessment system of claim 3, wherein said consistency assessment unit is further configured to:
Calculating the ratio between the upper limit abnormal rate of a preset reference interval or the lower limit abnormal rate of the preset reference interval of any two of the first test data subset, the second test data subset and the third test data subset to obtain a plurality of abnormal rate ratios;
and generating the consistency evaluation result based on the comparison between each abnormal rate ratio in the abnormal rate ratios and a preset threshold value.
6. The medical test item reference interval suitability assessment system of claim 1, further comprising:
The abnormal grade judgment standard determining module is used for determining an abnormal grade judgment standard based on the set of physical examination data;
And the abnormality grade calculation module is used for analyzing the second test data subset and the third test data subset based on the abnormality grade standard to obtain an abnormality grade distribution of the examination result of the outpatient and an abnormality grade distribution of the examination result of the inpatient.
7. The medical test item reference interval suitability assessment system of claim 6, wherein the abnormality rating criteria determination module is further configured to:
determining upper and lower limits of normal grades based on grading conditions of which the two sides of the aggregate distribution of the physical examination data are more than 2.5% or less than 97.5% or grading conditions of which the one side of the aggregate distribution of the physical examination data is less than 95%;
Determining upper and lower limits of a mild grade based on a grading condition that the distribution of the set of physical examination data is more than 1.5% and less than or equal to 2.5% or more than 97.5% and less than 98.5% or a grading condition that the distribution of the set of physical examination data is more than or equal to 95% and less than 97% on one side;
Determining upper and lower limits of a middle grade based on a grading condition that the distribution of the set of physical examination data is greater than or equal to 1.0% and less than or equal to 1.5% or greater than or equal to 98.5% and less than 99.0% or a grading condition that the distribution of the set of physical examination data is greater than or equal to 97.0% and less than 98.0% on one side;
determining upper and lower limits of a severe grade based on a grading condition that the distribution of the set of physical examination data is greater than or equal to 0.5% and less than or equal to 1.0% or greater than or equal to 99.0% and less than 99.5% or a grading condition that the distribution of the set of physical examination data is greater than or equal to 98.0% and less than 99.0% on one side;
The upper and lower limits of the extreme grade are determined based on a grading condition of 0.5% or more or 99.5% or less on both sides of the aggregate distribution of the physical examination data or a grading condition of 99.0% or more on one side of the aggregate distribution of the physical examination data.
8. The medical test item reference interval suitability assessment system of claim 1, further comprising:
the reference acquisition module is used for searching references;
The reference grade evaluation module is used for carrying out quality evaluation on the reference to obtain a reference quality grade evaluation result;
And the reference section management module is used for extracting and managing relevant data of the suggested reference section of the test item from the reference based on a large language model.
9. The medical test item reference interval suitability assessment system of claim 8, wherein the reference interval management module is further configured to:
calculating a bias ratio between a proposed reference interval of the test item and the reconstructed reference interval;
When the bias ratio is larger than a preset threshold value, judging that the consistency between the recommended reference interval of the test item and the reconstructed reference interval of the test item does not meet the preset requirement;
And when the bias ratio is smaller than or equal to a preset threshold value, judging that the consistency between the recommended reference interval of the test item and the reconstructed reference interval of the test item meets the preset requirement.
CN202510348983.6A 2025-03-24 2025-03-24 Medical examination project reference interval suitability evaluation system Active CN119889552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510348983.6A CN119889552B (en) 2025-03-24 2025-03-24 Medical examination project reference interval suitability evaluation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510348983.6A CN119889552B (en) 2025-03-24 2025-03-24 Medical examination project reference interval suitability evaluation system

Publications (2)

Publication Number Publication Date
CN119889552A CN119889552A (en) 2025-04-25
CN119889552B true CN119889552B (en) 2025-05-27

Family

ID=95441963

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510348983.6A Active CN119889552B (en) 2025-03-24 2025-03-24 Medical examination project reference interval suitability evaluation system

Country Status (1)

Country Link
CN (1) CN119889552B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114010173A (en) * 2021-11-01 2022-02-08 暨南大学附属第一医院(广州华侨医院) Neural network-based assessment system and method for cardioembolic stroke in non-valvular atrial fibrillation
CN116580800A (en) * 2023-04-14 2023-08-11 太原金域临床检验所有限公司 A method and device for constructing a reference interval

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4408720B2 (en) * 2003-03-03 2010-02-03 富士フイルム株式会社 Inspection method for radiation imaging system and medical image processing apparatus using the same
KR101546421B1 (en) * 2015-02-16 2015-08-24 에스티엑스엔진 주식회사 Adaptive constant false alarm rate processing method
CN116825339A (en) * 2023-01-09 2023-09-29 中国医科大学附属第一医院 Inspection reference interval research intelligent management platform
CN118565560B (en) * 2024-07-02 2024-12-24 广东汇锦科技有限公司 Method and system for detecting damage of enameled wire

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114010173A (en) * 2021-11-01 2022-02-08 暨南大学附属第一医院(广州华侨医院) Neural network-based assessment system and method for cardioembolic stroke in non-valvular atrial fibrillation
CN116580800A (en) * 2023-04-14 2023-08-11 太原金域临床检验所有限公司 A method and device for constructing a reference interval

Also Published As

Publication number Publication date
CN119889552A (en) 2025-04-25

Similar Documents

Publication Publication Date Title
US20250190497A1 (en) System and method for automated file reporting
Pezoulas et al. Medical data quality assessment: On the development of an automated framework for medical data curation
CN120687597B (en) Intelligent Retrieval Methods and Related Equipment for Scientific and Technological Literature Based on Generative Artificial Intelligence
EP3734604A1 (en) Method and system for supporting medical decision making
US8494987B2 (en) Semantic relationship extraction, text categorization and hypothesis generation
CN111373392B (en) Document classification device
Zhang et al. VetTag: improving automated veterinary diagnosis coding via large-scale language modeling
EP4068121B1 (en) Method and apparatus for acquiring character, page processing method, method for constructing knowledge graph, and medium
Hase Automated content analysis
CN118297360B (en) Intelligent matching method for chest pain center construction
CN120951975B (en) A Standard Document Writing Method and System Based on Knowledge Retrieval Enhancement
US20210209095A1 (en) Apparatus and Method for Combining Free-Text and Extracted Numerical Data for Predictive Modeling with Explanations
US20200293528A1 (en) Systems and methods for automatically generating structured output documents based on structural rules
CN113779179A (en) ICD intelligent coding method based on deep learning and knowledge graph
Arumugham et al. An explainable deep learning model for prediction of early‐stage chronic kidney disease
CN112183104B (en) Code recommendation method, system, corresponding equipment and storage medium
CN118394944A (en) Topic modeling and emotion analysis method and system based on deep learning
CN117556118A (en) Visual recommendation system and method based on scientific research big data prediction
KR20240110453A (en) Personal information detection device, system, method and recording medium in unstructured data
CN120579548A (en) Medical record information extraction and analysis method and system based on deep learning large model
CN118352015A (en) Comprehensive understanding and generating method and device for medical examination report
Agterberg et al. Cluster analysis application to identify groups of individuals with high health expenditures
CN119889552B (en) Medical examination project reference interval suitability evaluation system
CN119380163A (en) Fake news detection method based on comment-context dual collaborative masked Transformer model
US20260011415A1 (en) System and method for automated file reporting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant