CN111724374B - Evaluation method and terminal of analysis result - Google Patents

Evaluation method and terminal of analysis result Download PDF

Info

Publication number
CN111724374B
CN111724374B CN202010572829.4A CN202010572829A CN111724374B CN 111724374 B CN111724374 B CN 111724374B CN 202010572829 A CN202010572829 A CN 202010572829A CN 111724374 B CN111724374 B CN 111724374B
Authority
CN
China
Prior art keywords
result
test
marking result
marking
gold standard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010572829.4A
Other languages
Chinese (zh)
Other versions
CN111724374A (en
Inventor
林晨
喻碧莺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lin Chen
Wisdom Medical Shenzhen Co ltd
Original Assignee
Wisdom Medical Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wisdom Medical Shenzhen Co ltd filed Critical Wisdom Medical Shenzhen Co ltd
Priority to CN202010572829.4A priority Critical patent/CN111724374B/en
Publication of CN111724374A publication Critical patent/CN111724374A/en
Application granted granted Critical
Publication of CN111724374B publication Critical patent/CN111724374B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses an evaluation method and a terminal of an analysis result, wherein a preset number of files are obtained as a test set; acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set; acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result; judging whether the first test result is larger than a threshold value, if so, judging that the second marking result has accuracy; the method comprises the steps of obtaining a first mark and a second mark of a first device and an AI model aiming at the same test set, obtaining a gold standard of the test set, calculating a difference value between the first mark and the gold standard and a difference value between the first mark and the gold standard, carrying out t-test on the difference value, obtaining the accuracy of the AI model compared with the first device, and realizing accuracy assessment.

Description

Evaluation method and terminal of analysis result
Technical Field
The present invention relates to the field of statistical methods, and in particular, to a method and a terminal for evaluating an analysis result.
Background
The existing method for evaluating the accuracy of the analysis result of the AI mainly carries out difference operation by calculating the difference between the analysis result of the AI and the gold standard, but the judgment mode cannot see the spatial distribution trend, for example, in some scenes, the AI measurement result tends to show the difference in the horizontal direction and the difference in the vertical position is very small, the spatial distribution has important prompt significance for further improvement, but the existing method cannot embody, so that the accuracy evaluation compared with the manual method needs to be designed.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: the method and the terminal for evaluating the analysis result are provided, and the AI analysis result is accurately evaluated.
In order to solve the technical problems, the invention adopts a technical scheme that:
a method of evaluating an analysis result, comprising the steps of:
s1, acquiring a preset number of files as a test set;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result;
s4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy.
In order to solve the technical problems, the invention adopts another technical scheme that:
an evaluation terminal of analysis results, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
s1, acquiring a preset number of files as a test set;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result;
s4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy.
The invention has the beneficial effects that: the same test set is marked by setting the first equipment and the AI model together, the marked result is compared with the gold standard to calculate a difference value, the difference value is subjected to t-test, the difference value between different marked results and the gold standard is calculated, the comparison result of the difference value between different marked modes and the gold standard is intuitively obtained, the difference value is subjected to t-test, the accuracy standard can be quantized, whether the marked result of the AI model has accuracy can be directly judged according to the result of the t-test, and the accuracy assessment of the AI analysis result is realized.
Drawings
FIG. 1 is a flow chart showing the steps of a method for evaluating analysis results according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an evaluation terminal for analysis results according to an embodiment of the present invention;
FIG. 3 is a scatter plot of an embodiment of the present invention;
description of the reference numerals:
1. an evaluation terminal of analysis results; 2. a processor; 3. a memory;
Detailed Description
In order to describe the technical contents, the achieved objects and effects of the present invention in detail, the following description will be made with reference to the embodiments in conjunction with the accompanying drawings.
Referring to fig. 1, a method for evaluating an analysis result includes the steps of:
s1, acquiring a preset number of files as a test set;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result;
s4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy.
From the above description, the beneficial effects of the invention are as follows: the same test set is marked by setting the first equipment and the AI model together, the marked result is compared with the gold standard to calculate a difference value, the difference value is subjected to t-test, the difference value between different marked results and the gold standard is calculated, the comparison result of the difference value between different marked modes and the gold standard is intuitively obtained, the difference value is subjected to t-test, the accuracy standard can be quantized, whether the marked result of the AI model has accuracy can be directly judged according to the result of the t-test, and the accuracy assessment of the AI analysis result is realized.
Further, the calculating method of the first difference and the second difference in the step S3 is as follows:
acquiring the coordinates of the first marking result, the coordinates of the second marking result and the coordinates of the gold standard;
and calculating a first difference value between the coordinates of the first marking result and the coordinates of the gold standard and a second difference value between the coordinates of the second marking result and the coordinates of the gold standard by utilizing a trigonometric function.
From the above description, the difference relationships between the first mark result, the second mark result and the gold standard are obtained through the coordinates of the first mark result, the second mark result and the gold standard, and the difference relationships between the first mark and the gold standard and the difference relationships between the second mark and the gold standard can be compared conveniently.
Further, the step S3 further includes:
generating a scatter diagram according to the first difference value and the second difference value by taking the gold standard as a circle center and taking the difference value as a radius;
or the first marking result is used as a circle center, and a scatter diagram is generated according to a third difference value between the coordinates of the first marking result and the coordinates of the second marking result;
or generating a scatter diagram by taking the second marking result as a circle center according to the third difference value.
As can be seen from the above description, a scatter diagram is generated according to the first difference, the second difference and the gold standard, so that the difference between the first mark and the gold standard and the difference between the second mark and the gold standard can be intuitively embodied, that is, the accuracy of the results of the first mark and the second mark can be intuitively obtained; and considering that the gold standard can not be obtained, the first mark or the second mark can be used as the circle center, and a scatter diagram is generated by using the third difference value between the first mark and the second mark, so that the difference value between the first mark and the second mark can be intuitively obtained, and the accuracy of judging the second mark is more convenient.
Further, the step S2 further includes:
obtaining a third marking result of the second device on the test set and a fourth marking result of the first device on the test set, wherein the fourth marking result and the first marking result are different in generation time;
the step S4 further includes:
calculating a first set of intra-correlation coefficients between the first and fourth labeling results, a second set of intra-correlation coefficients between the first and third labeling results, and a third set of intra-correlation coefficients between the first and second labeling results;
performing t-test on the first group internal correlation coefficient and the third group internal correlation coefficient to obtain a second test result, and performing t-test on the second group internal correlation coefficient and the third group internal correlation coefficient to obtain a third test result;
judging whether the second test result and the third test result are both larger than a threshold value, and if so, considering that the second marking result has repeatability.
As can be seen from the above description, the second device is added to obtain the third marking result of the second device, and obtain the fourth marking result of the first device, where the generation time of the fourth marking result is different from that of the first marking result, and a comparison group is added, so that the evaluation result is more reliable, and the intra-group correlation number between the comparison group and the comparison group is t-checked, and the repeatability evaluation is further performed on the marking result of the second marking, i.e., the AI, so that the dimension of the evaluation is more complete and the evaluation result is more reliable.
Further, the calculating the first set of internal correlation coefficients, the second set of internal correlation coefficients, and the third set of internal correlation coefficients is specifically:
and calculating the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient by using a self-help method to respectively obtain a plurality of the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient.
From the above description, the intra-group correlation coefficients are calculated by a self-help method, so that a large number of intra-group correlation coefficients can be obtained for one group of samples, and t-test can be performed on intra-group correlation coefficients of different groups subsequently to obtain comparison results, and repeatability evaluation is finally realized.
Referring to fig. 2, an evaluation terminal of analysis results includes a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the following steps when executing the computer program:
s1, acquiring a preset number of files as a test set;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result;
s4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy.
The invention has the beneficial effects that: the same test set is marked by setting the first equipment and the AI model together, the marked result is compared with the gold standard to calculate a difference value, the difference value is subjected to t-test, the difference value between different marked results and the gold standard is calculated, the comparison result of the difference value between different marked modes and the gold standard is intuitively obtained, the difference value is subjected to t-test, the accuracy standard can be quantized, whether the marked result of the AI model has accuracy can be directly judged according to the result of the t-test, and the accuracy assessment of the AI analysis result is realized.
Further, when the processor performs the calculation of the first difference and the second difference in step S3:
acquiring the coordinates of the first marking result, the coordinates of the second marking result and the coordinates of the gold standard;
and calculating a first difference value between the coordinates of the first marking result and the coordinates of the gold standard and a second difference value between the coordinates of the second marking result and the coordinates of the gold standard by utilizing a trigonometric function.
From the above description, the difference relationships between the first mark result, the second mark result and the gold standard are obtained through the coordinates of the first mark result, the second mark result and the gold standard, and the difference relationships between the first mark and the gold standard and the difference relationships between the second mark and the gold standard can be compared conveniently.
Further, the step S3 further includes:
generating a scatter diagram according to the first difference value and the second difference value by taking the gold standard as a circle center and taking the difference value as a radius;
or the first marking result is used as a circle center, and a scatter diagram is generated according to a third difference value between the coordinates of the first marking result and the coordinates of the second marking result;
or generating a scatter diagram by taking the second marking result as a circle center according to the third difference value.
As can be seen from the above description, a scatter diagram is generated according to the first difference, the second difference and the gold standard, so that the difference between the first mark and the gold standard and the difference between the second mark and the gold standard can be intuitively embodied, that is, the accuracy of the results of the first mark and the second mark can be intuitively obtained; and considering that the gold standard can not be obtained, the first mark or the second mark can be used as the circle center, and a scatter diagram is generated by using the third difference value between the first mark and the second mark, so that the difference value between the first mark and the second mark can be intuitively obtained, and the accuracy of judging the second mark is more convenient.
Further, the step S2 further includes:
obtaining a third marking result of the second device on the test set and a fourth marking result of the first device on the test set, wherein the fourth marking result and the first marking result are different in generation time;
the step S4 further includes:
calculating a first set of intra-correlation coefficients between the first and fourth labeling results, a second set of intra-correlation coefficients between the first and third labeling results, and a third set of intra-correlation coefficients between the first and second labeling results;
performing t-test on the first group internal correlation coefficient and the third group internal correlation coefficient to obtain a second test result, and performing t-test on the second group internal correlation coefficient and the third group internal correlation coefficient to obtain a third test result;
judging whether the second test result and the third test result are both larger than a threshold value, and if so, considering that the second marking result has repeatability.
As can be seen from the above description, the second device is added to obtain the third marking result of the second device, and obtain the fourth marking result of the first device, where the generation time of the fourth marking result is different from that of the first marking result, and a comparison group is added, so that the evaluation result is more reliable, and the intra-group correlation number between the comparison group and the comparison group is t-checked, and the repeatability evaluation is further performed on the marking result of the second marking, i.e., the AI, so that the dimension of the evaluation is more complete and the evaluation result is more reliable.
Further, the calculating the first set of internal correlation coefficients, the second set of internal correlation coefficients, and the third set of internal correlation coefficients is specifically:
and calculating the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient by using a self-help method to respectively obtain a plurality of the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient.
From the above description, the intra-group correlation coefficients are calculated by a self-help method, so that a large number of intra-group correlation coefficients can be obtained for one group of samples, and t-test can be performed on intra-group correlation coefficients of different groups subsequently to obtain comparison results, and repeatability evaluation is finally realized.
Referring to fig. 1, a first embodiment of the present invention is as follows:
the evaluation method of the analysis result specifically comprises the following steps:
s1, acquiring a preset number of files as a test set;
in an alternative embodiment, the file is an image;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
the first marker marks the test set through the first equipment to obtain a first marking result;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result;
the golden standard is the correct position of the mark, taking the determination of the macula fovea golden standard of the fundus ultra-wide angle image as an example, two examinations of OCTA (Optical Coherence Tomography Angiography, optical coherence tomography blood vessel imaging) and ultra-wide angle fundus imaging can be carried out on the same patient, according to the position of the macula fovea determined in the OCTA tomographic image, the relative position relation between the macula fovea and retinal blood vessels in the OCTA tomographic image is determined, and because the OCTA is the same as the blood vessels shot by the ultra-wide angle fundus imaging, a linear regression equation is established according to the relative position of the macula fovea and the retinal blood vessels on the OCTA tomographic image, the accurate position of the macula fovea can be obtained according to the position of the retinal blood vessels on the obtained fundus ultra-wide angle image, namely the golden standard of the macula fovea;
calculating a corresponding gold standard for each file in the test set;
on the basis, the calculation method for obtaining the first difference value and the second difference value in the step S3 comprises the following steps:
acquiring coordinates of the first marking result, coordinates of the second marking result and coordinates of the gold standard; specifically, the marked files in the test set can be placed in the same coordinate system in the same mode, and the coordinates of the first marking result, the second marking result and the gold standard are obtained;
in an alternative embodiment, the pictures in the test set are the same in size, for example, 3000×4000 (pixels), and the positions of the pixels where the mark points are located can be directly used as coordinates of the mark results;
calculating the first difference between the coordinates of the first marking result and the coordinates of the gold standard and the second difference between the coordinates of the second marking result and the coordinates of the gold standard by utilizing a trigonometric function according to the coordinates; the coordinates of the mark for a file as in the second mark result are [ X ] AI ,Y AI ]The coordinates of the mark for the same file in the first mark result are [ X ] Human 1 ,Y Human 1 ]The difference is:
Sqrt[(X AI -X human 1 ) 2 +(Y AI -Y Human 1 ) 2 ]
After the first difference value and the second difference value are obtained in step S3, the method further includes:
generating a scatter diagram according to the first difference value and the second difference value by taking the coordinates of the gold standard as a circle center and taking the difference value as a radius;
or generating a scatter diagram by taking the coordinates of the first marking result as the circle center according to a third difference value between the coordinates of the first marking result and the coordinates of the second marking result;
or generating a scatter diagram by taking the coordinates of the second marking result as the circle center according to the third difference value;
in an alternative embodiment, a direction of the coordinates of the first marking result relative to the coordinates of the gold standard and a direction of the coordinates of the second marking result relative to the coordinates of the gold standard are also obtained, and a scatter diagram is generated according to the difference value and the direction; or directly generating a scatter diagram according to the coordinates of the first marking result, the coordinates of the second marking result and the coordinates of the gold standard;
s4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy;
the first test result is the value of the parameter P in the t test, and the threshold value may be 0.05, i.e. when the value of P is greater than 0.05, the difference between the second mark and the gold standard and the difference between the first mark and the gold standard are considered to be no difference, i.e. the AI mark result has the same accuracy as the manual mark result of the first marker;
in an alternative embodiment, the consistency between the first mark and the second mark can be described by using a bland-alterman diagram without obtaining a gold standard, namely, the consistency between the second mark result of the AI and the first mark result of the first marker is obtained;
if the second marking result of the AI has better consistency with the first marking result of the first marker, the marking result of the AI model is considered to have the same accuracy as the manual marking result of the first marker.
The second embodiment of the invention is as follows:
an evaluation method of analysis results is different from the first embodiment in that:
the step S2 further includes:
obtaining a third marking result of the second device on the test set and a fourth marking result of the first device on the test set, wherein the fourth marking result and the first marking result are different in generation time;
the second marker marks the test set through the second equipment to obtain a third marking result; a first marker marks the test set through the first equipment to obtain a fourth marking result, wherein the fourth marking result is different from the generation time of the first marking result, and if the first marker marks the test set again after marking the test set for four days to obtain the first marking result, the fourth marking result is obtained;
the step S4 further includes:
s5, calculating a first Intra-group correlation coefficient (ICC, intra-class Correlation Correlation, intra-group correlation number) between the first marking result and the fourth marking result, a second Intra-group correlation coefficient between the first marking result and the third marking result and a third Intra-group correlation coefficient between the first marking result and the second marking result;
the method comprises the steps of calculating the consistency of marks of a first marker on the same test set at different times, the consistency of marks of the first marker and a second marker on the same test set and the consistency of marks of the first marker and an AI model on the same test set; the consistency of the marks of the same person on the same test set at different times, by different persons, by the person and by the AI is obtained;
in this embodiment, a self-service Method (Bootstrap Method) may be used to calculate the first intra-group correlation coefficient, the second intra-group correlation coefficient, and the third intra-group correlation coefficient, so as to obtain a plurality of the first intra-group correlation coefficient, the second intra-group correlation coefficient, and the third intra-group correlation coefficient, respectively;
specifically, the first marking result and the fourth marking result are self-sampled, intra-group correlation coefficients are calculated according to the sampling result of each time, and finally a plurality of first intra-group correlation coefficients are obtained; self-sampling the first marking result and the third marking result, calculating intra-group correlation coefficients according to sampling results of each time, and finally obtaining a plurality of second intra-group correlation coefficients; self-sampling the first marking result and the second marking result, calculating intra-group correlation coefficients according to sampling results of each time, and finally obtaining a plurality of third intra-group correlation coefficients;
in an alternative embodiment, at least 50 first intra-group correlation coefficients, second intra-group correlation coefficients, and third intra-group correlation coefficients are generated respectively to ensure the accuracy of the t-test;
performing t-test on the first group internal correlation coefficient and the third group internal correlation coefficient to obtain a second test result, and performing t-test on the second group internal correlation coefficient and the third group internal correlation coefficient to obtain a third test result;
judging whether the second test result and the third test result are both greater than a threshold value, and if so, considering that the second marking result has repeatability;
t-checking the first group internal correlation coefficient and the third group internal correlation coefficient, namely checking the difference between the repeatability of the marking result of the same person (first marker) on the same test set at different times and the repeatability of the marking result of the person and the AI model on the same test set; t-checking the second set of internal correlation coefficients and the third set of internal correlation coefficients, i.e., checking the difference between the repeatability of the marking results of one person (first marker) and the other person (second marker) on the same test set and the repeatability of the marking results of the person (first marker) and the AI model on the same test set; if the threshold is 0.05, when the value of the result P of the t test is greater than 0.05, the repeatability between the AI model and the marking result of the same test set by the person is considered to be the same as that of the marking result of the same test set by different persons and the repeatability between the marking result of the same test set by the same person at different times, namely the repeatability of the AI model is the same as that of the manual method;
in an alternative embodiment, the reproducibility of the AI model is evaluated, that is, a fifth marking result of the AI model on the test set is obtained, wherein the fifth marking result is different from the generation time of the second marking result; calculating a fourth group of internal correlation coefficients between the second marking result and the fifth marking result by using a self-help method to obtain a plurality of fourth group of internal correlation coefficients, and performing t-test on the first group of internal correlation coefficients and the fourth group of internal correlation coefficients to obtain a fourth test result; and judging whether the fourth test result and the fourth group internal phase relation number are both larger than a threshold value, and if so, judging that the marking result of the AI model has reproducibility.
Referring to fig. 2, a third embodiment of the present invention is as follows:
an evaluation terminal 1 of analysis results comprises a processor 2, a memory 3 and a computer program stored on the memory 3 and executable on the processor 2, which processor 2 implements the steps of embodiment one or embodiment two when executing the computer program.
In summary, the present invention provides an evaluation method and a terminal for an analysis result, where a first marker marks a test set through a first device to obtain a first marking result and a fourth marking result, marking the test set by an AI model to obtain a second marking result, marking the test set through a second device to obtain a third marking result, obtaining a gold standard, calculating a difference value between the first marking result and the gold standard and a difference value between the second marking result and the gold standard, using a scatter diagram to represent the difference value, and performing t-test, where a distribution trend of the marking result in space can be represented through the scatter diagram, and if the t-test result exceeds a threshold, the AI model marking result and the artificial marking result are considered to have consistent accuracy; in addition, the fourth marking result of the first marker and the third marking result of the second marker are obtained, the repeatability of the marking result of the same test set between the same person (the first marking result and the fourth editor result), different persons (the first marking result and the third marking result) and the person and AI (the first marking result and the second marking result) can be obtained, a large number of intra-group correlation coefficients can be obtained through a self-service method, so that t-test can be carried out, the difference of the repeatability of the marking result of the same test set between the person and AI on the repeatability of the marking result of the same test set between the same person and different persons can be obtained, the problem that the repeatability of the marking result of the AI model cannot be systematically and comprehensively judged by adopting single values such as ICC (integrated circuit) and gram (gram) is solved, and the accurate evaluation of the AI analysis result is realized.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent changes made by the specification and drawings of the present invention, or direct or indirect application in the relevant art, are included in the scope of the present invention.

Claims (6)

1. A method of evaluating an analysis result, comprising the steps of:
s1, acquiring a preset number of files as a test set;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
a first marker marks the test set through the first equipment to obtain a first marking result;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result; the gold standard is the correct position of the mark, and the corresponding gold standard is calculated for each file in the test set, including: performing OCTA and ultra-wide-angle fundus photography on the same patient, determining the relative position relation between the macula fovea and retinal blood vessels in an OCTA tomographic image according to the position of the macula fovea determined in the OCTA tomographic image, establishing a linear regression equation according to the relative position between the macula fovea and the retinal blood vessels on the OCTA tomographic image, and obtaining the golden standard of the macula fovea according to the position of the retinal blood vessels on the obtained fundus ultra-wide-angle image and the linear regression equation;
the first difference and the second difference are calculated by the following steps:
placing the marked files in the test set in the same coordinate system in the same mode to obtain the first marking result, the second marking result and the gold standard coordinates;
the coordinates of the mark for a file in the second mark result are [ X ] AI ,Y AI ]The coordinates of the mark for the same file in the first mark result are [ X ] Human 1 ,Y Human 1 ]The difference is:
Sqrt[(X AI -X human 1 ) 2 +(Y AI -Y Human 1 ) 2 ];
S4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy;
the step S2 further includes:
obtaining a third marking result of the second device on the test set and a fourth marking result of the first device on the test set, wherein the fourth marking result and the first marking result are different in generation time;
a second marker marks the test set through the second equipment to obtain a third marking result;
the step S4 further includes:
calculating a first set of intra-correlation coefficients between the first and fourth labeling results, a second set of intra-correlation coefficients between the first and third labeling results, and a third set of intra-correlation coefficients between the first and second labeling results;
performing t-test on the first group internal correlation coefficient and the third group internal correlation coefficient to obtain a second test result, and performing t-test on the second group internal correlation coefficient and the third group internal correlation coefficient to obtain a third test result;
judging whether the second test result and the third test result are both larger than a threshold value, and if so, considering that the second marking result has repeatability.
2. The method for evaluating an analysis result according to claim 1, wherein the step S3 further comprises:
generating a scatter diagram according to the first difference value and the second difference value by taking the gold standard as a circle center and taking the difference value as a radius;
or the first marking result is used as a circle center, and a scatter diagram is generated according to a third difference value between the coordinates of the first marking result and the coordinates of the second marking result;
or generating a scatter diagram by taking the second marking result as a circle center according to the third difference value.
3. The method of claim 1, wherein calculating the first set of internal correlation coefficients, the second set of internal correlation coefficients, and the third set of internal correlation coefficients comprises:
and calculating the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient by using a self-help method to respectively obtain a plurality of the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient.
4. An evaluation terminal of analysis results, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the following steps when executing the computer program:
s1, acquiring a preset number of files as a test set;
s2, acquiring a first marking result of the first equipment on the test set and a second marking result of the AI model on the test set;
the first marker marks the test set through the first equipment to obtain a first marking result;
s3, acquiring a gold standard, and performing t-test on a first difference value between the first marking result and the gold standard and a second difference value between the second marking result and the gold standard to obtain a first test result; the gold standard is the correct position of the mark, and the corresponding gold standard is calculated for each file in the test set, including: performing OCTA and ultra-wide-angle fundus photography on the same patient, determining the relative position relation between the macula fovea and retinal blood vessels in an OCTA tomographic image according to the position of the macula fovea determined in the OCTA tomographic image, establishing a linear regression equation according to the relative position between the macula fovea and the retinal blood vessels on the OCTA tomographic image, and obtaining the golden standard of the macula fovea according to the position of the retinal blood vessels on the obtained fundus ultra-wide-angle image and the linear regression equation;
the first difference and the second difference are calculated by the following steps:
placing the marked files in the test set in the same coordinate system in the same mode to obtain the first marking result, the second marking result and the gold standard coordinates;
the coordinates of the mark for a file in the second mark result are [ X ] AI ,Y AI ]The coordinates of the mark for the same file in the first mark result are [ X ] Human 1 ,Y Human 1 ]The difference is:
Sqrt[(X AI -X human 1 ) 2 +(Y AI -Y Human 1 ) 2 ];
S4, judging whether the first test result is larger than a threshold value, and if so, judging that the second marking result has accuracy;
the step S2 further includes:
obtaining a third marking result of the second device on the test set and a fourth marking result of the first device on the test set, wherein the fourth marking result and the first marking result are different in generation time;
a second marker marks the test set through the second equipment to obtain a third marking result;
the step S4 further includes:
calculating a first set of intra-correlation coefficients between the first and fourth labeling results, a second set of intra-correlation coefficients between the first and third labeling results, and a third set of intra-correlation coefficients between the first and second labeling results;
performing t-test on the first group internal correlation coefficient and the third group internal correlation coefficient to obtain a second test result, and performing t-test on the second group internal correlation coefficient and the third group internal correlation coefficient to obtain a third test result;
judging whether the second test result and the third test result are both larger than a threshold value, and if so, considering that the second marking result has repeatability.
5. The terminal for evaluating an analysis result according to claim 4, wherein the step S3 further comprises:
generating a scatter diagram according to the first difference value and the second difference value by taking the gold standard as a circle center and taking the difference value as a radius;
or the first marking result is used as a circle center, and a scatter diagram is generated according to a third difference value between the coordinates of the first marking result and the coordinates of the second marking result;
or generating a scatter diagram by taking the second marking result as a circle center according to the third difference value.
6. The terminal for evaluating an analysis result according to claim 4, wherein calculating the first set of intra-correlation coefficients, the second set of intra-correlation coefficients, and the third set of intra-correlation coefficients is specifically:
and calculating the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient by using a self-help method to respectively obtain a plurality of the first group internal correlation coefficient, the second group internal correlation coefficient and the third group internal correlation coefficient.
CN202010572829.4A 2020-06-22 2020-06-22 Evaluation method and terminal of analysis result Active CN111724374B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010572829.4A CN111724374B (en) 2020-06-22 2020-06-22 Evaluation method and terminal of analysis result

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010572829.4A CN111724374B (en) 2020-06-22 2020-06-22 Evaluation method and terminal of analysis result

Publications (2)

Publication Number Publication Date
CN111724374A CN111724374A (en) 2020-09-29
CN111724374B true CN111724374B (en) 2024-03-01

Family

ID=72569897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010572829.4A Active CN111724374B (en) 2020-06-22 2020-06-22 Evaluation method and terminal of analysis result

Country Status (1)

Country Link
CN (1) CN111724374B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633265A (en) * 2017-09-04 2018-01-26 深圳市华傲数据技术有限公司 For optimizing the data processing method and device of credit evaluation model
CN108805134A (en) * 2018-06-25 2018-11-13 慧影医疗科技(北京)有限公司 A kind of construction method of dissection of aorta parted pattern and application
CN109598415A (en) * 2018-11-13 2019-04-09 黑龙江金域医学检验所有限公司 Method for evaluating quality and device, the computer readable storage medium of detection system
CN109685870A (en) * 2018-11-21 2019-04-26 北京慧流科技有限公司 Information labeling method and device, tagging equipment and storage medium
JP2020009141A (en) * 2018-07-06 2020-01-16 株式会社 日立産業制御ソリューションズ Machine learning device and method
CN110826908A (en) * 2019-11-05 2020-02-21 北京推想科技有限公司 Evaluation method and device for artificial intelligent prediction, storage medium and electronic equipment
CN110826494A (en) * 2019-11-07 2020-02-21 达而观信息科技(上海)有限公司 Method and device for evaluating quality of labeled data, computer equipment and storage medium
CN110858327A (en) * 2018-08-24 2020-03-03 宏达国际电子股份有限公司 Method of validating training data, training system and computer program product
CN111311558A (en) * 2020-02-09 2020-06-19 华中科技大学同济医学院附属协和医院 Construction method of imaging omics model for pancreatic cancer prediction

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633265A (en) * 2017-09-04 2018-01-26 深圳市华傲数据技术有限公司 For optimizing the data processing method and device of credit evaluation model
CN108805134A (en) * 2018-06-25 2018-11-13 慧影医疗科技(北京)有限公司 A kind of construction method of dissection of aorta parted pattern and application
JP2020009141A (en) * 2018-07-06 2020-01-16 株式会社 日立産業制御ソリューションズ Machine learning device and method
CN110858327A (en) * 2018-08-24 2020-03-03 宏达国际电子股份有限公司 Method of validating training data, training system and computer program product
CN109598415A (en) * 2018-11-13 2019-04-09 黑龙江金域医学检验所有限公司 Method for evaluating quality and device, the computer readable storage medium of detection system
CN109685870A (en) * 2018-11-21 2019-04-26 北京慧流科技有限公司 Information labeling method and device, tagging equipment and storage medium
CN110826908A (en) * 2019-11-05 2020-02-21 北京推想科技有限公司 Evaluation method and device for artificial intelligent prediction, storage medium and electronic equipment
CN110826494A (en) * 2019-11-07 2020-02-21 达而观信息科技(上海)有限公司 Method and device for evaluating quality of labeled data, computer equipment and storage medium
CN111311558A (en) * 2020-02-09 2020-06-19 华中科技大学同济医学院附属协和医院 Construction method of imaging omics model for pancreatic cancer prediction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A survey on medical image analysis in diabetic retinopathy;Skylar Stolte et al.;《Medical Image Analysis》;1-27 *
基于钼靶影像的乳腺肿瘤形态特征分析;陈桂林;《中国优秀硕士学位论文全文数据库》;1-87 *

Also Published As

Publication number Publication date
CN111724374A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
Chakraborty et al. Observer studies involving detection and localization: modeling, analysis, and validation
RU2542096C2 (en) System for lung ventilation information presentation
US10052032B2 (en) Stenosis therapy planning
CN106163388A (en) For processing processing means and the method for the cardiac data of life entity
Jannin et al. Validation in medical image processing.
CN109770943A (en) A kind of ultrasonic automatic optimization method positioned using computer vision
KR20190123865A (en) Calibration method of x-ray apparatus and calibration apparatus for the same
Sheng et al. BurnCalc assessment study of computer-aided individual three-dimensional burn area calculation
EP3397979A1 (en) System and method for assessing tissue properties using chemical-shift-encoded magnetic resonance imaging
CN113870227B (en) Medical positioning method and device based on pressure distribution, electronic equipment and storage medium
TWI542320B (en) Human weight estimating method by using depth images and skeleton characteristic
Özsoy et al. Reliability and agreement of Azure Kinect and Kinect v2 depth sensors in the shoulder joint range of motion estimation
Denton et al. The identification of cerebral volume changes in treated growth hormone-deficient adults using serial 3D MR image processing
CN111724374B (en) Evaluation method and terminal of analysis result
CN116236208A (en) Multi-lead electrocardio electrode patch positioning method based on human body surface characteristics
US11941811B2 (en) Method for assessing cardiothoracic ratio and cardiothoracic ratio assessment system
US11645767B2 (en) Capturing a misalignment
CN109350062B (en) Medical information acquisition method, medical information acquisition device and non-volatile computer storage medium
CN114930390A (en) Method and apparatus for registering a medical image of a living subject with an anatomical model
EP4184454A1 (en) Weight estimation of a patient
EP4133298B1 (en) Automated detection of critical stations in multi-station magnetic resonance imaging
CN1748228A (en) Indication of accuracy of quantitative analysis
CN106859697B (en) Method for measuring distance by ultrasonic diagnostic apparatus
CN116110572A (en) Artificial intelligence model training method and device and computer readable storage medium
CA3188141A1 (en) Solution for determination of supraphysiological body joint movements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210309

Address after: Unit G7, block a, floor 1, building 9, Baoneng Science Park, Qinghu village, Qinghu community, Longhua street, Longhua District, Shenzhen, Guangdong 518000

Applicant after: Lin Chen

Applicant after: Huishili medical (Shenzhen) Co.,Ltd.

Address before: Room 604, block 4, ginkgo garden, 296 Shangdu Road, Cangshan District, Fuzhou City, Fujian Province 350000

Applicant before: Lin Chen

Applicant before: Ke Junlong

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220909

Address after: Room 804, Building 3A, Qiaoxiang Mansion, Qiaoxiang Road, Futian District, Shenzhen, Guangdong 518000

Applicant after: Wisdom Medical (Shenzhen) Co.,Ltd.

Applicant after: Lin Chen

Address before: Unit G7, block a, floor 1, building 9, Baoneng Science Park, Qinghu village, Qinghu community, Longhua street, Longhua District, Shenzhen, Guangdong 518000

Applicant before: Lin Chen

Applicant before: Huishili medical (Shenzhen) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant