CN111681731A - Method for automatically marking colors of inspection report - Google Patents

Method for automatically marking colors of inspection report Download PDF

Info

Publication number
CN111681731A
CN111681731A CN202010525930.4A CN202010525930A CN111681731A CN 111681731 A CN111681731 A CN 111681731A CN 202010525930 A CN202010525930 A CN 202010525930A CN 111681731 A CN111681731 A CN 111681731A
Authority
CN
China
Prior art keywords
character string
report
algorithm
detected
utilizing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010525930.4A
Other languages
Chinese (zh)
Inventor
张路
俞富裕
高文琪
李小满
高静怡
屈怀瑾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Meitong Technology Co ltd
Original Assignee
Hangzhou Meitong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Meitong Technology Co ltd filed Critical Hangzhou Meitong Technology Co ltd
Priority to CN202010525930.4A priority Critical patent/CN111681731A/en
Publication of CN111681731A publication Critical patent/CN111681731A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/109Font handling; Temporal or kinetic typography

Abstract

The invention discloses a method for automatically marking colors of an inspection report, which comprises the following steps: classifying the obtained contents of the first examination report according to different diseased organs by using a character string matching algorithm, and outputting a second examination report; and finally, judging whether the character string to be detected is the content to be labeled and the corresponding color label by utilizing an artificial intelligent language processing technology and a knowledge graph technology. The method can improve places with irregular formats in the inspection reports, can classify the inspection reports correspondingly according to requirements, judges abnormal organs or parts and labels the abnormal organs or parts in colors.

Description

Method for automatically marking colors of inspection report
Technical Field
The invention relates to the field of medicine, in particular to a method for automatically labeling colors of an inspection report.
Background
The medical examination report is an important basis for judging the state of illness of a patient, and is mainly in a text form at present. However, the content of the examination report is written by a doctor or modified by a template, the prior art mainly electronizes the paper report, namely, a backup is made in a computer, so that the paper report is convenient to file and retrieve, but the doctor still needs to find out the abnormal description related to the illness state from a pile of characters when looking up the paper report, which wastes time and labor. The prior art examination report has the problems that the report format is not standard and any abnormal label is not generated.
Disclosure of Invention
The invention provides a method for automatically marking colors of an inspection report, which aims to solve the problem that the inspection report has no abnormal marking in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a method for automatically marking colors of an inspection report, which comprises the following steps:
acquiring a first examination report, classifying the content of the first examination report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
processing the second inspection report by utilizing a beginning tagging word bank and an ending tagging word bank, and outputting a character string to be detected;
and judging whether the character string to be detected is the content to be labeled or not by utilizing an artificial intelligent language processing technology and a knowledge graph technology, and if the content is required to be labeled, carrying out corresponding color labeling.
Firstly, classifying the obtained contents of the first examination report according to different diseased organs by using a character string matching algorithm, and outputting a second examination report; and finally, judging whether the character string to be detected is the content to be labeled and the corresponding color label by utilizing an artificial intelligent language processing technology and a knowledge graph technology. The method can improve places with irregular formats in the inspection reports, can classify the inspection reports correspondingly according to requirements, judges abnormal organs or parts and labels the abnormal organs or parts in colors.
Preferably, the processing the second inspection report by using the beginning tagged thesaurus and the ending tagged thesaurus, and outputting the character string to be detected includes:
carrying out sentence-breaking processing on the second inspection report by utilizing a natural language processing technology, comparing the second inspection report with a keyword database, and outputting a character string to be marked;
the initial tagging word bank calculates the character string to be tagged by utilizing an IndexOf algorithm and a subtrr algorithm, and outputs an initial character string;
and the ending label word library calculates the starting character string by utilizing a lastIndexOf algorithm and a subtrr algorithm and outputs the character string to be detected.
Preferably, the method comprises the steps of judging whether the character string to be detected needs to be marked with content by using an artificial intelligence language processing technology and a knowledge graph technology, and if the content needs to be marked, marking corresponding colors, and comprises the following steps:
repeating the steps in claim 2 by using a recursive algorithm, and outputting a character string database to be detected, wherein the character string database to be detected comprises N character strings to be detected, and N is a positive integer;
judging whether the character string to be detected in the character string database to be detected is the content to be marked or not by utilizing an artificial intelligent language processing technology and a knowledge map technology, and outputting the character string to be marked;
and the character string to be marked is subjected to color marking by using a place algorithm.
Preferably, the acquiring a first examination report, classifying the content of the first medical report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report includes:
carrying out data cleaning on the first inspection report by utilizing a regular expression algorithm, and outputting a complete inspection report;
and classifying the complete examination report by utilizing a medical organ database and a medical report content sentence break database and combining a character string matching algorithm, and outputting a second examination report.
An apparatus for automatic color labeling of inspection reports, comprising:
the classification processing module is used for acquiring a first examination report, classifying the content of the first medical report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
the second inspection report processing module is used for processing the second inspection report by utilizing the beginning tagging word stock and the ending tagging word stock and outputting a character string to be detected;
and the color labeling module judges whether the character string to be detected is the content needing to be labeled or not by utilizing an artificial intelligent language processing technology and a knowledge graph technology, and performs corresponding color labeling if the content needs to be labeled.
Preferably, the second inspection report processing module includes:
the sentence-breaking processing unit is used for carrying out sentence-breaking processing on the second inspection report by utilizing a natural language processing technology, comparing the second inspection report with the keyword database and outputting a character string to be labeled;
a word library starting to be labeled, wherein the word library starting to be labeled calculates the character string to be labeled by utilizing an IndexOf algorithm and a subtrr algorithm and outputs a starting character string;
and the ending label word library unit calculates the starting character string by utilizing a lastIndexOf algorithm and a subtrr algorithm and outputs the character string to be detected.
Preferably, the color labeling module includes:
a recursive algorithm unit, which repeats the steps in claim 2 by using a recursive algorithm and outputs a character string database to be detected, wherein the character string database to be detected comprises N character strings to be detected, and N is a positive integer;
the to-be-labeled character string unit judges whether the to-be-detected character string in the to-be-detected character string database is the content to be labeled or not by utilizing an artificial intelligence language processing technology and a knowledge map technology, and outputs the to-be-labeled character string;
and the color labeling unit is used for performing color labeling on the character string to be labeled by utilizing a place algorithm.
Preferably, the classification processing module includes:
the data cleaning unit is used for cleaning the data of the first inspection report by utilizing a regular expression algorithm and outputting a complete inspection report;
and the classification processing unit is used for classifying the complete inspection report by utilizing the medical organ database and the medical report content sentence break database and combining a character string matching algorithm and outputting a second inspection report.
An electronic device comprising a memory and a processor, the memory for storing one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement a method of automatic color labeling of an inspection report as claimed in any one of the above.
A computer-readable storage medium having stored thereon a computer program for causing a computer to carry out a method of automatic color labeling of an inspection report as claimed in any one of the preceding claims when executed.
The invention has the following beneficial effects:
classifying the obtained contents of the first examination report according to different diseased organs by using a character string matching algorithm, and outputting a second examination report; and finally, judging whether the character string to be detected is the content to be labeled and the corresponding color label by utilizing an artificial intelligent language processing technology and a knowledge graph technology. The method can improve places with irregular formats in the inspection reports, can classify the inspection reports correspondingly according to requirements, judges abnormal organs or parts and labels the abnormal organs or parts in colors.
Drawings
FIG. 1 is a first flowchart of a method for implementing automatic color labeling of inspection reports according to an embodiment of the present invention;
FIG. 2 is a second flowchart of a method for implementing automatic color labeling of inspection reports according to an embodiment of the present invention;
FIG. 3 is a third flowchart of a method for implementing automatic color labeling for inspection reports according to an embodiment of the present invention;
FIG. 4 is a fourth flowchart of a method for implementing automatic color labeling for inspection reports according to an embodiment of the present invention;
fig. 5 is a flowchart of an embodiment of the present invention, which is a specific implementation of a method for automatically color-labeling an inspection report.
FIG. 6 is a schematic diagram of an apparatus for automatically color-labeling inspection reports according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of an acquisition module of an apparatus for automatically color-labeling inspection reports according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a matching module implementing an apparatus for automatically color-labeling inspection reports according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of an output module of an apparatus for automatically color-labeling inspection reports according to an embodiment of the present invention;
FIG. 10 is a flowchart illustrating an embodiment of an apparatus for automatically color labeling inspection reports according to the present invention;
fig. 11 is a schematic diagram of an electronic device implementing a method for automatically color labeling an inspection report according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Before the technical solution of the present invention is introduced, a scenario to which the technical solution of the present invention may be applicable is exemplarily described.
Example 1
As shown in fig. 1, a method for automatic color labeling of an inspection report includes the following steps:
s110, acquiring a first examination report, classifying the content of the first report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
s120, processing the second inspection report by utilizing the beginning labeled word bank and the ending labeled word bank, and outputting a character string to be detected;
s130, judging whether the character string to be detected needs to be marked with content or not by utilizing an artificial intelligent language processing technology and a knowledge graph technology, and if the content needs to be marked, carrying out corresponding color marking.
According to embodiment 1, the first inspection report is subjected to data cleaning and word missing supplement by using a regular expression algorithm and a character string replacement algorithm. Classifying the obtained contents of the first examination report according to different diseased organs by using a character string matching algorithm, and outputting a second examination report; and judging and obtaining the sentence break and the character string to be detected through the established beginning tagging word stock and the ending tagging word stock of the second inspection report, and finally judging whether the content needs to be tagged and whether the corresponding color tag is needed. The method can improve places with irregular formats in the inspection reports, can classify the inspection reports correspondingly according to requirements, judges abnormal organs or parts and labels the abnormal organs or parts in colors.
Example 2
As shown in fig. 2, a method for automatic color labeling of an inspection report includes:
s210, acquiring a first examination report, classifying the content of the first report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
s220, processing the second inspection report by utilizing the beginning tagging word stock and the ending tagging word stock, and outputting a character string to be detected;
s230, sentence breaking processing is carried out on the second inspection report by utilizing a natural language processing technology, and the second inspection report is compared with a keyword database to output a character string to be labeled;
s240, the word library for starting the annotation calculates the character string to be annotated by utilizing an IndexOf algorithm and a subtrr algorithm, and outputs a starting character string;
and S250, the ending label word library calculates the starting character string by utilizing a lastIndexOf algorithm and a subtrr algorithm, and outputs the character string to be detected.
As can be seen from embodiment 2, the sentence break processing is performed on the second inspection report by using the natural language processing technology to obtain a corresponding sentence break set, each item in the sentence break set is compared with the keyword database, and if the sentence break set item contains a keyword, the sentence break set item is placed into different storage containers according to the difference of the keyword. The IndexOf algorithm is the position where the designated character string value appears in the character string for the first time, the subtrr algorithm is the replicon character string, starts from the designated position and has the designated length, and the sentence break set item in the storage container is calculated according to the IndexOf algorithm and the subtrr algorithm to obtain the starting character string. The lastIndexOf algorithm is that the retrieval is started from the end of the character string, the sub-character string is retrieved, the position of the sub-character string in the character string is returned, the position is counted from front to back, the lastIndexOf algorithm and the substr algorithm are utilized to correspondingly calculate the starting character string, and the character string to be detected can be obtained. The method screens whether abnormal parts or organs appear in the inspection report, and by using the keyword database, the content of the inspection report can be screened and judged more quickly and accurately by starting to label the word bank and finishing to label the word bank.
Example 3
As shown in fig. 3, a method for automatic color labeling of an inspection report includes:
s310, acquiring a first examination report, classifying the content of the first report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
s320, processing the second inspection report by utilizing the beginning tagging word stock and the ending tagging word stock, and outputting a character string to be detected;
s330, judging whether the character string to be detected is the content needing to be labeled or not by utilizing an artificial intelligent language processing technology and a knowledge graph technology, and if the content needs to be labeled, carrying out corresponding color labeling;
s340, repeating the steps in claim 2 by using a recursive algorithm, and outputting a character string database to be detected, wherein the character string database to be detected comprises N character strings to be detected, and N is a positive integer;
s350, judging whether the character string to be detected in the character string database to be detected is the content to be marked or not by utilizing an artificial intelligent language processing technology and a knowledge map technology, and outputting the character string to be marked;
and S360, carrying out color marking on the character string to be marked by utilizing a place algorithm.
In embodiment 3, the step of repeatedly acquiring the character string to be detected by using a recursive algorithm to obtain a set of the character string to be detected, that is, a database of the character string to be detected, interference can be eliminated and whether the character string to be detected is the valid labeled content can be analyzed and judged by using an artificial intelligence language processing technology and a knowledge graph, a replace algorithm is a replacement operation, a replacement character string having the same length as the character string to be detected is created and the replacement character string is labeled with a color, the labeled color can be a color included in a hexadecimal color code, and the character string to be detected is replaced with the replacement character string by using the replace algorithm to complete color labeling. The method carries out color labeling by using a character string replacing method, and can achieve the purpose of accurately carrying out color labeling on the diseased organ or part.
Example 4
As shown in fig. 4, a method for automatic color labeling of an inspection report includes:
s410, acquiring a first examination report, classifying the content of the first report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
s420, performing data cleaning on the first inspection report by using a regular expression algorithm, and outputting a complete inspection report;
and S430, classifying the complete inspection report by utilizing the medical organ database and the medical report content sentence break database and combining a character string matching algorithm, and outputting a second inspection report.
In embodiment 4, the regular expression is used to perform data cleaning on the first inspection report, including establishing a word correction library and word missing supplement processing, and then the medical organ database and the medical report content sentence break database are loaded in advance, so that the first inspection report is segmented according to the medical report content sentence break database, and whether the segmented items belong to the content in the medical organ database is determined, and a second inspection report is output.
Example 5
As shown in fig. 5, one specific embodiment may be:
s510, performing data cleaning on the first inspection report;
establishing a correction word bank, and adding some common errors into the correction word bank; the format is as follows: "a { | } b { m }", wherein "a" is data needing cleaning or a "regular expression", and "b" is data after cleaning; "{ | }" flush and post-flush data delimiters; "{ m }" is the next set of separators; the program then automatically reads the correction library "c" and replaces it one by one for cleaning.
Establishing a keyword library for word deficiency supplement processing, wherein the format is as follows: "a 1{ m } a 2", wherein "a 1" and "b 2" are keywords, and "{ m } is a next keyword separator, sentence breaking is performed on report contents by punctuation marks to obtain a sentence breaking set" b1 ", each piece of data is set as" y ", and the set of" b1 "is circulated and combined with context to judge whether the starting character string of" y "is consistent with a keyword library; the relevant characters are then automatically supplemented according to the specific business logic.
The first inspection report is cleaned, the problem that the format of the inspection report is irregular is solved, and the technical effects of irregular content, wrong punctuation, wrongly written characters, few children and repeated content in the inspection report are eliminated.
S520, establishing a medical organ database and a medical report content sentence break database, classifying the inspection reports, and outputting a second inspection report;
establishing a medical organ database 'a 1'; the format is as follows: "x 1{ m } x2{ m }", wherein "x 1" and "x 2" are organ-related words; "m" is the next keyword separator, and a medical report content punctuation database "a 2" is established; the format is as follows: "w 1{ m } w2{ m }", wherein "w 1" and "w 2" are punctuation words; "{ m }" is the next keyword separator. Pre-loading a word stock of 'a 1' and 'a 2', segmenting a first check report according to a word stock of 'a 2', obtaining a segmented content set 'b 1', setting each piece of data in the set as 'y', circulating a 'b 1' set, judging whether the initial part of each piece of data 'y' is in the corresponding 'a 1' word stock, and if yes, placing the initial part of each piece of data 'y' into a corresponding 'a 1' storage container 'a 2'; if the current data y does not exist, judging the classification of the previous sentence of the current data y, storing the current data y into a container corresponding to the previous sentence y, and displaying the obtained classification result in a frame corresponding to the page to obtain a second inspection report.
The content in the examination report is classified according to the diseased organ or part, so that the problem that the interpretation of the examination report is troublesome is solved, and the technical effect of quickly acquiring important information in the examination report is achieved.
S530, establishing a keyword library, a starting tagging word library and an ending tagging word library;
establishing a keyword library "deckey 1" and "deckey 2" for judging whether to be labeled; the format is as follows: "h 1{ m } h 2", wherein "deckey 1" and "deckey 2" mark different color word stocks, and "h 1" and "h 2" are words to be labeled; "{ m }" is a delimiter for the next keyword.
Establishing a starting tagging word bank needing tagging content: "start-a"; the format is as follows: "x 1{ m } x 2", wherein "x 1" and "x 2" are words or words to be labeled; "{ m }" is a delimiter for the next keyword or word.
Establishing a word bank needing to label content and ending labeling: "end-b"; the format is as follows: "y 1{ m } y 2"; wherein "y 1" and "y 2" are words or words to be labeled; "{ m }" is a delimiter for the next keyword or word.
Through the corresponding database, the abnormal organs or parts needing to be marked by colors in the second inspection report can be quickly identified, and the identification speed is improved.
S540, calculating a second inspection report by using a natural language processing technology and combining the keyword library, the initial tagging word library and the end tagging word library, outputting a character string to be tagged, and repeating the steps by using a recursive algorithm;
setting the content of a second examination report as 'n', carrying out sentence-breaking processing on the content of the second examination report 'n' by using a natural language processing technology to obtain a sentence-breaking set 'r', setting each item in the set as 'p', judging whether each 'p' in the cyclic sentence-breaking set 'r' is labeled with a keyword 'deckey 1' or 'deckey 2', if the keyword 'deckey 1' or the 'deckey 2' exists, putting the keyword into different storage containers 'save' according to different keywords, and searching an index of each character or word appearing in 'save' by using an IndexOf function according to a beginning labeling word library 'start-a'; and the indexes are sorted from small to large to obtain a first index ' inx ', and a character string function ' subtrr ' is used for intercepting the punctuation marks ' and ' from the index ' inx ' to the nearest punctuation mark ' behind the index ' inx ' in the content of the second examination report. ","; "end; the truncated string is set to "str 1". Searching each word or word to be shown in the index of 'str 1' by using a lastIndexOf function according to the end tagging word library 'end-b', sequencing the words or words according to the indexes from small to large, sequencing the words or words according to the length of the keywords to obtain the first index 'inx 2' and the length 'L' of the keyword or word 'y 1', intercepting 'str 1' by using a function 'subtrr', and ending the length from the index '0' to 'inx 2' + 'L'; obtaining a new result 'str 2', obtaining a set of contents to be labeled by using a recursive algorithm, and setting each item in the set as 'p'
The method solves the problem of difficult text data mining and achieves the technical effect of quickly and accurately mining the text data.
S550, judging effective labeling content of the character string to be labeled by using an artificial intelligent language processing technology and a knowledge graph, and labeling colors;
eliminating interference and analyzing and judging whether the str2 is effective marking content or not by using an artificial intelligent language processing technology and a knowledge graph, and if the str2 is effective marking content, putting the effective marking content into a storage container res 1; creating a character string ' str3 ' with the length equal to ' str2 ', replacing ' str2 ' in report content ' n ' with a character string ' replace ' str3 ' function, and marking the ' strInfo 1 ' and ' p ' of a marked content set with colors; replacing the report content "n" by using a function place, wherein the first parameter of replacement is as follows: the second parameter is a tag in html hypertext markup language; the format is as follows: "< font style ═ color' > p </font >; where "color" is a color value, such as "Rgb" or "hexadecimal color code," the marked content "n" is displayed to the page.
The method solves the problem that the inspection report has no abnormal label, and achieves the technical effect of carrying out color labeling on important information in the inspection report.
Example 6
As shown in fig. 6, an apparatus for automatically color-labeling an inspection report includes:
the classification processing module 10 is used for acquiring a first examination report, classifying the content of the first medical report according to the diseased organ information by using a character string matching algorithm, and outputting a second examination report;
the second inspection report processing module 20 is used for processing the second inspection report by using the beginning tagging word bank and the ending tagging word bank and outputting a character string to be detected;
and the color labeling module 30 is used for judging whether the character string to be detected is the content to be labeled by using an artificial intelligent language processing technology and a knowledge graph technology, and if the content is required to be labeled, performing corresponding color labeling.
One embodiment of the above apparatus may be: the system comprises a classification processing module 10 for obtaining a first inspection report, classifying the content of the first medical report according to the information of the diseased organ by using a character string matching algorithm, outputting a second inspection report, a second inspection report processing module 20 for processing the second inspection report by using a beginning tagging thesaurus and an ending tagging thesaurus, and outputting a character string to be detected, and finally, a color tagging module 30 for judging whether the character string to be detected is the content to be tagged or not by using an artificial intelligence language processing technology and a knowledge graph technology, and if the content is to be tagged, performing corresponding color tagging.
Example 7
As shown in fig. 7, the second inspection report processing module 20 of the apparatus for automatically color-labeling an inspection report includes:
a sentence-breaking processing unit 22, which performs sentence-breaking processing on the second inspection report by using a natural language processing technology, compares the sentence-breaking processing with the keyword database, and outputs a character string to be labeled;
a word library starting unit 24, which calculates the character string to be labeled by using an IndexOf algorithm and a subtrr algorithm and outputs a starting character string;
and an end label word library unit 26, which calculates the start character string by using a lastIndexOf algorithm and a subtrr algorithm and outputs the character string to be detected.
One embodiment of the second inspection report processing module 20 of the above apparatus may be: a sentence-breaking processing unit 22, which performs sentence-breaking processing on the second inspection report by using a natural language processing technology, compares the second inspection report with the keyword database, outputs a character string to be labeled, then starts to label a word bank unit 24, calculates the character string to be labeled by using an IndexOf algorithm and a substr algorithm in the starting label word bank, outputs a starting character string, and finally ends to label a word bank unit 26, calculates the starting character string by using a lastIndexOf algorithm and a substr algorithm in the ending label word bank, and outputs a character string to be detected.
Example 8
As shown in fig. 8, a color labeling module 30 of an apparatus for automatically color labeling an inspection report includes:
a recursive algorithm unit 32, which repeats the steps in claim 2 by using a recursive algorithm, and outputs a character string database to be detected, wherein the character string database to be detected comprises N character strings to be detected, and N is a positive integer;
a to-be-labeled character string unit 34, which judges whether the to-be-labeled character string in the to-be-labeled character string database is the content to be labeled by using an artificial intelligence language processing technology and a knowledge map technology, and outputs the to-be-labeled character string;
and the color labeling unit 36 is used for performing color labeling on the character string to be labeled by using a place algorithm.
One embodiment of the color labeling module 30 of the above device may be: a recursive algorithm unit 32, which repeats the steps in claim 2 by using a recursive algorithm, and outputs a database of character strings to be detected, wherein the database of character strings to be detected includes N character strings to be detected, N is a positive integer, then a character string to be labeled unit 34, which determines whether the character strings to be detected in the database of character strings to be detected are content to be labeled by using an artificial intelligence language processing technology and a knowledge graph technology, and outputs a character string to be labeled, and finally a color labeling unit 36, wherein the character strings to be labeled are color-labeled by using a place algorithm.
Example 9
As shown in fig. 9, a classification processing module 10 of an apparatus for automatically color-labeling an inspection report includes:
the data cleaning unit 12 is used for cleaning the data of the first inspection report by using a regular expression algorithm and outputting a complete inspection report;
and the classification processing unit 14 is used for classifying the complete examination report by utilizing the medical organ database and the medical report content sentence break database and combining a character string matching algorithm and outputting a second examination report.
One embodiment of the classification processing module 10 of the above apparatus may be: the data cleaning unit 12 is used for cleaning the data of the first inspection report by using a regular expression algorithm and outputting a complete inspection report, and the classification processing unit 14 is used for classifying the complete inspection report by using a medical organ database and a medical report content sentence break database and combining a character string matching algorithm and outputting a second inspection report.
Example 10
As shown in fig. 10, one specific implementation may be:
s1010, obtaining a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
the fine-grained emotion dictionary method comprises text preprocessing and word vector representation, wherein the preprocessing of the web text is divided into two steps of data cleaning and Chinese word segmentation. The web texts comprise social networks and shopping platforms such as microblog, Twitter, WeChat, QQ, Face-book, Taobao, Jingdong and the like, and the web texts used in the method are microblog. And calculating the web text by using a fine-grained emotion dictionary method, and outputting a word vector text.
S1020, preprocessing the acquired web text through the crust segmentation, and outputting a web preprocessed text;
the microblog text preprocessing comprises two steps of data cleaning and Chinese word segmentation. Data cleansing is the deletion of information irrelevant to emotion analysis, such as links, users, punctuation marks and the like in microblog texts. The Chinese word segmentation tool commonly used at present has the Chinese word segmentation tool such as the Chinese word segmentation tool of the Chinese, the LTP word segmentation tool of the Haugh and big, the NLPIR word segmentation tool designed by Chinese academy of sciences and the like.
S1030, splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, wherein the formula is as follows:
Figure BDA0002533798530000151
wherein, Vi ∈ Vd × n indicates that ti corresponds to an element in a dictionary, and ⊕ indicates a row vector splicing operation;
the word vector consists of four parts: text vector VT, part of speech vector VP, emotion vector VE, emotion vector VM. Wherein, the obtaining of the text vector can be regarded as a dictionary searching process. The dimensionality of a single vector in the dictionary is d, the number of words is N, and the dictionary VdXN is obtained by a word vector training model through large-scale linguistic data. The text adopts Chinese microblog word vectors [16-17] sourced from the Chinese information processing research institute of Beijing university and the DBIIR laboratory of Chinese university. For a text sequence T ═ T1, T2, …, tn }, word vectors of words in the text are concatenated to obtain a word vector representation of the entire text sequence.
S1040, utilizing formulas of the network preprocessed word vector text and the fine-grained emotion dictionary
Figure BDA0002533798530000152
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion;
according to the classification standard of 'emotional vocabulary ontology library', the parts of speech are divided into 7 classes, namely nouns (Noun), verbs (Verb), adjectives (Adj), adverbs (Adv), network words (Nw), idioms (Idiom) and prepositions phrases (Prep). Emotions are also classified into 7 types: le (happy), good (Like), Anger (Anger), Sadness (Sadness), Fear (Fear), dislost (distorst), Surprise (surprie). The part-of-speech information and the emotion information are represented as 7-dimensional vectors VP and VM., respectively, in a manner similar to one-hot encoding. The emotions are classified into 6 categories, namely positive emotion words, negative emotion words, degree adverbs, advocates, negatives and neutral words, which are expressed as 6-dimensional vectors VE. To reduce sparsity, VP, VM, and VE are all initialized to random values between [ -0.1,0.1 ]. And finally fusing the text vector and the emotion information together to construct a word vector X as input.
S1050, inputting the word vector text into an Attention layer, and outputting an Attention sequence after the Attention calculation;
the Attention calculation is mainly divided into three steps: the first step is to calculate the similarity between Query and each Key, obtain weight, and commonly used similarity functions comprise dot product, splicing, perceptron and the like; the second step is to normalize these weights using the Softmax function; and finally, weighting and summing the weight and the corresponding key Value to obtain the final Attention. The Attention model proposed by the google machine translation team is the Attention for similarity calculation using dot product, and the factor dk plays a role in adjustment, so that the inner product is not too large. Currently, in NLP research, Key and Value are usually expressed by the same Value, i.e., Key ═ Value.
S1060, inputting the Attention sequence into a convolution layer, performing convolution operation, and outputting a characteristic matrix C;
the convolutional layer may perform local feature extraction on the input sequence by different convolutional checks. The length h of the convolution kernel can divide the sequence into { X0: h-1, X1: h, …, Xi: i + h-1, …, Xn-h +1: n }, and the convolution characteristics obtained by performing convolution operation on each component are as follows: c ═ C1,c2,…,cn-h+1) Wherein ci is the feature extracted after the convolution operation is performed on the component Xi i + h-1. The ci obtained for each sliding window is calculated as follows: c. Ci=relu(W·Xi:i+h-1+ b), W is the convolution kernel weight, b is the offset.
And S1070, inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set.
The pooling layer performs downsampling operation on the feature matrix C obtained after convolution, selects local optimal features from the downsampling operation, and adopts maximum pooling for sampling, and the obtained features are expressed as: li=max(c1,c2,…,cn-h+1). The resulting features are then combined to produce a vector L: (L)1,l2,…,ln). And selecting a multi-channel mode in the convolutional layer, namely selecting a plurality of filters to carry out feature extraction on the sequence, and obtaining the features of the original text sentence through the above operations.
And S1080, inputting the original text feature set into the multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors, and outputting the relative probability of the emotion tag score vectors.
The previous layer is input to a multi-layer perceptron (MLP) to get a higher layer representation of the features. The model herein selects an MLP without any hidden layer, performs a non-linear function f transformation on its output vector to obtain a score vector of emotion labels, and then performs a Softmax operation on the emotion score vector.
And S1090, correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and using the output F value as an index of network emotion analysis.
The parameter setting can directly influence the model effect, and through continuous parameter adjustment and optimization, the DB-AC model parameters provided by the method are shown as the following table:
Figure BDA0002533798530000171
the F value is calculated as follows:
Figure BDA0002533798530000172
Figure BDA0002533798530000173
Figure BDA0002533798530000174
where gold is the number of results manually labeled, system _ correct is the number of matches in the submitted result with the manual label, and system _ disposed is the number of submitted results. The accuracy of prediction is improved.
Example 11
As shown in fig. 11, an electronic device comprises a memory 1101 and a processor 1102, the memory 1101 storing one or more computer instructions, wherein the one or more computer instructions are executed by the processor 1102 to implement one of the above-mentioned methods for automatic color labeling of an inspection report.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
A computer-readable storage medium having stored thereon a computer program for causing a computer to execute a method for automatic color labeling of an examination report as described above.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 1101 and executed by the processor 1102 to implement the present invention. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.
The computer device may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The computer device may include, but is not limited to, a memory 1101, a processor 1102. Those skilled in the art will appreciate that the present embodiments are merely exemplary of a computing device and are not intended to limit the computing device, and may include more or fewer components, or some of the components may be combined, or different components, e.g., the computing device may also include input output devices, network access devices, buses, etc.
The processor 1102 may be a Central Processing Unit (CPU), other general purpose processor 1102, a digital signal processor 1102 (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc. The general purpose processor 1102 may be a microprocessor 1102 or the processor 1102 may be any conventional processor 1102 or the like.
The storage 1101 may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The memory 1101 may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (FlashCard), etc. provided on the computer device. Further, the memory 1101 may also include both an internal storage unit and an external storage device of the computer device. The memory 1101 is used to store computer programs and other programs and data required by the computer device. The memory 1101 may also be used to temporarily store data that has been output or is to be output.
The above description is only an embodiment of the present invention, but the technical features of the present invention are not limited thereto, and any changes or modifications within the technical field of the present invention by those skilled in the art are covered by the claims of the present invention.

Claims (10)

1. A method for automatic color labeling of an inspection report, comprising:
acquiring a first examination report, classifying the content of the first report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
processing the second inspection report by utilizing a beginning tagging word bank and an ending tagging word bank, and outputting a character string to be detected;
and judging whether the character string to be detected is the content to be labeled or not by utilizing an artificial intelligent language processing technology and a knowledge graph technology, and if the content is required to be labeled, carrying out corresponding color labeling.
2. The method of claim 1, wherein the processing the second inspection report with a beginning tagging thesaurus and an ending tagging thesaurus to output a character string to be detected comprises:
carrying out sentence-breaking processing on the second inspection report by utilizing a natural language processing technology, comparing the second inspection report with a keyword database, and outputting a character string to be marked;
the initial tagging word bank calculates the character string to be tagged by utilizing an IndexOf algorithm and a subtrr algorithm, and outputs an initial character string;
and the ending label word library calculates the starting character string by utilizing a lastIndexOf algorithm and a subtrr algorithm and outputs the character string to be detected.
3. The method according to claim 2, wherein the determining whether the character string to be detected is the content to be labeled by using an artificial intelligence language processing technique and a knowledge graph technique, and if the content is to be labeled, performing corresponding color labeling comprises:
repeating the steps in claim 2 by using a recursive algorithm, and outputting a character string database to be detected, wherein the character string database to be detected comprises N character strings to be detected, and N is a positive integer;
judging whether the character string to be detected in the character string database to be detected is the content to be marked or not by utilizing an artificial intelligent language processing technology and a knowledge map technology, and outputting the character string to be marked;
and the character string to be marked is subjected to color marking by using a place algorithm.
4. The method of claim 1, wherein the step of obtaining a first examination report, classifying the content of the first medical report according to the diseased organ information by using a character string matching algorithm, and outputting a second examination report comprises:
carrying out data cleaning on the first inspection report by utilizing a regular expression algorithm, and outputting a complete inspection report;
and classifying the complete examination report by utilizing a medical organ database and a medical report content sentence break database and combining a character string matching algorithm, and outputting a second examination report.
5. An apparatus for automatic color labeling of an inspection report, comprising:
the classification processing module is used for acquiring a first examination report, classifying the content of the first medical report according to diseased organ information by using a character string matching algorithm, and outputting a second examination report;
the second inspection report processing module is used for processing the second inspection report by utilizing the beginning tagging word stock and the ending tagging word stock and outputting a character string to be detected;
and the color labeling module judges whether the character string to be detected is the content needing to be labeled or not by utilizing an artificial intelligent language processing technology and a knowledge graph technology, and performs corresponding color labeling if the content needs to be labeled.
6. The apparatus of claim 5, wherein the second inspection report processing module comprises:
the sentence-breaking processing unit is used for carrying out sentence-breaking processing on the second inspection report by utilizing a natural language processing technology, comparing the second inspection report with the keyword database and outputting a character string to be labeled;
a word library starting to be labeled, wherein the word library starting to be labeled calculates the character string to be labeled by utilizing an IndexOf algorithm and a subtrr algorithm and outputs a starting character string;
and the ending label word library unit calculates the starting character string by utilizing a lastIndexOf algorithm and a subtrr algorithm and outputs the character string to be detected.
7. The apparatus of claim 6, wherein the color labeling module comprises:
a recursive algorithm unit, which repeats the steps in claim 2 by using a recursive algorithm and outputs a character string database to be detected, wherein the character string database to be detected comprises N character strings to be detected, and N is a positive integer;
the to-be-labeled character string unit judges whether the to-be-detected character string in the to-be-detected character string database is the content to be labeled or not by utilizing an artificial intelligence language processing technology and a knowledge map technology, and outputs the to-be-labeled character string;
and the color labeling unit is used for performing color labeling on the character string to be labeled by utilizing a place algorithm.
8. The apparatus of claim 5, wherein the classification processing module comprises:
the data cleaning unit is used for cleaning the data of the first inspection report by utilizing a regular expression algorithm and outputting a complete inspection report;
and the classification processing unit is used for classifying the complete inspection report by utilizing the medical organ database and the medical report content sentence break database and combining a character string matching algorithm and outputting a second inspection report.
9. An electronic device comprising a memory and a processor, the memory configured to store one or more computer instructions, wherein the one or more computer instructions are executable by the processor to implement a method of establishing a risk assessment rating as claimed in any one of claims 1 to 4.
10. A computer-readable storage medium storing a computer program, the computer program causing a computer to implement a method of establishing a risk assessment rating according to any one of claims 1 to 4 when executed.
CN202010525930.4A 2020-06-10 2020-06-10 Method for automatically marking colors of inspection report Pending CN111681731A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010525930.4A CN111681731A (en) 2020-06-10 2020-06-10 Method for automatically marking colors of inspection report

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010525930.4A CN111681731A (en) 2020-06-10 2020-06-10 Method for automatically marking colors of inspection report

Publications (1)

Publication Number Publication Date
CN111681731A true CN111681731A (en) 2020-09-18

Family

ID=72454623

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010525930.4A Pending CN111681731A (en) 2020-06-10 2020-06-10 Method for automatically marking colors of inspection report

Country Status (1)

Country Link
CN (1) CN111681731A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927814A (en) * 2021-03-30 2021-06-08 善诊(上海)信息技术有限公司 Physical examination recommendation method, device, equipment and storage medium for placeholder lesions
CN116484802A (en) * 2023-06-20 2023-07-25 苏州浪潮智能科技有限公司 Character string color marking method, device, computer equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273657A (en) * 2017-05-15 2017-10-20 慧影医疗科技(北京)有限公司 The generation method and storage device of diagnostic imaging picture and text report
CN110097969A (en) * 2019-05-10 2019-08-06 安徽科大讯飞医疗信息技术有限公司 A kind of analysis method of diagnosis report, device and equipment
CN110556173A (en) * 2019-08-09 2019-12-10 刘丽丽 intelligent classification management system and method for inspection report
CN110599289A (en) * 2019-07-31 2019-12-20 长春市万易科技有限公司 Method for formatting official document
CN111009296A (en) * 2019-12-06 2020-04-14 安翰科技(武汉)股份有限公司 Capsule endoscopy report labeling method, apparatus, and medium
CN111222325A (en) * 2019-12-30 2020-06-02 北京富通东方科技有限公司 Medical semantic labeling method and system of bidirectional stack type recurrent neural network
WO2020109177A1 (en) * 2018-11-26 2020-06-04 Algotec Systems Ltd. System and method for matching medical concepts in radiological reports

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273657A (en) * 2017-05-15 2017-10-20 慧影医疗科技(北京)有限公司 The generation method and storage device of diagnostic imaging picture and text report
WO2020109177A1 (en) * 2018-11-26 2020-06-04 Algotec Systems Ltd. System and method for matching medical concepts in radiological reports
CN110097969A (en) * 2019-05-10 2019-08-06 安徽科大讯飞医疗信息技术有限公司 A kind of analysis method of diagnosis report, device and equipment
CN110599289A (en) * 2019-07-31 2019-12-20 长春市万易科技有限公司 Method for formatting official document
CN110556173A (en) * 2019-08-09 2019-12-10 刘丽丽 intelligent classification management system and method for inspection report
CN111009296A (en) * 2019-12-06 2020-04-14 安翰科技(武汉)股份有限公司 Capsule endoscopy report labeling method, apparatus, and medium
CN111222325A (en) * 2019-12-30 2020-06-02 北京富通东方科技有限公司 Medical semantic labeling method and system of bidirectional stack type recurrent neural network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927814A (en) * 2021-03-30 2021-06-08 善诊(上海)信息技术有限公司 Physical examination recommendation method, device, equipment and storage medium for placeholder lesions
CN116484802A (en) * 2023-06-20 2023-07-25 苏州浪潮智能科技有限公司 Character string color marking method, device, computer equipment and storage medium
CN116484802B (en) * 2023-06-20 2023-09-05 苏州浪潮智能科技有限公司 Character string color marking method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN113011533B (en) Text classification method, apparatus, computer device and storage medium
CN107239481B (en) Knowledge base construction method for multi-source network encyclopedia
US8874581B2 (en) Employing topic models for semantic class mining
CN107644010A (en) A kind of Text similarity computing method and device
CN106778878B (en) Character relation classification method and device
WO2002025479A1 (en) A document categorisation system
CN113177124A (en) Vertical domain knowledge graph construction method and system
CN112270196A (en) Entity relationship identification method and device and electronic equipment
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
CN110968725B (en) Image content description information generation method, electronic device and storage medium
CN113196277A (en) System for retrieving natural language documents
CN113168499A (en) Method for searching patent document
CN113392209A (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN112380866A (en) Text topic label generation method, terminal device and storage medium
CN111325018A (en) Domain dictionary construction method based on web retrieval and new word discovery
CN111507093A (en) Text attack method and device based on similar dictionary and storage medium
CN111681731A (en) Method for automatically marking colors of inspection report
CN114840685A (en) Emergency plan knowledge graph construction method
Frasconi et al. Text categorization for multi-page documents: A hybrid naive Bayes HMM approach
Wong et al. isentenizer-: Multilingual sentence boundary detection model
CN114239828A (en) Supply chain affair map construction method based on causal relationship
CN112818693A (en) Automatic extraction method and system for electronic component model words
CN112784601A (en) Key information extraction method and device, electronic equipment and storage medium
CN110020024B (en) Method, system and equipment for classifying link resources in scientific and technological literature
CN111414755A (en) Network emotion analysis method based on fine-grained emotion dictionary

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination