CN111341457A - Medical diagnosis information visualization method and device based on big data retrieval - Google Patents

Medical diagnosis information visualization method and device based on big data retrieval Download PDF

Info

Publication number
CN111341457A
CN111341457A CN202010116976.0A CN202010116976A CN111341457A CN 111341457 A CN111341457 A CN 111341457A CN 202010116976 A CN202010116976 A CN 202010116976A CN 111341457 A CN111341457 A CN 111341457A
Authority
CN
China
Prior art keywords
medical diagnosis
retrieval
information
background server
displayed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010116976.0A
Other languages
Chinese (zh)
Other versions
CN111341457B (en
Inventor
林瞰
徐莉
罗国基
石万美
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Qilekang Digital Health Medical Technology Co ltd
Original Assignee
Guangzhou 7lk Pharmaceutical Chain Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou 7lk Pharmaceutical Chain Co ltd filed Critical Guangzhou 7lk Pharmaceutical Chain Co ltd
Priority to CN202010116976.0A priority Critical patent/CN111341457B/en
Publication of CN111341457A publication Critical patent/CN111341457A/en
Application granted granted Critical
Publication of CN111341457B publication Critical patent/CN111341457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Primary Health Care (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a medical diagnosis information visualization method and device based on big data retrieval, wherein the method comprises the following steps: the method comprises the steps that a user terminal receives inquiry text information input by a user, and the inquiry text information is uploaded to a background server based on an HTTPS protocol; the background server extracts and processes the retrieval keywords based on the received inquiry text information to obtain the retrieval keywords; index retrieval is carried out on the medical diagnosis database by utilizing the retrieval key words, and the medical diagnosis information retrieved by the index retrieval is sorted according to the degree of correlation; performing visual feature determination on the sequenced medical diagnosis information on a background server to obtain determined visual features; performing to-be-displayed rendering processing on the medical diagnosis information according to the determined visual characteristics to obtain to-be-displayed rendering results; and the background server loads the rendering result to be displayed to the user terminal for visual display. In the embodiment of the invention, the retrieval and the visual display of the medical diagnosis information are realized.

Description

Medical diagnosis information visualization method and device based on big data retrieval
Technical Field
The invention relates to the technical field of medical big data visualization, in particular to a medical diagnosis information visualization method and device based on big data retrieval.
Background
In recent years, along with the continuous improvement of the living standard of people, people pay more and more attention to the health, but along with the trend of the aging of the domestic population, people suffer from various chronic diseases, such as hypertension, diabetes, hypertension and the like, or common diseases, such as cold, fever and the like; with the development of the internet, online intelligent inquiry can be generally realized, the existing online intelligent inquiry is answered by an online doctor and cannot be answered in real time, and how to solve big data retrieval matching according to inquiry information input by a user and quickly match medical diagnosis information for visualization is realized, so that online intelligent inquiry is quickly realized, and the use experience of the user is improved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a medical diagnosis information visualization method and device based on big data retrieval, which can realize quick matching and visualization of medical diagnosis information through big data retrieval matching according to inquiry information input by a user, thereby quickly realizing online intelligent inquiry and improving the use experience of the user.
In order to solve the technical problem, an embodiment of the present invention provides a medical diagnosis information visualization method based on big data retrieval, where the method includes:
the method comprises the steps that a user terminal receives inquiry text information input by a user and uploads the inquiry text information to a background server based on an HTTPS protocol;
the background server extracts and processes the retrieval keywords based on the received inquiry text information to obtain the retrieval keywords;
index retrieval is carried out on the medical diagnosis database by utilizing the retrieval key words, and the medical diagnosis information retrieved by the index retrieval is sorted according to the degree of correlation;
performing visual feature determination on the sequenced medical diagnosis information on the background server to obtain determined visual features, wherein the visual features comprise the screen size, the screen resolution and the user expected display resolution of a user terminal for displaying the medical diagnosis information;
performing to-be-displayed rendering processing on the medical diagnosis information according to the determined visual characteristics to obtain to-be-displayed rendering results;
and the background server loads the rendering result to be displayed to the user terminal for visual display.
Optionally, the background server performs retrieval keyword extraction processing based on the received inquiry text information to obtain a retrieval keyword, including:
performing initial keyword extraction processing on the inquiry text information based on a keyword extraction algorithm to obtain initial keywords;
performing semantic analysis processing on the inquiry text information based on an NLP analysis model to obtain a semantic analysis label;
and screening and matching the initial keywords and the semantic analysis labels to obtain retrieval keywords.
Optionally, the performing semantic analysis processing on the inquiry text information based on the NLP analysis model to obtain a semantic analysis tag includes:
constructing a character feature vector list based on the inquiry text information to obtain a character feature vector list;
inputting the character feature vector list into the NLP analysis model, and presetting a weight value of each character feature vector in the character feature vector list in the NLP analysis model by using an N-Gram statistical language algorithm;
and analyzing and processing the character feature vector with the preset weight through the NLP model, and outputting a semantic analysis label.
Optionally, the performing an index search on the medical diagnosis database by using the search keyword includes:
clustering with the index key words in the medical diagnosis database by using the search key words as centers to obtain clustering results;
performing Euclidean distance calculation based on the clustering result, and obtaining a final index keyword based on the Euclidean distance calculation result;
and constructing a keyword retrieval formula by using the final index keyword, and performing index retrieval on the medical diagnosis database based on the constructed keyword retrieval formula.
Optionally, the sorting the medical diagnosis information retrieved by the index according to the degree of correlation includes:
acquiring index keyword information corresponding to each piece of medical diagnosis information;
assigning a corresponding weight to the index keyword information based on the Euclidean distance calculation result;
and giving corresponding weights by using the index keyword information for accumulation processing, and sequencing according to the size of an accumulation result.
Optionally, the giving of the corresponding weight value by using the index keyword information for accumulation processing includes:
and performing accumulation processing according to the corresponding weight value given by the index keyword in each piece of medical diagnosis information.
Optionally, the determining the visual characteristics of the sequenced medical diagnosis information on the background server to obtain the determined visual characteristics includes:
the background server acquires the display characteristics of the user terminal based on the HTTPS protocol;
determining visual features of the sequenced medical diagnosis information in the background server based on the display features to obtain determined visual features;
the display characteristics comprise the screen size of the user terminal for display and the display resolution set by the user;
the visual features include a screen size, a screen resolution, and a user desired display resolution of a user terminal for displaying the medical diagnostic information.
Optionally, the rendering processing to be displayed on the medical diagnosis information according to the determined visual features to obtain a rendering result to be displayed includes:
mapping data elements in the medical diagnosis information according to the determined visual characteristics to obtain a document frame to be displayed;
and performing to-be-displayed rendering processing on the to-be-displayed document frame to obtain to-be-displayed rendering results.
Optionally, the loading, by the background server, the rendering result to be displayed to the user terminal for visual display includes:
and the background server loads the rendering result to be displayed to the user terminal according to the display sequence for visual display.
In addition, an embodiment of the present invention further provides a medical diagnosis information visualization apparatus based on big data retrieval, where the apparatus includes:
an input module: the system comprises a background server, a user terminal and a data processing system, wherein the background server is used for receiving inquiry text information input by a user and uploading the inquiry text information to the background server based on an HTTPS protocol;
the keyword extraction module: the background server is used for extracting and processing retrieval keywords based on the received inquiry text information to obtain the retrieval keywords;
an index retrieval module: the system is used for performing index retrieval on the medical diagnosis database by using the retrieval key words and sequencing the medical diagnosis information retrieved by the index retrieval according to the degree of correlation;
a visual feature determination module: the system comprises a background server, a user terminal and a server, wherein the background server is used for sorting medical diagnosis information and acquiring the medical diagnosis information;
a to-be-displayed rendering module: the rendering processing module is used for rendering the medical diagnosis information to be displayed according to the determined visual characteristics to obtain a rendering result to be displayed;
loading a visualization module: and the background server is used for loading the rendering result to be displayed to the user terminal for visual display.
In the embodiment of the invention, by receiving inquiry text information input by a user and uploading the inquiry text information to a background server, extracting keywords, performing index retrieval by using the extracted keywords, then performing sequencing, performing to-be-displayed rendering on a sequencing result according to determined visual characteristics, and then loading the to-be-displayed rendering result to a user terminal for visual display; the medical diagnosis information can be matched and visualized quickly through big data retrieval and matching according to the inquiry information input by the user, so that online intelligent inquiry is realized quickly, and the use experience of the user is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart illustrating a medical diagnosis information visualization method based on big data retrieval according to an embodiment of the present invention;
fig. 2 is a schematic structural composition diagram of a medical diagnosis information visualization device based on big data retrieval in a layout fee embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Referring to fig. 1, fig. 1 is a schematic flowchart of a medical diagnosis information visualization method based on big data retrieval according to an embodiment of the present invention.
As shown in fig. 1, a method for visualizing medical diagnosis information based on big data retrieval, the method comprises:
s11: the method comprises the steps that a user terminal receives inquiry text information input by a user and uploads the inquiry text information to a background server based on an HTTPS protocol;
in the specific implementation process of the invention, a user terminal is provided with a corresponding APP or an applet or a PC application program capable of establishing connection with a background server and the like, a corresponding operation interface on the user terminal is provided with a corresponding inquiry message for receiving input by a user, after receiving the inquiry message, the user terminal converts the inquiry message into an inquiry text message through a corresponding algorithm, firstly, redundancy removal is carried out on the inquiry message, redundancy removal is carried out generally based on a grammar rule, and then the inquiry text message is integrated in the inquiry message after the redundancy removal according to an input time sequence; after the user terminal obtains the inquiry text information, uploading the inquiry text information to a background server through an HTTPS protocol in the background server; by the mode, the inquiry information input by the user is correspondingly processed, the subsequent processing is facilitated, the processing speed of the subsequent steps is increased, and the safety and the transmission speed of text transmission are guaranteed through an HTTPS protocol.
S12: the background server extracts and processes the retrieval keywords based on the received inquiry text information to obtain the retrieval keywords;
in a specific implementation process of the present invention, the background server performs retrieval keyword extraction processing based on the received inquiry text information to obtain a retrieval keyword, including: performing initial keyword extraction processing on the inquiry text information based on a keyword extraction algorithm to obtain initial keywords; performing semantic analysis processing on the inquiry text information based on an NLP analysis model to obtain a semantic analysis label; and screening and matching the initial keywords and the semantic analysis labels to obtain retrieval keywords.
Further, the performing semantic analysis processing on the inquiry text information based on the NLP analysis model to obtain a semantic analysis tag includes: constructing a character feature vector list based on the inquiry text information to obtain a character feature vector list; inputting the character feature vector list into the NLP analysis model, and presetting a weight value of each character feature vector in the character feature vector list in the NLP analysis model by using an N-Gram statistical language algorithm; and analyzing and processing the character feature vector with the preset weight through the NLP model, and outputting a semantic analysis label.
Specifically, the keywords in the inquiry text information are initially extracted through a keyword extraction algorithm to obtain initial keywords, the keyword extraction algorithm for extracting the keywords can be any one of a TF-IDF keyword extraction algorithm, a semantic-based statistical language model, a TF-IWF document keyword automatic extraction algorithm, a text keyword extraction algorithm based on a separation model, a Chinese text keyword extraction (SKE) algorithm based on semantics and a Chinese keyword extraction algorithm based on a naive Bayesian model, each algorithm has unique advantages, in the embodiment of the invention, the TF-IDF keyword extraction algorithm is preferentially used, and TF-IDF (term frequency-inverse document similarity) is a common weighting technology for information retrieval and data mining; TF means Term Frequency (Term Frequency), IDF means Inverse text Frequency index (Inverse Document Frequency); TF-IDF is a statistical method to evaluate the importance of a word to one of a set of documents or a corpus; the importance of a word increases in direct proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus; various forms of TF-IDF weighting are often applied by search engines as a measure or rating of the degree of relevance between a document and a user query; in addition to TF-IDF, search engines on the internet use a ranking method based on link analysis to determine the order in which documents appear in search results; after the initial keywords are extracted, performing semantic analysis on the inquiry text information by using an NLP (non line segment) analysis model so as to obtain semantic analysis labels through analysis; screening and matching the initial keywords and the semantic analysis labels to obtain retrieval keywords; wherein, the screening matching is carried out by similarity calculation during the screening matching; the specific algorithm is as follows:
Figure BDA0002391786080000071
wherein S isi,SjFor comparing similar initial keyword sets and semantic analysis tag sets, tjh,tikAre respectively a set Si,SjThe initial keyword and semantic analysis tag in (c), wup (t)ik,tjh) Is wup similarity, max, between the initial keyword and the semantic analysis tagh(wup(tik,tjh) Is t)jhAnd SjMaximum value of wup similarity, max, of all semantic analysis tags in (1)k(wup(tjh,tik) Is t)ikAnd SiThe maximum value of wup similarity of all the initial keywords in (b), and size(s) represents the number of sets.
The position feature vector list is constructed by corresponding inquiry text information, and the word bag model (BOF) in the natural language processing field is combined with the N-Gram features, so that words can be accurately segmented, and the sequence after the words are segmented can be adjusted. The bag of words model (BOF) is a standard target classification framework consisting of 4 parts of feature extraction, feature clustering, feature coding, feature aggregation and classifier classification. The N-Gram feature is an algorithm based on a statistical language model, is also called a first-order Markov chain, and is a byte fragment sequence with the length of N formed by performing sliding window operation on the content in the text with the size of N according to bytes. Each byte segment is called as a gram, the occurrence frequency of all the grams is counted, and filtering is carried out according to a preset threshold value to form a key gram list, namely a vector feature space of the text; each gram in the list is a feature vector dimension; firstly, roughly dividing input criminal name field information into word segment sequences; then carrying out Bi-gram cutting treatment; and finally, filtering to obtain a feature vector list.
The NLP analysis model architecture adopts an input, mapping (hiding) and output architecture, wherein X (1) to X (n) represent a feature vector of each word in a text, a paragraph can be represented by a mean value obtained by embedding and accumulating all the words, and finally a label of an output layer is obtained from a hidden layer through nonlinear transformation once again; the model inputs a word sequence (a text or a sentence) and outputs the probability that the word sequence belongs to different categories; the hidden layer is obtained by summing and averaging the input layers and multiplying the sum by a weight matrix A; the output layer is obtained by multiplying the hidden layer by a weight matrix B; in order to improve the operation time and the running time, the model uses a hierarchical Softmax skill, is built on the basis of Huffman coding, codes tags and can greatly reduce the number of model prediction targets.
Specifically, the output layer is a formula of multiplying the hidden layer by the weight matrix B as follows:
Figure BDA0002391786080000081
wherein, ynDenotes true label, xnRepresenting a feature vector list (N-Gram features after document N normalization), wherein A and B respectively represent weight matrixes; n is 1,2,3, …, N is a positive integer.
Firstly, constructing a character feature vector by inquiry text information, inputting a constructed character feature vector list into an NLP analysis model, and presetting a weight value of each character feature vector in the character feature vector list by using an N-Gram statistical language algorithm in the NLP analysis model; and analyzing and processing the character feature vector with the preset weight through an NLP model, and outputting a semantic analysis label.
S13: index retrieval is carried out on the medical diagnosis database by utilizing the retrieval key words, and the medical diagnosis information retrieved by the index retrieval is sorted according to the degree of correlation;
in a specific implementation process of the present invention, the index retrieval performed on the medical diagnosis database by using the retrieval key includes: clustering with the index key words in the medical diagnosis database by using the search key words as centers to obtain clustering results; performing Euclidean distance calculation based on the clustering result, and obtaining a final index keyword based on the Euclidean distance calculation result; and constructing a keyword retrieval formula by using the final index keyword, and performing index retrieval on the medical diagnosis database based on the constructed keyword retrieval formula.
Further, the sorting the medical diagnosis information retrieved by the index according to the degree of correlation includes: acquiring index keyword information corresponding to each piece of medical diagnosis information; assigning a corresponding weight to the index keyword information based on the Euclidean distance calculation result; and giving corresponding weights by using the index keyword information for accumulation processing, and sequencing according to the size of an accumulation result.
Further, the giving of the corresponding weight value by using the index keyword information for accumulation processing includes: and performing accumulation processing according to the corresponding weight value given by the index keyword in each piece of medical diagnosis information.
Specifically, the search keyword is used as a center to perform clustering with the index keyword in the medical diagnosis database, and k-means clustering is used in the embodiment of the invention, so that a clustering result with the search keyword as the center is obtained; then, calculating the Euclidean distance by using the keywords in the clustering result, and determining the final index keyword according to the Euclidean distance obtained by calculation; and constructing a retrieval formula suitable for the medical diagnosis database through the final index key words, and then performing index retrieval on the medical diagnosis database by using the retrieval formula.
Firstly, acquiring multiple corresponding index key word information in each piece of medical diagnosis information, calculating Euclidean distances among all index key words, and giving corresponding weight to each index key word according to the Euclidean distances among the index key words; finally, giving corresponding weights to the multiple corresponding index keyword information in each piece of medical diagnosis information for accumulation, and then sequencing according to the size of the accumulated results; wherein, the accumulation processing is carried out according to the corresponding weight value given by the index key words in each piece of medical diagnosis information.
S14: performing visual feature determination on the sequenced medical diagnosis information on the background server to obtain determined visual features, wherein the visual features comprise the screen size, the screen resolution and the user expected display resolution of a user terminal for displaying the medical diagnosis information;
in a specific implementation process of the present invention, the determining the visual characteristics of the sequenced medical diagnosis information on the background server to obtain the determined visual characteristics includes: the background server acquires the display characteristics of the user terminal based on the HTTPS protocol; determining visual features of the sequenced medical diagnosis information in the background server based on the display features to obtain determined visual features; the display characteristics comprise the screen size of the user terminal for display and the display resolution set by the user; the visual features include a screen size, a screen resolution, and a user desired display resolution of a user terminal for displaying the medical diagnostic information.
Specifically, the display characteristics of the user terminal are obtained in the background server by using an HTTPS protocol, where the display characteristics include a screen size for display of the user terminal and a display resolution set by a user; determining the visual features of the sequenced medical diagnosis information in a background server according to the display features to obtain the determined visual features; specifically, the visual characteristics include a screen size of a user terminal for displaying medical diagnosis information, a screen resolution, and a user desired display resolution.
S15: performing to-be-displayed rendering processing on the medical diagnosis information according to the determined visual characteristics to obtain to-be-displayed rendering results;
in a specific implementation process of the present invention, the rendering to be displayed of the medical diagnosis information according to the determined visual characteristics to obtain a rendering result to be displayed includes: mapping data elements in the medical diagnosis information according to the determined visual characteristics to obtain a document frame to be displayed; and performing to-be-displayed rendering processing on the to-be-displayed document frame to obtain to-be-displayed rendering results.
Specifically, mapping all data elements in the medical diagnosis information to be displayed according to the determined visual characteristics, so as to map the data elements to a corresponding document frame to obtain a document frame to be displayed; and finally, rendering the document frame to be displayed to obtain a rendering result to be displayed.
S16: and the background server loads the rendering result to be displayed to the user terminal for visual display.
In a specific implementation process of the present invention, the loading, by the background server, the rendering result to be displayed to the user terminal for visual display includes: and the background server loads the rendering result to be displayed to the user terminal according to the display sequence for visual display.
In the embodiment of the invention, by receiving inquiry text information input by a user and uploading the inquiry text information to a background server, extracting keywords, performing index retrieval by using the extracted keywords, then performing sequencing, performing to-be-displayed rendering on a sequencing result according to determined visual characteristics, and then loading the to-be-displayed rendering result to a user terminal for visual display; the medical diagnosis information can be matched and visualized quickly through big data retrieval and matching according to the inquiry information input by the user, so that online intelligent inquiry is realized quickly, and the use experience of the user is improved.
Examples
Referring to fig. 2, fig. 2 is a schematic structural composition diagram of a medical diagnosis information visualization apparatus based on big data retrieval in a layout fee embodiment.
As shown in fig. 2, a medical diagnosis information visualization apparatus based on big data retrieval, the apparatus comprising:
the input module 21: the system comprises a background server, a user terminal and a data processing system, wherein the background server is used for receiving inquiry text information input by a user and uploading the inquiry text information to the background server based on an HTTPS protocol;
in the specific implementation process of the invention, a user terminal is provided with a corresponding APP or an applet or a PC application program capable of establishing connection with a background server and the like, a corresponding operation interface on the user terminal is provided with a corresponding inquiry message for receiving input by a user, after receiving the inquiry message, the user terminal converts the inquiry message into an inquiry text message through a corresponding algorithm, firstly, redundancy removal is carried out on the inquiry message, redundancy removal is carried out generally based on a grammar rule, and then the inquiry text message is integrated in the inquiry message after the redundancy removal according to an input time sequence; after the user terminal obtains the inquiry text information, uploading the inquiry text information to a background server through an HTTPS protocol in the background server; by the mode, the inquiry information input by the user is correspondingly processed, the subsequent processing is facilitated, the processing speed of the subsequent steps is increased, and the safety and the transmission speed of text transmission are guaranteed through an HTTPS protocol.
Keyword extraction module 22: the background server is used for extracting and processing retrieval keywords based on the received inquiry text information to obtain the retrieval keywords;
in a specific implementation process of the present invention, the background server performs retrieval keyword extraction processing based on the received inquiry text information to obtain a retrieval keyword, including: performing initial keyword extraction processing on the inquiry text information based on a keyword extraction algorithm to obtain initial keywords; performing semantic analysis processing on the inquiry text information based on an NLP analysis model to obtain a semantic analysis label; and screening and matching the initial keywords and the semantic analysis labels to obtain retrieval keywords.
Further, the performing semantic analysis processing on the inquiry text information based on the NLP analysis model to obtain a semantic analysis tag includes: constructing a character feature vector list based on the inquiry text information to obtain a character feature vector list; inputting the character feature vector list into the NLP analysis model, and presetting a weight value of each character feature vector in the character feature vector list in the NLP analysis model by using an N-Gram statistical language algorithm; and analyzing and processing the character feature vector with the preset weight through the NLP model, and outputting a semantic analysis label.
Specifically, the keywords in the inquiry text information are initially extracted through a keyword extraction algorithm to obtain initial keywords, the keyword extraction algorithm for extracting the keywords can be any one of a TF-IDF keyword extraction algorithm, a semantic-based statistical language model, a TF-IWF document keyword automatic extraction algorithm, a text keyword extraction algorithm based on a separation model, a Chinese text keyword extraction (SKE) algorithm based on semantics and a Chinese keyword extraction algorithm based on a naive Bayesian model, each algorithm has unique advantages, in the embodiment of the invention, the TF-IDF keyword extraction algorithm is preferentially used, and TF-IDF (term frequency-inverse document similarity) is a common weighting technology for information retrieval and data mining; TF means Term Frequency (Term Frequency), IDF means Inverse text Frequency index (Inverse Document Frequency); TF-IDF is a statistical method to evaluate the importance of a word to one of a set of documents or a corpus; the importance of a word increases in direct proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus; various forms of TF-IDF weighting are often applied by search engines as a measure or rating of the degree of relevance between a document and a user query; in addition to TF-IDF, search engines on the internet use a ranking method based on link analysis to determine the order in which documents appear in search results; after the initial keywords are extracted, performing semantic analysis on the inquiry text information by using an NLP (non line segment) analysis model so as to obtain semantic analysis labels through analysis; screening and matching the initial keywords and the semantic analysis labels to obtain retrieval keywords; wherein, the screening matching is carried out by similarity calculation during the screening matching; the specific algorithm is as follows:
Figure BDA0002391786080000121
wherein S isi,SjFor comparing similar initial keyword sets and semantic analysis tag sets, tjh,tikAre respectively a set Si,SjThe initial keyword and semantic analysis tag in (c), wup (t)ik,tjh) Is wup similarity, max, between the initial keyword and the semantic analysis tagh(wup(tik,tjh) Is t)jhAnd SjMaximum value of wup similarity, max, of all semantic analysis tags in (1)k(wup(tjh,tik) Is t)ikAnd SiThe maximum value of wup similarity of all the initial keywords in (b), and size(s) represents the number of sets.
The position feature vector list is constructed by corresponding inquiry text information, and the word bag model (BOF) in the natural language processing field is combined with the N-Gram features, so that words can be accurately segmented, and the sequence after the words are segmented can be adjusted. The bag of words model (BOF) is a standard target classification framework consisting of 4 parts of feature extraction, feature clustering, feature coding, feature aggregation and classifier classification. The N-Gram feature is an algorithm based on a statistical language model, is also called a first-order Markov chain, and is a byte fragment sequence with the length of N formed by performing sliding window operation on the content in the text with the size of N according to bytes. Each byte segment is called as a gram, the occurrence frequency of all the grams is counted, and filtering is carried out according to a preset threshold value to form a key gram list, namely a vector feature space of the text; each gram in the list is a feature vector dimension; firstly, roughly dividing input criminal name field information into word segment sequences; then carrying out Bi-gram cutting treatment; and finally, filtering to obtain a feature vector list.
The NLP analysis model architecture adopts an input, mapping (hiding) and output architecture, wherein X (1) to X (n) represent a feature vector of each word in a text, a paragraph can be represented by a mean value obtained by embedding and accumulating all the words, and finally a label of an output layer is obtained from a hidden layer through nonlinear transformation once again; the model inputs a word sequence (a text or a sentence) and outputs the probability that the word sequence belongs to different categories; the hidden layer is obtained by summing and averaging the input layers and multiplying the sum by a weight matrix A; the output layer is obtained by multiplying the hidden layer by a weight matrix B; in order to improve the operation time and the running time, the model uses a hierarchical Softmax skill, is built on the basis of Huffman coding, codes tags and can greatly reduce the number of model prediction targets.
Specifically, the output layer is a formula of multiplying the hidden layer by the weight matrix B as follows:
Figure BDA0002391786080000131
wherein, ynDenotes true label, xnRepresenting a feature vector list (N-Gram features after document N normalization), wherein A and B respectively represent weight matrixes; n is 1,2,3, …, N is a positive integer.
Firstly, constructing a character feature vector by inquiry text information, inputting a constructed character feature vector list into an NLP analysis model, and presetting a weight value of each character feature vector in the character feature vector list by using an N-Gram statistical language algorithm in the NLP analysis model; and analyzing and processing the character feature vector with the preset weight through an NLP model, and outputting a semantic analysis label.
The index retrieval module 23: the system is used for performing index retrieval on the medical diagnosis database by using the retrieval key words and sequencing the medical diagnosis information retrieved by the index retrieval according to the degree of correlation;
in a specific implementation process of the present invention, the index retrieval performed on the medical diagnosis database by using the retrieval key includes: clustering with the index key words in the medical diagnosis database by using the search key words as centers to obtain clustering results; performing Euclidean distance calculation based on the clustering result, and obtaining a final index keyword based on the Euclidean distance calculation result; and constructing a keyword retrieval formula by using the final index keyword, and performing index retrieval on the medical diagnosis database based on the constructed keyword retrieval formula.
Further, the sorting the medical diagnosis information retrieved by the index according to the degree of correlation includes: acquiring index keyword information corresponding to each piece of medical diagnosis information; assigning a corresponding weight to the index keyword information based on the Euclidean distance calculation result; and giving corresponding weights by using the index keyword information for accumulation processing, and sequencing according to the size of an accumulation result.
Further, the giving of the corresponding weight value by using the index keyword information for accumulation processing includes: and performing accumulation processing according to the corresponding weight value given by the index keyword in each piece of medical diagnosis information.
Specifically, the search keyword is used as a center to perform clustering with the index keyword in the medical diagnosis database, and k-means clustering is used in the embodiment of the invention, so that a clustering result with the search keyword as the center is obtained; then, calculating the Euclidean distance by using the keywords in the clustering result, and determining the final index keyword according to the Euclidean distance obtained by calculation; and constructing a retrieval formula suitable for the medical diagnosis database through the final index key words, and then performing index retrieval on the medical diagnosis database by using the retrieval formula.
Firstly, acquiring multiple corresponding index key word information in each piece of medical diagnosis information, calculating Euclidean distances among all index key words, and giving corresponding weight to each index key word according to the Euclidean distances among the index key words; finally, giving corresponding weights to the multiple corresponding index keyword information in each piece of medical diagnosis information for accumulation, and then sequencing according to the size of the accumulated results; wherein, the accumulation processing is carried out according to the corresponding weight value given by the index key words in each piece of medical diagnosis information.
Visual feature determination module 24: the system comprises a background server, a user terminal and a server, wherein the background server is used for sorting medical diagnosis information and acquiring the medical diagnosis information;
in a specific implementation process of the present invention, the determining the visual characteristics of the sequenced medical diagnosis information on the background server to obtain the determined visual characteristics includes: the background server acquires the display characteristics of the user terminal based on the HTTPS protocol; determining visual features of the sequenced medical diagnosis information in the background server based on the display features to obtain determined visual features; the display characteristics comprise the screen size of the user terminal for display and the display resolution set by the user; the visual features include a screen size, a screen resolution, and a user desired display resolution of a user terminal for displaying the medical diagnostic information.
Specifically, the display characteristics of the user terminal are obtained in the background server by using an HTTPS protocol, where the display characteristics include a screen size for display of the user terminal and a display resolution set by a user; determining the visual features of the sequenced medical diagnosis information in a background server according to the display features to obtain the determined visual features; specifically, the visual characteristics include a screen size of a user terminal for displaying medical diagnosis information, a screen resolution, and a user desired display resolution.
The to-be-displayed rendering module 25: the rendering processing module is used for rendering the medical diagnosis information to be displayed according to the determined visual characteristics to obtain a rendering result to be displayed;
in a specific implementation process of the present invention, the rendering to be displayed of the medical diagnosis information according to the determined visual characteristics to obtain a rendering result to be displayed includes: mapping data elements in the medical diagnosis information according to the determined visual characteristics to obtain a document frame to be displayed; and performing to-be-displayed rendering processing on the to-be-displayed document frame to obtain to-be-displayed rendering results.
Specifically, mapping all data elements in the medical diagnosis information to be displayed according to the determined visual characteristics, so as to map the data elements to a corresponding document frame to obtain a document frame to be displayed; and finally, rendering the document frame to be displayed to obtain a rendering result to be displayed.
Loading the visualization module 26: and the background server is used for loading the rendering result to be displayed to the user terminal for visual display.
In a specific implementation process of the present invention, the loading, by the background server, the rendering result to be displayed to the user terminal for visual display includes: and the background server loads the rendering result to be displayed to the user terminal according to the display sequence for visual display.
In the embodiment of the invention, by receiving inquiry text information input by a user and uploading the inquiry text information to a background server, extracting keywords, performing index retrieval by using the extracted keywords, then performing sequencing, performing to-be-displayed rendering on a sequencing result according to determined visual characteristics, and then loading the to-be-displayed rendering result to a user terminal for visual display; the medical diagnosis information can be matched and visualized quickly through big data retrieval and matching according to the inquiry information input by the user, so that online intelligent inquiry is realized quickly, and the use experience of the user is improved.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
In addition, the method and the device for visualizing medical diagnosis information based on big data retrieval provided by the embodiment of the present invention are described in detail above, and a specific example should be adopted herein to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A medical diagnosis information visualization method based on big data retrieval is characterized by comprising the following steps:
the method comprises the steps that a user terminal receives inquiry text information input by a user and uploads the inquiry text information to a background server based on an HTTPS protocol;
the background server extracts and processes the retrieval keywords based on the received inquiry text information to obtain the retrieval keywords;
index retrieval is carried out on the medical diagnosis database by utilizing the retrieval key words, and the medical diagnosis information retrieved by the index retrieval is sorted according to the degree of correlation;
performing visual feature determination on the sequenced medical diagnosis information on the background server to obtain determined visual features, wherein the visual features comprise the screen size, the screen resolution and the user expected display resolution of a user terminal for displaying the medical diagnosis information;
performing to-be-displayed rendering processing on the medical diagnosis information according to the determined visual characteristics to obtain to-be-displayed rendering results;
and the background server loads the rendering result to be displayed to the user terminal for visual display.
2. The medical diagnosis information visualization method according to claim 1, wherein the background server performs retrieval keyword extraction processing based on the received inquiry text information to obtain a retrieval keyword, and the retrieval keyword includes:
performing initial keyword extraction processing on the inquiry text information based on a keyword extraction algorithm to obtain initial keywords;
performing semantic analysis processing on the inquiry text information based on an NLP analysis model to obtain a semantic analysis label;
and screening and matching the initial keywords and the semantic analysis labels to obtain retrieval keywords.
3. The method for visualizing medical diagnostic information according to claim 2, wherein performing semantic analysis processing on the inquiry text information based on the NLP analysis model to obtain a semantic analysis tag comprises:
constructing a character feature vector list based on the inquiry text information to obtain a character feature vector list;
inputting the character feature vector list into the NLP analysis model, and presetting a weight value of each character feature vector in the character feature vector list in the NLP analysis model by using an N-Gram statistical language algorithm;
and analyzing and processing the character feature vector with the preset weight through the NLP model, and outputting a semantic analysis label.
4. The method for visualizing medical diagnosis information according to claim 1, wherein said performing an index search on a medical diagnosis database using said search key comprises:
clustering with the index key words in the medical diagnosis database by using the search key words as centers to obtain clustering results;
performing Euclidean distance calculation based on the clustering result, and obtaining a final index keyword based on the Euclidean distance calculation result;
and constructing a keyword retrieval formula by using the final index keyword, and performing index retrieval on the medical diagnosis database based on the constructed keyword retrieval formula.
5. The method for visualizing medical diagnostic information of claim 4, wherein the sorting the indexed medical diagnostic information by relevance comprises:
acquiring index keyword information corresponding to each piece of medical diagnosis information;
assigning a corresponding weight to the index keyword information based on the Euclidean distance calculation result;
and giving corresponding weights by using the index keyword information for accumulation processing, and sequencing according to the size of an accumulation result.
6. The medical diagnosis information visualization method according to claim 5, wherein the assigning of the corresponding weight value by using the index keyword information for accumulation processing includes:
and performing accumulation processing according to the corresponding weight value given by the index keyword in each piece of medical diagnosis information.
7. The method for visualizing medical diagnostic information according to claim 1, wherein the performing visual feature determination on the sequenced medical diagnostic information on the background server to obtain the determined visual feature comprises:
the background server acquires the display characteristics of the user terminal based on the HTTPS protocol;
determining visual features of the sequenced medical diagnosis information in the background server based on the display features to obtain determined visual features;
the display characteristics comprise the screen size of the user terminal for display and the display resolution set by the user;
the visual features include a screen size, a screen resolution, and a user desired display resolution of a user terminal for displaying the medical diagnostic information.
8. The medical diagnosis information visualization method according to claim 1, wherein the rendering to be displayed of the medical diagnosis information according to the determined visual feature to obtain a rendering result to be displayed comprises:
mapping data elements in the medical diagnosis information according to the determined visual characteristics to obtain a document frame to be displayed;
and performing to-be-displayed rendering processing on the to-be-displayed document frame to obtain to-be-displayed rendering results.
9. The medical diagnosis information visualization method according to claim 1, wherein the background server loads the rendering result to be displayed to the user terminal for visualization display, and the method comprises:
and the background server loads the rendering result to be displayed to the user terminal according to the display sequence for visual display.
10. An apparatus for visualizing medical diagnostic information based on big data retrieval, the apparatus comprising:
an input module: the system comprises a background server, a user terminal and a data processing system, wherein the background server is used for receiving inquiry text information input by a user and uploading the inquiry text information to the background server based on an HTTPS protocol;
the keyword extraction module: the background server is used for extracting and processing retrieval keywords based on the received inquiry text information to obtain the retrieval keywords;
an index retrieval module: the system is used for performing index retrieval on the medical diagnosis database by using the retrieval key words and sequencing the medical diagnosis information retrieved by the index retrieval according to the degree of correlation;
a visual feature determination module: the system comprises a background server, a user terminal and a server, wherein the background server is used for sorting medical diagnosis information and acquiring the medical diagnosis information;
a to-be-displayed rendering module: the rendering processing module is used for rendering the medical diagnosis information to be displayed according to the determined visual characteristics to obtain a rendering result to be displayed;
loading a visualization module: and the background server is used for loading the rendering result to be displayed to the user terminal for visual display.
CN202010116976.0A 2020-02-25 2020-02-25 Medical diagnosis information visualization method and device based on big data retrieval Active CN111341457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010116976.0A CN111341457B (en) 2020-02-25 2020-02-25 Medical diagnosis information visualization method and device based on big data retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010116976.0A CN111341457B (en) 2020-02-25 2020-02-25 Medical diagnosis information visualization method and device based on big data retrieval

Publications (2)

Publication Number Publication Date
CN111341457A true CN111341457A (en) 2020-06-26
CN111341457B CN111341457B (en) 2024-05-10

Family

ID=71187889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010116976.0A Active CN111341457B (en) 2020-02-25 2020-02-25 Medical diagnosis information visualization method and device based on big data retrieval

Country Status (1)

Country Link
CN (1) CN111341457B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008117239A (en) * 2006-11-06 2008-05-22 Techmatrix Corp Medical information processing system, observation data editing device, observation data editing method and program
US20130117659A1 (en) * 2011-11-09 2013-05-09 Microsoft Corporation Dynamic Server-Side Image Sizing For Fidelity Improvements
US20160012038A1 (en) * 2014-07-10 2016-01-14 International Business Machines Corporation Semantic typing with n-gram analysis
CN106462325A (en) * 2014-05-27 2017-02-22 三星电子株式会社 Method of controlling display and electronic device for providing the same
CN107194163A (en) * 2017-05-15 2017-09-22 上海联影医疗科技有限公司 A kind of display methods and system
CN107766400A (en) * 2017-05-05 2018-03-06 平安科技(深圳)有限公司 Text searching method and system
CN108153843A (en) * 2017-12-18 2018-06-12 广州七乐康药业连锁有限公司 A kind of medical information personalized recommendation method and system based on cloud platform
US20180260309A1 (en) * 2017-03-11 2018-09-13 Wipro Limited Method and system for semantic test suite reduction
CN109376230A (en) * 2018-12-18 2019-02-22 广东博维创远科技有限公司 Crime is determined a crime prediction technique, system, storage medium and server

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008117239A (en) * 2006-11-06 2008-05-22 Techmatrix Corp Medical information processing system, observation data editing device, observation data editing method and program
US20130117659A1 (en) * 2011-11-09 2013-05-09 Microsoft Corporation Dynamic Server-Side Image Sizing For Fidelity Improvements
CN106462325A (en) * 2014-05-27 2017-02-22 三星电子株式会社 Method of controlling display and electronic device for providing the same
US20160012038A1 (en) * 2014-07-10 2016-01-14 International Business Machines Corporation Semantic typing with n-gram analysis
US20180260309A1 (en) * 2017-03-11 2018-09-13 Wipro Limited Method and system for semantic test suite reduction
CN107766400A (en) * 2017-05-05 2018-03-06 平安科技(深圳)有限公司 Text searching method and system
CN107194163A (en) * 2017-05-15 2017-09-22 上海联影医疗科技有限公司 A kind of display methods and system
CN108153843A (en) * 2017-12-18 2018-06-12 广州七乐康药业连锁有限公司 A kind of medical information personalized recommendation method and system based on cloud platform
CN109376230A (en) * 2018-12-18 2019-02-22 广东博维创远科技有限公司 Crime is determined a crime prediction technique, system, storage medium and server

Also Published As

Publication number Publication date
CN111341457B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN110297988B (en) Hot topic detection method based on weighted LDA and improved Single-Pass clustering algorithm
CN109740148B (en) Text emotion analysis method combining BiLSTM with Attention mechanism
CN108717406B (en) Text emotion analysis method and device and storage medium
CN111274365B (en) Intelligent inquiry method and device based on semantic understanding, storage medium and server
CN106649818B (en) Application search intention identification method and device, application search method and server
CN111680159B (en) Data processing method and device and electronic equipment
CN105139237A (en) Information push method and apparatus
CN113434636B (en) Semantic-based approximate text searching method, semantic-based approximate text searching device, computer equipment and medium
CN112559684A (en) Keyword extraction and information retrieval method
CN115438166A (en) Keyword and semantic-based searching method, device, equipment and storage medium
US20220171935A1 (en) Machine-learning techniques for augmenting electronic documents with data-verification indicators
US10915756B2 (en) Method and apparatus for determining (raw) video materials for news
CN109299227B (en) Information query method and device based on voice recognition
CN107862058B (en) Method and apparatus for generating information
CN113239159B (en) Cross-modal retrieval method for video and text based on relational inference network
CN115186665B (en) Semantic-based unsupervised academic keyword extraction method and equipment
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN113707299A (en) Auxiliary diagnosis method and device based on inquiry session and computer equipment
CN111813874B (en) Terahertz knowledge graph construction method and system
CN111177367A (en) Case classification method, classification model training method and related products
CN115408488A (en) Segmentation method and system for novel scene text
CN114758742A (en) Medical record information processing method and device, electronic equipment and storage medium
CN109298796B (en) Word association method and device
CN116842168B (en) Cross-domain problem processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Room 109, First Floor, Building 2, No. 23 Hejing Road, Guangzhou, Guangdong Province, 510000 (for office purposes only)

Patentee after: Guangzhou qilekang Digital Health Medical Technology Co.,Ltd.

Country or region after: China

Address before: Room 701-707, No. 2, Luju Road, Liwan District, Guangzhou, Guangdong 510000

Patentee before: GUANGZHOU 7LK PHARMACEUTICAL CHAIN CO.,LTD.

Country or region before: China

CP03 Change of name, title or address