CN112100498A - Disease public opinion monitoring method and device - Google Patents

Disease public opinion monitoring method and device Download PDF

Info

Publication number
CN112100498A
CN112100498A CN202010972725.2A CN202010972725A CN112100498A CN 112100498 A CN112100498 A CN 112100498A CN 202010972725 A CN202010972725 A CN 202010972725A CN 112100498 A CN112100498 A CN 112100498A
Authority
CN
China
Prior art keywords
disease
information
query
hit
classified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010972725.2A
Other languages
Chinese (zh)
Inventor
杨哲
方军
黄强
潘旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010972725.2A priority Critical patent/CN112100498A/en
Publication of CN112100498A publication Critical patent/CN112100498A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The application discloses a disease public opinion monitoring method, a device, electronic equipment and a computer readable storage medium, and relates to the technical field of computers, artificial intelligence, data processing, public opinion monitoring and disease early warning. The specific implementation scheme is as follows: acquiring disease query information within preset time, summarizing to generate a query information set, determining a hit disease information set in the query information set based on a pre-constructed disease knowledge graph, gathering different types of disease information in the disease information set by adopting a pre-determined similarity algorithm to obtain a plurality of classified disease information, and generating a monitoring list based on the number of times that the plurality of classified disease information are hit in the query information set. The disease information existing in the disease inquiry information is determined by using the disease knowledge map, and the disease public sentiment monitoring and the disease early warning are realized based on the disease inquiry information.

Description

Disease public opinion monitoring method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to the field of artificial intelligence technologies, data processing technologies, public opinion monitoring technologies, and disease early warning technologies, and in particular, to a method and an apparatus for monitoring disease public opinion, an electronic device, and a computer-readable storage medium.
Background
Because the outbreak of epidemic diseases and infectious diseases often bring huge loss to human society, people hope to clearly know the information, risk and the like of the epidemic diseases and the infectious diseases, so that the research on the diseases not only can effectively help people to know the diseases, but also can early warn the diseases with outbreak risk and the like, and the influence of the diseases on life can be effectively reduced.
In the prior art, modeling can be performed only based on user search data and historical data of disease control, and then the acquired model is used for monitoring epidemic diseases and infectious diseases, but the method has not ideal monitoring effect on epidemic diseases and infectious diseases.
Disclosure of Invention
The application provides a disease public opinion monitoring method, device, electronic equipment and storage medium.
In a first aspect, an embodiment of the present application provides a method for monitoring disease public sentiment, including: acquiring disease query information within preset time, and summarizing to generate a query information set; determining a hit disease information set in the query information set based on a pre-constructed disease knowledge graph; aggregating the different categories of disease information in the disease information set by adopting a predetermined similarity algorithm to obtain a plurality of classified disease information; a monitoring list is generated based on a number of times each of the plurality of classified disease information is hit in the set of query information.
In a second aspect, an embodiment of the present application provides a monitoring apparatus for disease public sentiment, including: the query information acquisition unit is configured to acquire disease query information within a preset time and collect the disease query information to generate a query information set; a disease information determination unit configured to determine a hit disease information set among the query information sets based on a pre-constructed disease knowledge map; a disease information aggregation unit configured to aggregate different categories of disease information in the disease information set by using a predetermined similarity algorithm to obtain a plurality of classified disease information; a monitoring list generating unit configured to generate a monitoring list based on the number of times that each of the plurality of classified disease information is hit in the query information set.
In a third aspect, an embodiment of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method for monitoring disease consensus as described in any one of the implementation manners of the first aspect.
In a fourth aspect, embodiments of the present application provide a non-transitory computer readable storage medium having computer instructions stored thereon, comprising: the computer instructions are used for causing the computer to execute the disease consensus monitoring method as described in any one of the implementation manners of the first aspect.
According to the method, after disease query information in preset time is obtained, a query information set is generated in a gathering mode, a hit disease information set in the query information set is determined based on a pre-constructed disease knowledge graph, different types of disease information in the disease information set are gathered by adopting a predetermined similarity algorithm to obtain a plurality of classified disease information, and a monitoring list is generated based on the number of times that the plurality of classified disease information are hit in the query information set. The disease information existing in the disease inquiry information is determined by using the disease knowledge map, and the disease public sentiment monitoring and the disease early warning are realized based on the disease inquiry information.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is an exemplary system architecture to which embodiments of the present application may be applied;
fig. 2 is a flowchart of an embodiment of a method for monitoring disease consensus according to the present application;
fig. 3 is a flowchart of an embodiment for constructing a disease knowledge graph in advance in a disease consensus monitoring method according to the present application;
fig. 4 is a schematic structural diagram of an embodiment of a monitoring apparatus for disease opinion according to the present application;
fig. 5 is a block diagram of an electronic device suitable for implementing the disease consensus monitoring method according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be understood that the monitoring method for disease public sentiment disclosed in the present application not only relates to the technical fields of computer technology, artificial intelligence technology, data processing technology, public sentiment monitoring technology, disease early warning technology, etc., but also can be adaptively used in the technical field supporting the monitoring method, and the present application is not limited thereto.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of a disease consensus monitoring method, apparatus, electronic device and computer-readable storage medium of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 for the purpose of sending disease query information or the like. Various applications that can provide disease information search services, such as public service applications, encyclopedia search applications, diagnosis applications, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. Hardware, various electronic devices with display screens are possible, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple software or software modules (e.g., sending disease query information, etc.), or as a single software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a server that provides a disease inquiry result for the terminal devices 101, 102, 103 or a server that acquires disease inquiry information issued by the terminal devices 101, 102, 103. For example, after disease query information within a preset time is acquired, a query information set is generated in a summarizing manner; determining a hit disease information set in the query information set based on a pre-constructed disease knowledge graph; aggregating the different categories of disease information in the disease information set by adopting a predetermined similarity algorithm to obtain a plurality of classified disease information; a monitoring list is generated based on a number of times each of the plurality of classified disease information is hit in the set of query information. It should be noted that the monitoring method for disease public opinion provided by the embodiments of the present application is generally executed by the server 105, and accordingly, the monitoring device for disease public opinion is generally disposed in the server 105.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules, for example, to provide distributed services, or as a single piece of software or software module. And is not particularly limited herein.
Further, the method of monitoring the disease public sentiment may be executed by the terminal devices 101, 102, 103, and accordingly, the monitoring apparatus of the disease public sentiment may be provided in the terminal devices 101, 102, 103. At this point, the exemplary system architecture 100 may also not include the server 105 and the network 104.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continuing reference to fig. 2, a flow 200 of one embodiment of a method for monitoring disease consensus according to the present application is shown. The disease public opinion monitoring method comprises the following steps:
step 201, acquiring disease query information within a preset time, and summarizing to generate a query information set.
In this embodiment, an executing subject (for example, the server 105 shown in fig. 1) of the disease public opinion monitoring method may obtain a disease query in history data from a local or non-local human-computer interaction device (for example, the terminal devices 101, 102, 103 shown in fig. 1), or may directly obtain disease query information sent by a user from other human-computer interaction devices (for example, the terminal devices 101, 102, 103 shown in fig. 1), which is not limited in this application.
It should be understood that the disease query information is meant to include various query information issued by the user, and the disease query information is expressed in a manner of convenience only for characterizing the query information issued by the user in the present application, and should not be considered as limiting the content in the query information issued by the user.
When the execution main body acquires the query information, a certain acquisition condition may be set to screen the range of the query information, for example, a time condition may be set in advance, and when the query information is acquired from the historical data, one or more time intervals may be preset according to a requirement to acquire the query information in the preset time intervals; when the query directly received is acquired, one or more time intervals may also be preset to acquire the query information within the preset time intervals.
After continuously acquiring a plurality of pieces of disease query information within a preset time, summarizing the acquired disease query information to generate a query information set, in an actual situation, a large amount of disease query information is generally acquired in order to improve a monitoring effect, and therefore, in a summarizing process, summarizing is generally realized by using a big data tool, for example, a remote dictionary service tool (Redis), an enterprise-level search application server (Solr) tool, a Hadoop (Hadoop) tool, and the like.
After the disease query information is summarized to generate a query set, the query set may be protected in a local or non-local storage medium, or may be stored in a database for subsequent retrieval, for example, in an es (elastic search) database.
In some optional implementations of this embodiment, the query information is aggregated using a hadoop tool to determine the set of query information.
In particular, the Haodu tool refers to a distributed system infrastructure. Distributed programs can be developed without knowledge of the details of the distributed bottom layer. The power of the cluster is fully utilized to carry out high-speed operation and storage. Hadoop Distributed File System (Hadoop Distributed File System) is implemented by Hadoop, where one component is HDFS. HDFS is characterized by high fault tolerance and is designed for deployment on inexpensive (low-cost) hardware; and it provides high throughput (high throughput) to access data of applications, suitable for applications with very large data sets. The HDFS can access data in a file system in a streaming mode, and the Haodu tool provides a reliable, efficient and scalable data processing mode to realize reliable and efficient aggregation of disease query information.
Step 202, determining a hit disease information set in the query information set based on a pre-constructed disease knowledge graph.
In this embodiment, the query information set is screened by a pre-constructed disease knowledge graph to determine the disease information set existing in the information set determined in the above step 201, and information related to disease content, such as name information and symptom information of a disease, is usually recorded in the pre-constructed disease knowledge graph, so as to determine hit disease information according to the content in the disease query information, and determine a disease information set based on the hit disease information, for example, if a piece of disease query information is "dizziness and fatigue, erythema and mass appear in the body", then according to "urticaria: and determining the disease information hit in the piece of disease inquiry information as skin disease and urticaria.
It should be understood that the disease information obtained may be different according to different constructed disease knowledge maps, and the disease information may be specific disease name information, pathological classification information of a disease, and symptom classification information, which is not limited in this application.
Step 203, aggregating the different categories of disease information in the disease information set by adopting a predetermined similarity algorithm to obtain a plurality of classified disease information.
In this embodiment, a predetermined similarity algorithm is adopted to aggregate different types of disease information in the acquired disease information set, for example, an aggregation algorithm such as a word2vec word vector aggregation algorithm, a relative entropy aggregation algorithm, and a dictionary tree query algorithm is used.
The category of the disease information may be classified based on classification criteria such as pathological features, disease parts, and harmfulness of the disease, so as to obtain a plurality of disease classification information.
It should be understood that the category of the disease information is generally related to the information used in the pre-constructed disease knowledge graph, i.e. the content of the disease information set finally generated by the disease knowledge graph, for example, the recognition result may be urticaria, chicken pox, tinea alba, tinea pedis, hepatitis a, and aids, and may be classified as skin diseases: urticaria, chickenpox, tinea alba, tinea pedis, infectious diseases: hepatitis A, AIDS, tinea pedis, and chickenpox.
In some optional implementations of the present embodiment, aggregating different categories of disease information in the acquired disease information set by using a predetermined similarity algorithm includes: and aggregating the different categories of disease information in the disease information set by adopting a dictionary tree query algorithm.
In particular, a Trie, also known as a token lookup tree or Trie, is a variation of a tree structure or hash table. For counting, sorting and storing a large number of character strings (which may also be stored). The advantage is to use a common prefix to save memory space. In this example, a simple example: for example, we want to store 3 words, nyist, nyistacm, nyisttc. If the character array is stored only according to the previous idea of character array storage, three character string arrays need to be defined. But if we use a dictionary tree, only one tree needs to be defined. Therefore, similar disease information with similar similarity can be aggregated through the dictionary tree query algorithm, for example, skin redness and swelling, skin eruption, skin chapping and the like are clustered into skin diseases, so that the disease information of the same category is aggregated accurately and efficiently.
Step 204, generating a monitoring list based on the number of times each of the plurality of classified disease information is hit in the query information set.
In this embodiment, the classified disease information is obtained by aggregating different classes of disease information in the disease information set through the predetermined similarity algorithm in step 203, so that each piece of classified disease information corresponds to at least one piece of disease information.
On the basis, based on the number of times that all the disease information corresponding to the classified disease information is hit, a proper number of classified disease information is selected according to the monitoring requirement to generate a monitoring list.
Wherein, the monitoring requirement can be a heat requirement, a comprehensive investigation requirement, a disease geographic range monitoring requirement, a disease time range monitoring and the like.
In some optional implementations of the embodiment, the generating the monitoring list based on the number of times each of the plurality of classified disease information is hit in the query information set includes: in response to determining that the number of times the classified disease information is hit in the query information set satisfies a predetermined threshold condition, marking the classified disease information as valid disease category information; the monitoring list is determined based on the effective disease category information.
Specifically, in order to improve the quality of classified disease information in the acquired monitoring list, some interference disease query information with a small number of hits in the query information set is eliminated, a proper query number threshold condition is predetermined, when the sum of the number of hits of the disease information in the query information set corresponding to the classified disease information is determined to meet the predetermined threshold condition, that is, the number of pieces of disease query information existing in the query information set is enough but not the interference information with the small number of hits, the classified disease information is marked as effective disease type information, and the monitoring list is determined based on the effective disease type information, so that the quality of the determined monitoring list is ensured.
In some optional implementations of this embodiment, the disease query information includes: geographic identification information and/or time identification information when the disease inquiry information is generated; and the monitoring list further comprises: the geographic identification information and/or the time identification information corresponding to the classified disease information.
Specifically, when the disease query information is acquired, the geographic identification information and/or the time identification information when the query information is generated, that is, the geographic location information and the query time information of the user who sends the disease query information, can be acquired simultaneously or in a divided manner, and when the monitoring list is subsequently generated, the geographic identification information and/or the time identification information are correspondingly added in the monitoring list, so that the location of the user who sends the disease query information and the time for disease query can be determined subsequently according to the information, thereby not only realizing regional disease public opinion monitoring and time regional public opinion monitoring, but also determining regional disease risk and/or temporal disease risk based on the monitoring result.
In some optional implementations of this embodiment, the determining a monitoring list based on the number of times the classified disease information is hit in the query information set includes: acquiring the hit times of the classified disease information of different classifications in the query information set, and generating a hit time sequence list; and determining the monitoring list based on the sorting of the hit times of the classified disease information in the hit time sequence list.
Specifically, the number of times that the classified disease information of different classifications is hit in the query information set is obtained, a hit number sequence list is correspondingly generated, a proper preset number is determined according to monitoring requirements, the classified disease information of the preset number is extracted from the number sequence list to generate a monitoring list, the monitoring list is generated according to the heat condition of the disease information, and therefore the effectiveness of the information in the monitoring list is improved.
In some optional implementations of the embodiment, the method for monitoring disease public opinion further includes: and selecting a preset number of classified disease information in the monitoring list according to the sorting of the hit times of the selected classified disease information to generate a disease heat information list.
Specifically, the number of times that the classified disease information of different classifications is hit in the query information set is obtained, a hit number sequence list is correspondingly generated, a predetermined number of classified disease information is selected from the hit number sequence list, and a disease heat information list is generated, so that information with high heat is fed back according to the disease heat information list, and therefore the disease information concerned by the current user can be known conveniently, and the disease occurrence trend can be analyzed conveniently.
In some optional implementations of the embodiment, the method for monitoring disease public opinion further includes: generating disease early warning information based on the monitoring list; and outputting the disease early warning information.
Specifically, after the monitoring list is determined, a disease which may possibly occur is determined according to the disease information appearing in the monitoring list, corresponding disease early warning information is generated, and the disease early warning information is output to a suitable subject, such as an operator of the executing subject, a health department and/or a user of the disease inquiry information, so as to perform early warning on the disease according to the disease early warning information in the following, and reduce the loss caused by the disease outbreak.
It should be understood that, in a non-conflicting manner, the features of the above-described implementations may be combined with each other to combine the effects brought by the respective portions to achieve a better effect.
The disease public opinion monitoring method provided by the embodiment of the application collects and generates a query information set after disease query information in preset time is acquired, determines a hit disease information set in the query information set based on a pre-constructed disease knowledge graph, gathers different types of disease information in the disease information set by adopting a pre-determined similarity calculation method to obtain a plurality of classified disease information, and generates a monitoring list based on the number of times that the plurality of classified disease information are hit in the query information set respectively. The disease information existing in the disease inquiry information is determined by using the disease knowledge map, and the disease public sentiment monitoring and the disease early warning are realized based on the disease inquiry information.
To better understand the construction steps of the pre-constructed disease knowledge graph, refer to fig. 3, which shows a flow 300 of an embodiment for constructing a disease knowledge graph in advance in the disease consensus monitoring method according to the present application. The method specifically comprises the following steps:
step 301, a disease information set is obtained from historical data.
Specifically, a disease information set is obtained according to information in the historical data, and the disease information set at least comprises disease names and symptom information, so that contents related to diseases contained in query information can be matched by using a constructed disease knowledge graph subsequently, and disease information with reference value is obtained.
Step 302, generating serialized data from the disease information set.
Specifically, the disease information set collected in step 301 is obtained, and the content in the disease information set is generated into serialized data, where Serialization (Serialization) is a process of converting the state information of the object into a form that can be stored or transmitted. During serialization, the object writes its current state to a temporary or persistent store. The object may later be recreated by reading or deserializing the state of the object from storage. For example, the content in the disease information set is arranged into data in json (javascript Object notification) format, data in CSV (Comma-Separated Values) format, YAML (YAML Ain't Markup Language) format, and the like.
Step 303, constructing a disease information and disease knowledge map using the map database based on the serialized data.
Specifically, after the serialized data obtained in step 302 is obtained, a disease information and disease knowledge map is constructed using a map database, for example, a neo4j map database, a relational database management system (mysql) map database, or a titan map database.
It should be understood that the selection of the graph database is related to the format of the serialized data and the amount of serialized data obtained in step 302 above.
In the method for constructing the pre-constructed disease knowledge graph in the method for monitoring the disease public sentiment provided by the embodiment of the application, the disease information set can be obtained according to information in historical data, and the construction of the disease knowledge graph is completed, so that the constructed disease knowledge graph is used for matching the content related to the disease contained in the query information, and the disease information set is determined.
In some optional implementations of this embodiment, the disease information set further includes: at least one of disease infectivity information, morbidity information, and cure method information.
Specifically, when a disease information set for constructing a disease knowledge graph in advance is obtained, one or more of disease infectivity information, morbidity information and curing method information can be obtained to increase the information content in the disease knowledge graph constructed in advance, so that the disease knowledge graph is convenient to be adopted to better determine the hit disease information set from the query information set, and the monitoring quality of disease public opinions is improved.
In order to deepen understanding, the application also provides a specific implementation scheme by combining a specific application scene. In the specific application scenario, the acquired disease query information is of the Nanchang user: "cough, uncomfortable throat", of jiujiang users: "fever", of Anhui user: "avian influenza", of users in Beijing: "skin reddish eruptions", of users of Beijing: "skin pox, fever", "restaurant recommendation in Beijing".
And summarizing to generate a query information set, and based on a pre-constructed disease knowledge graph, the hit disease information set in the query set A comprises pneumonia, chicken pox, avian influenza and urticaria.
And clustering the disease information in the disease information set by adopting a predetermined similarity algorithm to obtain two classified disease information of skin disease and pneumonia.
A monitoring list is generated based on cough, throat discomfort, fever, avian influenza and skin eruption, skin pox and fever corresponding to pneumonia, wherein the monitoring list is generated based on that pneumonia is hit for 3 times in the query set, that skin disease is hit for 2 times in the query set and geographical identification information in the query information.
And generating disease early warning information related to pneumonia based on the geographic identification information in the monitoring list, and outputting the disease early warning information to a user group in the western and Jiangxi region.
According to the application scene, the disease public opinion monitoring method obtains disease query information within preset time, collects and generates a query information set, determines a hit disease information set in the query information set based on a pre-constructed disease knowledge graph, adopts a predetermined similarity calculation method to collect different types of disease information in the disease information set to obtain a plurality of classified disease information, and generates a monitoring list based on the number of times that the plurality of classified disease information are hit in the query information set. The disease information existing in the disease inquiry information is determined by using the disease knowledge map, and the disease public sentiment monitoring and the disease early warning are realized based on the disease inquiry information.
As shown in fig. 4, the monitoring apparatus 400 for disease consensus in this embodiment may include: the query information acquisition unit 401 is configured to acquire disease query information within a preset time, and summarize to generate a query information set; a disease information determination unit 402 configured to determine a hit disease information set among the query information sets based on a pre-constructed disease knowledge graph; a disease information aggregation unit 403 configured to aggregate different categories of disease information in the disease information set by using a predetermined similarity algorithm to obtain a plurality of classified disease information; a monitoring list generating unit 404 configured to generate a monitoring list based on the number of times each of the plurality of classified disease information is hit in the query information set.
In some optional implementations of the present embodiment, the disease information determining unit 402 further includes: a disease profile construction subunit configured to construct the pre-constructed disease profile based on: acquiring a disease information set from historical data; wherein the disease information set at least comprises disease name information and disease symptom information; generating serialized data according to the disease information set; based on the serialized data, a disease information disease knowledge map is constructed using a map database.
In some optional implementations of this embodiment, the disease information set further includes: at least one of disease infectivity information, morbidity information, and cure method information.
In some optional implementation manners of this embodiment, the summarizing and generating the query information set in the query information obtaining unit 401 includes: and aggregating the query information by using a Haodu tool to determine the query information set.
In some optional implementations of the present embodiment, the aggregating, in the disease information aggregating unit 403, of the disease information of different categories in the disease information set by using a predetermined similarity algorithm includes: and aggregating the different categories of disease information in the disease information set by adopting a dictionary tree query algorithm.
In some optional implementations of this embodiment, the monitoring list generating unit 404 further includes: a valid disease category information tagging unit configured to tag the classified disease information as valid disease category information in response to determining that a number of times the classified disease information is hit in the query information set satisfies a predetermined threshold condition; and a monitoring list generating 404 unit further configured to determine the monitoring list based on the effective disease category information.
In some optional implementation manners of this embodiment, the query information obtaining unit 401 further includes: a geographic identification acquisition subunit configured to acquire geographic identification information at the time of generation of the disease query information; and/or a time identification obtaining subunit configured to obtain time identification information when the disease query information is generated; and the monitoring list generating unit 404 is further configured to, when generating the monitoring list, add the geographic identification information and/or the time identification information corresponding to the classified disease information in the monitoring list.
In some optional implementations of this embodiment, the monitoring list generating unit 404 further includes: a hit frequency sequence counting subunit configured to obtain the number of times that the classified disease information of different classifications is hit in the query information set, and generate a hit frequency sequence list; and a monitoring list generating unit 404, further configured to determine the monitoring list based on the ranking of the hit times of the classified disease information in the hit times sequence list.
In some optional implementation manners of this embodiment, the apparatus 400 further includes a heat information generating unit, configured to select a preset number of classified disease information from the monitoring list according to the sorting of the hit times of the selected classified disease information, and generate a disease heat information list.
In some optional implementation manners of this embodiment, the apparatus 400 further includes an early warning information generating unit, configured to generate disease early warning information based on the monitoring list; an early warning information output unit configured to output the disease early warning information.
The present embodiment exists as an apparatus embodiment corresponding to the above method embodiment, and the same contents refer to the description of the above method embodiment, which is not repeated herein. Through the monitoring device of disease public opinion that this application embodiment provided, can utilize the disease knowledge map to confirm the disease information of existence in the disease inquiry information, realize the early warning to the control of disease public opinion and disease based on disease inquiry information.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 5 is a block diagram of an electronic device for monitoring disease consensus according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
Memory 502 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor, so that the at least one processor executes the method for monitoring disease consensus provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the method for monitoring disease consensus provided by the present application.
The memory 502, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the monitoring method of disease consensus in the embodiment of the present application (for example, the inquiry information acquiring unit 401, the disease information determining unit 402, the disease information aggregating unit 403, and the monitoring list generating unit 404 shown in fig. 4). The processor 501 executes various functional applications of the server and data processing by running non-transitory software programs, instructions and modules stored in the memory 502, so as to implement the disease consensus monitoring method in the above method embodiment.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the monitoring electronic device for disease consensus, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to disease consensus monitoring electronics via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic equipment of the monitoring method for disease public sentiment can also comprise: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the monitoring electronics for disease consensus, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, and the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, disease query information in preset time is obtained and summarized to generate a query information set, a hit disease information set in the query information set is determined based on a pre-constructed disease knowledge graph, different types of disease information in the disease information set are aggregated by adopting a pre-determined similarity algorithm to obtain a plurality of classified disease information, and a monitoring list is generated based on the number of times that the plurality of classified disease information are hit in the query information set respectively. The disease information existing in the disease inquiry information is determined by using the disease knowledge map, and the disease public sentiment monitoring and the disease early warning are realized based on the disease inquiry information.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (22)

1. A disease public opinion monitoring method comprises the following steps:
acquiring disease query information within preset time, and summarizing to generate a query information set;
determining a hit disease information set in the query information set based on a pre-constructed disease knowledge graph;
aggregating the different categories of disease information in the disease information set by adopting a predetermined similarity algorithm to obtain a plurality of classified disease information;
generating a monitoring list based on the number of times each of the plurality of classified disease information is hit in the query information set.
2. The method of claim 1, wherein the pre-constructed disease knowledge map is determined based on the steps of:
acquiring a disease information set from historical data; wherein the disease information set at least comprises disease name information and disease symptom information;
generating serialized data according to the disease information set;
based on the serialized data, a disease information disease knowledge map is constructed using a map database.
3. The method of claim 2, wherein the set of disease information further comprises: at least one of disease infectivity information, morbidity information, and cure method information.
4. The method of claim 1, wherein the aggregating generates a set of query information comprises:
and aggregating the query information by using a Haodu tool to determine the query information set.
5. The method of claim 1, wherein said aggregating, using a predetermined similarity algorithm, different categories of disease information in the set of disease information comprises:
and aggregating the different categories of disease information in the disease information set by adopting a dictionary tree query algorithm.
6. The method of claim 1, wherein generating a monitoring list based on a number of times each of the plurality of classified disease information is hit in the query information set comprises:
in response to determining that the number of times the classified disease information is hit in the query information set satisfies a predetermined threshold condition, marking the classified disease information as valid disease category information;
determining the monitoring list based on the effective disease category information.
7. The method of claim 1, wherein the disease query information comprises:
geographic identification information and/or time identification information when the disease query information is generated; and
the monitoring list further comprises:
the geographic identification information and/or the time identification information corresponding to the classified disease information.
8. The method of claim 1, wherein said determining a watch list based on a number of times said classified disease information is hit in said query information set comprises:
acquiring the hit times of the classified disease information of different classifications in the query information set, and generating a hit time sequence list;
determining the monitoring list based on the ranking of the hit times of the classified disease information in the hit times sequence list.
9. The method of claim 8, further comprising:
and selecting a preset number of classified disease information from the monitoring list according to the sorting of the hit times of the selected classified disease information, and generating a disease heat information list.
10. The method according to any one of claims 1-9, further comprising:
generating disease early warning information based on the monitoring list;
and outputting the disease early warning information.
11. A monitoring device for disease public sentiment comprises:
the query information acquisition unit is configured to acquire disease query information within a preset time and collect the disease query information to generate a query information set;
a disease information determination unit configured to determine a hit disease information set among the query information sets based on a pre-constructed disease knowledge map;
a disease information aggregation unit configured to aggregate different categories of disease information in the disease information set by using a predetermined similarity algorithm to obtain a plurality of classified disease information;
a monitoring list generating unit configured to generate a monitoring list based on the number of times that each of the plurality of classified disease information is hit in the query information set.
12. The apparatus of claim 11, wherein the disease information determination unit further comprises:
a disease profile construction subunit configured to construct the pre-constructed disease profile based on: acquiring a disease information set from historical data; wherein the disease information set at least comprises disease name information and disease symptom information;
generating serialized data according to the disease information set;
based on the serialized data, a disease information disease knowledge map is constructed using a map database.
13. The apparatus of claim 12, wherein the set of disease information further comprises: at least one of disease infectivity information, morbidity information, and cure method information.
14. The apparatus of claim 11, wherein the aggregating generation of the set of query information in the query information obtaining unit comprises:
and aggregating the query information by using a Haodu tool to determine the query information set.
15. The apparatus according to claim 11, wherein the aggregating of the disease information of different categories in the set of disease information using a predetermined similarity algorithm in the disease information aggregating unit comprises:
and aggregating the different categories of disease information in the disease information set by adopting a dictionary tree query algorithm.
16. The apparatus of claim 11, wherein the monitoring list generating unit further comprises:
a valid disease category information tagging unit configured to tag the classified disease information as valid disease category information in response to determining that a number of times the classified disease information is hit in the query information set satisfies a predetermined threshold condition;
the monitoring list generation unit is further configured to determine the monitoring list based on the effective disease category information.
17. The apparatus of claim 11, wherein the query information obtaining unit further comprises:
a geographic identification acquisition subunit configured to acquire geographic identification information at the time of generation of the disease query information; and/or
A time identification acquisition subunit configured to acquire time identification information at the time of generation of the disease inquiry information;
and the monitoring list generating unit is further configured to add the geographic identification information and/or the time identification information corresponding to the classified disease information in the monitoring list when generating the monitoring list.
18. The apparatus of claim 11, wherein the monitoring list generating unit further comprises:
a hit frequency sequence counting subunit configured to obtain the number of times that the classified disease information of different classifications is hit in the query information set, and generate a hit frequency sequence list;
the monitoring list generation unit is further configured to determine the monitoring list based on the ranking of the hit times of the classified disease information in the hit time sequence list.
19. The apparatus of claim 18, further comprising:
and the heat information generating unit is configured to select a preset number of classified disease information from the monitoring list according to the sorting of the hit times of the selected classified disease information to generate a disease heat information list.
20. The apparatus of any of claims 11-19, further comprising:
an early warning information generation unit configured to generate disease early warning information based on the monitoring list;
an early warning information output unit configured to output the disease early warning information.
21. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.
22. A non-transitory computer readable storage medium storing computer instructions, comprising: the computer instructions are for causing the computer to perform the method of any one of claims 1-10.
CN202010972725.2A 2020-09-16 2020-09-16 Disease public opinion monitoring method and device Pending CN112100498A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010972725.2A CN112100498A (en) 2020-09-16 2020-09-16 Disease public opinion monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010972725.2A CN112100498A (en) 2020-09-16 2020-09-16 Disease public opinion monitoring method and device

Publications (1)

Publication Number Publication Date
CN112100498A true CN112100498A (en) 2020-12-18

Family

ID=73759689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010972725.2A Pending CN112100498A (en) 2020-09-16 2020-09-16 Disease public opinion monitoring method and device

Country Status (1)

Country Link
CN (1) CN112100498A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112951441A (en) * 2021-02-25 2021-06-11 平安科技(深圳)有限公司 Monitoring and early warning method, device, equipment and storage medium based on multiple dimensions
CN113871019A (en) * 2021-12-06 2021-12-31 江西易卫云信息技术有限公司 Disease public opinion monitoring method, system, storage medium and equipment
CN114023447A (en) * 2021-12-06 2022-02-08 清华大学 Training method and device for rare patient number prediction model
CN114067935A (en) * 2021-11-03 2022-02-18 广西壮族自治区通信产业服务有限公司技术服务分公司 Epidemic disease investigation method, system, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109767842A (en) * 2018-12-13 2019-05-17 平安科技(深圳)有限公司 A kind of disease pre-warning method, Disease Warning Mechanism device and computer readable storage medium
CN111444429A (en) * 2020-03-27 2020-07-24 腾讯科技(深圳)有限公司 Information pushing method and device and server

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109767842A (en) * 2018-12-13 2019-05-17 平安科技(深圳)有限公司 A kind of disease pre-warning method, Disease Warning Mechanism device and computer readable storage medium
CN111444429A (en) * 2020-03-27 2020-07-24 腾讯科技(深圳)有限公司 Information pushing method and device and server

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112951441A (en) * 2021-02-25 2021-06-11 平安科技(深圳)有限公司 Monitoring and early warning method, device, equipment and storage medium based on multiple dimensions
CN114067935A (en) * 2021-11-03 2022-02-18 广西壮族自治区通信产业服务有限公司技术服务分公司 Epidemic disease investigation method, system, electronic equipment and storage medium
CN114067935B (en) * 2021-11-03 2022-05-20 广西壮族自治区通信产业服务有限公司技术服务分公司 Epidemic disease investigation method, system, electronic equipment and storage medium
CN113871019A (en) * 2021-12-06 2021-12-31 江西易卫云信息技术有限公司 Disease public opinion monitoring method, system, storage medium and equipment
CN114023447A (en) * 2021-12-06 2022-02-08 清华大学 Training method and device for rare patient number prediction model

Similar Documents

Publication Publication Date Title
CN112100498A (en) Disease public opinion monitoring method and device
US11275642B2 (en) Tuning context-aware rule engine for anomaly detection
US11500880B2 (en) Adaptive recommendations
CN111681726B (en) Processing method, device, equipment and medium of electronic medical record data
US11657612B2 (en) Method and apparatus for identifying video
US20200267057A1 (en) Systems and methods for automatically detecting, summarizing, and responding to anomalies
Doyle et al. Forecasting significant societal events using the embers streaming predictive analytics system
US20180102938A1 (en) Cluster-based processing of unstructured log messages
US9633088B1 (en) Event log versioning, synchronization, and consolidation
JP2021166098A (en) Retrieval word recommendation method and apparatus, target model training method and apparatus, electronic device, storage medium, and program
US20230004536A1 (en) Systems and methods for a data search engine based on data profiles
US9824312B2 (en) Domain specific languages and complex event handling for mobile health machine intelligence systems
Chen et al. Bert-log: Anomaly detection for system logs based on pre-trained language model
CN112509690A (en) Method, apparatus, device and storage medium for controlling quality
CN112269885A (en) Method, apparatus, device and storage medium for processing data
CN110852780A (en) Data analysis method, device, equipment and computer storage medium
CN111756832A (en) Method and device for pushing information, electronic equipment and computer readable storage medium
Roy et al. A proposal for optimization of data node by horizontal scaling of name node using big data tools
US20220164377A1 (en) Method and apparatus for distributing content across platforms, device and storage medium
CN111125362B (en) Abnormal text determination method and device, electronic equipment and medium
CN112308127A (en) Method, apparatus, device and storage medium for processing data
JP6159002B1 (en) Estimation apparatus, estimation method, and estimation program
CN112329427B (en) Method and device for acquiring short message samples
CN112559747B (en) Event classification processing method, device, electronic equipment and storage medium
JP2018049587A (en) Estimation device, estimation method and estimation program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination