CN115544214B - Event processing method, device and computer readable storage medium - Google Patents

Event processing method, device and computer readable storage medium Download PDF

Info

Publication number
CN115544214B
CN115544214B CN202211533022.5A CN202211533022A CN115544214B CN 115544214 B CN115544214 B CN 115544214B CN 202211533022 A CN202211533022 A CN 202211533022A CN 115544214 B CN115544214 B CN 115544214B
Authority
CN
China
Prior art keywords
event
historical
entity
target
cosine similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211533022.5A
Other languages
Chinese (zh)
Other versions
CN115544214A (en
Inventor
牟昊
邓钢清
何宇轩
徐亚波
李旭日
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Datastory Information Technology Co ltd
Original Assignee
Guangzhou Datastory Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Datastory Information Technology Co ltd filed Critical Guangzhou Datastory Information Technology Co ltd
Priority to CN202211533022.5A priority Critical patent/CN115544214B/en
Publication of CN115544214A publication Critical patent/CN115544214A/en
Application granted granted Critical
Publication of CN115544214B publication Critical patent/CN115544214B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an event processing method, equipment and a computer readable storage medium, wherein the method comprises the following steps: carrying out event extraction on the text information by adopting an event extraction model; extracting the entity of the text information by adopting an entity identification model; determining a target event according to the extracted event information and the entity; calculating cosine similarity of the target event and each historical event in the event database, and judging whether any one of the target event and the previous K historical events is the same event according to the previous K historical events with highest cosine similarity, the entity of the previous K historical events and the entity of the target event; if not, updating the target event increment to the event database; otherwise, updating the corresponding historical event in the event database; the invention adopts the event extraction model to extract the event, the entity recognition model to extract the entity, and the cosine similarity and the entity similarity of the event are combined to comprehensively judge the event similarity, so that the accuracy of the event extraction and the combination can be improved.

Description

Event processing method, device and computer readable storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an event processing method, an event processing device, and a computer readable storage medium.
Background
Various events occur anytime in the world, the extraction of the context of the event is particularly important for the development of the event, and the relationship between the event and people, enterprises, industries and the like is especially important, so that people can be helped to quickly understand the development process of the event, and the application of natural languages such as intelligent search, question-answering systems, recommendation, text generation and the like can be promoted. However, for the same event, there are multiple description modes, especially chinese, and these description modes are even more wonder on the network, and if these same events are not combined, it is disadvantageous for the downstream application of the event, such as smart search: the results searched out by the keywords are likely to be different descriptions of the same event, which is very disadvantageous for the user to screen the results he wants. It is also particularly important how to merge the same events.
The existing event merging method merges similar events through the boundary distance of characters, but the boundary distance is very time-consuming, and for different events with one or two different characters, the similar events are considered, for example: the two events of apple issue iphone12 and apple issue iphone13 are different in character, but calculating the boundary distance of the two events can be considered as the same event.
Disclosure of Invention
The embodiment of the invention provides an event processing method, event processing equipment and a computer readable storage medium, which can effectively improve the accuracy of event extraction and merging.
In a first aspect, an embodiment of the present invention provides an event processing method, including:
acquiring text information, and carrying out event extraction on the text information by adopting an event extraction model to obtain event information;
extracting the entity from the text information by adopting an entity identification model to obtain the entity in the text information;
determining a target event according to the event information and the entity;
calculating cosine similarity between the target event and each historical event in an event database, and selecting the first K historical events with highest cosine similarity from the event database;
judging whether any one of the target event and the previous K historical events is the same event or not according to the cosine similarity of the selected previous K historical events, the entity of the selected previous K historical events and the entity of the target event;
if not, updating the target event increment into the event database;
if yes, updating the historical event which belongs to the same event with the target event in the event database.
As an improvement of the above scheme, the event information includes an event and an event type and a probability of the event type.
As an improvement of the above solution, the determining a target event according to the event information and the entity includes:
judging whether the probability of the event type of the current extracted event is larger than a set probability threshold value;
if not, discarding the currently extracted event;
if yes, judging whether the entity exists in the currently extracted event;
when the entity exists in the current extracted event, outputting the current extracted event as a target event;
discarding the currently extracted event when the entity does not exist in the currently extracted event.
As an improvement of the above solution, the calculating the cosine similarity between the target event and each historical event in the event database, and selecting the first K historical events with the highest cosine similarity from the event database includes:
inputting the target event into a vector model to obtain an event vector of the target event;
calculating cosine similarity between the event vector and each historical event in the event database;
and selecting the first K historical events with highest cosine similarity from the event database.
As an improvement of the above solution, the method further includes:
and carrying out standardization processing on the currently extracted entity through a preset normalization code table.
As an improvement of the above solution, the determining whether the target event and any one of the previous K historical events are the same event according to the cosine similarity of the selected previous K historical events, and the entity of the target event, includes:
for the first K historical events, judging whether cosine similarity between the ith historical event and the target event is larger than a preset similarity threshold value or not;
if not, determining that the target event and the ith historical event are not the same event;
if yes, judging whether the standardized entity is the same as the entity corresponding to the ith historical event;
when the standardized entity is different from the entity corresponding to the ith historical event, extracting the (i+1) th historical event, and returning to a cosine similarity judging flow; i is more than or equal to 1 and less than or equal to K-1;
when the standardized entity is the same as the entity corresponding to the ith historical event, inputting the target event and the ith historical event into an event similarity judgment model to obtain an event judgment result; the event judgment result comprises the same event and not the same event.
As an improvement of the above scheme, between the extraction of the (i+1) th historical event, further comprising:
judging whether the ith historical event is the last historical event in the previous K historical events or not;
if yes, determining that the target event and the ith historical event are not the same event;
if not, the (i+1) th historical event is extracted.
As an improvement of the above solution, the updating the historical event in the event database, which belongs to the same event as the target event, includes:
for a historical event belonging to the same event with the target event in the event database, updating a field of the historical event;
wherein the fields include the occurrence time and volume of the corresponding event.
In a second aspect, an embodiment of the present invention provides an event processing apparatus, including: a processor; a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the event processing method according to any of the first aspects when the computer program is executed.
In a third aspect, an embodiment of the present invention provides a computer readable storage medium storing a computer program, where the computer program when run controls a device in which the computer readable storage medium is located to perform the event processing method according to any one of the first aspects.
Compared with the prior art, the embodiment of the invention has the beneficial effects that: acquiring text information, and carrying out event extraction on the text information by adopting an event extraction model to obtain event information; extracting the entity from the text information by adopting an entity identification model to obtain the entity in the text information; determining a target event according to the event information and the entity; calculating cosine similarity between the target event and each historical event in an event database, and selecting the first K historical events with highest cosine similarity from the event database; judging whether any one of the target event and the previous K historical events is the same event or not according to the cosine similarity of the selected previous K historical events, the entity of the selected previous K historical events and the entity of the target event; if not, updating the target event increment into the event database; if yes, updating a historical event which belongs to the same event with the target event in the event database; the invention adopts an event extraction model to extract the event, an entity recognition model to extract the entity, and the cosine similarity and the entity similarity of the event are combined to comprehensively judge the event similarity; and if the event is judged to belong to the same event, directly updating the corresponding historical event in the event database, and if the event is judged not to belong to the same event, updating the event increment into the event database, so that the accuracy of event extraction and merging can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that will be used in the embodiments will be briefly described below, and it will be apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for event processing according to an embodiment of the present invention;
FIG. 2 is a flow chart of event extraction provided by an embodiment of the present invention;
FIG. 3 is a flow chart of event merging provided by an embodiment of the present invention;
fig. 4 is a schematic diagram of an event processing device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, a flowchart of an event processing method according to an embodiment of the present invention includes:
s1: acquiring text information, and carrying out event extraction on the text information by adopting an event extraction model to obtain event information;
by way of example, an API tool may be employed to crawl text information from web pages such as websites, microblogs, weChat public numbers, etc., for example: the regenerated me of Xuqin XuXX brand of the substitution name # Xiato// @ is Dicheng: # Gong XuXX brand of the substitution name # delicious and remaining between lips and happy and free. The sweet blessing is given to you, so that the best taste melts you-together with XU XX@ XU XX Chinese brand code words Gong X@ Gong X Simon, hope you to get happy in summer-!
And inputting the crawled text information into a pre-constructed event extraction model, and extracting event information of the text information. The event extraction model is constructed by adopting a BERT model. It should be noted that the BERT model belongs to the prior art, and in the embodiment of the present invention, the development is not performed.
Wherein, the event information comprises events, event types and probability of event types. The event includes three elements, namely a subject, a predicate, and an object. Taking the text information as an example, the text information is input into the BERT model, and the event can be obtained as follows: "Gong X", "substitution, xu XX", event type: instead, the probability of an event type is: 0.9988. it should be noted that, the event type may be predefined and configured into the event extraction model, so as to classify and predict probability of the event type for the input text information.
S2: extracting the entity from the text information by adopting an entity identification model to obtain the entity in the text information;
in the embodiment of the invention, the text information is input into the entity recognition model, so that an entity can be obtained, for example: { brand: xuXX, name: gong X.
S3: determining a target event according to the event information and the entity;
s4: calculating cosine similarity between the target event and each historical event in an event database, and selecting the first K historical events with highest cosine similarity from the event database;
s5: judging whether any one of the target event and the previous K historical events is the same event or not according to the cosine similarity of the selected previous K historical events, the entity of the selected previous K historical events and the entity of the target event;
s6: if not, updating the target event increment into the event database;
s7: if yes, updating the historical event which belongs to the same event with the target event in the event database.
In the embodiment of the invention, an event extraction model is adopted to extract the event, an entity recognition model is adopted to extract the entity, and the cosine similarity and the entity similarity of the event are combined to comprehensively judge the event similarity; and if the event is judged to belong to the same event, directly updating the corresponding historical event in the event database, and if the event is judged not to belong to the same event, updating the event increment into the event database, so that the accuracy of event extraction and merging can be improved.
In an alternative embodiment, said determining a target event based on said event information and said entity comprises:
judging whether the probability of the event type of the current extracted event is larger than a set probability threshold value;
if not, discarding the currently extracted event;
if yes, judging whether the entity exists in the currently extracted event;
when the entity exists in the current extracted event, outputting the current extracted event as a target event;
discarding the currently extracted event when the entity does not exist in the currently extracted event.
Illustratively, the entity recognition model is constructed using a BERT model. The extraction flow of the event is shown in figure 2, text information is input into an event extraction model, and the probability of the event and the event type is obtained; inputting the text information into an entity identification model to obtain an entity; in order to reduce calculation, whether the probability of the event type is larger than a preset probability threshold value can be judged first, if not, the event is directly discarded, and corresponding text information is not input into an entity recognition model for entity extraction; if yes, reserving the event, and inputting the corresponding text information into the entity identification model to obtain an entity; and judging whether the reserved event has an entity output by the entity identification model, if not, discarding the event, and if so, outputting the event as a target event. For example, the probability threshold is set to 0.7, at which time the event "Gong X is slow XX" is reserved, and the entity { brand: xuXX, name: gong X, the event currently extracted is outputted as the target event. In the embodiment of the invention, the event is extracted by utilizing the event extraction model, the entity identification model and the event probability threshold together, so that the accuracy of event extraction can be improved.
In an optional embodiment, the calculating the cosine similarity between the target event and each historical event in the event database, and selecting the first K historical events with the highest cosine similarity from the event database includes:
inputting the target event into a vector model to obtain an event vector of the target event;
calculating cosine similarity between the event vector and each historical event in the event database;
and selecting the first K historical events with highest cosine similarity from the event database.
The vector model is constructed by adopting a BERT model, a 768-dimensional event vector representing a target event can be obtained by inputting the target event 'Gong X pronouncing XU X' into the vector model, cosine similarity between the event vector and each historical event in an event database is calculated, topK historical events with highest cosine similarity are selected, and then polarity ordering of the topK events is performed according to the cosine similarity from large to small. The cosine similarity distance calculation formula of the event vector and the historical event is as follows:
Figure 159080DEST_PATH_IMAGE001
where n represents the dimension of the event vector, x i Representing the ith component, y, of the target event corresponding to the target vector i The i-th component of the time vector corresponding to the historical time is represented. It should be noted that, by inputting the historical event into the vector model, an event vector of 768 dimensions that characterizes the historical event can be obtained as well. In order to reduce the data display, K is set to 3, at this time, the first 3 historical events with the maximum cosine similarity are selected, and the cosine similarity is ranked from large to small, for example:
historical event 1: gong X is a brand name of Xu XX: gong X, xuXX, cosine similarity: 0.98.
historical event 2: gong X is Yanshuxx, entity: jun, comfort XX, cosine similarity: 0.62.
historical event 3: gong X, origin Charlotte Tilbury, entity: gong X, charlotte Tilbury, cosine similarity: 0.45.
in an alternative embodiment, the method further comprises:
and carrying out standardization processing on the currently extracted entity through a preset normalization code table.
Wherein, the normalization code table records different names and standard names of brand entities. For example, the normalization code table comprises a key column recorded with a standard name and a word column recorded with an alternative name of the standard name; if a brand entity exists in the target event and exists in a word column of the normalization code table, the brand entity is replaced by a standard name corresponding to the key column, and standardized processing of the event entity is realized, so that similar events can be conveniently combined.
Further, the determining whether the target event and any one of the first K historical events are the same event according to the cosine similarity of the first K historical events, and the cosine similarity of the first K historical events, the cosine similarity of the entity, and the cosine similarity of the entity, the cosine similarity of the target event, the cosine similarity of the first K historical events, and the cosine similarity of the first K historical events, comprises:
for the first K historical events, judging whether cosine similarity between the ith historical event and the target event is larger than a preset similarity threshold value or not;
if not, determining that the target event and the ith historical event are not the same event;
if yes, judging whether the standardized entity is the same as the entity corresponding to the ith historical event;
when the standardized entity is different from the entity corresponding to the ith historical event, extracting the (i+1) th historical event, and returning to a cosine similarity judging flow; i is more than or equal to 1 and less than or equal to K-1;
when the standardized entity is the same as the entity corresponding to the ith historical event, inputting the target event and the ith historical event into an event similarity judgment model to obtain an event judgment result; the event judgment result comprises the same event and not the same event.
Further, between extracting the (i+1) th history event, further comprising:
judging whether the ith historical event is the last historical event in the previous K historical events or not;
if yes, determining that the target event and the ith historical event are not the same event;
if not, the (i+1) th historical event is extracted.
The flow of event merging is shown in fig. 3, the target event is input into a vector model to obtain an event vector, then cosine similarity between the event vector and historical events in an event database is calculated, and a topK historical event is selected; and simultaneously, carrying out standardization processing on the entity output by the entity identification model by adopting the normalization code table to obtain a standardized entity. Then taking a historical event according to the sequence of the cosine similarity from large to small, judging whether the cosine similarity of the historical event is larger than a preset similarity threshold value, if not, indicating that the target event is a new event, and incrementally storing the new event into an event database; if yes, further judging whether the entity in the history event is the same as the standardized entity; if the entities are the same, an event similarity judging model is also needed to judge whether the target event and the historical event are the same event; if the event is the same event, merging the target event and the historical event, and if the event is not the same event, indicating that the target event is a new event, and storing the increment into an event database. If the entities are different, judging whether the historical event is the last event in the topK historical event; if not, the target event is indicated to be a new event, and the increment is stored in an event database; if yes, taking a historical event from the topK historical events, and repeating the merging process.
The event similarity judging model is constructed by adopting a BERT model. For example, the similarity threshold is set to 0.85. For the topK historical event described above: historical event 1, historical event 2, historical event 3; selecting a historical event 1: gong X becomes a slow XX brand speaker, cosine similarity of the historical event to the target event: 0.98, greater than a preset similarity threshold of 0.85; historical event 1: the entities in Gong X as a slow XX brand speaker are: the entities of Gong X, xuXX, target event "Gong X for XuXX" are: xuXX, gong X, i.e. the history event 1 is the same as the entity of the target event, merge the history event 1 with the target event, and update the relevant information of the history event 1 in the time database. In the embodiment of the invention, the cosine similarity, the event similarity judging model and the entity similarity are utilized to comprehensively judge the event similarity, so that similar events are combined, the accuracy of event combination can be further improved, and the repeated reservation of the similar events is avoided.
In an alternative embodiment, the updating the historical event in the event database that belongs to the same event as the target event includes:
for a historical event belonging to the same event with the target event in the event database, updating a field of the historical event;
wherein the fields include the occurrence time and volume of the corresponding event.
In the embodiment of the invention, the historical events stored in the event database are provided with fields for representing the occurrence time and sound volume of the events; for the situation that the target event is identified to be the same as one historical event in the event database, the occurrence time and sound volume of the event corresponding to the field of the historical event are updated; wherein, every time an event similar to the historical event is identified, the sound volume is increased by 1.
Compared with the prior art, the embodiment of the invention has the beneficial effects that: setting a probability threshold value in an event extraction stage to filter the events output by the event extraction model, filtering the events with small confidence coefficient, and setting entity identification to filter the non-concerned events so as to avoid error extraction, incomplete extraction and unimportant event extraction of the event extraction model; in the event merging stage, cosine similarity and brand entity are used for judging whether the events are the same event, and the brand entity is used for standardization, so that the same event is prevented from being repeatedly extracted, and the event extraction accuracy and the event merging accuracy are effectively improved.
Example two
Referring to fig. 4, a schematic diagram of an event processing device according to an embodiment of the present invention includes: a processor 100, a memory 200 for storing one or more computer programs; such as an event handler. When the one or more computer programs are executed by the processor 100, the processor 100 implements the event processing method according to any one of the embodiments, such as steps S1-S7 shown in fig. 1, and the same technical effects can be achieved, so that repetition is avoided and redundant description is omitted. Alternatively, the processor may implement the functions of the modules/units in the above-described device embodiments when executing the computer program.
The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the event processing device.
The event processing device may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, and the like. The event processing device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic is merely an example of an event processing device and does not constitute a limitation of the event processing device, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the event processing device may further include an input-output device, a network access device, a bus, etc.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the event processing device, connecting various parts of the overall event processing device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the event processing device by executing or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the integrated modules/units of the event processing device may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.
Example III
The embodiment of the invention provides a computer readable storage medium, which stores a computer program, wherein when the computer program runs, a device where the computer readable storage medium is located is controlled to execute the event processing method according to any one of the first embodiment, and the same technical effects can be achieved, so that repetition is avoided, and redundant description is omitted here.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that many modifications and variations may be made without departing from the spirit of the invention, and it is intended that such modifications and variations be considered as a departure from the scope of the invention.

Claims (3)

1. An event processing method, comprising:
acquiring text information, and carrying out event extraction on the text information by adopting an event extraction model to obtain event information; the event information comprises an event, an event type and probability of the event type; the event comprises three elements of a subject, a predicate and an object;
extracting the entity from the text information by adopting an entity identification model to obtain the entity in the text information;
determining a target event according to the event information and the entity; the method comprises the steps that a currently extracted entity is subjected to standardized processing through a preset normalization code table, wherein the normalization code table comprises a key column recorded with a standard name and a word column recorded with an alternative name of the standard name; if a brand entity exists in the target event and is in a word column of the normalized code table, the brand entity is replaced by a standard name corresponding to the key column;
calculating cosine similarity between the target event and each historical event in an event database, and selecting the first K historical events with highest cosine similarity from the event database;
judging whether any one of the target event and the previous K historical events is the same event or not according to the cosine similarity of the selected previous K historical events, the entity of the selected previous K historical events and the entity of the target event;
if not, updating the target event increment into the event database;
if yes, updating a historical event which belongs to the same event with the target event in the event database, namely merging the target event and the historical event;
the updating the historical event which belongs to the same event with the target event in the event database comprises the following steps:
for a historical event belonging to the same event with the target event in the event database, updating a field of the historical event;
wherein the field comprises the occurrence time and sound volume of the corresponding event, and each time an event similar to the historical event is identified, the sound volume is added with 1;
the determining a target event according to the event information and the entity comprises the following steps:
judging whether the probability of the event type of the current extracted event is larger than a set probability threshold value;
if not, discarding the currently extracted event;
if yes, judging whether the entity exists in the currently extracted event;
when the entity exists in the current extracted event, outputting the current extracted event as a target event;
discarding the currently extracted event when the entity does not exist in the currently extracted event;
the calculating the cosine similarity between the target event and each historical event in the event database, and selecting the first K historical events with highest cosine similarity from the event database comprises the following steps:
inputting the target event into a vector model to obtain an event vector of the target event; wherein, the vector model is constructed by adopting a BERT model;
calculating cosine similarity between the event vector and each historical event in the event database; the cosine similarity distance calculation formula of the event vector and the historical event is as follows:
Figure QLYQS_1
where n represents the dimension of the event vector, x i Representing the ith component, y, of the target event corresponding to the target vector i An ith component representing a time vector corresponding to the historical time;
selecting the first K historical events with highest cosine similarity from the event database;
the step of judging whether any one of the target event and the previous K historical events is the same event according to the cosine similarity of the selected previous K historical events, the entity thereof and the entity of the target event, comprises the following steps:
for the first K historical events, judging whether cosine similarity between the ith historical event and the target event is larger than a preset similarity threshold value or not;
if not, determining that the target event and the ith historical event are not the same event;
if yes, judging whether the standardized entity is the same as the entity corresponding to the ith historical event;
when the standardized entity is different from the entity corresponding to the ith historical event, extracting the (i+1) th historical event, and returning to a cosine similarity judging flow; i is more than or equal to 1 and less than or equal to K-1;
when the standardized entity is the same as the entity corresponding to the ith historical event, inputting the target event and the ith historical event into an event similarity judgment model to obtain an event judgment result; wherein the event judgment result comprises the same event and not the same event;
between extracting the i+1th historical event, further comprising:
judging whether the ith historical event is the last historical event in the previous K historical events or not;
if yes, determining that the target event and the ith historical event are not the same event;
if not, the (i+1) th historical event is extracted.
2. An event processing apparatus, comprising: a processor; a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the event processing method of claim 1 when the computer program is executed.
3. A computer readable storage medium, wherein the computer readable storage medium stores a computer program, and wherein the computer program when run controls a device on which the computer readable storage medium resides to execute the event processing method according to claim 1.
CN202211533022.5A 2022-12-02 2022-12-02 Event processing method, device and computer readable storage medium Active CN115544214B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211533022.5A CN115544214B (en) 2022-12-02 2022-12-02 Event processing method, device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211533022.5A CN115544214B (en) 2022-12-02 2022-12-02 Event processing method, device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN115544214A CN115544214A (en) 2022-12-30
CN115544214B true CN115544214B (en) 2023-06-23

Family

ID=84722346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211533022.5A Active CN115544214B (en) 2022-12-02 2022-12-02 Event processing method, device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115544214B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424281A (en) * 2013-08-30 2015-03-18 宏碁股份有限公司 Integration method and system of event
CN110399478A (en) * 2018-04-19 2019-11-01 清华大学 Event finds method and apparatus
CN115129882A (en) * 2022-05-19 2022-09-30 广州数说故事信息科技有限公司 Event context analysis method based on knowledge graph, storage medium and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112908488B (en) * 2021-02-09 2022-03-11 北京药明津石医药科技有限公司 Event recognition method and device, computer equipment and storage medium
CN114676346A (en) * 2022-03-17 2022-06-28 平安科技(深圳)有限公司 News event processing method and device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424281A (en) * 2013-08-30 2015-03-18 宏碁股份有限公司 Integration method and system of event
CN110399478A (en) * 2018-04-19 2019-11-01 清华大学 Event finds method and apparatus
CN115129882A (en) * 2022-05-19 2022-09-30 广州数说故事信息科技有限公司 Event context analysis method based on knowledge graph, storage medium and device

Also Published As

Publication number Publication date
CN115544214A (en) 2022-12-30

Similar Documents

Publication Publication Date Title
CN111581976B (en) Medical term standardization method, device, computer equipment and storage medium
WO2017215370A1 (en) Method and apparatus for constructing decision model, computer device and storage device
CN107704512B (en) Financial product recommendation method based on social data, electronic device and medium
WO2019041521A1 (en) Apparatus and method for extracting user keyword, and computer-readable storage medium
WO2020237856A1 (en) Smart question and answer method and apparatus based on knowledge graph, and computer storage medium
WO2021114810A1 (en) Graph structure-based official document recommendation method, apparatus, computer device, and medium
CN111797210A (en) Information recommendation method, device and equipment based on user portrait and storage medium
US11321361B2 (en) Genealogical entity resolution system and method
CN111814770A (en) Content keyword extraction method of news video, terminal device and medium
US20190114711A1 (en) Financial analysis system and method for unstructured text data
CN109522397B (en) Information processing method and device
CN110427453B (en) Data similarity calculation method, device, computer equipment and storage medium
CN109710224B (en) Page processing method, device, equipment and storage medium
CN111414375A (en) Input recommendation method based on database query, electronic device and storage medium
CN110209780B (en) Question template generation method and device, server and storage medium
CN111191454A (en) Entity matching method and device
TW202123026A (en) Data archiving method, device, computer device and storage medium
WO2019085118A1 (en) Topic model-based associated word analysis method, and electronic apparatus and storage medium
US11803796B2 (en) System, method, electronic device, and storage medium for identifying risk event based on social information
CN115544214B (en) Event processing method, device and computer readable storage medium
WO2021012958A1 (en) Original text screening method, apparatus, device and computer-readable storage medium
CN114842982B (en) Knowledge expression method, device and system for medical information system
CN116450664A (en) Data processing method, device, equipment and storage medium
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN112069267A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant