CN115345152B - Template library updating method, report analyzing method, device, equipment and medium - Google Patents

Template library updating method, report analyzing method, device, equipment and medium Download PDF

Info

Publication number
CN115345152B
CN115345152B CN202211276700.4A CN202211276700A CN115345152B CN 115345152 B CN115345152 B CN 115345152B CN 202211276700 A CN202211276700 A CN 202211276700A CN 115345152 B CN115345152 B CN 115345152B
Authority
CN
China
Prior art keywords
entity
field
analyzed
report
semantic template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211276700.4A
Other languages
Chinese (zh)
Other versions
CN115345152A (en
Inventor
刘胜军
邓小宁
李凤荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North Health Medical Big Data Technology Co ltd
Original Assignee
North Health Medical Big Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North Health Medical Big Data Technology Co ltd filed Critical North Health Medical Big Data Technology Co ltd
Priority to CN202211276700.4A priority Critical patent/CN115345152B/en
Publication of CN115345152A publication Critical patent/CN115345152A/en
Application granted granted Critical
Publication of CN115345152B publication Critical patent/CN115345152B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a template library updating method, a report analyzing method, a device, equipment and a medium, which relate to the technical field of natural language processing and comprise the following steps: acquiring a first field to be analyzed; under the condition that any statement of the first field to be analyzed is matched with a first characteristic field in a semantic template library, analyzing an entity name in the first field to be analyzed according to the first semantic template; generating a first flow regulation report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name; and the semantic template library is matched with a second field to be analyzed of a second flow regulation report according to the entity name in the first flow regulation report result, and is updated according to a second semantic template constructed by a second characteristic field. The method can quickly identify and extract the entity content corresponding to each entity type, improves the accuracy of the flow regulation report result, is suitable for analyzing a large number of flow regulation reports, and improves the analysis efficiency.

Description

Template library updating method, report analyzing method, device, equipment and medium
Technical Field
The invention relates to the technical field of natural language processing, in particular to a template library updating method, a report analyzing method, a device, equipment and a medium.
Background
At present, analysis of the flow adjustment report is divided into two modes of manual analysis and model analysis, the manual analysis method is a mode of manually reading the flow adjustment report one by one, and key information such as time, place, people and the like is manually analyzed, and the mode is only suitable for the condition that the number of flow adjustment personnel is small, and the analysis of massive flow adjustment reports cannot be completed.
Disclosure of Invention
The invention provides a template library updating method, a report analyzing device, equipment and a medium, which are used for solving the technical problem of low analyzing accuracy rate in a flow modulation report analyzing process.
In a first aspect, the present invention provides a report parsing method, including:
acquiring at least one first to-be-analyzed field in a first scheduling report;
under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field;
generating a first flow regulation report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
the report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow regulation report according to the entity name in the first flow regulation report result, and updating a semantic template base according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
According to the report parsing method provided by the invention, the second field to be parsed in the second flow report is matched according to the entity name in the first flow report result, and the semantic template base is updated according to the second semantic template constructed by the second characteristic field, which comprises the following steps:
constructing an entity relationship according to any entity name in the first field to be analyzed and the entity type corresponding to the entity name, and traversing all entity names in the first field to be analyzed to construct an entity relationship set corresponding to the first field to be analyzed;
expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster;
traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement in a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relation between each statement and the entity type;
after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the entity type and entity name corresponding relations in the second field to be analyzed and the second characteristic field;
updating the second semantic template to the semantic template library to obtain an updated semantic template library;
and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
According to the report parsing method provided by the present invention, after the first flow report is processed in the clause, the method further includes:
obtaining at least one unresolved field;
under the condition that any statement of the unresolved field is matched with a second feature field in an updated semantic template library, extracting all entity names in the unresolved field according to a second semantic template corresponding to the second feature field;
and generating a second flow regulation report result according to the unresolved field, the entity position of each entity name in the unresolved field and the entity type corresponding to each entity name.
According to the report parsing method provided by the present invention, parsing all entity names in the first field to be parsed according to the first semantic template corresponding to the first characteristic field includes:
analyzing an entity name corresponding to the entity type according to the relative position of any entity type in the first semantic template and the first characteristic field;
and traversing all entity types until each entity type and the entity name corresponding to each entity type are obtained.
According to the report parsing method provided by the present invention, the generating a first flow report result according to the first field to be parsed, the entity location of each entity name in the first field to be parsed, and the entity type corresponding to each entity name includes:
determining the first field to be analyzed as a first format element;
generating an entity type number corresponding to each entity type according to the sequence of the entity types in the first field to be analyzed;
determining the corresponding relation between the entity type number and the entity name according to the entity type number corresponding to any entity type, the initial position of the entity name in the first field to be analyzed and the final position of the entity name in the first field to be analyzed;
traversing all entity type numbers until acquiring the corresponding relation between all the entity type numbers and the entity names corresponding to each entity type number, and determining the corresponding relation between all the entity type numbers and the entity names corresponding to each entity type number as a second format element according to the sequence of the entity type numbers;
and generating a first flow regulation report result according to the first format element and the second format element.
According to the report parsing method provided by the invention, before the first flow report is processed in a sentence manner, the method further comprises the following steps:
the sample flow adjustment report is processed in a sentence dividing mode, and at least one third field to be analyzed is obtained;
and generating a sample report result according to the third field to be analyzed, the entity position of each entity name in the third field to be analyzed and the entity type corresponding to each entity name.
According to the method for analyzing the flow modulation report provided by the invention, after a sample report result is generated, the method further comprises the following steps:
generating the semantic template library;
determining a sample semantic template according to an ordered set formed by the event name, the entity type, the corresponding relation of the entity name and the third characteristic field in the sample report result, and inputting the sample semantic template into the semantic template library;
and the third characteristic field is a field in which all statements with entity type corresponding relations are removed from the third field to be analyzed.
According to the report parsing method provided by the invention, after the sample report result is generated, the method further comprises the following steps:
constructing an entity relationship according to any entity name in the sample report result and the entity type corresponding to the entity name, and traversing all the entity names in the third field to be analyzed to construct an entity relationship set corresponding to the third field to be analyzed;
and constructing an entity relation cluster according to the entity relation set.
In a second aspect, the present invention provides a template library updating method, including:
acquiring at least one first to-be-analyzed field in a first flow regulation report;
under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field;
generating a first flow regulation report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result, and updating the semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field except all statements with entity type corresponding relations; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
According to the template library updating method provided by the invention, the step of matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result and updating the semantic template library according to a second semantic template constructed by a second characteristic field comprises the following steps:
constructing an entity relationship according to any entity name in the first field to be resolved and the entity type corresponding to the entity name, and traversing all the entity names in the first field to be resolved to construct an entity relationship set corresponding to the first field to be resolved;
expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster;
traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement in a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relation between each statement and the entity type;
after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the entity type and entity name corresponding relations in the second field to be analyzed and the second characteristic field;
updating the second semantic template to the semantic template library to obtain an updated semantic template library;
and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
In a third aspect, the present invention provides a report parsing apparatus, including:
an acquisition unit: the method comprises the steps of obtaining at least one first to-be-analyzed field in a first flow report;
an analysis unit: the semantic template analysis method comprises the steps that under the condition that any statement of a first field to be analyzed is matched with a first feature field in a semantic template library, all entity names in the first field to be analyzed are analyzed according to a first semantic template corresponding to the first feature field;
a generation unit: the first flow regulation report result is generated according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
the report parsing apparatus further includes an updating unit: the semantic template library is used for matching a second field to be analyzed in a second flow regulation report according to the entity name in the first flow regulation report result and updating the semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field except all statements with entity type corresponding relations; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
In a fourth aspect, an electronic device is further provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the report parsing method when executing the program.
In a fifth aspect, there is also provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the report parsing method.
The invention has the beneficial effects that: the invention provides a template library updating method, a report analyzing device, equipment and a medium, which realize the matching of a first field to be analyzed of a flow regulation report and a characteristic field of a semantic template library through the semantic template library, analyze all entity names from the first field to be analyzed and realize the rapid generation of the flow regulation report.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a report parsing method provided by the present invention;
FIG. 2 is a second schematic flow chart of a report parsing method according to the present invention;
FIG. 3 is a third schematic flowchart of a report parsing method provided by the present invention;
FIG. 4 is a schematic diagram of a process for resolving all entity names according to the present invention;
FIG. 5 is a schematic flow chart of generating a first tone report result according to the present invention;
FIG. 6 is a fourth schematic flowchart of a report parsing method provided by the present invention;
FIG. 7 is a fifth flowchart illustrating a report parsing method according to the present invention;
FIG. 8 is a sixth schematic flow chart of a report parsing method according to the present invention;
FIG. 9 is a flow chart of a template library updating method provided by the present invention;
FIG. 10 is a second flowchart of the template library updating method provided by the present invention;
FIG. 11 is a seventh schematic flowchart of a report parsing method provided by the present invention;
FIG. 12 is a schematic structural diagram of a report parser according to the present invention;
fig. 13 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a report parsing method provided by the present invention, fig. 1 is a specific embodiment of the present invention, and only shows basic implementation steps that can be implemented and can solve the problems to be solved by the present application, while in the following examples, an alternative embodiment or a preferred embodiment of the present application will be described in detail based on fig. 1, and specifically, the present invention provides a flow regulation report parsing method, which comprises:
acquiring at least one first to-be-analyzed field in a first flow regulation report;
under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field;
generating a first flow report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
the report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result, and updating a semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
In step 101, the process of obtaining at least one first to-be-analyzed field in the first flow report may be implemented by sentence-processing the first flow report, where the first flow report is optionally a paragraph formed by multiple sentence fragments, and the first flow report is subjected to branch processing by a sentence, a semicolon, a comma, or another symbol, so as to obtain multiple to-be-analyzed fields.
In step 102, when any statement in the first field to be parsed matches with a first feature field in a semantic template library, all entity names in the first field to be parsed are parsed according to a first semantic template corresponding to the first feature field, where any statement in the first field to be parsed may be "prior to" in "study in second experiment before 31 months in 3 months, or" after "10 points have been left from a certain dish market", and for the semantic template library, there are multiple semantic templates, and a semantic template is a template tool for extracting entity information of the first field to be parsed, where there are corresponding relationships between feature fields and other templated entity types and entity names.
Optionally, the first feature field is a field excluding the entity type or a field combination with a precedence order in the first semantic template, and in combination with the embodiment "study in experiment two before 31 th month 3" and "study before" in the first semantic template, the field excluding the entity type in the first semantic template is "while in another optional embodiment, if the first field to be parsed is" study in experiment two before 31 th month 3 "and study in experiment three after 2 th month 5", the field combination with the precedence order in the semantic template corresponding to the first field is "study before 8230and then after".
Optionally, the first semantic template is an ordered set formed by a first feature field according to the event name, the entity type and the corresponding relationship of the entity name, and the first semantic template may be defined as a fixed format, for example, if the above "study in experiment two before 31 months and 3 days" is performed with semantic templating processing, it may be defined as "event": in the above, "sample": if there is a first field to be parsed that "learns in experiment four before 10 months and 1 day", then "10 months and 1 day", "experiment four middle", and "learning" in the first field to be parsed may be parsed according to the first semantic template.
Compared with the extensive entity extraction in the prior art, the time, the place, the people and the events in the flow report can be more pertinently and quickly determined by the method, and other entity types do not need to be obtained, so that the extraction efficiency and the accuracy are improved.
In step 103, a first tuning report result is generated according to the first field to be resolved, the entity position of each entity name in the first field to be resolved, and the entity type corresponding to each entity name, in step 103, first all the entity types and the entity contents in the first field to be resolved need to be determined, and then the first tuning report result is generated according to a fixed format, where the fixed format may be: { "text": streaming text statement, "entity _ list": [ { "id": entity id, "si": entity start position, "ei": entity termination position, "et": entity types (including time/place/people/events) }, wherein the entity starting position and the entity ending position are position information of Chinese characters corresponding to each entity content, which are counted from left to right in a streaming original sentence.
Specifically, taking "study in experiment two before 31 months in 3" as an example, the entity type "time" corresponds to the entity content "31 days in 3 months", the entity type "location" corresponds to the entity content "experiment two", the entity type "event" corresponds to the entity content "study in" and further, according to the correspondence, the first flow report result is generated as follows: { "text": study in experiment two before "day 31 of 3 months", "entry _ list": [ { "id":1, "si":0, "ei":5, "et": time }, "id":2, "si":8, the 'ei': 12, "et": location }, { "id":3, "si":13, the "ei":15, "et": event } ] }. In the present application, the reason for analyzing the flow adjustment report into the fixed format is to facilitate the subsequent flow adjustment tracing and the relevant personnel to retrieve the keyword information in the flow adjustment report.
The report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result, and updating a semantic template library according to a second semantic template constructed by a second characteristic field, specifically, constructing an entity relationship according to any entity name in the first field to be analyzed and the entity type corresponding to the entity name, and traversing all entity names in the first field to be analyzed to construct an entity relationship set corresponding to the first field to be analyzed; expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster; traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement of a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relationship between each statement and the entity type; after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the entity type and entity name corresponding relations in the second field to be analyzed and the second characteristic field; updating the second semantic template to the semantic template library to obtain an updated semantic template library; and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
The invention provides a template library updating method, a report analyzing device, equipment and a medium, wherein a preset semantic template library is used for matching a first to-be-analyzed field of a flow regulation report with a characteristic field of the semantic template library, all entity names are analyzed from the first to-be-analyzed field, and the flow regulation report is rapidly generated.
Fig. 2 is a second schematic flow chart of the report parsing method provided by the present invention, in order to obtain a more accurate entity extraction result according to the flow report after the first flow report result is generated, and in order to enrich the diversity of the parsing contents, the present invention can also match different semantic templates according to the expansion of the entity relationship cluster and the voice template library, thereby realizing more diverse entity content extraction, and the present invention enriches the entity relationship cluster and the voice template library continuously in the process of entity extraction according to a heuristic algorithm, including:
constructing an entity relationship according to any entity name in the first field to be resolved and the entity type corresponding to the entity name, and traversing all the entity names in the first field to be resolved to construct an entity relationship set corresponding to the first field to be resolved;
expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster;
traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement of a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relationship between each statement and the entity type;
after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the event name of the second field to be analyzed, the corresponding relation between the entity type and the entity name and the second characteristic field;
updating the second semantic template to the semantic template library to obtain an updated semantic template library;
and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
In step 201, an entity relationship is constructed according to any entity name in the first field to be resolved and an entity type corresponding to the entity name, all entity names in the first field to be resolved are traversed to construct an entity relationship set corresponding to the first field to be resolved, and in combination with the embodiment of "learning in experiment two before 31 months 3", the entity type "time" corresponds to the entity content "31 months" and the entity type "location" corresponds to the entity content "experiment two", the entity type "event" corresponds to the entity content "learning", the entity relationship between the entity name and the entity type can be represented in a preset format, and optionally, the preset format can be { "en": entity name, "et": entity type, further traversing all entity names in the first field to be analyzed, and acquiring an entity relationship set of { "en":3 month and 31 days, "et": time }, { "en": in experiment two, "et": site }, { "en": in the above, "et": event }.
In step 201, the entity relationship set is expanded to an entity relationship cluster, and an expanded entity relationship cluster is obtained, where the entity relationship cluster is a data cluster of entity relationships formed according to entity relationships constructed by a plurality of entity names and entity types, and the entity relationship set is expanded to the entity relationship cluster, so as to enrich the entity relationship cluster and form the expanded entity relationship cluster.
In step 203, traversing the entity name of each entity relationship set in the augmented entity relationship cluster, determining the entity type corresponding to the statement in the case of matching with any statement in the second field to be resolved, traversing all statements in the second field to be resolved to determine the corresponding relationship between each statement and the entity type, in such an embodiment, the second field to be resolved is determined after processing the second scheduling report in a sentence division manner, and is the same as processing the first scheduling report, and when processing the second scheduling report, acquiring at least one second field to be resolved, further, as a result of augmenting the entity relationship cluster, further more entity names can be identified, for example, when { "en": entity name, "et": entity type, further traversing all entity names in the first field to be analyzed, and acquiring an entity relationship set of { "en":3 month and 31 days, "et": time }, { "en": in experiment two, "et": site }, { "en": in the above, "et": and after the event is expanded to the entity relationship cluster, assuming that the second field to be analyzed is 'learning in experiment two after 31 days in 3 months', the corresponding relationship between each statement and the entity type can be determined according to the entity relationship cluster.
In step 204, after all statements having entity type correspondence are removed, a second feature field is obtained, so as to construct a second semantic template according to the event name of the second field to be analyzed, the correspondence between the entity type and the entity name, and the second feature field, and remove all statements having entity type correspondence, that is, all statements having entity type correspondence are removed from "3 months and 31 days later in experiment two upper school" { "en":3 month and 31 days, "et": time }, { "en": in experiment two, "et": site }, { "en": in the above, "et": event, the obtained second feature field is "after", and then a second semantic template is constructed according to the event name, the entity type, and the correspondence of the entity name of the second field to be analyzed and the second feature field, specifically, the second semantic template may be characterized as "event": in the above, "sample": the < location > < event > "" before and after "< time >, and relative to" before "in the first semantic template, the" after "in the second semantic template is used for forming a newly added semantic template, namely the second semantic template obtained according to heuristic learning.
In step 205, the second semantic template is updated to the semantic template library to obtain an updated semantic template library, and for the semantic template library before updating, the second semantic template is not provided with the second feature field "after" in the second semantic template, and only the first semantic template with the first feature field "before" is provided, and then the second semantic template is updated to the semantic template library to obtain a richer semantic template library, so that in the later processing process of the flow regulation report, when the statement field "after" exists, accurate entity extraction for such a statement can be realized.
Fig. 3 is a third schematic flow chart of the report parsing method provided by the present invention, and after the first flow report is processed in a sentence, the method further includes:
obtaining at least one unresolved field;
under the condition that any statement of the unresolved field is matched with a second feature field in an updated semantic template library, extracting all entity names in the unresolved field according to a second semantic template corresponding to the second feature field;
and generating a second flow regulation report result according to the unresolved field, the entity position of each entity name in the unresolved field and the entity type corresponding to each entity name.
After updating the entity relationship cluster and the semantic template library, the method can be suitable for entity extraction of richer flow regulation reports, and specifically comprises the following steps:
in step 301, at least one unresolved field is obtained, and if there are "study in experiment two after 31 days in 3 months" and "study in experiment five after 1 day in 10 months" in the first to-be-resolved field, in the semantic template library before updating, since there is no "study in the second feature field," there is only a successful match "study in experiment two after 31 days in 3 months", and entity extraction is performed on the field.
In step 302, when any statement in the unresolved field matches a second feature field in the updated semantic template library, all entity names in the unresolved field are extracted according to a second semantic template corresponding to the second feature field, but the unresolved field cannot be successfully matched, "study in experiment five after 1 month and 10 days", because the semantic template library is updated, all entity names in the first field to be resolved can be extracted according to a second feature field "in the second semantic template, that is, in the case where" match successfully "is performed after 1 month and" match with a second feature field in the semantic template library, "all entity names in the first field to be resolved can be extracted according to a second semantic template corresponding to the second feature field, that is, the entity names" 1 month and 10 "," study in experiment five "and" study "are extracted.
In step 303, a second modulation report result is generated according to the unresolved field, the entity position of each entity name in the unresolved field, and the entity type corresponding to each entity name, where the second modulation report result is generated as follows: { "text": "study in experiment five after 1 day 10", "entity _ list": [ { "id":1, "si":0, "ei":5, "et": time }, "id":2, "si":8, "ei":12, "et": location }, { "id":3, "si":13, "ei":15, "et": event } ] }.
Fig. 4 is a schematic flow chart of parsing all entity names provided in the present invention, where parsing all entity names in the first field to be parsed according to the first semantic template corresponding to the first feature field includes:
analyzing an entity name corresponding to the entity type according to the relative position of any entity type in the first semantic template and the first characteristic field;
and traversing all entity types until each entity type and the entity name corresponding to each entity type are obtained.
In step 1021, the first semantic template may be defined as a fixed format, e.g., "event": in the above, "sample": before < time > < event > ", since the relative position of the entity type and the first characteristic field is that" time "is before" and "place" and "event" is after "before" in the case of resolving the first field to be resolved, it is assumed that the first field to be resolved is "class in experiment one before 8 days of 9 months", and "class in experiment one" are resolved similarly in the order of time, place, and position of event.
In step 1021, in an optional embodiment, the first semantic template is traversed to extract the statements in the first to-be-resolved field except the first feature field, so as to obtain the entity name corresponding to each entity type, and in another optional embodiment, the statements in the first to-be-resolved field except the first feature field may be extracted according to the corresponding relationship between the entity name and the entity type in the entity relationship cluster, so as to obtain each entity type and the entity name corresponding to each entity type.
Fig. 5 is a schematic flow chart of generating a first flow report result according to the present invention, where the generating a first flow report result according to the first to-be-analyzed field, the entity position of each entity name in the first to-be-analyzed field, and the entity type corresponding to each entity name includes:
determining the first field to be analyzed as a first format element;
generating an entity type number corresponding to each entity type according to the sequence of the entity types in the first field to be analyzed;
determining the corresponding relation between the entity type number and the entity name according to the entity type number corresponding to any entity type, the initial position of the entity name in the first field to be resolved and the terminal position of the entity name in the first field to be resolved;
traversing all entity type numbers until acquiring the corresponding relation between all the entity type numbers and the entity names corresponding to each entity type number, and determining the corresponding relation between all the entity type numbers and the entity names corresponding to each entity type number as a second format element according to the sequence of the entity type numbers;
and generating a first flow regulation report result according to the first format element and the second format element.
In step 1031, the first field to be parsed is determined as a first format element, in an optional embodiment, the first field to be parsed is "study in experiment two before 31 days in month 3", at this time, the first format element is an event statement, "study in experiment two before 31 days in month 3", and in the first flow report result, since only the location information needs to be marked, and the location and the range of the entity name corresponding to each entity type can be known by combining the first format element.
In step 1032, an entity type number corresponding to each entity type is generated according to a sequence of the entity type in the first field to be analyzed, for example, the entity type includes time, place, and event, and each entity type is numbered according to the sequence, for example, time is 1, place is 2, and event is 3.
In step 1033, determining a corresponding relationship between the entity type number and the entity name according to an entity type number corresponding to any entity type, a starting position of the entity name in the first field to be resolved, and an ending position of the entity name in the first field to be resolved, for example, if "31 days of 3 months" corresponds to "time" in the entity type, and "31 days of 3 months" is learned in experiment two for 14 words in total, "31 days of 3 months" occupies the first 5 words, then the starting position of the entity name "31 days of 3 months" is 1, and the ending position is 5; correspondingly, the initial position of the entity name "experiment two" is 9, and the end position is 12; the physical name "school" is 13 at the starting position and 14 at the ending position.
In step 1034, all the entity type numbers are traversed until the corresponding relationship between all the entity type numbers and the entity names corresponding to each entity type number is obtained, so as to determine the corresponding relationship between all the entity type numbers and the entity names corresponding to each entity type number as the second format element according to the sequence of the entity type numbers, each entity name corresponds to an initial position and an end position, and each entity name also corresponds to an entity type number, and further according to the above embodiment, the second format element may be: [ { "id":1, "si":0, "ei":5, "et": time }, { "id":2, "si":9 "ei":12, "et": location }, { "id":3, "si":13, the "ei":14, "et": event } ].
In step 1035, a first flow report result is generated according to the first format element and the second format element, in combination with step 1031 and step 1034, in an optional embodiment, the first format element is "3 months and 31 days ago in second study", and the second format element is: [ { "id":1, "si":0, "ei":5, "et": time }, { "id":2, "si":9 "ei":12, "et": location }, { "id":3, "si":13, "ei":14, "et": event), and the first tone report result is { "text": study in experiment two before "day 31 of 3 months", "entry _ list": [ { "id":1, "si":0, "ei":5, "et": time }, { "id":2, "si":9, "ei":12, "et": location }, { "id":3, "si":13, "ei":14, "et": event } ] }.
Fig. 6 is a fourth schematic flow chart of the report parsing method provided by the present invention, before the first report is processed in a sentence division manner, the method further includes:
the sample flow adjustment report is processed in a sentence dividing mode, and at least one third field to be analyzed is obtained;
and generating a sample report result according to the third field to be analyzed, the entity position of each entity name in the third field to be analyzed and the entity type corresponding to each entity name.
In step 401, the present invention aims to continuously construct a new entity relationship and a new semantic template by labeling a small number of samples and continuously analyzing the flow regulation report according to the entity relationship cluster and the semantic template library in the subsequent analysis of the flow regulation report, and further expand the new entity relationship to the entity relationship cluster and expand the new semantic template to the semantic template library.
Before processing the first flow modulation report, the sample flow modulation report needs to be preprocessed, and a construction rule for generating a sample report result is given according to a third field to be analyzed determined by the sample flow modulation report.
In step 402, a sample report result is generated according to the third field to be parsed, the entity position of each entity name in the third field to be parsed, and the entity type corresponding to each entity name, in an optional embodiment, the generation of the sample report result may input the entity name in the corresponding third field to be parsed to the entity position corresponding to the corresponding entity type according to the format of the sample report result, and finally form a sample flow regulation report result, so as to lay a foundation for establishment of a subsequent semantic template library and establishment of an entity relationship cluster.
Fig. 7 is a fifth schematic flow chart of the report parsing method provided in the present invention, and after generating a sample report result, the method further includes:
generating the semantic template library;
determining a sample semantic template according to an ordered set formed by the event name, the entity type, the corresponding relation of the entity name and the third characteristic field in the sample report result, and inputting the sample semantic template into the semantic template library;
and the third characteristic field is a field in which all statements with entity type corresponding relations are removed from the third field to be analyzed.
In step 501, the semantic template library is generated, and after the semantic template library is constructed, no corresponding relation between any entity type and any entity name exists in the semantic template library.
In step 502, the semantic template library is a basis for parsing the flow modulation report, the richness of the semantic template library determines the diversity of the parsing contents of the flow modulation report, and the sample semantic template is extracted based on the event name, the entity type, the corresponding relationship of the entity name, and the ordered set formed by the third characteristic field in the sample report result, so as to enrich the semantic template library. The extraction method of the single semantic template comprises the following steps: firstly, according to the content of each sample entity list ' entity _ list ', the entity position of each sample is converted to generate a sample semantic template, then event information is added to the sample semantic template, and the sample semantic template is stored as ' { ' event ': event, "sample": template } "form, different tune reports will correspond to different semantic templates, if the sample report result is {" text ": study in experiment two before "day 31 of 3 months", "entry _ list": [ { "id":1, "si":0, "ei":5, "et": time }, { "id":2, "si":9 "ei":12, "et": location }, { "id":3, "si":13, "ei":14, "et": event } ] }, then the sample semantic template can be defined as "event": in the above, "sample": before < time > < place > < event > ", the third characteristic field is a field after all statements corresponding to the entity type in the third field to be analyzed are removed, and optionally, the third characteristic field is 'before'.
Fig. 8 is a sixth schematic flowchart of the report parsing method provided in the present invention, and after generating a sample report result, the method further includes:
constructing an entity relationship according to any entity name in the sample report result and the entity type corresponding to the entity name, and traversing all the entity names in the third field to be analyzed to construct an entity relationship set corresponding to the third field to be analyzed;
and constructing an entity relation cluster according to the entity relation set.
In step 601, construct an entity relationship according to any entity name in the sample report result and the entity type corresponding to the entity name, traverse all entity names in the third field to be resolved to construct an entity relationship set corresponding to the third field to be resolved, in conjunction with the embodiments in fig. 6 and fig. 7, if the sample report result is { "text": study in experiment two before "day 31 of 3 months", "entry _ list": [ { "id":1, "si":0, "ei":5, "et": time }, { "id":2, "si":9, "ei":12, "et": location }, { "id":3, "si":13, the "ei":14, "et": event } ] }, then the entity type "time" corresponds to the entity content "31 days in 3 months", the entity type "place" corresponds to the entity content "in experiment two", the entity type "event" corresponds to the entity content "learning up", all entity names in the third field to be analyzed are traversed to construct an entity relationship set corresponding to the third field to be analyzed, and an entity relationship set is obtained as { "en":3 month and 31 days, "et": time }, { "en": in experiment two, "et": location }, { "en": in the above, "et": event }.
In step 602, the invention constructs an entity relationship set according to the sample report result, constructs an entity relationship cluster according to the entity relationship set, and continuously expands the entity relationship cluster by acquiring new entity relationships in the analysis process of the subsequent flow regulation report.
Fig. 9 is a schematic flow diagram of a method for updating a template library according to the present invention, and the present invention further provides a method for updating a semantic template library, including:
acquiring at least one first to-be-analyzed field in a first flow regulation report;
under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field;
generating a first flow regulation report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
matching a second field to be analyzed in a second flow regulation report according to the entity name in the first flow regulation report result, and updating the semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
In step 701, at least one first to-be-analyzed field in a first flow report is obtained, in step 702, under the condition that any statement of the first to-be-analyzed field is matched with a first feature field in a semantic template library, all entity names in the first to-be-analyzed field are analyzed according to a first semantic template corresponding to the first feature field, in step 703, a first flow report result is generated according to the first to-be-analyzed field, the entity position of each entity name in the first to-be-analyzed field and the entity type corresponding to each entity name, in step 704, a second to-be-analyzed field in a second flow report is matched according to the entity name in the first flow report result, and the semantic template library is updated according to a second semantic template constructed by the second feature field; the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations; the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
The steps 701 to 704 can refer to the steps 101 to 103, which are not described herein.
Fig. 10 is a second flowchart of the method for updating a template library according to the present invention, where the matching of the entity name in the first tuning report result with the second field to be analyzed in the second tuning report and the updating of the semantic template library according to the second semantic template constructed by the second feature field include:
constructing an entity relationship according to any entity name in the first field to be resolved and the entity type corresponding to the entity name, and traversing all the entity names in the first field to be resolved to construct an entity relationship set corresponding to the first field to be resolved;
expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster;
traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement of a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relationship between each statement and the entity type;
after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the entity type and entity name corresponding relations in the second field to be analyzed and the second characteristic field;
updating the second semantic template to the semantic template library to obtain an updated semantic template library;
and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
As shown in fig. 10, step 801 may refer to step 201, step 802 may refer to step 202, step 803 may refer to step 203, step 804 may refer to step 204, and step 805 may refer to step 205, which are not described herein again.
Fig. 11 is a seventh flow diagram of the report parsing method provided by the present invention, in the process of extracting a tone report from a tone report library, extracting a semantic template by labeling an entity type, constructing an initial semantic template library, then preprocessing the tone report and parsing the tone report, in the parsing process, analyzing the tone report by combining the initial semantic template library, matching the semantic template, extracting entity words, generating a report parsing result, and inputting the parsed result into a tone report parsing result library, and according to the parsed tone report parsing result, updating a seed library, i.e., an entity relationship cluster, so that in the case of a new tone report, a new semantic template can be extracted, and the new semantic template is updated to the initial semantic template library in time, thereby implementing a multi-stream parsing of the tone report based on the updated semantic template library in the subsequent tone report parsing, and further implementing a heuristic parsing of the tone report.
Fig. 12 is a schematic structural diagram of a report parsing apparatus provided in the present invention, and the present invention provides a flow chart report parsing apparatus, including an obtaining unit 1: for obtaining at least one first to-be-analyzed field in the first flow modulation report, the operation principle of the obtaining unit 1 may refer to the foregoing step 101, which is not described herein again.
The flow modulation report analysis device further comprises an analysis unit 2: the parsing unit is configured to parse all entity names in the first to-be-parsed field according to the first semantic template corresponding to the first feature field when any statement in the first to-be-parsed field matches with the first feature field in the semantic template library, and the operation principle of the parsing unit 2 may refer to step 102, which is not described herein again.
The flow modulation report analysis device further comprises a generation unit 3: for generating a first scheduling report result according to the first to-be-analyzed field, the entity position of each entity name in the first to-be-analyzed field, and the entity type corresponding to each entity name, the operation principle of the generating unit 3 may refer to the foregoing step 103, which is not described herein again.
The report parsing apparatus further includes an updating unit 4: the semantic template library is used for matching a second field to be analyzed in a second flow regulation report according to the entity name in the first flow regulation report result and updating the semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field except all statements with entity type corresponding relations; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
The invention provides a template library updating method, a report analyzing device, equipment and a medium, wherein a preset semantic template library is used for matching a first to-be-analyzed field of a flow regulation report with a characteristic field of the semantic template library, all entity names are analyzed from the first to-be-analyzed field, and the flow regulation report is rapidly generated.
Fig. 13 is a schematic structural diagram of an electronic device provided by the present invention. As shown in fig. 13, the electronic device may include: a processor (processor) 110, a communication Interface (communication Interface) 120, a memory (memory) 130 and a communication bus 140, wherein the processor 110, the communication Interface 120 and the memory 130 are communicated with each other via the communication bus 140. Processor 110 may call logic instructions in memory 130 to perform a method of stream modulation report parsing, the method comprising: acquiring at least one first to-be-analyzed field in a first scheduling report; under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field; generating a first flow report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name; the report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow regulation report according to the entity name in the first flow regulation report result, and updating a semantic template base according to a second semantic template constructed by a second characteristic field; the first characteristic field is a field in the first to-be-analyzed field except all statements with entity type corresponding relations; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations; the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
In addition, the logic instructions in the memory 130 may be implemented in the form of software functional units and may be stored in a computer readable storage medium when being sold or used as a product to be analyzed. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, wherein when the computer program is executed by a processor, the computer is capable of executing a report parsing method provided by the above methods, the method comprising: acquiring at least one first to-be-analyzed field in a first flow regulation report; under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field; generating a first flow report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name; the report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow regulation report according to the entity name in the first flow regulation report result, and updating a semantic template base according to a second semantic template constructed by a second characteristic field; the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations; the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the above methods to provide a report parsing method, the method comprising: acquiring at least one first to-be-analyzed field in a first scheduling report; under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field; generating a first flow report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name; the report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result, and updating a semantic template library according to a second semantic template constructed by a second characteristic field; the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations; the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (13)

1. A method for report parsing, comprising:
acquiring at least one first to-be-analyzed field in a first scheduling report;
under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field;
generating a first flow regulation report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
the report analysis method also comprises the step of updating the semantic template library, and the updating method comprises the following steps: matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result, and updating a semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
2. The report parsing method of claim 1, wherein matching a second field to be parsed in a second report according to the entity name in the first report result, and updating the semantic template library according to a second semantic template constructed by a second feature field, comprises:
constructing an entity relationship according to any entity name in the first field to be resolved and the entity type corresponding to the entity name, and traversing all the entity names in the first field to be resolved to construct an entity relationship set corresponding to the first field to be resolved;
expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster;
traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement of a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relationship between each statement and the entity type;
after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the entity type and entity name corresponding relations in the second field to be analyzed and the second characteristic field;
updating the second semantic template to the semantic template library to obtain an updated semantic template library;
and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
3. The report parsing method according to claim 2, further comprising, after the first streaming report is sentence-processed:
obtaining at least one unresolved field;
under the condition that any statement of the unresolved field is matched with a second feature field in an updated semantic template library, extracting all entity names in the unresolved field according to a second semantic template corresponding to the second feature field;
and generating a second flow regulation report result according to the unresolved field, the entity position of each entity name in the unresolved field and the entity type corresponding to each entity name.
4. The report parsing method according to any one of claims 1 to 3, wherein the parsing out all entity names in the first field to be parsed according to the first semantic template corresponding to the first characteristic field includes:
analyzing an entity name corresponding to the entity type according to the relative position of any entity type in the first semantic template and the first characteristic field;
and traversing all entity types until each entity type and the entity name corresponding to each entity type are obtained.
5. The report parsing method of claim 1, wherein the generating a first streaming report result according to the first to-be-parsed field, the entity location of each entity name in the first to-be-parsed field, and the entity type corresponding to each entity name comprises:
determining the first field to be analyzed as a first format element;
generating an entity type number corresponding to each entity type according to the sequence of the entity types in the first field to be analyzed;
determining the corresponding relation between the entity type number and the entity name according to the entity type number corresponding to any entity type, the initial position of the entity name in the first field to be resolved and the terminal position of the entity name in the first field to be resolved;
traversing all entity type numbers until the corresponding relation between all the entity type numbers and the entity names corresponding to the entity type numbers is obtained, and determining the corresponding relation between all the entity type numbers and the entity names corresponding to the entity type numbers as a second format element according to the sequence of the entity type numbers;
and generating a first flow regulation report result according to the first format element and the second format element.
6. The report parsing method according to claim 1, 2, 3 or 5, further comprising, before the sentence-wise processing the first flow report:
the sample flow adjustment report is processed in a sentence dividing mode, and at least one third field to be analyzed is obtained;
and generating a sample report result according to the third field to be analyzed, the entity position of each entity name in the third field to be analyzed and the entity type corresponding to each entity name.
7. The report parsing method of claim 6, further comprising, after generating the sample report result:
generating the semantic template library;
determining a sample semantic template according to an ordered set formed by the event name, the entity type, the corresponding relation of the entity name and the third characteristic field in the sample report result, and inputting the sample semantic template into the semantic template library;
and the third characteristic field is a field in which all statements with entity type corresponding relations are removed from the third field to be analyzed.
8. The report parsing method of claim 6, further comprising, after generating the sample report result:
constructing an entity relationship according to any entity name in the sample report result and the entity type corresponding to the entity name, and traversing all the entity names in the third field to be analyzed to construct an entity relationship set corresponding to the third field to be analyzed;
and constructing an entity relation cluster according to the entity relation set.
9. A template library updating method is characterized by comprising the following steps:
acquiring at least one first to-be-analyzed field in a first scheduling report;
under the condition that any statement of the first field to be analyzed is matched with a first feature field in a semantic template library, analyzing all entity names in the first field to be analyzed according to a first semantic template corresponding to the first feature field;
generating a first flow regulation report result according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result, and updating the semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
10. The method according to claim 9, wherein the matching a second field to be analyzed in a second report according to the entity name in the first report result and updating the semantic template library according to a second semantic template constructed by a second feature field comprises:
constructing an entity relationship according to any entity name in the first field to be resolved and the entity type corresponding to the entity name, and traversing all the entity names in the first field to be resolved to construct an entity relationship set corresponding to the first field to be resolved;
expanding the entity relationship set to an entity relationship cluster to obtain an expanded entity relationship cluster;
traversing the entity name of each entity relationship set in the expanded entity relationship cluster, determining the entity type corresponding to the statement under the condition that the entity name is matched with any statement of a second field to be analyzed, and traversing all statements in the second field to be analyzed to determine the corresponding relationship between each statement and the entity type;
after all sentences with entity type corresponding relations are eliminated, a second characteristic field is obtained, and a second semantic template is constructed according to the entity type and entity name corresponding relations in the second field to be analyzed and the second characteristic field;
updating the second semantic template to the semantic template library to obtain an updated semantic template library;
and the second field to be analyzed is determined after the second flow regulation report is processed in a sentence way.
11. A report parsing apparatus, comprising:
an acquisition unit: the method comprises the steps of obtaining at least one first to-be-analyzed field in a first flow regulation report;
an analysis unit: the semantic template analysis method comprises the steps that under the condition that any statement of a first field to be analyzed is matched with a first feature field in a semantic template library, all entity names in the first field to be analyzed are analyzed according to a first semantic template corresponding to the first feature field;
a generation unit: the first flow regulation report result is generated according to the first field to be analyzed, the entity position of each entity name in the first field to be analyzed and the entity type corresponding to each entity name;
the report parsing apparatus further includes an updating unit: the semantic template library is used for matching a second field to be analyzed in a second flow report according to the entity name in the first flow report result and updating a semantic template library according to a second semantic template constructed by a second characteristic field;
the first characteristic field is a field in the first to-be-analyzed field, except all statements of the corresponding relation of the entity types; the second characteristic field is a field in the second field to be analyzed, except all statements with entity type corresponding relations;
the first semantic template is an ordered set formed by first characteristic fields according to the corresponding relation between the entity type and the entity name; the second semantic template is an ordered set formed by second characteristic fields according to the corresponding relation between the entity type and the entity name.
12. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the report resolution method of any of claims 1 to 8 when executing the computer program.
13. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the report parsing method according to any one of claims 1 to 8.
CN202211276700.4A 2022-10-19 2022-10-19 Template library updating method, report analyzing method, device, equipment and medium Active CN115345152B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211276700.4A CN115345152B (en) 2022-10-19 2022-10-19 Template library updating method, report analyzing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211276700.4A CN115345152B (en) 2022-10-19 2022-10-19 Template library updating method, report analyzing method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN115345152A CN115345152A (en) 2022-11-15
CN115345152B true CN115345152B (en) 2023-03-14

Family

ID=83957390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211276700.4A Active CN115345152B (en) 2022-10-19 2022-10-19 Template library updating method, report analyzing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115345152B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117688217A (en) * 2024-02-02 2024-03-12 北方健康医疗大数据科技有限公司 System, method and medium for realizing data blood relationship structure based on directed graph

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101308487A (en) * 2008-06-25 2008-11-19 中国科学院地理科学与资源研究所 Space-time fusion method for natural language expressing dynamic traffic information
WO2017132719A1 (en) * 2016-02-03 2017-08-10 Global Software Innovation Pty Ltd Systems and methods for generating electronic document templates and electronic documents
JP2021197133A (en) * 2020-06-12 2021-12-27 ペキン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッドBeijing Baidu Netcom Science And Technology Co., Ltd. Meaning matching method, device, electronic apparatus, storage medium, and computer program
CN113869066A (en) * 2021-10-15 2021-12-31 中通服创立信息科技有限责任公司 Semantic understanding method and system based on agricultural field text
CN115019915A (en) * 2022-05-31 2022-09-06 深圳市北科瑞声科技股份有限公司 Method, device, equipment and medium for generating flow regulation report based on semantic recognition

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10262062B2 (en) * 2015-12-21 2019-04-16 Adobe Inc. Natural language system question classifier, semantic representations, and logical form templates
US20220253729A1 (en) * 2021-02-01 2022-08-11 Otsuka Pharmaceutical Development & Commercialization, Inc. Scalable knowledge database generation and transactions processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101308487A (en) * 2008-06-25 2008-11-19 中国科学院地理科学与资源研究所 Space-time fusion method for natural language expressing dynamic traffic information
WO2017132719A1 (en) * 2016-02-03 2017-08-10 Global Software Innovation Pty Ltd Systems and methods for generating electronic document templates and electronic documents
JP2021197133A (en) * 2020-06-12 2021-12-27 ペキン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッドBeijing Baidu Netcom Science And Technology Co., Ltd. Meaning matching method, device, electronic apparatus, storage medium, and computer program
CN113869066A (en) * 2021-10-15 2021-12-31 中通服创立信息科技有限责任公司 Semantic understanding method and system based on agricultural field text
CN115019915A (en) * 2022-05-31 2022-09-06 深圳市北科瑞声科技股份有限公司 Method, device, equipment and medium for generating flow regulation report based on semantic recognition

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Disease trajectories;Ioana Visan;《Nature immunology》;20220331;第33卷(第3期);全文 *
Trajectory distributions: A new description of movement for trajectory prediction;Pei Lv1 et al;《Computational Visual Media》;20220228;第8卷(第2期);213-224 *
一种基于模板的过程定义语言的探讨与研究;吕呈悦等;《计算机应用》;20070210(第02期);全文 *

Also Published As

Publication number Publication date
CN115345152A (en) 2022-11-15

Similar Documents

Publication Publication Date Title
US11170179B2 (en) Systems and methods for natural language processing of structured documents
CN111177184A (en) Structured query language conversion method based on natural language and related equipment thereof
CN110765235B (en) Training data generation method, device, terminal and readable medium
JP2018532171A (en) SQL examination method, server and storage device
CN109947952B (en) Retrieval method, device, equipment and storage medium based on English knowledge graph
CN111292751B (en) Semantic analysis method and device, voice interaction method and device, and electronic equipment
CN116483973A (en) Text processing method and device and related equipment
CN108399157B (en) Dynamic extraction method of entity and attribute relationship, server and readable storage medium
CN115345152B (en) Template library updating method, report analyzing method, device, equipment and medium
CN109740159B (en) Processing method and device for named entity recognition
CN114547274B (en) Multi-turn question and answer method, device and equipment
CN108090104A (en) For obtaining the method and apparatus of webpage information
CN114528312A (en) Method and device for generating structured query language statement
CN116595026A (en) Information inquiry method
CN117076718A (en) Graph database query processing system and method based on large language model
CN111401034A (en) Text semantic analysis method, semantic analysis device and terminal
CN117195829A (en) Text labeling method, text labeling device and electronic equipment
CN114842982B (en) Knowledge expression method, device and system for medical information system
CN111723182A (en) Key information extraction method and device for vulnerability text
CN115757720A (en) Project information searching method, device, equipment and medium based on knowledge graph
JP2004348552A (en) Voice document search device, method, and program
CN114547059A (en) Platform data updating method and device and computer equipment
CN113254612A (en) Knowledge question-answering processing method, device, equipment and storage medium
KR101207375B1 (en) System and method for managing mathematical contents
CN116245096B (en) Tibetan word segmentation evaluation set construction method based on local word list

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant