CN114186072A - Method, system and storage medium for extracting traffic accident report and reasoning scene type - Google Patents

Method, system and storage medium for extracting traffic accident report and reasoning scene type Download PDF

Info

Publication number
CN114186072A
CN114186072A CN202111518348.6A CN202111518348A CN114186072A CN 114186072 A CN114186072 A CN 114186072A CN 202111518348 A CN202111518348 A CN 202111518348A CN 114186072 A CN114186072 A CN 114186072A
Authority
CN
China
Prior art keywords
scene
accident
traffic accident
information
report
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111518348.6A
Other languages
Chinese (zh)
Inventor
马峻岩
赵祥模
许良
史静
刘晨颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN202111518348.6A priority Critical patent/CN114186072A/en
Publication of CN114186072A publication Critical patent/CN114186072A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Abstract

A method, a system and a storage medium for extracting a traffic accident report and reasoning scene types are provided, wherein the method comprises the following steps: classifying entities of a traffic accident scene, and constructing a V2V accident field body; preprocessing data of the traffic accident report; carrying out accident information extraction on the traffic accident report after data preprocessing to acquire specific information of an accident scene; and reasoning to obtain other implicit correlation structures in the body and outputting the scene type corresponding to the structured scene information. The invention adopts the combination of the domain ontology and the natural language processing technology, designs and constructs the accident domain ontology according to the knowledge system covered in the accident report and the application range of the ontology, and provides an information extraction framework combining the domain ontology and the relation extraction after analyzing the text characteristics of the accident report, thereby primarily completing the automation process of extracting scene information from the accident report and improving the utilization rate and the automation degree of the traffic accident report data set in the scene construction technology.

Description

Method, system and storage medium for extracting traffic accident report and reasoning scene type
Technical Field
The invention belongs to the technical field of intelligent vehicle-road systems, and particularly relates to a method, a system and a storage medium for extracting a traffic accident report and reasoning scene types.
Background
The intelligent internet automobile as a product of the current stage of the automatic driving technology has been developed into a global hot topic in the automobile field, and a vehicle road coordination system (CVIS) is one of the key research directions in the current intelligent internet automobile field. Vehicle-to-vehicle cooperative communication (V2X) is an important means for realizing CVIS, and V2V (vehicle-to-vehicle) is also an important means for V2X and is of major interest in academia. Under the vehicle-road cooperative environment, no matter people, vehicles, roads and roadside facilities can sense environmental information and information of other participants through the vehicle-road cooperative system, and meanwhile, information interaction can be carried out with other units, so that vehicle connection of all things in the true sense is realized. However, currently, the field of vehicle-road coordination still faces many safety problems in the development process, so that people still have some questions about the vehicle-road coordination system. Therefore, reducing the safety risk of the intelligent internet vehicle becomes a central part in the current vehicle road cooperative research.
A data-driven scene construction technology is a research hotspot in the field of intelligent internet vehicle testing at present. At present, most researchers select structured natural driving data to conduct scene research, unstructured traffic accident reports with abundant data are ignored, manual analysis is often not needed in the scene construction process, and the scene construction period is long and efficiency is low.
Disclosure of Invention
The invention aims to provide a method, a system and a storage medium for extracting a traffic accident report and reasoning scene types, aiming at the problems in the prior art, so that the utilization rate and the automation degree of a traffic accident report data set in a scene construction technology are improved, and the construction of a data-driven scene in the field of intelligent internet connection is assisted.
In order to achieve the purpose, the invention has the following technical scheme:
a traffic accident report extraction and scene type reasoning method comprises the following steps:
classifying entities of a traffic accident scene, and constructing a V2V accident field body;
preprocessing data of the traffic accident report;
carrying out accident information extraction on the traffic accident report after data preprocessing to acquire specific information of an accident scene;
and reasoning to obtain other implicit correlation structures in the body and outputting the scene type corresponding to the structured scene information.
Preferably, in the step of classifying the entities of the traffic accident scene and constructing the V2V accident domain ontology, designing a generic structure, object attributes and a plurality of instances of the V2V accident domain ontology, and constructing a basic entity through an OWL language and an ontology construction tool; the core body of the V2V accident domain includes the following modules: vehicle, Obstacle, Object Behavior, Environment, Road Net Work and Accident Scenario Type; wherein, vessel describes the main traffic participants; obstacle is divided into two objects by its state: static and dynamic obstacles; the Object Behavior is derived by extracting common Behavior attributes from the Obstacle and the Vehicle and is used for describing Behavior actions of the entity; environment is used as an environmental element in an accident scene to describe the weather state and the illumination condition; the Road NetWork is used as a module for describing a Road NetWork and comprises a Road NetWork shape and a physical structure thereof; the Accident Scenario Type is an Accident scene category module and identifies the Type of an Accident scene.
Preferably, the content of the data preprocessing of the traffic accident report comprises domain-specific vocabulary processing, reference resolution, sentence boundary detection and dependency relationship analysis; the object association structure shown in table 1 and the data association structure shown in table 2 are set for the core ontology of the V2V accident domain:
TABLE 1
Figure BDA0003407723860000021
Figure BDA0003407723860000031
TABLE 2
Attribute name Definition domain Value range
speed_is Speed double
move_direction_is Direction string
lane_width_is Lane double
Scenario_type_is AccidentScenario string
relative_direction_is AccidentScenario string
has_traffic_light RoadNetwork boolean
has_stop_sign RoadNetwork boolean
Preferably, in the step of extracting the accident information from the traffic accident report after the data preprocessing, the traffic accident report after the data preprocessing is subjected to ontology analysis, and then the information is extracted according to an extraction rule.
Preferably, the concrete step of extracting the accident information of the traffic accident report after the data preprocessing comprises:
defining a specific domain dictionary, and converting specific words 'A' and 'B' meeting matching conditions in a report into an 'A-B' form through regular matching so as to form a whole;
restoring the reference words in the text into the objects originally referred to by the reference words;
sentence boundary detection is carried out, and the text is converted into a plurality of single sentences;
and carrying out dependency analysis on the unstructured accident report by utilizing a natural language processing toolkit to obtain the dependency relationship between words in the sentence.
Preferably, the step of acquiring the specific information of the accident scene includes:
ontology analysis, namely importing the class, attribute relationship and instance information in an entity;
the analysis result of the traffic accident report after passing through the relation extraction module is imported, and the analysis result comprises the single sentences after grouping processing and the dependency relationship between words and words in the single sentences;
extracting single sentence information of the traffic accident report, extracting the content of only one sentence in the traffic accident report each time, and finishing the extraction after traversing all sentences in the traffic accident report;
creating an object list and setting the object list to be empty, wherein the object list stores instantiation objects of classes or instances identified from sentences, if the instantiation objects are identified as the instances, the object list is searched for the parent class type to which the object list belongs through the body, then the object list is inquired, whether the instances of the classes exist in the object list is judged, if the instances of the classes exist, the next step is carried out, if the instances of the classes do not exist, the instantiation objects of the classes are generated firstly, and the objects are added into the list and then the next step is carried out;
traversing the dependencies of the entities, including lookup of dependencies, and attribute population of instance objects.
Preferably, in the step of obtaining the other association structures implicit in the ontology through inference, the scene inference rule is described by using an SWRL language, the rule is edited by using software, and the other association structures implicit in the ontology are obtained through automatic inference by an inference engine.
Preferably, the step of automatically reasoning to obtain other implicit correlation structures in the ontology through the reasoning engine includes:
importing an OWL body file;
importing scene information: importing the structured scene information converted from the unstructured accident report into a scene information list before reasoning the scene type;
construction example: creating scene contents in the scene information list as instances of the corresponding classes;
add Association structures between instances: filling the incidence relation among all the instances into the ontology;
with the inference engine: the inference engine searches in the rule base according to the instantiated ontology information and returns an inference result meeting the condition;
outputting a reasoning result: and outputting the scene type corresponding to the structured scene information.
The invention also provides a traffic accident report extraction and scene type reasoning system, which comprises:
the accident body construction module is used for classifying entities of a traffic accident scene and constructing a V2V accident field body;
the data preprocessing module is used for preprocessing the data of the traffic accident report;
the accident information extraction module is used for extracting accident information of the traffic accident report after data preprocessing to obtain specific information of an accident scene;
and the scene type output module is used for reasoning to obtain other implicit correlation structures in the body and outputting the scene type corresponding to the structured scene information.
The invention also proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps in the traffic accident report extraction and scene type inference method.
Compared with the prior art, the invention at least has the following beneficial effects:
the invention provides a traffic accident report extraction and scene type reasoning method, which adopts the combination of a domain ontology and a natural language processing technology, designs and constructs an accident domain ontology according to a knowledge system covered in an accident report and the application range of the ontology, and provides an information extraction framework combining the domain ontology and relation extraction after analyzing the text characteristics of the accident report, thereby primarily completing the automatic process of extracting scene information from the accident report. The utilization rate and the automation degree of the traffic accident report data set in the scene construction technology are improved, and the data-driven scene construction in the field of intelligent internet connection can be assisted.
Drawings
FIG. 1 is a flow chart of a traffic accident report extraction and scene type inference method of the present invention;
FIG. 2 is a schematic diagram of a tree structure of classes in the method ontology of the present invention;
FIG. 3 is a flow chart of an accident information extraction algorithm in the method of the present invention;
FIG. 4 is a flow diagram of scenario inference in the method of the present invention;
FIG. 5 is an example interface diagram of an incident domain ontology according to embodiment V2V of the present invention;
FIG. 6 is a view of an OWL document content presentation interface of an ontology according to an embodiment of the present invention;
FIG. 7 is an exemplary interface diagram for incident reporting in accordance with an embodiment of the present invention;
FIG. 8 is a diagram of a custom domain-specific dictionary section content interface according to an embodiment of the present invention;
FIG. 9 is a data pre-processing result interface diagram according to an embodiment of the present invention;
FIG. 10 is a text relationship extraction results interface diagram in accordance with an embodiment of the present invention;
FIG. 11 is an interface diagram of accident scene extraction results in accordance with an embodiment of the present invention;
FIG. 12 is a diagram of a scenario inference result interface according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Referring to fig. 1, the method for extracting a traffic accident report and reasoning scene types according to the embodiment of the present invention includes the following steps:
step one, classifying entities of a traffic accident scene, designing a generic structure, object attributes and a plurality of instances of a V2V accident field body, and constructing and storing a basic body through an OWL language and a body construction tool
Step two, data preprocessing is carried out on the accident report, and the data preprocessing comprises specific field vocabulary processing, reference resolution, sentence boundary detection and dependency relationship analysis;
step three, extracting accident information of the accident report, and extracting the information according to an extraction rule after the body analysis to obtain specific information of an accident scene;
and step four, describing scene inference rules by using an SWRL language, and editing the rules by using software. And automatically reasoning through a reasoning engine to obtain other implicit correlation structures in the ontology.
Wherein, the core body in V2V accident field in step one includes 6 big modules: vehicle, Obstacle, Object Behavior, Environment, Road Net Work and Accident Scenario Type.
In this context, Vehicle mainly describes the Vehicle category, i.e. the main traffic participants, and in the present invention refers to motor vehicles. Obstacle can be classified into two types by its state, a static Obstacle and a dynamic Obstacle. Obstacle and Vehicle can extract common Behavior attributes to derive Object Behavior, which is responsible for describing the Behavior action of the entity. Environment mainly describes weather conditions and lighting conditions as environmental elements in an accident scene. Road net work is a module for describing Road NetWork, and mainly comprises the shape of the Road NetWork and the physical structure of the Road NetWork. The Accident Scenario Type is an Accident scene Type module, and is used for identifying the Type of an Accident scene by using 17 types of V2V scene types in foreign Accident reports for reference.
And in the second step, 6 classes are designed for the Accident field ontology, wherein the classes comprise Vehicle, Obstal, Object Behavior, Environment, Road Net work and Accident Scenario Type.
17 object association structures and 7 data association structures were constructed as shown in the following table.
TABLE 1 object association Structure
Attribute name Definition domain Value range
has_lanes Road Lane
is_on_lane Vehicle/Obstacle Lane
has_speed ObjectBehavior Speed
has_direction ObjectBehavior Direction
has_lateral_action ObjectBehavior LateralAction
has_longi_action ObjectBehavior LongitudinalAction
execute Vehicle/Obstacle ObjectBehavior
hit_position Vehicle Road/Lane/Junction
has_f_obstacle Vehicle Obstacle
has_lf_obstacle Vehicle Obstacle
has_rf_obstacle Vehicle Obstacle
has_r_obstacle Vehicle Obstacle
has_l_obstacle Vehicle Obstacle
has_b_obstacle Vehicle Obstacle
has_rb_obstacle Vehicle Obstacle
has_lb_obstacle Vehicle Obstacle
is_part_of Vehicle/Obstacle/VehicleAction/ is_part_of
Table 2 data association structure
Figure BDA0003407723860000071
Figure BDA0003407723860000081
The third step specifically comprises the following steps:
defining a specific domain dictionary, and converting specific words 'A' and 'B' meeting matching conditions in a report into an 'A-B' form through regular matching so as to form a whole;
the method comprises the following steps of (1) performing reference resolution, namely restoring reference words in a text into an object originally referred by the reference words;
the method comprises the following steps of carrying out boundary division processing on sentences of the text, namely carrying out sentence boundary detection, and converting the text into a plurality of single sentences, so that the subsequent processing is facilitated;
carrying out dependency analysis on the unstructured accident report by using a natural language processing toolkit to obtain the dependency relationship between words in the sentence;
ontology parsing in step three aims at obtaining classes, attributes, entities, inheritance relationships of the classes and the like from a defined ontology, and is a basis for realizing an ontology and accident report mapping process, fig. 2 shows a tree structure of the classes in the ontology of the method, extraction rules are core links of an information extraction process, and fig. 3 shows an accident information extraction algorithm flow chart, and specifically comprises the following steps:
ontology analysis, wherein information such as classes, attribute relations, instances and the like in the entity is imported;
and importing the analysis result of the accident report after passing through the relation extraction module. The method comprises the following two parts: grouping the single sentences, and the dependency relationship between words and words in the single sentences;
extracting single sentence information of the accident report, extracting the content of only one sentence in the report each time, and finishing the extraction after traversing all the sentences in the report;
the object list is created and set to null. Stored in the object list is an instantiated object of the class or instance identified from the sentence. If the object is identified as an instance, the parent class type to which the object belongs is searched through the body, then the object list is inquired, whether the instance of the class exists in the object list is judged, if the instance of the class exists, the next step is carried out, if the instance of the class does not exist, an instantiation object of the class is generated firstly, and the object is added into the list and then the next step is carried out;
and traversing the dependency relationship of the entity. Through previous work, all entities present in a sentence have been identified by the ontology and instantiated, the instantiated objects being stored in the list of objects. In the following traversal dependency part, the main work is divided into two parts, which are: and searching the dependency relationship and filling the attribute of the instance object.
The automated reasoning of the reasoning engine is shown in fig. 4, and specifically includes the following steps:
importing an OWL body file;
and importing scene information. In the previous work, through the processing of three modules of data preprocessing, text relation extraction and information extraction, an unstructured accident report is converted into structured scene information, and a scene information list is imported before the scene type is inferred so as to execute an automatic inference process;
an example was constructed. The scene information list comprises complete accident scene information, such as vehicle, weather, road facilities and other factors, and the work of creating the content in the scene into corresponding class instances is completed by combining tools;
an association structure between instances is added. After the task of creating the instances is completed, filling the incidence relations among all the instances into the ontology;
using an inference engine. The inference engine is a core step for realizing an inference task, and the inference engine can search in a rule base according to instantiated ontology information and return an inference result meeting conditions;
and outputting a reasoning result. And outputting the scene type corresponding to the structured scene information.
Compared with the prior art, the method of combining the domain ontology with the natural language processing technology is adopted, the knowledge system covered in the accident report is combined with the application range of the ontology, the accident domain ontology is designed and constructed, the information extraction framework combining the domain ontology and the relation extraction is provided after the text characteristics of the accident report are analyzed, and the automatic process of extracting the scene information from the accident report is preliminarily completed. The utilization rate and the automation degree of the traffic accident report data set in the scene construction technology are improved, and the data-driven scene construction in the field of intelligent internet connection can be assisted.
The accident report information extraction of the present invention is primarily directed to car accident reports, and therefore, in preparing an accident report data set, two criteria are followed: firstly, the accident vehicle types are motor vehicles, and pedestrian and non-motor vehicle accidents are not involved; secondly, the number of main accident vehicles when the accident happens is at most two.
The ontology content and the example constructed by the invention are shown in fig. 5, and the OWL file part content of the ontology is shown in fig. 6. By analyzing the accident report, the rule is obtained: the incident report may be divided into three parts, a first paragraph, a second paragraph, and all paragraphs following the third paragraph. In a subsequent scene information extraction framework, accident scene information can be extracted according to different parts. And finally, the third part is used as supplementary information to supplement the first part and the second part of information so as to ensure that a complete accident scene is separated from the report as comprehensively as possible. Fig. 7 shows the contents of the original accident report.
The invention summarizes 74 groups of special words and 4 groups of regular expression matching templates, the dictionary of the specific field is stored in the dictionary in a key (value) form, and the replacement words are obtained in a query mode. By utilizing the word bank, the processing requirement of the experiment of the invention on the vocabulary in the accident field can be met.
In the preprocessing process, a character string matching mode is adopted by combining with the accident field special vocabulary dictionary, and the vocabulary to be processed is searched from the accident report. And replacing the special vocabulary in the accident report text into a preset format. The specific words replaced from the above embodiment are shown in fig. 8.
Secondly, after the special vocabulary processing is finished, the existing tool Stanford CoreNLP is used for carrying out the reference elimination processing on the text, and the corresponding antecedent words in the original text are completely replaced by the corresponding antecedent words, so that the subsequent processing is facilitated. The text after data preprocessing is shown in fig. 9.
In addition, the selection NLTK tool implements a sentence boundary check function on text. NLTK is a natural semantic processing library in Python, and is the most popular type in the existing natural language development tools. In the task of natural language processing, the NLTK provides powerful functions, and the development efficiency and the processing efficiency can be greatly improved while the requirements of users are met. The NLTK has very powerful functions in the aspects of word segmentation and sentence segmentation, and simultaneously supports multiple languages, so that the NLTK tool is used for performing sentence segmentation processing on the accident report. The sentence dependency relationship analysis is then performed on the sentence division result in a single sentence format, and the result is shown in fig. 10.
After the foregoing processing, the original text data is now converted into a form of "simple sentence + dependency relationship", and then entities existing in the sentence are identified by using the ontology, and the entities are associated by using the dependency relationship, which complement each other, and the accident scene information is extracted from the result by using the extraction rule, and the scene information extracted from the embodiment is shown in fig. 11.
The owlready2 is a functional library based on OWL encapsulation, integrates functions of addition, deletion, check, modification and the like of ontology, simultaneously supports various ontology inference engines, and is a very powerful ontology tool. On the basis, the invention utilizes the own ready2 module to complete the design of the automatic reasoning module, and the previous work converts the unstructured scene information into a structured scene information list, so that the scene reasoning module is utilized to complete the reasoning task of the scene type by taking the information list as input.
Finally, the type of the accident scene is inferred to be scene 1: "Running _ Red _ Light". The result is shown in FIG. 12, where the arrows indicate positions showing manually created instances, type attributes carried by the instances themselves, and attributes inferred by Pellet, respectively.
The invention also provides a traffic accident report extraction and scene type reasoning system, which comprises:
the accident body construction module is used for classifying entities of a traffic accident scene and constructing a V2V accident field body;
the data preprocessing module is used for preprocessing the data of the traffic accident report;
the accident information extraction module is used for extracting accident information of the traffic accident report after data preprocessing to obtain specific information of an accident scene;
and the scene type output module is used for reasoning to obtain other implicit correlation structures in the body and outputting the scene type corresponding to the structured scene information.
The invention also proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps in the traffic accident report extraction and scene type inference method.
Illustratively, the computer program may be partitioned into one or more modules/units, which are stored in a computer-readable storage medium and executed by the processor to perform the steps of the traffic accident report extraction and scene type inference method described herein. The one or more modules/units may be a series of computer-readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the server.
The server can be a computing device such as a smart phone, a notebook, a palm computer and a cloud server. The server may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the server may also include more or fewer components, or some components in combination, or different components, e.g., the server may also include input output devices, network access devices, buses, etc.
The Processor may be a CentraL Processing Unit (CPU), other general purpose Processor, a DigitaL SignaL Processor (DSP), an AppLication Specific Integrated Circuit (ASIC), an off-the-shelf ProgrammabLe Gate Array (FPGA) or other ProgrammabLe logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage may be an internal storage unit of the server, such as a hard disk or a memory of the server. The memory may also be an external storage device of the server, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure DigitaL (SD) Card, a FLash memory Card (FLash Card), or the like provided on the server. Further, the memory may also include both an internal storage unit of the server and an external storage device. The memory is used to store the computer readable instructions and other programs and data needed by the server. The memory may also be used to temporarily store data that has been output or is to be output.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the method embodiment, and specific reference may be made to the part of the method embodiment, which is not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A traffic accident report extraction and scene type reasoning method is characterized by comprising the following steps:
classifying entities of a traffic accident scene, and constructing a V2V accident field body;
preprocessing data of the traffic accident report;
carrying out accident information extraction on the traffic accident report after data preprocessing to acquire specific information of an accident scene;
and reasoning to obtain other implicit correlation structures in the body and outputting the scene type corresponding to the structured scene information.
2. The traffic accident report extraction and scene type inference method according to claim 1, wherein: in the step of classifying the entities of the traffic accident scene and constructing the V2V accident field body, designing a generic structure, object attributes and a plurality of instances of the V2V accident field body, and constructing a basic body through an OWL language and a body construction tool; the core body of the V2V accident domain includes the following modules: vehicle, Obstacle, Object Behavior, Environment, Road Net Work and Accident Scenario Type; wherein, vessel describes the main traffic participants; obstacle is divided into two objects by its state: static and dynamic obstacles; the Object Behavior is derived by extracting common Behavior attributes from the Obstacle and the Vehicle and is used for describing Behavior actions of the entity; environment is used as an environmental element in an accident scene to describe the weather state and the illumination condition; the Road NetWork is used as a module for describing a Road NetWork and comprises a Road NetWork shape and a physical structure thereof; the Accident Scenario Type is an Accident scene category module and identifies the Type of an Accident scene.
3. The traffic accident report extraction and scene type inference method of claim 2, wherein: the content for carrying out data preprocessing on the traffic accident report comprises specific field vocabulary processing, reference resolution, sentence boundary detection and dependency relationship analysis; the object association structure shown in table 1 and the data association structure shown in table 2 are set for the core ontology of the V2V accident domain:
TABLE 1
Figure FDA0003407723850000011
Figure FDA0003407723850000021
TABLE 2
Attribute name Definition domain Value range speed_is Speed double move_direction_is Direction string lane_width_is Lane double Scenario_type_is AccidentScenario string relative_direction_is AccidentScenario string has_traffic_light RoadNetwork boolean has_stop_sign RoadNetwork boolean
4. The traffic accident report extraction and scene type inference method according to claim 1, wherein: in the step of extracting the accident information of the traffic accident report after the data preprocessing, the information is extracted according to an extraction rule after the traffic accident report after the data preprocessing is subjected to ontology analysis.
5. The traffic accident report extraction and scene type inference method according to claim 4, wherein the concrete steps of extracting accident information from the traffic accident report after data preprocessing include:
defining a specific domain dictionary, and converting specific words 'A' and 'B' meeting matching conditions in a report into an 'A-B' form through regular matching so as to form a whole;
restoring the reference words in the text into the objects originally referred to by the reference words;
sentence boundary detection is carried out, and the text is converted into a plurality of single sentences;
and carrying out dependency analysis on the unstructured accident report by utilizing a natural language processing toolkit to obtain the dependency relationship between words in the sentence.
6. The traffic accident report extraction and scenario type inference method according to claim 1, wherein the step of obtaining accident scenario specific information comprises:
ontology analysis, namely importing the class, attribute relationship and instance information in an entity;
the analysis result of the traffic accident report after passing through the relation extraction module is imported, and the analysis result comprises the single sentences after grouping processing and the dependency relationship between words and words in the single sentences;
extracting single sentence information of the traffic accident report, extracting the content of only one sentence in the traffic accident report each time, and finishing the extraction after traversing all sentences in the traffic accident report;
creating an object list and setting the object list to be empty, wherein the object list stores instantiation objects of classes or instances identified from sentences, if the instantiation objects are identified as the instances, the object list is searched for the parent class type to which the object list belongs through the body, then the object list is inquired, whether the instances of the classes exist in the object list is judged, if the instances of the classes exist, the next step is carried out, if the instances of the classes do not exist, the instantiation objects of the classes are generated firstly, and the objects are added into the list and then the next step is carried out;
traversing the dependencies of the entities, including lookup of dependencies, and attribute population of instance objects.
7. The traffic accident report extraction and scene type inference method according to claim 1, wherein in the step of inferring other implicit association structures in the ontology, the scene inference rules are described in SWRL language, the rules are edited by software, and the inference engine is used to automatically infer the other implicit association structures in the ontology.
8. The traffic accident report extraction and scenario type inference method of claim 7, wherein the step of deriving implicit other association structures in the ontology through automated inference by an inference engine comprises:
importing an OWL body file;
importing scene information: importing the structured scene information converted from the unstructured accident report into a scene information list before reasoning the scene type;
construction example: creating scene contents in the scene information list as instances of the corresponding classes;
add Association structures between instances: filling the incidence relation among all the instances into the ontology;
with the inference engine: the inference engine searches in the rule base according to the instantiated ontology information and returns an inference result meeting the condition;
outputting a reasoning result: and outputting the scene type corresponding to the structured scene information.
9. A traffic accident report extraction and scene type inference system, comprising:
the accident body construction module is used for classifying entities of a traffic accident scene and constructing a V2V accident field body;
the data preprocessing module is used for preprocessing the data of the traffic accident report;
the accident information extraction module is used for extracting accident information of the traffic accident report after data preprocessing to obtain specific information of an accident scene;
and the scene type output module is used for reasoning to obtain other implicit correlation structures in the body and outputting the scene type corresponding to the structured scene information.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps in the traffic accident report extraction and scene type inference method according to any of claims 1 to 8.
CN202111518348.6A 2021-12-13 2021-12-13 Method, system and storage medium for extracting traffic accident report and reasoning scene type Pending CN114186072A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111518348.6A CN114186072A (en) 2021-12-13 2021-12-13 Method, system and storage medium for extracting traffic accident report and reasoning scene type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111518348.6A CN114186072A (en) 2021-12-13 2021-12-13 Method, system and storage medium for extracting traffic accident report and reasoning scene type

Publications (1)

Publication Number Publication Date
CN114186072A true CN114186072A (en) 2022-03-15

Family

ID=80543444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111518348.6A Pending CN114186072A (en) 2021-12-13 2021-12-13 Method, system and storage medium for extracting traffic accident report and reasoning scene type

Country Status (1)

Country Link
CN (1) CN114186072A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880408A (en) * 2022-05-31 2022-08-09 小米汽车科技有限公司 Scene construction method, device, medium and chip

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880408A (en) * 2022-05-31 2022-08-09 小米汽车科技有限公司 Scene construction method, device, medium and chip

Similar Documents

Publication Publication Date Title
CN109918560B (en) Question and answer method and device based on search engine
CN111291161A (en) Legal case knowledge graph query method, device, equipment and storage medium
DE102019001267A1 (en) Dialog-like system for answering inquiries
CN111309910A (en) Text information mining method and device
CN111783394A (en) Training method of event extraction model, event extraction method, system and equipment
CN113780486B (en) Visual question answering method, device and medium
US11688190B2 (en) Text refinement network
CN113761868B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN111625633A (en) Knowledge graph-based enterprise system question-answer intention identification method and device
WO2023159767A1 (en) Target word detection method and apparatus, electronic device and storage medium
CN113515632A (en) Text classification method based on graph path knowledge extraction
CN114186072A (en) Method, system and storage medium for extracting traffic accident report and reasoning scene type
CN111178080A (en) Named entity identification method and system based on structured information
CN116821307B (en) Content interaction method, device, electronic equipment and storage medium
Muscetti et al. Multimedia ontology population through semantic analysis and hierarchical deep features extraction techniques
CN116521729A (en) Information classification searching method and device based on elastic search
CN116304231A (en) Query statement generation method and device based on grammar parsing tree, equipment and medium
CN116976341A (en) Entity identification method, entity identification device, electronic equipment, storage medium and program product
CN113869049B (en) Fact extraction method and device with legal attribute based on legal consultation problem
CN116010545A (en) Data processing method, device and equipment
CN115600605A (en) Method, system, equipment and storage medium for jointly extracting Chinese entity relationship
CN115392239A (en) Knowledge extraction method and intelligent client system applying same
CN112308464B (en) Business process data processing method and device
CN114398980A (en) Cross-modal Hash model training method, encoding method, device and electronic equipment
CN113434631A (en) Emotion analysis method and device based on event, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination