CN110134963A - A kind of text mining is applied to the method for road traffic accident data processing - Google Patents
A kind of text mining is applied to the method for road traffic accident data processing Download PDFInfo
- Publication number
- CN110134963A CN110134963A CN201910418287.2A CN201910418287A CN110134963A CN 110134963 A CN110134963 A CN 110134963A CN 201910418287 A CN201910418287 A CN 201910418287A CN 110134963 A CN110134963 A CN 110134963A
- Authority
- CN
- China
- Prior art keywords
- data
- text
- accident
- traffic accident
- road traffic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of methods that text mining is applied to road traffic accident data processing, Chinese word segmentation is carried out to road traffic accident data sample, by word incorporation model by the sample data set three-dimensional vector, large-scale text categorization network TextCNN network struction model is built by neural network CNN again, exports crucial traffic information;The present invention is based on natural language processing techniques to handle traffic accident recording text, integrated use python and c++ language development Accident-causing repair system again, it can automate, mass disposal casualty data record, casualty data quality is effectively improved, the Accident-causing repair system is easy to operate, processing is efficient, information is intuitive, causation analysis is accurate;Abundant applicating text data simultaneously facilitate it in the use established in model process.Cost of manufacture of the invention is lower, effectively overcomes the shortcomings of China's road traffic accident structured record, and can efficiently and accurately repair traffic accident data.
Description
Technical field
The present invention relates to road accident processing technology fields more particularly to a kind of text mining to be applied to road traffic accident
The method of data processing.
Background technique
Under the promotion of traffic power strategy, China's road traffic has entered the transition developed from rapid growth to high quality
Phase, traffic safety problem receive much attention and pay attention to.And traffic accident data are the core data sources of traffic safety research, are road
Road safety improvement provides basic information support.In recent years, public safety traffic management integrated application platform (" six directions one " platform) is complete
Face application effectively increases the level of IT application of traffic accident treatment and history accident archive.However according to the Chinese people in 2014
Republic's road traffic statistics annual report is shown, in China's Mainland accident main cause statistics, the indefinite motor vehicle of Accident-causing
Illegal activities are shown as " behavior of other influences safety " and occupy 43% in cause of accident, this significantly impacts accident number
According to availability, decline the accuracy for analyzing and improving China's road traffic safety situation and effect sharply.Compare Hong Kong fortune
The Hong-Kong cause of accident statistics of defeated administration's publication, the illegal activities of " other " classification only account for 13%.It is worth noting that, " six
The accident text description that traffic police's record when accident occurs is stored in unification " platform, has recorded the scene of accident generation in detail.So
And due to the non-structured feature of natural language, the effective information that contains in accident text cannot direct batch extracting, it is difficult to quilt
It is included in road safety data data system.
In previous traffic accident research, the incomplete problem of casualty data is coped with, method that there are mainly two types of: is 1. answered
Reduced and joined caused by data information incomplete recording by flexible extensive parameter with the statistical method that can solve data heterogeneity
Number estimated bias, including stochastic parameter model, latent variable model, multi-factor structure model and introducing space structure etc..Such side
The advantage of method is that treatment process is simple, without collecting data again, but is difficult to complete correction missing information, can not quantify to lack
The influence of information, and analysis result has biggish dependence to model selection;2. the investigation of accident depth is unfolded, utilization is retrievable
Video data, Trace Data witness the original materials such as data, using accident reconstruction technology, deeply probe into accident genesis mechanism.It should
The advantage of class method is can be completely restored to accident occurrence scene, but be only adapted to that initial data is complete and small sample analysis of cases.
The duration of traffic events (congestion, accident etc.) is carried out in advance currently, having a small amount of research and utilization text mining
It surveys, but not yet carries out the relevant rudimentaries Journal of Sex Research works such as the traffic accident information based on text mining is analyzed and the quality of data improves
Make.
To sum up, the quality of traffic accident data is in urgent need to be improved, and the existing imperfect solution to the problem of data that is directed to has
Repair ability is unreliable, relies on single accident is difficult to the short slabs such as the excessive initial data all found out.And " six directions one " platform
The accident of middle storage, which describes text data, to be used effectively.Therefore, applicating text digging technology proposed by the present invention repairs thing
Therefore data will become the completely new effective solution for solving the problems, such as to repair casualty data.
Summary of the invention
The present invention is directed at least solve the technical problems existing in the prior art.For this purpose, the invention discloses a kind of texts
The method for being applied to road traffic accident data processing is excavated, Chinese word segmentation is carried out to road traffic accident data sample, is passed through
Word incorporation model builds large-scale text categorization network by the sample data set three-dimensional vector, then by neural network CNN
TextCNN network struction model exports crucial traffic information.
Further, described includes: in open source library jieba sheet to road traffic accident data sample progress Chinese word segmentation
On the basis of the pervasive corpus of body, according to scene feature, import traffic safety corpus as customized dictionary, to sample into
Row participle, then stop words is removed, leave out and sentence the unallied text of duty, enhances ambiguity error correcting capability.
Further, described to further comprise by the sample data set three-dimensional vector by word incorporation model: root
The corresponding row vector of each of the phrase of text word spell up and down by the characteristics of according to road traffic accident scene short text
It connects, the text is indicated with a two-dimensional matrix (x, y), and the feature of each word saves in a matrix, and all texts are turned
After turning to two-dimensional matrix, multiple text planes are stacked, splicing up and down are carried out in a vertical direction, with a stereoscopic three-dimensional matrix
(x, y, z) indicates entire data set, if text number differs, with 0 filling, obtains the consistent three-dimensional matrice of structure with guarantee.
Further, described that large-scale text categorization network TextCNN network struction mould is built by neural network CNN
Type further comprises: dividing to the data set, mark value y binaryzation is predicted, then at random by training pattern
Change data, and divides training dataset and test data set by preset ratio;Use based on TensorFlow as rear end
Keras frame is modeled, and building Conv1D convolutional layer, global maximum pond layer, Dropout prevent over-fitting and output;It is right
Training dataset is after 200epoch training, then performance test data the set pair analysis model is tested, and modelling effect, precision are obtained
And error.
Further, described to be by preset ratio division training dataset and test data set, preset ratio therein
8:2.
Further, cause of accident is divided into five classes, status consideration, turns to factor, four class of spacing factor at speed factor
Reason is set as model classification, and the not yet apparent data that will classify are set as " other ", is first " other " by classification before model training
Data be completely drawn out, after the completion of with training set to model training, will classification not yet apparent data import model carry out it is pre-
It surveys, obtains the output classification of model judgement.
Further, data visualization is carried out, the geographic region for showing traffic accident frequency with different highlighted fashions is generated
Domain figure, visual representation traffic accident big data analysis result.
Further, in Excel with correlation function extract every accident record in detail in place where the accident occurred
Column, then the table is imported into Tableau as the casualty data table to connect with geographic information database, Tableau is by basis
The geographic information database that accident source data table and Tableau are carried is attached, and obtains traffic accident property loss and accident
Scale area distribution thermodynamic chart.
Further, traffic accident causation repair system is developed under java environment, it will be in traffic accident text entry
Key message extract, repair assert reason be " other " Accident-causing;Retrieval traffic accident text note in systems
Record, system will carry out structuring processing, final output traffic key message to accident text entry.
Further, the traffic key message includes: final classification, traffic injury time, Pilot Name, license plate
Number, type of vehicle, place where the accident occurred point, loss, whether state is related.
It is limited to traditional technology, traffic accident text entry is as height unstructured data, it is difficult to efficiently use.With it is existing
There is technology to compare, invention handles traffic accident recording text, then integrated use python and c++ based on natural language processing technique
Language development Accident-causing repair system can automate, mass disposal casualty data record, effectively improve casualty data matter
Amount, the Accident-causing repair system is easy to operate, processing is efficient, information is intuitive, causation analysis is accurate.
For the particularity that Chinese word segmentation cannot be identified by space or other punctuates, in open source library jieba language itself
On the basis of expecting library, the traffic safety corpus built according to relevant laws and regulations and implementing regulations, as customized dictionary,
Again to sample Chinese word segmentation and removal stop words, ambiguity error correcting capability can be enhanced, guarantee participle accuracy.
By word2vec model by text vector, entire data set is indicated with three-dimensional matrice, and can be by each word
Feature preferably save the limitation for breaching the unstructured feature of text data in a matrix, abundant applicating text data are simultaneously
It is facilitated in the use established in model process.
Cost of manufacture of the invention is lower, effectively overcomes the shortcomings of China's road traffic accident structured record, and can
Efficiently and accurately repair traffic accident data.Meanwhile this system is that domestic applicating text digging technology for the first time repairs casualty data,
The text analysis technique of application has reached advanced international standard.
Detailed description of the invention
From following description with reference to the accompanying drawings it will be further appreciated that the present invention.Component in figure is not drawn necessarily to scale,
But it focuses on and shows in the principle of embodiment.In the figure in different views, identical appended drawing reference is specified to be corresponded to
Part.
Fig. 1 is sample stereoscopic three-dimensional matrix data collection figure of the invention;
Fig. 2 is the accident scale area distribution thermodynamic chart in one embodiment of the invention;
Fig. 3 is the exemplary diagram at Accident-causing repair system processing interface in one embodiment of the invention;
Fig. 4 is Accident-causing repair system output interface exemplary diagram in one embodiment of the invention.
Specific embodiment
Embodiment one
In the present embodiment, processing and model construction are carried out to data first, the present invention uses python, as subject
Speech carries out Chinese word segmentation to sample using open source library jieba, then relies on word2vec model, by data set three-dimensional vector, most
Convolutional neural networks CNN is used afterwards, takes out TextCNN network, implementation model.
1.1 Chinese word segmentation
As shown in Figure 1, Chinese word segmentation, which refers to, is reassembled into word sequence according to the specification for extracting special term for chinese character sequence
Process, wherein removal stop words as committed step, refer in information retrieval, processing natural language data when filter out certain
Word to this text data without physical meaning, character and word a bit save memory space with this and improve search efficiency.
On the basis of the pervasive corpus for the library jieba that increases income itself, according to scene feature, traffic safety corpus is imported
As customized dictionary, sample is segmented, then removes stop words, leave out many and sentences the unallied text of duty, enhances discrimination
Adopted error correcting capability.
It is concentrated in initial data, field ' x ' is the description of original casualty data merit, and field ' y ' is finally determining new point
Class classification, field ' split ' is the result after participle.
1.2 construct three-dimensional vector data collection
Raw data set is conventional two-dimensional table, one-dimensional representation feature, two-dimensional representation data bulk, which can only carry out
Vector equalization represents entire text with row vector, characteristic attribute abstracts, therefore ineffective.
According to the characteristic of this scene short text, if by the corresponding row vector of each of the phrase of one section of text word into
Row splicing up and down, then this section of text just can be used a matrix (x, y) to indicate, and the feature of each word can be saved preferably
In a matrix, effect is more preferable compared to equalization.
Due to including multistage text in data set, it also needs to increase a dimension z, if each text is used one
A plane indicates, i.e., after converting two-dimensional matrix for all texts, then carries out splicing up and down in a vertical direction, i.e., by multiple texts
Plane is stacked, then can be used a stereoscopic three-dimensional matrix (x, y, z) to indicate entire data set, is filled out if text number differs, available 0
It fills, to guarantee to obtain the consistent three-dimensional matrice of structure.
1.3 establish model
Data set is divided, then mark value y binaryzation training pattern and is predicted, then randomization data, and press
The ratio cut partition training dataset and test data set of 8:2.Use and is carried out based on TensorFlow as the Keras frame of rear end
Modeling.Building Conv1D convolutional layer, global maximum pond layer, Dropout prevent over-fitting and output.Training dataset is passed through
After 200epoch training, then performance test data the set pair analysis model is tested, and modelling effect, precision and error are obtained.
Cause of accident is divided into five classes by the present invention, and the 1st class is " other ", that is, not yet apparent data of classifying, therefore will be left
2 (status considerations), 3 (speed factors), 4 (turn to factors), 5 (spacing factor) four class reasons are set as model classification.Model
It is first that 1 1/3 data for accounting for about cause of accident total data are completely drawn out by classification, with training the set pair analysis model instruction before training
After the completion of white silk, 1 class data importing model is predicted, the output classification of model judgement can be obtained, on inspection, test effect
Fruit accuracy is higher.
2 data visualizations
As shown in Fig. 2, the present embodiment by using Tableau to 2012 to 2018 Hunan Province's highways 7 with
Upper bus and coach accident carries out data visualization, generates the geography that each city's traffic accident frequency in Hunan Province is shown with different highlighted fashions
Administrative division map, visual representation traffic accident big data analysis result.
The data source of visualized operation is the 7 years traffic accidents recording text data in Hunan Province, is transported in Excel
With correlation function extract every accident record in detail in arranged for information about to place where the accident occurred with the city under the jurisdiction of the provincial government, Hunan Province, then should
Table imported into Tableau as the casualty data table to connect with geographic information database, and Tableau will be according to accident source data
The geographic information database that table and Tableau are carried is attached, and obtains Hunan Province's traffic accident property loss and accident scale
Area distribution thermodynamic chart.
, can be with apparent Hunan Province's traffic accident high-incidencely by shown in Fig. 2, future will further to the ground, accident be caused
Because carrying out classification analysis, constructs and identify the high-risk scene of the traffic accident of different regions, and pointedly plan and improve the ground
Means of transportation.
The exploitation of 3 Accident-causing repair systems
Traffic accident causation repair system is developed at system front end interface shown in as shown in Figure 3-4 under java environment, real
Now by natural language text structuring function, the key message in traffic accident text entry is extracted, repairs and assert original
Because of the Accident-causing of " other ".Traffic accident text entry is retrieved in systems, and system will solve accident text entry
It releases, i.e. structuring is handled, final exportable " final classification ", " traffic injury time ", " Pilot Name ", " license plate number ", " vehicle
The key messages such as type ", " place where the accident occurred point ", " loss ", " whether state is related ".The system can be in history casualty data
In there are casualty data reason is accurately repaired in the incomplete situation of reason, improve the quality of data, and can be advantageously applied to locate
The reason of traffic accident data from now on is managed, data integrity and availability are enhanced.
Classified and can be obtained by the statistics of road traffic accident statistics annual report reason main for traffic accident, caseload
Most reasons be do not give way, drive without a license by regulation, driving when intoxicated, illegal meeting etc., but can be seen by further analysis
Out, part of reason (drive without a license, drive when intoxicated, fatigue driving) is only driving condition factor, is not that accident is caused to send out
Raw direct reason.In this way, the direct reason of accident and driving condition factor then will lead to analysis on accident cause, hand over the case where mixing
Logical safety improvement work lacks reliability.
Embodiment two
A kind of text mining is applied to the method for road traffic accident data processing, to road traffic accident data sample into
Row Chinese word segmentation by word incorporation model by the sample data set three-dimensional vector, then by neural network CNN builds big rule
Mould text classification network TextCNN network struction model exports crucial traffic information.
Further, described includes: in open source library jieba sheet to road traffic accident data sample progress Chinese word segmentation
On the basis of the pervasive corpus of body, according to scene feature, import traffic safety corpus as customized dictionary, to sample into
Row participle, then stop words is removed, leave out and sentence the unallied text of duty, enhances ambiguity error correcting capability.
Further, described to further comprise by the sample data set three-dimensional vector by word incorporation model: root
The corresponding row vector of each of the phrase of text word spell up and down by the characteristics of according to road traffic accident scene short text
It connects, the text is indicated with a two-dimensional matrix (x, y), and the feature of each word saves in a matrix, and all texts are turned
After turning to two-dimensional matrix, multiple text planes are stacked, splicing up and down are carried out in a vertical direction, with a stereoscopic three-dimensional matrix
(x, y, z) indicates entire data set, if text number differs, with 0 filling, obtains the consistent three-dimensional matrice of structure with guarantee.
Further, described that large-scale text categorization network TextCNN network struction mould is built by neural network CNN
Type further comprises: dividing to the data set, mark value y binaryzation is predicted, then at random by training pattern
Change data, and divides training dataset and test data set by preset ratio;Use based on TensorFlow as rear end
Keras frame is modeled, and building Conv1D convolutional layer, global maximum pond layer, Dropout prevent over-fitting and output;It is right
Training dataset is after 200epoch training, then performance test data the set pair analysis model is tested, and modelling effect, precision are obtained
And error.
Further, described to be by preset ratio division training dataset and test data set, preset ratio therein
8:2.
Further, cause of accident is divided into five classes, status consideration, turns to factor, four class of spacing factor at speed factor
Reason is set as model classification, and the not yet apparent data that will classify are set as " other ", is first " other " by classification before model training
Data be completely drawn out, after the completion of with training set to model training, will classification not yet apparent data import model carry out it is pre-
It surveys, obtains the output classification of model judgement.
Further, data visualization is carried out, the geographic region for showing traffic accident frequency with different highlighted fashions is generated
Domain figure, visual representation traffic accident big data analysis result.
Further, in Excel with correlation function extract every accident record in detail in place where the accident occurred
Column, then the table is imported into Tableau as the casualty data table to connect with geographic information database, Tableau is by basis
The geographic information database that accident source data table and Tableau are carried is attached, and obtains traffic accident property loss and accident
Scale area distribution thermodynamic chart.
Further, traffic accident causation repair system is developed under java environment, it will be in traffic accident text entry
Key message extract, repair assert reason be " other " Accident-causing;Retrieval traffic accident text note in systems
Record, system will carry out structuring processing, final output traffic key message to accident text entry.
Further, the traffic key message includes: final classification, traffic injury time, Pilot Name, license plate
Number, type of vehicle, place where the accident occurred point, loss, whether state is related.
In the present embodiment, the method proposes reconstruct Variational Design exploratoryly, using model and text handling method, also
The direct reason of accident of the original under different conditions, final common recognition not Chu 9 kinds of states such as naturally in poor shape, fatigue driving, and be
The accident that reason is judged as " status consideration " before finds the direct reason of accident and counts, to differentiate the direct of accident
Reason and status consideration reduce Biased estimator, effectively improve safety analysis result robustness.Traffic Safety Analysis work can be in shape
On the basis of state statistical analysis, accurate assurance is made to driving behavior of the driver under different conditions, to take reasonable measure
The generation of reduction accident.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described want
There is also other identical elements in the process, method of element, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
Although describing the present invention by reference to various embodiments above, but it is to be understood that of the invention not departing from
In the case where range, many changes and modifications can be carried out.Therefore, be intended to foregoing detailed description be considered as it is illustrative and
It is unrestricted, and it is to be understood that following following claims (including all equivalents) is intended to limit spirit and model of the invention
It encloses.The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention.It is reading
After the content of record of the invention, technical staff can be made various changes or modifications the present invention, these equivalence changes and
Modification equally falls into the scope of the claims in the present invention.
Claims (10)
1. a kind of text mining is applied to the method for road traffic accident data processing, which is characterized in that road traffic accident
Data sample carries out Chinese word segmentation, by word incorporation model by the sample data set three-dimensional vector, then passes through neural network
CNN builds large-scale text categorization network TextCNN network struction model, exports crucial traffic information.
2. a kind of text mining as described in claim 1 is applied to the method for road traffic accident data processing, feature exists
In described includes: in the pervasive corpus of library jieba itself of increasing income to road traffic accident data sample progress Chinese word segmentation
On the basis of, it according to scene feature, imports traffic safety corpus and sample is segmented as customized dictionary, then remove and stop
Word is left out and is sentenced the unallied text of duty, enhances ambiguity error correcting capability.
3. a kind of text mining as claimed in claim 2 is applied to the method for road traffic accident data processing, feature exists
In described to further comprise by the sample data set three-dimensional vector by word incorporation model: according to road traffic accident field
The corresponding row vector of each of the phrase of text word is carried out splicing up and down by the characteristics of scape short text, and the text is with one
A two-dimensional matrix (x, y) indicates that the feature of each word saves in a matrix, after converting two-dimensional matrix for all texts,
Multiple text planes are stacked, carry out splicing up and down in a vertical direction, indicate entire number with a stereoscopic three-dimensional matrix (x, y, z)
According to collection, if text number differs, with 0 filling, the consistent three-dimensional matrice of structure is obtained with guarantee.
4. a kind of text mining as claimed in claim 3 is applied to the method for road traffic accident data processing, feature exists
In described to build large-scale text categorization network TextCNN network struction model by neural network CNN and further comprise: right
The data set is divided, and mark value y binaryzation is predicted, then randomization data by training pattern, and by default
Ratio cut partition training dataset and test data set;Use and modeled based on TensorFlow as the Keras frame of rear end,
Building Conv1D convolutional layer, global maximum pond layer, Dropout prevent over-fitting and output;Training dataset is passed through
After 200epoch training, then performance test data the set pair analysis model is tested, and modelling effect, precision and error are obtained.
5. a kind of text mining as claimed in claim 4 is applied to the method for road traffic accident data processing, feature exists
In described to divide training dataset and test data set by preset ratio, preset ratio therein is 8:2.
6. a kind of text mining as claimed in claim 5 is applied to the method for road traffic accident data processing, feature exists
In cause of accident being divided into five classes, status consideration, speed factor, steering factor, four class reason of spacing factor are set as model class
Not, the not yet apparent data that will classify are set as " other ", before model training, are first completely drawn out the data that classification is " other ",
After the completion of with training set to model training, the not yet apparent data of classification are imported into model and are predicted, model is obtained and sentences
Disconnected output classification.
7. a kind of text mining as claimed in claim 6 is applied to the method for road traffic accident data processing, feature exists
In progress data visualization generates the geographic area figure for showing traffic accident frequency with different highlighted fashions, visual representation traffic
Accident big data analysis result.
8. a kind of text mining as claimed in claim 7 is applied to the method for road traffic accident data processing, feature exists
In, in Excel with correlation function extract every accident record in detail in arrange with place where the accident occurred, then the table imported into
Tableau as the casualty data table to be connect with geographic information database, Tableau will according to accident source data table and
Tableau included geographic information database is attached, and obtains traffic accident property loss and accident scale area distribution heat
Try hard to.
9. a kind of text mining as claimed in claim 8 is applied to the method for road traffic accident data processing, feature exists
In exploitation traffic accident causation repair system, the key message in traffic accident text entry is mentioned under java environment
It takes, repairs the Accident-causing for assert that reason is " other ";Traffic accident text entry is retrieved in systems, and system will be to accident text
This record carries out structuring processing, final output traffic key message.
10. a kind of text mining as claimed in claim 9 is applied to the method for road traffic accident data processing, feature exists
In the traffic key message includes: final classification, traffic injury time, Pilot Name, license plate number, type of vehicle, accident
Scene, loss, whether state is related.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910418287.2A CN110134963A (en) | 2019-05-20 | 2019-05-20 | A kind of text mining is applied to the method for road traffic accident data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910418287.2A CN110134963A (en) | 2019-05-20 | 2019-05-20 | A kind of text mining is applied to the method for road traffic accident data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110134963A true CN110134963A (en) | 2019-08-16 |
Family
ID=67571364
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910418287.2A Pending CN110134963A (en) | 2019-05-20 | 2019-05-20 | A kind of text mining is applied to the method for road traffic accident data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110134963A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807930A (en) * | 2019-11-07 | 2020-02-18 | 中国联合网络通信集团有限公司 | Dangerous vehicle early warning method and device |
CN111209472A (en) * | 2019-12-24 | 2020-05-29 | 中国铁道科学研究院集团有限公司电子计算技术研究所 | Railway accident fault association and accident fault reason analysis method and system |
CN111914687A (en) * | 2020-07-15 | 2020-11-10 | 深圳民太安智能科技有限公司 | Method for actively identifying accident based on Internet of vehicles |
CN112364627A (en) * | 2020-10-23 | 2021-02-12 | 北京建筑大学 | Safety production accident analysis method and device based on text mining, electronic equipment and storage medium |
CN112732744A (en) * | 2021-01-12 | 2021-04-30 | 重庆长安汽车股份有限公司 | Method for efficiently processing CIDAS database based on Tcl/Tk and R languages |
CN113470357A (en) * | 2021-06-30 | 2021-10-01 | 中国汽车工程研究院股份有限公司 | Road traffic accident information processing system and method |
CN113592040A (en) * | 2021-09-27 | 2021-11-02 | 山东蓝湾新材料有限公司 | Method and device for classifying dangerous chemical accidents |
CN114999161A (en) * | 2022-07-29 | 2022-09-02 | 河北博士林科技开发有限公司 | Be used for intelligent traffic jam edge management system |
CN115100861A (en) * | 2022-06-22 | 2022-09-23 | 公安部交通管理科学研究所 | Drunk driving vehicle identification method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170308790A1 (en) * | 2016-04-21 | 2017-10-26 | International Business Machines Corporation | Text classification by ranking with convolutional neural networks |
CN108280173A (en) * | 2018-01-22 | 2018-07-13 | 深圳市和讯华谷信息技术有限公司 | A kind of key message method for digging, medium and the equipment of non-structured text |
CN109410588A (en) * | 2018-12-20 | 2019-03-01 | 湖南晖龙集团股份有限公司 | A kind of traffic accident evolution analysis method based on traffic big data |
-
2019
- 2019-05-20 CN CN201910418287.2A patent/CN110134963A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170308790A1 (en) * | 2016-04-21 | 2017-10-26 | International Business Machines Corporation | Text classification by ranking with convolutional neural networks |
CN108280173A (en) * | 2018-01-22 | 2018-07-13 | 深圳市和讯华谷信息技术有限公司 | A kind of key message method for digging, medium and the equipment of non-structured text |
CN109410588A (en) * | 2018-12-20 | 2019-03-01 | 湖南晖龙集团股份有限公司 | A kind of traffic accident evolution analysis method based on traffic big data |
Non-Patent Citations (2)
Title |
---|
PZYSEERE: "利用word2vec、textCNN、jieba对事故文本多分类及致因修复(三维向量)", 《HTTP://C.360WEBCACHE.COM》 * |
韦凌翔 等: "诱发道路交通事故的关键因子分析方法研究", 《交通信息与安全》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807930A (en) * | 2019-11-07 | 2020-02-18 | 中国联合网络通信集团有限公司 | Dangerous vehicle early warning method and device |
CN111209472A (en) * | 2019-12-24 | 2020-05-29 | 中国铁道科学研究院集团有限公司电子计算技术研究所 | Railway accident fault association and accident fault reason analysis method and system |
CN111209472B (en) * | 2019-12-24 | 2023-08-18 | 中国铁道科学研究院集团有限公司电子计算技术研究所 | Railway accident fault association and accident fault cause analysis method and system |
CN111914687A (en) * | 2020-07-15 | 2020-11-10 | 深圳民太安智能科技有限公司 | Method for actively identifying accident based on Internet of vehicles |
CN111914687B (en) * | 2020-07-15 | 2023-11-17 | 深圳民太安智能科技有限公司 | Method for actively identifying accidents based on Internet of vehicles |
CN112364627A (en) * | 2020-10-23 | 2021-02-12 | 北京建筑大学 | Safety production accident analysis method and device based on text mining, electronic equipment and storage medium |
CN112364627B (en) * | 2020-10-23 | 2023-07-25 | 北京建筑大学 | Text mining-based safety production accident analysis method and device, electronic equipment and storage medium |
CN112732744B (en) * | 2021-01-12 | 2023-03-14 | 重庆长安汽车股份有限公司 | Method for efficiently processing CIDAS database based on Tcl/Tk and R languages |
CN112732744A (en) * | 2021-01-12 | 2021-04-30 | 重庆长安汽车股份有限公司 | Method for efficiently processing CIDAS database based on Tcl/Tk and R languages |
CN113470357A (en) * | 2021-06-30 | 2021-10-01 | 中国汽车工程研究院股份有限公司 | Road traffic accident information processing system and method |
CN113592040A (en) * | 2021-09-27 | 2021-11-02 | 山东蓝湾新材料有限公司 | Method and device for classifying dangerous chemical accidents |
CN115100861A (en) * | 2022-06-22 | 2022-09-23 | 公安部交通管理科学研究所 | Drunk driving vehicle identification method |
CN114999161B (en) * | 2022-07-29 | 2022-10-28 | 河北博士林科技开发有限公司 | Be used for intelligent traffic jam edge management system |
CN114999161A (en) * | 2022-07-29 | 2022-09-02 | 河北博士林科技开发有限公司 | Be used for intelligent traffic jam edge management system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110134963A (en) | A kind of text mining is applied to the method for road traffic accident data processing | |
CN110019396B (en) | Data analysis system and method based on distributed multidimensional analysis | |
CN103544255B (en) | Text semantic relativity based network public opinion information analysis method | |
CN110781254A (en) | Automatic case knowledge graph construction method, system, equipment and medium | |
CN113656805B (en) | Event map automatic construction method and system for multi-source vulnerability information | |
Banchs | Text mining with MATLAB® | |
CN106202514A (en) | Accident based on Agent is across the search method of media information and system | |
CN103049532A (en) | Method for creating knowledge base engine on basis of sudden event emergency management and method for inquiring knowledge base engine | |
CN114003791B (en) | Depth map matching-based automatic classification method and system for medical data elements | |
CN111899089A (en) | Enterprise risk early warning method and system based on knowledge graph | |
US11841839B1 (en) | Preprocessing and imputing method for structural data | |
CN110888943A (en) | Method and system for auxiliary generation of court referee document based on micro-template | |
KR20100127036A (en) | A method for providing idea maps by using classificaion in terms of viewpoints | |
CN103065009B (en) | Intelligent design system and method of traffic sign lines | |
CN116010612A (en) | River basin flood control knowledge graph construction method and device and electronic equipment | |
CN111680506A (en) | External key mapping method and device of database table, electronic equipment and storage medium | |
CN103793373B (en) | Tracking relation recovery method based on syntax | |
CN112363996B (en) | Method, system and medium for establishing physical model of power grid knowledge graph | |
CN106815320B (en) | Investigation big data visual modeling method and system based on expanded three-dimensional histogram | |
Lesbegueries et al. | Associating spatial patterns to text-units for summarizing geographic information | |
Terblanche et al. | Ontology‐based employer demand management | |
Das et al. | Transportation research record articles: A case study of trend mining | |
CN102436472B (en) | Multi- category WEB object extract method based on relationship mechanism | |
CN113486676B (en) | Geological entity semantic relation extraction method and device for geological text | |
CN113535810B (en) | Mining method, device, equipment and medium for traffic violation objects |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |