CN116976339B - Special condition analysis method, equipment and medium for expressway - Google Patents
Special condition analysis method, equipment and medium for expressway Download PDFInfo
- Publication number
- CN116976339B CN116976339B CN202311210811.XA CN202311210811A CN116976339B CN 116976339 B CN116976339 B CN 116976339B CN 202311210811 A CN202311210811 A CN 202311210811A CN 116976339 B CN116976339 B CN 116976339B
- Authority
- CN
- China
- Prior art keywords
- data
- special condition
- special
- category
- condition data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 36
- 239000013598 vector Substances 0.000 claims abstract description 27
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 238000004364 calculation method Methods 0.000 claims abstract description 7
- 230000008451 emotion Effects 0.000 claims description 16
- 238000003860 storage Methods 0.000 claims description 11
- 238000012216 screening Methods 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- 238000012163 sequencing technique Methods 0.000 claims description 4
- 230000008030 elimination Effects 0.000 claims description 2
- 238000003379 elimination reaction Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 8
- 238000012098 association analyses Methods 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 abstract description 2
- 238000004590 computer program Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Computational Linguistics (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a special condition analysis method, equipment and medium for expressways, wherein the method comprises the following steps: acquiring historical special condition data to obtain corresponding text vector data; clustering the text vector data based on a clustering algorithm of local density; based on a parallel association rule algorithm, carrying out support degree calculation on clustered text vector data, and generating association rule data meeting confidence degree pre-support based on frequent item sets to obtain a plurality of special condition data categories; extracting core feature words aiming at each special condition data category to obtain a category model corresponding to the special condition data category; and acquiring new special condition data, and determining a similar special condition data set with the new special condition data. By means of rapid clustering, association analysis, core feature word extraction and category model establishment, similar special condition data can be found in the historical special condition data rapidly aiming at the new special condition data, so that workers can process rapidly.
Description
Technical Field
The application relates to the field of traffic control systems, in particular to a special condition analysis method, equipment and medium for highways.
Background
With the continuous development of social services and transportation industry, the problems in special conditions (special conditions refer to short for special conditions, such as tolling special conditions, accidents, traffic jams, weather mutation and the like) of the expressway are more diversified, and with the continuous increase of the mileage of the expressway, the acceptance of various special conditions is more and more increased. At present, the high-speed special condition processing mode is mainly manually handled, and when special conditions occur, related staff receives user reports through a telephone, an interphone or a monitoring system and takes corresponding measures.
However, this approach requires high business familiarity for the staff, and new staff often need extensive training to be able to perform. Meanwhile, as the special condition acceptance information is continuously enlarged, the problems of low working efficiency and the like are gradually presented to business personnel.
Disclosure of Invention
In order to solve the above problems, the present application proposes a special condition analysis method for expressways, comprising:
acquiring historical special condition data, and preprocessing the historical special condition data to obtain corresponding text vector data;
clustering the text vector data based on a clustering algorithm of local density;
based on a parallel association rule algorithm, carrying out support degree calculation on the clustered text vector data to obtain a frequent item set, and generating association rule data meeting confidence degree pre-support based on the frequent item set so as to carry out data classification according to the association rule data to obtain a plurality of special emotion data categories;
extracting core feature words for each special condition data category, and obtaining a category model corresponding to the special condition data category based on the weight value corresponding to the core feature words;
and acquiring new special condition data, and analyzing the new special condition data based on the category model to determine a similar special condition data set with the new special condition data.
On the other hand, the application also provides a special condition analysis device for the expressway, which comprises:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform operations such as: the special case analysis method for expressways described in the above example.
In another aspect, the present application also proposes a non-volatile computer storage medium storing computer-executable instructions configured to: the special case analysis method for expressways described in the above example.
The special condition analysis method for the expressway provided by the application can bring the following beneficial effects:
by means of rapid clustering, association analysis, core feature word extraction and category model establishment, similar special condition data can be found in the historical special condition data rapidly aiming at the new special condition data, so that workers can process rapidly. And the method of unsupervised training is adopted, so that even if special condition data are continuously expanded and similar special conditions are continuously increased, accurate inquiry of similar special conditions can be realized without manual labeling.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a flow chart of a special case analysis method for a highway according to an embodiment of the present application;
fig. 2 is a schematic diagram of a special case analysis device for expressways in an embodiment of the present application.
Detailed Description
For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Through analysis of expressway special condition data, partial similar special conditions exist in various special condition information, the handling methods of the similar special conditions are approximately the same, and if the handling conditions of the similar special conditions are provided for business personnel for reference, the working efficiency of the business personnel can be greatly improved, and the service quality of a high-speed operation unit can be improved.
Based on the method, semi-intelligent screening of similar special conditions based on a method of combining artificial marking, statistics and the like with knowledge and knowledge of business personnel is provided. For example, based on a large amount of acquired historical special condition data, a manual marking mode is adopted to perform category identification (such as accident, congestion, charge and the like) on each piece of historical data, the tag is stored in a database as a parameter, and after a piece of new special condition data is obtained, the historical tag can be subjected to matching inquiry through database inquiry to screen out similar special conditions of the new special condition.
However, as expressway special conditions are diversified, similar special conditions are increased, new words are appeared continuously, data are noisier, so that the accuracy of the similar special condition screening method is low, the time complexity is higher, and business personnel may have weak reference to the similar special conditions when transacting business, so that special condition business volume is increased continuously, and the transacting efficiency and the service quality are difficult to meet the demands of the travel public.
Based on this, it is proposed that as shown in fig. 1, an embodiment of the present application provides a special case analysis method for expressways, including:
s101: and acquiring historical special condition data, and preprocessing the historical special condition data to obtain corresponding text vector data.
Specifically, the pretreatment process may include: and removing preset special symbols and punctuation marks from the historical special condition data. In the rest text data, the historical special condition data is segmented (also called data word segmentation), word shape and word group are carried out, words or phrases obtained after segmentation are restored into original word shapes, training of text word vectors is carried out, and text normalization is carried out on the original word shapes to obtain corresponding text vector data. Of course, it is also possible to splice and correct text data therein.
Preprocessing the historical special condition data, so as to clean, convert and sort the data, and make the data become normalized text data suitable for subsequent text clustering and mining tasks.
S102: and clustering the text vector data based on a clustering algorithm of local density.
And clustering the special condition data based on a local density rapid clustering algorithm for the normalized text vector data, so as to realize the classification of the whole special condition data.
In particular, the local density describes the degree of aggregation of data around a data node. The relative distance describes the distance of one data node from other data nodes having a greater local density. If the local density value and the relative distance value of one data node are larger, which means that there are more data nodes around itself and the distance from the data node with more data nodes around another is longer, it is considered as a cluster center.
And regarding each piece of text vector data, taking the text vector data as a data node, and determining coordinate data corresponding to the data node, namely the coordinate data obtained by the corresponding vector direction and vector length. And determining the local density corresponding to the data node through the coordinate data and a preset dc value.
Determining a plurality of data nodes with local densities higher than a preset density threshold value, and taking the plurality of data nodes as a clustering center; and determining relative distances between the data nodes and other data nodes, and if the relative distances are higher than a preset distance threshold value, determining the other data nodes as cluster centers, wherein each cluster center and other surrounding data nodes form a category.
The preset distance threshold may also be called a cut-off distance, and is dynamically updated based on the total data amount (obtained by multiplying the average distance between all data nodes by the corresponding weight and then performing accumulation and summation), and the larger the data amount is, the larger the preset example threshold is generally.
S103: and carrying out support degree calculation on the clustered text vector data based on a parallel association rule algorithm to obtain a frequent item set, and generating association rule data meeting confidence degree pre-support based on the frequent item set so as to carry out data classification according to the association rule data to obtain a plurality of special emotion data categories.
And grouping the classified special emotion data by adopting a parallel Apriori association rule algorithm, and classifying the classified special emotion data into different special emotion type sets according to the strong association among text data.
Specifically, a dynamically set support threshold is determined, based on the support threshold, a threshold judgment is performed on keywords in a keyword set corresponding to text vector data, if the support corresponding to the keywords is not smaller than the support threshold, the keywords are used as frequent keywords in a frequent item set, the obtained frequent item set is used iteratively, so that new frequent keywords are obtained through the support of the remaining keywords until no new frequent keywords are generated in the keyword set.
In the iteration process, the keywords which are greater than or equal to the support threshold value are reserved to obtain 1 frequent keywords, then the last obtained (n-1) frequent item set is used for continuous iteration, the support degree of the keywords in the frequent item set is calculated, the frequent keywords which meet the support threshold value are reserved, new frequent keywords are generated, and no new frequent item set is generated.
S104: and extracting core feature words for each special condition data category, and obtaining a category model corresponding to the special condition data category based on the weight value corresponding to the core feature words.
Extracting keywords from the data of each special condition data category by adopting a TF-IDF algorithm, screening the keywords of each category of data according to a dynamic threshold, and then training each category of data by applying an LDA algorithm to construct a corresponding category model.
Specifically, for each special case data category, determining a text occurrence frequency TF value of each keyword contained in the text occurrence frequency TF value, and determining an inverse document occurrence frequency IDF value of each keyword in a total corpus corresponding to all special case data categories. And determining a weight value corresponding to each keyword according to the text occurrence frequency TF value and the reverse file occurrence frequency IDF value. For example, the weight value of each keyword is finally determined by using TF value.
And taking the keywords with weight values higher than a preset dynamic weight threshold value as core feature words, setting a dynamic screening threshold value according to text characteristics, screening the keywords of the historical special condition data of each class, and only retaining the core feature words.
And aiming at each special condition data category, obtaining a category model corresponding to the special condition data category according to the core feature words contained in the special condition data category and the weight values of the core feature words. The category model may be an LDA model, and the category model corresponding to the specific condition data category is obtained in the form of a weight value of a core feature word, for example, a category model may include: the traffic control system comprises the following components of [0.029 ] an emergency lane "+0.023 ] a stop" +0.020 ] a penalty "+0.016 ] a congestion" +0.014 ] a drive-away "+0.014 ] a warning board. And integrating the category model into a total model library to obtain a category total model.
S105: and acquiring new special condition data, and analyzing the new special condition data based on the category model to determine a similar special condition data set with the new special condition data.
And after acquiring a piece of new special condition data, screening candidate categories by combining a keyword fitting and repeatability eliminating method, screening historical special conditions by a time threshold and the candidate categories, and calculating the similarity fitting degree of the screened historical special conditions, so that a similar special condition set of the special condition data is rapidly and accurately predicted.
Specifically, the new special condition data are analyzed through each class model respectively, so that keyword fitting and repeatability elimination are carried out on the new special condition data, and the candidate special condition data class is predicted. The keyword fitting and repeatability eliminating means that keywords are extracted from new special condition data, fitting is carried out among the keywords, repeated keywords are removed, and only proper keywords are left.
And screening the historical special condition data in the candidate special condition data category based on the predicted candidate special condition data category and the corresponding time threshold. For example, candidate special case data within a recent time threshold is selected only from the candidate special case data categories.
And performing fitting degree calculation (for example, calculating through (number of special condition matching words/word segmentation length of target special condition data) on the screened historical special condition data and the new special condition data, wherein the number of special condition matching words refers to the number of the same keywords between the new special condition data and the screened historical special condition data, the target special condition data refers to the new special condition data) and sequencing, and selecting a plurality of historical special condition data with the highest sequencing as a similar special condition data set of the new special condition data.
As shown in fig. 2, the embodiment of the present application further provides a special condition analysis device for an expressway, including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; the processor is connected with the memory through bus communication;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform operations such as: the special case analysis method for expressways according to any one of the above embodiments.
The embodiments also provide a non-volatile computer storage medium storing computer executable instructions configured to: the special case analysis method for expressways according to any one of the above embodiments.
All embodiments in the application are described in a progressive manner, and identical and similar parts of all embodiments are mutually referred, so that each embodiment mainly describes differences from other embodiments. In particular, for the apparatus and medium embodiments, the description is relatively simple, as it is substantially similar to the method embodiments, with reference to the section of the method embodiments being relevant.
The devices and media provided in the embodiments of the present application are in one-to-one correspondence with the methods, so that the devices and media also have similar beneficial technical effects as the corresponding methods, and since the beneficial technical effects of the methods have been described in detail above, the beneficial technical effects of the devices and media are not described in detail herein.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.
Claims (8)
1. A special condition analysis method for an expressway, comprising:
acquiring historical special condition data, and preprocessing the historical special condition data to obtain corresponding text vector data;
clustering the text vector data based on a clustering algorithm of local density;
based on a parallel association rule algorithm, carrying out support degree calculation on the clustered text vector data to obtain a frequent item set, and generating association rule data meeting a confidence coefficient threshold based on the frequent item set so as to carry out data classification according to the association rule data to obtain a plurality of special emotion data categories;
extracting core feature words for each special condition data category, and obtaining a category model corresponding to the special condition data category based on the weight value corresponding to the core feature words;
acquiring new special emotion data, and analyzing the new special emotion data based on the category model to determine a similar special emotion data set with the new special emotion data;
extracting core feature words for each special condition data category, and obtaining a category model corresponding to the special condition data category based on the weight value corresponding to the core feature words, wherein the method specifically comprises the following steps:
determining a text occurrence frequency TF value of each keyword contained in each special case data category, and determining a reverse file occurrence frequency IDF value of each keyword in all special case data categories;
determining a weight value corresponding to each keyword according to the text occurrence frequency TF value and the reverse file occurrence frequency IDF value;
the key words with the weight values higher than the preset dynamic weight threshold value are used as core feature words;
aiming at each special case data category, obtaining a category model corresponding to the special case data category according to the core feature words contained in the special case data category and the weight values of the core feature words;
according to the core feature words contained in the model and the weight values of the core feature words, a category model corresponding to the special condition data category is obtained, and the model specifically comprises the following steps:
according to the core feature words contained in the model and the weight values of the core feature words, obtaining a category model corresponding to the special condition data category in the form of the weight values of the core feature words;
and integrating the category model into a total model library to obtain a category total model.
2. The method according to claim 1, wherein preprocessing the historical special case data specifically comprises:
removing preset special symbols and punctuation marks from the historical special condition data;
in the rest text data, dividing the historical special condition data, and restoring the words or phrases obtained after dividing into original word shapes;
training the text word vector to normalize the text of the original word shape to obtain corresponding text vector data.
3. The method according to claim 2, wherein clustering the text vector data based on a clustering algorithm of local densities, in particular comprises:
aiming at each piece of text vector data, taking the text vector data as a data node, and determining coordinate data corresponding to the data node;
determining the local density corresponding to the data node through the coordinate data and a preset dc value;
determining a plurality of data nodes with local density higher than a preset density threshold value, and taking the plurality of data nodes as a clustering center;
and determining relative distances between the data nodes and other data nodes, and determining the other data nodes as a clustering center if the relative distances are higher than a preset distance threshold value.
4. A method according to claim 3, wherein the preset distance threshold is dynamically updated, and wherein the determining of the preset distance threshold comprises:
and obtaining the total data according to the average distance among all the data nodes and the corresponding weight, and dynamically updating the preset distance threshold according to the total data.
5. The method according to claim 1, wherein the supporting degree calculation is performed on the clustered text vector data based on a parallel association rule algorithm to obtain a frequent item set, and the method specifically comprises:
determining a dynamically set support threshold, and performing threshold judgment on keywords in a keyword set corresponding to the text vector data based on the support threshold;
if the support degree corresponding to the keyword is not less than the support degree threshold, the keyword is used as a frequent keyword in a frequent item set;
and iteratively using the obtained frequent item set to obtain new frequent keywords through the support degree of the residual keywords until no new frequent keywords are generated in the keyword set.
6. The method according to claim 1, wherein analyzing the new special case data based on the category model to determine a set of similar special case data to the new special case data, comprises:
analyzing the new special condition data through each model respectively to perform keyword fitting and repeatability elimination on the new special condition data and predict candidate special condition data types;
screening historical special condition data in the candidate special condition data category based on the predicted candidate special condition data category and a corresponding time threshold;
and aiming at the historical special emotion data obtained through screening, carrying out fitting degree calculation and sequencing on the historical special emotion data and the new special emotion data, and selecting a plurality of historical special emotion data with the highest sequencing as a similar special emotion data set of the new special emotion data.
7. A special case analysis device for an expressway, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform operations such as: the special case analysis method for an expressway according to any one of claims 1 to 6.
8. A non-transitory computer storage medium storing computer-executable instructions, the computer-executable instructions configured to: the special case analysis method for an expressway according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311210811.XA CN116976339B (en) | 2023-09-20 | 2023-09-20 | Special condition analysis method, equipment and medium for expressway |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311210811.XA CN116976339B (en) | 2023-09-20 | 2023-09-20 | Special condition analysis method, equipment and medium for expressway |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116976339A CN116976339A (en) | 2023-10-31 |
CN116976339B true CN116976339B (en) | 2023-12-22 |
Family
ID=88475203
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311210811.XA Active CN116976339B (en) | 2023-09-20 | 2023-09-20 | Special condition analysis method, equipment and medium for expressway |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116976339B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190653A (en) * | 2018-07-09 | 2019-01-11 | 四川大学 | Malicious code family homology analysis technology based on semi-supervised Density Clustering |
US10366123B1 (en) * | 2013-08-06 | 2019-07-30 | Intuit Inc. | Template-free extraction of data from documents |
CN112328799A (en) * | 2021-01-06 | 2021-02-05 | 腾讯科技(深圳)有限公司 | Question classification method and device |
CN113239691A (en) * | 2021-05-11 | 2021-08-10 | 中国石油大学(华东) | Similar appeal work order screening method and device based on topic model |
CN113807456A (en) * | 2021-09-26 | 2021-12-17 | 大连交通大学 | Feature screening and association rule multi-label classification algorithm based on mutual information |
CN115658889A (en) * | 2022-10-13 | 2023-01-31 | 阿里巴巴(中国)有限公司 | Dialogue processing method, device, equipment and storage medium |
CN115858787A (en) * | 2022-12-12 | 2023-03-28 | 交通运输部公路科学研究所 | Hot spot extraction and mining method based on problem appeal information in road transportation |
CN115858773A (en) * | 2022-04-06 | 2023-03-28 | 北京中关村科金技术有限公司 | Keyword mining method, device and medium suitable for long document |
CN116681207A (en) * | 2023-06-01 | 2023-09-01 | 山东高速集团有限公司 | Lane special condition business auditing method, equipment and medium |
-
2023
- 2023-09-20 CN CN202311210811.XA patent/CN116976339B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10366123B1 (en) * | 2013-08-06 | 2019-07-30 | Intuit Inc. | Template-free extraction of data from documents |
CN109190653A (en) * | 2018-07-09 | 2019-01-11 | 四川大学 | Malicious code family homology analysis technology based on semi-supervised Density Clustering |
CN112328799A (en) * | 2021-01-06 | 2021-02-05 | 腾讯科技(深圳)有限公司 | Question classification method and device |
CN113239691A (en) * | 2021-05-11 | 2021-08-10 | 中国石油大学(华东) | Similar appeal work order screening method and device based on topic model |
CN113807456A (en) * | 2021-09-26 | 2021-12-17 | 大连交通大学 | Feature screening and association rule multi-label classification algorithm based on mutual information |
CN115858773A (en) * | 2022-04-06 | 2023-03-28 | 北京中关村科金技术有限公司 | Keyword mining method, device and medium suitable for long document |
CN115658889A (en) * | 2022-10-13 | 2023-01-31 | 阿里巴巴(中国)有限公司 | Dialogue processing method, device, equipment and storage medium |
CN115858787A (en) * | 2022-12-12 | 2023-03-28 | 交通运输部公路科学研究所 | Hot spot extraction and mining method based on problem appeal information in road transportation |
CN116681207A (en) * | 2023-06-01 | 2023-09-01 | 山东高速集团有限公司 | Lane special condition business auditing method, equipment and medium |
Non-Patent Citations (3)
Title |
---|
• Liping Jing,Michael K. Ng,Joshua Z. Huang .Knowledge-based vector space model for text clustering.Springer.2009,全文. * |
基于LSA模型的改进密度峰值算法的微学习单元文本聚类研究;武国胜;张月琴;;计算机工程与科学(04);全文 * |
基于Word2vector的文本特征化表示方法;周顺先;蒋励;林霜巧;龚德良;王鲁达;;重庆邮电大学学报(自然科学版)(02);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116976339A (en) | 2023-10-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110276066B (en) | Entity association relation analysis method and related device | |
CN109992668B (en) | Self-attention-based enterprise public opinion analysis method and device | |
Paramesh et al. | Automated IT service desk systems using machine learning techniques | |
JP2021504789A (en) | ESG-based corporate evaluation execution device and its operation method | |
US11620453B2 (en) | System and method for artificial intelligence driven document analysis, including searching, indexing, comparing or associating datasets based on learned representations | |
CN109710766B (en) | Complaint tendency analysis early warning method and device for work order data | |
US20200210776A1 (en) | Question answering method, terminal, and non-transitory computer readable storage medium | |
CN112463774B (en) | Text data duplication eliminating method, equipment and storage medium | |
CN110704616B (en) | Equipment alarm work order identification method and device | |
CN117473431B (en) | Airport data classification and classification method and system based on knowledge graph | |
CN110310012B (en) | Data analysis method, device, equipment and computer readable storage medium | |
CN114579739B (en) | Topic detection and tracking method for text data stream | |
CN112463960B (en) | Entity relationship determination method and device, computing equipment and storage medium | |
CN111191825A (en) | User default prediction method and device and electronic equipment | |
CN112818121A (en) | Text classification method and device, computer equipment and storage medium | |
CN116071077B (en) | Risk assessment and identification method and device for illegal account | |
CN115563268A (en) | Text abstract generation method and device, electronic equipment and storage medium | |
CN115146062A (en) | Intelligent event analysis method and system fusing expert recommendation and text clustering | |
CN112685374A (en) | Log classification method and device and electronic equipment | |
CN116976339B (en) | Special condition analysis method, equipment and medium for expressway | |
WO2023083176A1 (en) | Sample processing method and device and computer readable storage medium | |
CN116226747A (en) | Training method of data classification model, data classification method and electronic equipment | |
CN116089886A (en) | Information processing method, device, equipment and storage medium | |
CN112308453B (en) | Risk identification model training method, user risk identification method and related devices | |
CN114254622A (en) | Intention identification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |