CN108846134A - A kind of O&M scheme recommender system and method based on web crawlers - Google Patents
A kind of O&M scheme recommender system and method based on web crawlers Download PDFInfo
- Publication number
- CN108846134A CN108846134A CN201810731658.8A CN201810731658A CN108846134A CN 108846134 A CN108846134 A CN 108846134A CN 201810731658 A CN201810731658 A CN 201810731658A CN 108846134 A CN108846134 A CN 108846134A
- Authority
- CN
- China
- Prior art keywords
- scheme
- crawler
- module
- recommender system
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims description 21
- 230000009193 crawling Effects 0.000 claims abstract description 4
- 238000007726 management method Methods 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000002688 persistence Effects 0.000 claims description 5
- 230000000977 initiatory effect Effects 0.000 claims description 3
- 238000013523 data management Methods 0.000 claims description 2
- 238000013461 design Methods 0.000 abstract description 6
- 238000012423 maintenance Methods 0.000 abstract description 6
- 239000000284 extract Substances 0.000 description 3
- 230000033772 system development Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002224 dissection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/20—Administration of product repair or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Educational Administration (AREA)
- Primary Health Care (AREA)
- Health & Medical Sciences (AREA)
- Development Economics (AREA)
- General Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of O&M scheme recommender system based on web crawlers, the system include crawler module, scheme database management module and recommender system module;Crawler module obtains the relevent information of targeted sites, article as O&M protocol writing scheme library by building crawler for extracting the keyword of O&M problem;Scheme database management module obtains the data of targeted sites by crawler module and arranges writing scheme library again for crawling to scheme base progress basic management and data;Recommender system module scores for calculating the description of O&M problem and the text similarity of scheme, sorts according to this, recommend to describe the high O&M scheme of correlation with O&M problem.The system uses triple layer designs, including data Layer, service layer, application layer, and three crawler module, scheme base management, recommender system parts are integrated into a whole, and improves the accuracy rate of O&M solution recommendation, provides auxiliary for maintenance work.
Description
Technical field
The present invention relates to intelligent transportation operational system field, especially a kind of O&M scheme based on web crawlers recommends system
System, and the recommended method using the above-mentioned O&M scheme recommender system based on web crawlers.
Background technique
Since the 1950s, with the rapid advances of computer technology, intelligent transportation system(ITS)March toward hair at full speed
The process of exhibition, field involved in intelligent transportation system is more and more wider, gradually becomes one to improve urban traffic management
Important channel.At the same time, a large amount of intelligent transportation equipment is put into use, and can these equipment operate normally and directly determine
The effect of construction of high-tech traffic system, this just proposes strict requirements to the maintenance work of intelligent transportation equipment.
Intelligent transportation device category is various, including tens kinds of semaphore, bayonet, electronic police, video camera etc., and with
The progress of science and technology, new intelligent transportation equipment are also constantly put into use.As operation maintenance personnel, variety classes equipment is faced
The different faults of appearance, it is often helpless, O&M solution can not be rapidly obtained, is unfavorable for reducing failure bring shadow
It rings.
Summary of the invention
The technical problem to be solved by the present invention is in view of the deficiencies of the prior art, provide a kind of to increase fortune using crawler
It ties up scheme database source and intelligent transportation equipment can be applied to according to the accurate O&M solution of O&M question recommending
In operational system, the O&M scheme recommender system based on web crawlers of auxiliary is provided for maintenance work.
Another technical problem to be solved by this invention is to provide a kind of using the above-mentioned O&M side based on web crawlers
The recommended method of case recommender system progress O&M scheme.
The technical problem to be solved by the present invention is to what is realized by technical solution below.The present invention is that one kind is based on
The O&M scheme recommender system of web crawlers, the system include crawler module, scheme database management module and recommender system module;
Crawler module obtains the relevent information of targeted sites, article by building crawler for extracting the keyword of O&M problem
As O&M protocol writing scheme library;
Scheme database management module passes through crawler module and obtains Target Station for crawling to scheme base progress basic management and data
The data of point simultaneously arrange writing scheme library again;
Recommender system module scores for calculating the description of O&M problem and the text similarity of scheme, sorts according to this, recommend and transport
Dimension problem describes the high O&M scheme of correlation.
The technical problems to be solved by the invention can also be further realized by technical solution below, for above
The O&M scheme recommender system based on web crawlers, the system architecture is using data Layer, service layer and three layers of application layer
Architecture design, data Layer uses MySQL as database, using MyBatis as Data Persistence Layer frame;Service layer uses
SpringMVC realizes the business functions such as crawler, data management, similarity calculation, Interworking Data layer, and handles asking for application layer
It asks;Application layer is responsible for initiating Http request, and the result returned to service layer carries out parsing displaying.
The technical problems to be solved by the invention can also be further realized by technical solution below, for above
The O&M scheme recommender system based on web crawlers, the crawler module describe O&M problem using TF-IFD algorithm
Keyword is extracted, and keyword passes through crawler module acquisition data according to this.
The technical problems to be solved by the invention can also be further realized by technical solution below, for above
The O&M scheme recommender system based on web crawlers, the crawler module obtain fortune using Crawler4j building crawler
Tie up solution data.
The technical problems to be solved by the invention can also be further realized by technical solution below, for above
The O&M scheme recommender system based on web crawlers, the recommender system module calculate text using dynamic programming method
Similarity, and sorted to the similarity score of O&M solution and O&M problem by weighting summation method, done with this
The degree of correlation is recommended.
The technical problems to be solved by the invention can also be further realized by technical solution below, and one kind being based on network
The O&M proposal recommending method of crawler, this method are carried out using the above-described O&M scheme recommender system based on web crawlers
Title in the description and scheme base of O&M problem is done similarity-rough set with content, and gives several by the recommendation of O&M scheme
Similarity calculation result assigns weight, obtains similarity score, takes scoring higher first several to do and recommends;If similarity score
It is below a certain threshold value, then automatically extracts the keyword of O&M problem description, obtains phase from external network using web crawlers
O&M solution is closed, scheme base is updated, while doing the recommendation of O&M scheme.
Compared with prior art, the present invention obtains information, the article of targeted sites by crawler module, saves to solution party
Case library;Text similarity is calculated by recommender system module, and to the Similarity-Weighted phase of O&M problem description and title, content
Add and sort as similarity score, does the recommendation of O&M solution;O&M solution library is managed by scheme database management module
Data, and be able to use crawler and obtain external knowledge library data, thus quick obtaining O&M scheme, reply variety classes equipment goes out
Existing different faults, reducing failure bring influences.The system can increase O&M scheme database by using crawler
Source, convenient for improving the recommendation of O&M solution according to according to the accurate O&M solution of O&M question recommending
Accuracy rate provides auxiliary for maintenance work.
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram of the invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, right below in conjunction with attached drawing of the present invention
Technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is a part of the invention
Embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making wound
Every other embodiment obtained under the premise of the property made labour, shall fall within the protection scope of the present invention.
Referring to Fig.1, a kind of O&M scheme recommender system based on web crawlers, which includes crawler module, scheme base
Management module, recommender system module, system architecture use triple layer designs:Data Layer, service layer and application layer, data Layer use
MySQL is as database, and MyBatis is as Data Persistence Layer frame;Service layer realizes crawler, data pipe using SpringMVC
The business functions such as reason, similarity calculation, Interworking Data layer, and handle the request of application layer.Application layer is responsible for initiating Http request,
The result returned to service layer carries out parsing displaying.
The system calculates text similarity using dynamic programming algorithm, and to the phase of O&M problem description and title, content
It sorts like degree weighting summation as similarity score, does the recommendation of O&M solution;Recommended method is sketched:It first removes to be compared
Spcial character in text, such as punctuation mark, using dynamic programming algorithm computational problem and title, the similarity of problem and content,
Two Similarity-Weighteds are added and are used as similarity score, are sorted by similarity score, are taken in the top as the solution recommended
Certainly scheme.
The system describes O&M problem using TF-IFD algorithm to extract keyword, and calls crawler to carry out O&M solution according to this
The certainly acquisition of scheme.
The system realizes the crawler function of system using Crawler4j open source crawler project, as O&M solution
Data acquiring mode.
The system uses MySQL as database, and Mybatis is based on SpringMVC Development of Framework as Persistence Layer Framework
The interface of Restful style, the compatible various platforms for supporting Http agreement(Android, iOS, Web etc.)Using calling.
The recommended method of the system includes the following steps:With the description of O&M problem and the title and content in solution library
Similarity-rough set is done, and assigns weight to two similarity calculation results, obtains similarity score, takes scoring higher former
It does and recommends;If similarity score is below a certain threshold value, system if, automatically extracts the keyword of O&M problem description, uses net
Network crawler obtains related O&M solution, and more new database while does the recommendation of O&M scheme.The system is set using three layers
Three crawler module, scheme base management, recommender system parts are combined into one by meter, including data Layer, service layer, application layer
It is whole, the accuracy rate of O&M solution recommendation is improved, provides auxiliary for maintenance work.
This system function mainly includes following 3 modules.
Crawler module:The keyword for extracting O&M problem constructs the correlation that crawler obtains targeted sites by Crawler4j
Information, article are as O&M protocol writing scheme library.
Scheme database management module:Basic management is carried out to O&M scheme base(Including increase, deletion, modification, inquiry etc.), number
According to crawling, i.e., the data of targeted sites is obtained by crawler module and arrange writing scheme library.
Recommender system module:It calculates the description of O&M problem and the text similarity of scheme scores, sort according to this, recommend and transport
Dimension problem describes the high O&M scheme of correlation.
Concrete operations introduction:
1, system development environment:Using 10 operating system of Windows, IDEA 2017.3 as compiler, JAVA as exploitation
Language(Wherein JDK version is 1.8), integrate SpringMVC 4.2.2.RELEASE frame, Junit4.10 unit test tool,
Mybatis Data Persistence Layer frame, druid Data Connection Pool, Crawler4j crawler, lucene information retrieval tool library,
The frames such as Maven or tool are completed system development environment and are built;
2, system architecture designs:System development is according to MVC design Specification Design and writes, be divided into dao, service,
Tri- layers of controller, wherein dao is realized by Mybatis frame, and service mainly calls dao layers of interface method to realize industry
Business logic is explained for controller layers using@RequestMapping and provides corresponding interface IP address, request method, returns to number
According to etc. information, using@ResponseBody explain realize Restful style interface, pass through@Service explain call
Service layer method realizes interface function.
3, crawler functions of modules:Crawler module is based on Crawler4j open source crawler project development, mainly includes
The exploitation and the exploitation of CrawlerController of WebCrawler, wherein WebCrawler is responsible for the parsing to website, needs reality
Its existing shouldVisit(Site configuration),visit(Dissection process)Method;CrawlerController is responsible for calling
WebCrawler controls the operation relevant parameter of crawler;
4, recommender system function:Recommender system using dynamic programming algorithm calculate text similarity, and to O&M problem description with
The Similarity-Weighted addition of title, content is sorted as similarity score, does the recommendation of O&M solution;
5, O&M key to the issue word extracts:Realize that TF-IFD algorithm describes O&M problem to extract keyword using lucene;
6, system deployment:The program of war format is packaged as by Maven tool, Window Server 2012, JDK,
It is run under 8.0 environment of Tomcat.
Crawler in the application refers to web crawlers, be it is a kind of according to certain rules, automatically grab web message
Program or script.
TF-IDF algorithm:
TF:Word frequency(Term Frequency), i.e., after participle, frequency that some word occurs in a document.
IDF:Inverse document frequency (Inverse Document Frequency).It is distributed on the basis of word frequency to each word
Weight, if the word frequency there are three word is the same, but this do not represent these three words in the importance of this article be it is the same, because
This will also distribute weight to these three words, IDF be exactly some word it is rare in entire corpus but in this side article it is more
Secondary appearance, it is likely that the characteristic of this article is reflected, therefore IDF is just high, equal to total number of documents in corpus than upper comprising changing word
Number of files logarithm.
Some word is higher to the importance of article, its TF-IDF value is bigger.
Claims (6)
1. a kind of O&M scheme recommender system based on web crawlers, it is characterised in that:The system includes crawler module, scheme base
Management module and recommender system module;
Crawler module obtains the relevent information of targeted sites, article by building crawler for extracting the keyword of O&M problem
As O&M protocol writing scheme library;
Scheme database management module passes through crawler module and obtains Target Station for crawling to scheme base progress basic management and data
The data of point simultaneously arrange writing scheme library again;
Recommender system module scores for calculating the description of O&M problem and the text similarity of scheme, sorts according to this, recommend and transport
Dimension problem describes the high O&M scheme of correlation.
2. the O&M scheme recommender system according to claim 1 based on web crawlers, it is characterised in that:The system architecture
It is designed using data Layer, service layer and application layer three-tier architecture, data Layer, as database, is made using MySQL using MyBatis
For Data Persistence Layer frame;Service layer realizes the business functions such as crawler, data management, similarity calculation using SpringMVC, right
Data Layer is connect, and handles the request of application layer;Application layer is responsible for initiating Http request, and the result returned to service layer parses
It shows.
3. the O&M scheme recommender system according to claim 1 based on web crawlers, it is characterised in that:The crawler mould
Block describes O&M problem using TF-IFD algorithm to extract keyword, and keyword passes through crawler module acquisition data according to this.
4. the O&M scheme recommender system according to claim 1 or 3 based on web crawlers, it is characterised in that:It is described to climb
Erpoglyph block obtains O&M solution data using Crawler4j building crawler.
5. the O&M scheme recommender system according to claim 1 based on web crawlers, it is characterised in that:The recommendation system
Module of uniting calculates the similarity of text using dynamic programming method, and by weighting summation method, to O&M solution and fortune
The similarity score of dimension problem, is sorted with this, does degree of correlation recommendation.
6. a kind of O&M proposal recommending method based on web crawlers, it is characterised in that:This method is any using claim 1-5
O&M scheme recommender system described in one based on web crawlers carries out the recommendation of O&M scheme, and its step are as follows:By O&M
Title and content in the description and scheme base of problem do similarity-rough set, and assign power to several similarity calculation results
Weight, obtains similarity score, takes scoring higher first several to do and recommends;If similarity score is below a certain threshold value, from
The dynamic keyword for extracting the description of O&M problem obtains related O&M solution, other side using web crawlers from external network
Case library is updated, while doing the recommendation of O&M scheme.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810731658.8A CN108846134A (en) | 2018-07-05 | 2018-07-05 | A kind of O&M scheme recommender system and method based on web crawlers |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810731658.8A CN108846134A (en) | 2018-07-05 | 2018-07-05 | A kind of O&M scheme recommender system and method based on web crawlers |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108846134A true CN108846134A (en) | 2018-11-20 |
Family
ID=64200344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810731658.8A Pending CN108846134A (en) | 2018-07-05 | 2018-07-05 | A kind of O&M scheme recommender system and method based on web crawlers |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108846134A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559032A (en) * | 2018-11-27 | 2019-04-02 | 上海交通大学医学院 | A kind of Assessment Management System for the clinical research initiated for researcher |
CN109993550A (en) * | 2019-04-17 | 2019-07-09 | 连云港杰瑞电子有限公司 | After-sale service system and method based on wechat small routine and smart allocation algorithm |
CN111507550A (en) * | 2019-01-30 | 2020-08-07 | 广州泰迪智能科技有限公司 | Automatic recommendation method for optimal solution of work order problem |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105912662A (en) * | 2016-04-11 | 2016-08-31 | 天津大学 | Coreseek-based vertical search engine research and optimization method |
CN107294788A (en) * | 2017-07-17 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of method and device of troubleshooting |
CN107341068A (en) * | 2017-06-28 | 2017-11-10 | 北京优特捷信息技术有限公司 | The method and apparatus that O&M troubleshooting is carried out by natural language processing |
CN107943812A (en) * | 2017-05-24 | 2018-04-20 | 成都明途科技有限公司 | Recommend method for the news of user's centralized integration resource |
-
2018
- 2018-07-05 CN CN201810731658.8A patent/CN108846134A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105912662A (en) * | 2016-04-11 | 2016-08-31 | 天津大学 | Coreseek-based vertical search engine research and optimization method |
CN107943812A (en) * | 2017-05-24 | 2018-04-20 | 成都明途科技有限公司 | Recommend method for the news of user's centralized integration resource |
CN107341068A (en) * | 2017-06-28 | 2017-11-10 | 北京优特捷信息技术有限公司 | The method and apparatus that O&M troubleshooting is carried out by natural language processing |
CN107294788A (en) * | 2017-07-17 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of method and device of troubleshooting |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559032A (en) * | 2018-11-27 | 2019-04-02 | 上海交通大学医学院 | A kind of Assessment Management System for the clinical research initiated for researcher |
CN111507550A (en) * | 2019-01-30 | 2020-08-07 | 广州泰迪智能科技有限公司 | Automatic recommendation method for optimal solution of work order problem |
CN109993550A (en) * | 2019-04-17 | 2019-07-09 | 连云港杰瑞电子有限公司 | After-sale service system and method based on wechat small routine and smart allocation algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107797991B (en) | Dependency syntax tree-based knowledge graph expansion method and system | |
CN102819604B (en) | Method for retrieving confidential information of file and judging and marking security classification based on content correlation | |
Endarnoto et al. | Traffic condition information extraction & visualization from social media twitter for android mobile application | |
CN112749284B (en) | Knowledge graph construction method, device, equipment and storage medium | |
US11768892B2 (en) | Method and apparatus for extracting name of POI, device and computer storage medium | |
CN104462547B (en) | A kind of method and system of configurable collecting webpage data | |
CN103294781A (en) | Method and equipment used for processing page data | |
CN106815307A (en) | Public Culture knowledge mapping platform and its use method | |
CN107346325A (en) | Information query method and device | |
CN109194714B (en) | File pushing method and device, terminal device and storage medium | |
CN107391675A (en) | Method and apparatus for generating structure information | |
JP2019074843A (en) | Information providing apparatus, information providing method, and program | |
CN114116065B (en) | Method and device for acquiring topological graph data object and electronic equipment | |
CN108846134A (en) | A kind of O&M scheme recommender system and method based on web crawlers | |
CN111897914A (en) | Entity information extraction and knowledge graph construction method for field of comprehensive pipe gallery | |
CN104978314A (en) | Media content recommendation method and device | |
CN110134845A (en) | Project public sentiment monitoring method, device, computer equipment and storage medium | |
CN104331438B (en) | To novel web page contents selectivity abstracting method and device | |
CN102760150A (en) | Webpage extraction method based on attribute reproduction and labeled path | |
CN108170678A (en) | A kind of text entities abstracting method and system | |
CN103345532A (en) | Method and device for extracting webpage information | |
CN103778238A (en) | Method for automatically building classification tree from semi-structured data of Wikipedia | |
CN106874502A (en) | A kind of method of video search, device and terminal | |
CN103942211A (en) | Text page recognition method and device | |
CN106202038A (en) | Synonym method for digging based on iteration and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181120 |
|
RJ01 | Rejection of invention patent application after publication |