CN110134845A - Project public sentiment monitoring method, device, computer equipment and storage medium - Google Patents
Project public sentiment monitoring method, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN110134845A CN110134845A CN201910270796.5A CN201910270796A CN110134845A CN 110134845 A CN110134845 A CN 110134845A CN 201910270796 A CN201910270796 A CN 201910270796A CN 110134845 A CN110134845 A CN 110134845A
- Authority
- CN
- China
- Prior art keywords
- public sentiment
- destination item
- corpus
- data source
- project
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000012544 monitoring process Methods 0.000 title claims abstract description 65
- 238000003058 natural language processing Methods 0.000 claims abstract description 25
- 230000009193 crawling Effects 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 17
- 238000012806 monitoring device Methods 0.000 claims description 16
- 238000011156 evaluation Methods 0.000 claims description 14
- 238000004458 analytical method Methods 0.000 claims description 12
- 238000013480 data collection Methods 0.000 claims description 12
- 238000010276 construction Methods 0.000 claims description 3
- 239000000463 material Substances 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000000875 corresponding effect Effects 0.000 description 30
- 238000010586 diagram Methods 0.000 description 17
- 238000012545 processing Methods 0.000 description 12
- 238000000605 extraction Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000011218 segmentation Effects 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 238000001228 spectrum Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 230000008676 import Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 241000270322 Lepidosauria Species 0.000 description 1
- 244000097202 Rathbunia alamosensis Species 0.000 description 1
- 235000009776 Rathbunia alamosensis Nutrition 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000007873 sieving Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present application provides a kind of project public sentiment monitoring method, device, computer equipment and computer readable storage medium.The embodiment of the present application belongs to data display technique field, when realizing the monitoring of project public sentiment, after the identification information for obtaining destination item, the data source website list of destination item is obtained by way of web search according to identification information, the corpus of destination item is crawled from the data source website that the data source website list is included according to identification information, realize the corpus obtained about destination item, then corpus is parsed to identify principal name and public sentiment feature that corpus is included by natural language processing, the principal name and the public sentiment feature are imported into chart database to construct the public sentiment relation map of the destination item, visually show the public sentiment relation map of the destination item, to realize that the public sentiment of project monitors from subdivision angle to target unit, to improve the public sentiment monitoring efficiency of subdivision angle in target unit.
Description
Technical field
This application involves data display technique fields more particularly to a kind of project public sentiment monitoring method, device, computer to set
Standby and computer readable storage medium.
Background technique
Enterprise's public feelings information is the content doing business connection project at present and being all related to, and is all taken in traditional technology
The method of the acquisition appointed website data of orientation, such as financial web site.But the message reflection acquired in this way is that an enterprise is whole
The public opinion situation of body, the if desired enterprise's public sentiment of enterprise in a certain respect, such as a product of enterprise, an investment or
The public sentiment of the contents such as one advertisement marketing is needed to be screened from a large amount of whole public feelings information data, and is filtered out
Content it is not accurate enough, cause obtain this aspect public sentiment inefficient problem.
Summary of the invention
The embodiment of the present application provides a kind of project public sentiment monitoring method, device, computer equipment and computer-readable deposits
Storage media is able to solve the problem that destination item public sentiment monitoring efficiency is not high in traditional technology.
In a first aspect, the embodiment of the present application provides a kind of project public sentiment monitoring method, which comprises by default
Mode obtains the identification information of destination item;The destination item is obtained by way of web search according to the identification information
Data source website list;It is crawled from the data source website that the data source website list is included according to the identification information
The corpus of the destination item;The corpus is parsed by natural language processing to identify principal name that the corpus is included
And public sentiment feature;The principal name and the public sentiment feature are imported into chart database to construct the public sentiment of the destination item and close
It is map;Show the public sentiment relation map.
Second aspect, the embodiment of the present application also provides a kind of project public sentiment monitoring devices, comprising: first acquisition unit,
For obtaining the identification information of destination item by predetermined manner;Second acquisition unit, for being passed through according to the identification information
The mode of web search obtains the data source website list of the destination item;Unit is crawled, for according to the identification information
The corpus of the destination item is crawled from the data source website that the data source website list is included;Recognition unit is used for
The corpus is parsed by natural language processing to identify principal name and public sentiment feature that the corpus is included;Building is single
Member, for the principal name and the public sentiment feature to be imported chart database to construct the public sentiment relational graph of the destination item
Spectrum;Display unit, for showing the public sentiment relation map.
The third aspect, the embodiment of the present application also provides a kind of computer equipments comprising memory and processor, it is described
Computer program is stored on memory, the processor realizes the project public sentiment monitoring side when executing the computer program
Method.
Fourth aspect, it is described computer-readable to deposit the embodiment of the present application also provides a kind of computer readable storage medium
Storage media is stored with computer program, and the computer program makes the processor execute the project carriage when being executed by processor
Feelings monitoring method.
The embodiment of the present application provides a kind of project public sentiment monitoring method, device, computer equipment and computer-readable deposits
Storage media.When the embodiment of the present application realizes the monitoring of project public sentiment, after the identification information for obtaining destination item, according to the mark
Information obtains the data source website list of the destination item by way of web search, according to the identification information from described
The corpus of the destination item is crawled in the data source website that data source website list is included, to realize that acquisition is more comprehensive
About the corpus of destination item, the corpus is then parsed by natural language processing to identify main body that the corpus is included
The principal name and the public sentiment feature are imported chart database to construct the carriage of the destination item by title and public sentiment feature
Feelings relation map visually shows the public sentiment relation map of the destination item, thus real from subdivision angle to target unit
The public sentiment monitoring of existing project, to improve the public sentiment monitoring efficiency of subdivision angle in target unit.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in embodiment description
Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, general for this field
For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the application scenarios schematic diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 2 is the flow diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 3 is another flow diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 4 is a sub- flow diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 5 is another sub-process schematic diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 6 is the sub- flow diagram of third of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 7 is the schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application;
Fig. 8 is another schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application;And
Fig. 9 is the schematic block diagram of computer equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen
Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall in the protection scope of this application.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction
Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded
Body, step, operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this present specification merely for the sake of description specific embodiment
And be not intended to limit the application.As present specification and it is used in the attached claims, unless on
Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in present specification and the appended claims is
Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
Referring to Fig. 1, Fig. 1 is the application scenarios schematic diagram of project public sentiment monitoring method provided by the embodiments of the present application.This
The project public sentiment monitoring method that application embodiment provides can be applied in terminal shown in FIG. 1, soft in terminal by being installed on
Part come the step of realizing the project public sentiment monitoring method, wherein the terminal can for laptop, tablet computer or
The electronic equipments such as desktop computer.Project public sentiment monitoring method provided by the embodiments of the present application the specific implementation process is as follows: terminal
The identification information of destination item is obtained by predetermined manner;It is obtained by way of web search according to the identification information described
The data source website list of destination item;The data source net for being included from the data source website list according to the identification information
The corpus of the destination item is crawled in standing;The corpus is parsed by natural language processing to identify that the corpus included
Principal name and public sentiment feature;The principal name and the public sentiment feature are imported into chart database to construct the destination item
Public sentiment relation map;Show the public sentiment relation map.
It should be noted that only illustrating desktop computer as terminal, in the actual operation process, terminal in Fig. 1
Type is not limited to shown in Fig. 1, and the terminal can also be the electronic equipments such as mobile phone, laptop or tablet computer, on
The application scenarios for stating project public sentiment monitoring method are merely illustrative technical scheme, are not used to limit present techniques
Scheme.
Fig. 2 is the schematic flow chart of project public sentiment monitoring method provided by the embodiments of the present application.Project public sentiment monitoring
Method is applied in the terminal of Fig. 1, with all or part of function of finished item public sentiment monitoring method.
Referring to Fig. 2, Fig. 2 is the flow diagram of project public sentiment monitoring method provided by the embodiments of the present application.Such as Fig. 2 institute
Show, this approach includes the following steps S210-S260:
S210, the identification information that destination item is obtained by predetermined manner.
Wherein, predetermined manner, which refers to, manually in such a way that input equipment inputs or passes through natural language processingTarget item Purpose corpus is to obtain vocabularyAndTo the vocabulary of acquisitionThe mode screened.
Destination item refers to that enterprise or its hetero-organization determine the project of monitoring public sentiment, for example, the target item of an enterprise
Content in Mu Zhi enterprise in a certain respect, for example, in a product of enterprise, marketing, investment, an event etc.
Hold, is the content of one subdivision aspect of enterprise.
The identification information of destination item refers to the identification information for the destination item main contents, is to the target
The description of project key content, for example, being directed to a product, the title of the product, each attribute etc., such as Mobile phone etc..
Specifically, the identification information of the destination item can receive the information of input by input equipment, to obtain
The identification information of the project is to carry out public sentiment monitoring to the project, for example, obtaining the brand name of Mobile phone, model, processing
The description of the performances such as device and continuation of the journey.In addition, the identification information of destination item can also be obtained by way of natural language processing, from
Initial corpus is crawled in the corpus source of destination item, the initial corpus such as is segmented and screened at the natural language processings, sieve
The identification information for selecting the current concerned destination item of destination item, crawls more the destination item further according to identification information
More comprehensive corpus carries out accurate public sentiment monitoring to destination item by more corpus.For example, terminal crawls an enterprise
The data of one predetermined period of industry are filtered out the current hot spot target project of the enterprise from corpus by natural language processing, obtained
The identification information of the destination item is taken, that is, filters out the public sentiment for the Hot events that enterprise is currently concerned, according to the mark
Information crawls the data for the destination item that the identification information is related to, which is further analyzed, and obtains the destination item
Public sentiment, so that enterprise is referred to.
S220, the data source website column for obtaining the destination item by way of web search according to the identification information
Table.
Specifically, the identification information that terminal obtains destination item passes through search according to the identification information of the destination item
The data source website for obtaining the corpus source that the destination item is related to, the corpus source i.e. destination item, if comprising multiple
Data source website is then data source website list.For example, enterprise is in order to targetedly understand the carriage on one product item of enterprise
Feelings, that is, understand the evaluation of the outer bound pair product, it needs to carry out public sentiment monitoring, the key of the available product to the product
Word, such as brand name, the model of the product etc. obtain the data source that the product is related to according to the keyword, certainly,
The data source is also possible to the network address being manually entered, and according to the network address and keyword of input, acquisition forms the product public sentiment and relates to
And corpus, the corpus include news, product introduction, product comment and comment etc., can be in the form of article or sentence etc.
Description is present in the website of each type.For example, enterprise in order to realize Mobile phone product public sentiment monitoring, to improve mobile phone
The marketing and design in product later period etc. need to obtain the keys such as the keyword of mobile phone products, such as mobile phone title, mobile phone model
Word obtains the data source comprising mobile phone products according to the keyword of mobile phone products, obtains from data source and produces for this mobile phone
The corpus of product public sentiment.
S230, crawled from the data source website that the data source website list is included according to the identification information it is described
The corpus of destination item.
Wherein, it crawls and refers to and crawled by crawler, crawler refers to web crawlers, and web crawlers is otherwise known as webpage spider
Spider, network robot or webpage follower etc., be it is a kind of according to it is certain rule automatically grab web message program or
Person's script.
Specifically, to implement the public sentiment monitoring to destination item, crawl target item on internet by constructing crawler system
The related corpus of purpose passes through the public sentiment relation map by the public sentiment relation map of the parsing building destination item to corpus
The public sentiment of destination item is obtained to realize the public sentiment monitoring to destination item.Webpage is automatically extracted since web crawlers is one
Program, crawl include in the data source website destination item corpus, can only be crawled by crawlers and target
The related data of project obtain the identification information of destination item by predetermined manner, obtain destination item by identification information
After data source website list, crawler system is according to the data source website list of destination item, by crawling available data source
The rich language material of destination item in website.
Further, the data that corpus source includes can also be screened, according to the data acquisition target item filtered out
Public sentiment in mesh public sentiment in a certain respect, for example, for a certain mobile phone products camera evaluate, battery durable, processor or
The data such as the superiority and inferiority of system are screened to form corresponding public sentiment.
S240, the corpus is parsed by natural language processing to identify principal name and public sentiment that the corpus is included
Feature.
Wherein, the destination item refers to that enterprise or its hetero-organization determine that the content of monitoring public sentiment is whole, for example, one
The destination item of a enterprise can be the contents such as a product of enterprise, marketing, investment, an event, be enterprise
The content of one subdivision aspect is whole.The object for including in the principal name finger speech material, the object are the target item destination name
Title of part is respectively formed in title and the destination item, for example, principal name includes mobile phone A if a destination item is mobile phone A
Title and composition mobile phone A all parts or component title, due to including the title and composition of mobile phone A in corpus
The title of the display screen B of mobile phone A and the title of camera C, during identification, each main body is cannot be distinguished in computer equipment
Relationship between title, such as display screen B and camera C are subordinated to mobile phone A, but can identify the hand that the corpus is included
The title of machine A, and form the title of the display screen B of mobile phone A and the entitled principal name of camera C.
Public sentiment feature refers to the keyword of destination item public sentiment, is the feature description of evaluation goal project, for describing mesh
Principal name corresponds to relationship between the attribute and main body of main body in mark project, for example, if destination item is Mobile phone, target item
The principal name for including in mesh is mobile phone title, the title of all parts or component in mobile phone, such as the display of composition mobile phone
The title and camera title of screen, public sentiment is characterized in evaluation and description to the corresponding main body of principal name, for example, mobile phone is matched
The description for the embodiment mobile phone features such as effect of setting that high, display screen is big or camera is taken pictures is good.It should be noted that principal name is
Refer to that the mark for distinguishing main body, principal name there can be various forms of statements, for example, mobile phone is directed to, in addition to mobile phone brand conduct
Outside mobile phone title, the concrete model or code name of mobile phone brand can also be used as mobile phone title.
Specifically, the corpus is parsed by natural language processing, refers to and carries out the corpus according to sentence separatrix
Segmentation constructs name physical model to obtain sentence data collection, according to the corpus, is identified by the name physical model
Principal name included in the sentence data collection carries out part of speech analysis and the retrieval of relationship by objective (RBO) to the corpus to obtain
The public sentiment feature of the destination item.For example, parsing the corpus of acquisition by natural language processing technique, identifying mobile phone name
Claim information and correlated characteristic description, important data source is provided for destination item public sentiment.Wherein, Entity recognition is named, English is
Named Entity Recognition, abbreviation NER, also referred to as " proper name identification " refer in identification text there is certain sense
Entity, mainly include name, place name, mechanism name, proper noun etc..In general, the naming Entity recognition of the task is exactly to know
It Chu not three categories (entity class, time class and numeric class), seven groups (name, mechanism name, place name, time, day in text to be processed
Phase, currency and percentage) name entity, Chinese name physical model includes CRF model and the BiLSTM-CRF model based on word.
The public feelings information in relation to destination item is obtained by natural language processing method by the detailed comprehensive data source of acquisition,
It is subsequent that public feelings information is imported into chart database, to improve the data of node and nodal community.For example, passing through name entity mould
Type identifies the sentence corpus in relation to destination item, after segmenting to corpus, carries out part of speech analysis and Feature Words point to word
Analysis, such as the relationship between noun, verb, adjective and these words, to extract the destination item public feelings information in corpus.
Further, it can also be improved main in destination item by the model training and automatic learning art of artificial intelligence
The accuracy of the identification of body title and the identification of public sentiment feature.Specifically, by natural language processing technique, to a large amount of corpus of acquisition
Word segmentation processing is carried out, and the participle of acquisition is screened, passes through the model training and automatic learning art of artificial intelligence at this time
The accuracy of principal name identification and the identification of public sentiment feature is improved, for example, sieving by artificial intelligence model and automatic learning art
The noun for including in the participle of acquisition and verb etc. are selected, and is filtered out in noun and verb according to tactic from high to low
The word of preceding presetting digit capacity, using noun as destination item main body, entity mould is established in description of the verb as relationship between main body
Type.
S250, the principal name and the public sentiment feature are imported into chart database to construct the public sentiment of the destination item
Relation map.
Wherein, chart database, also known as graphic data base, English are Graph Database, and graphic data base is NoSQL
One seed type of database, the relation information between its Graphics Application theory storage entity, common graphic data base include
Neo4j, FlockDB and AllegroGrap etc..In a graphic data base, there are mainly two types of the main compositions of database,
The relationship of node collection and connecting node, node collection are exactly a series of set of nodes in figure, and in graphic data base, each node is still
It is also both the node collection belonging to it with the label for indicating oneself affiliated entity type, and it is special to record a series of description nodes
The attribute of property, in addition to this it is possible to connect each node by relationship.
Specifically, by by natural language processing parse the corpus identify the destination item principal name and
Public sentiment feature is imported into chart database, improves the node of chart database and the data of connecting node relationship, wherein node is corresponding
Principal name and public sentiment feature, while the relationship between node being described.In design configuration database, section is formed by multiple nodes
Point set is associated between node by relationship, distinguishes figure interior joint collection, the correlation between node and node is being led
When entering data, graphic data base automatic identification imports the node data and relation data in data, by the node data and pass
Coefficient according to belonging on the corresponding position of graphic data base respectively.In this example, the principal name and the public sentiment is special
After sign imports chart database, the public sentiment relation map of the destination item can be constructed automatically, for example, if destination item is a
Mobile phone, the principal name for including in destination item is mobile phone title, the title of all parts or component in mobile phone, for example is formed
The title and camera title of the display screen of mobile phone, public sentiment are characterized in evaluation and description to the corresponding main body of principal name, than
Such as, the display screen of mobile phone is the description to mobile phone feature greatly, and three difference nodes are corresponding " mobile phone ", " display screen " and " big ", together
When by describing relationship by three nodes between "comprising" relationship between " mobile phone " and " display screen ", " display screen " and " big "
It is connected in turn to form public sentiment relation map.Pass through the public sentiment relation map of the destination item in the embodiment of the present application
Mode stores the dynamic public sentiment data of destination item, preferably can visualize and extract the public sentiment of destination item.
Further, the public sentiment relation map of the destination item includes the son of destination item title, the destination item
Project name and the corresponding public sentiment feature of sub-project title.
Wherein, the sub-project of destination item refers to the component part of destination item.If for example, the destination item is a
Mobile phone, then the components such as the display screen of the mobile phone, camera, battery and central processing unit are the sub-project of the mobile phone, sub-project
The title of corresponding model sub-project.
Specifically, the element in the public sentiment relation map of the destination item includes the public sentiment relational graph of the destination item
Principal name and public sentiment feature in spectrum.For example, have in the public sentiment relation map of the destination item principal names such as noun and
The description of the features such as adjective.Public sentiment feature, refers to the keyword of public sentiment, for example, clear and battery life length etc. of taking pictures.It is logical
The mode of the public sentiment relation map of destination item is crossed, destination item dynamic public sentiment data is stored, to realize public sentiment data preferably
It visualizes and extracts, by constructing the spectrum data of destination item, news corpus relevant to destination item library has been built, in mesh
Before the spectrum data visualization of mark project, it is also necessary to carry out time-sequencing to destination item public sentiment data, target item is set out
The earlier news data of the nearest ranking of mesh, for example, the components such as central processing unit used by retrieval A mobile phone products,
Then by the related object of traversal A node, related public sentiment of A product etc. can be got.Furthermore it is also possible to destination item
Related specific field carries out depth analysis, can also be into the relationship with customer of the product for example, if destination item is a product
Row analysis, needs to obtain public sentiment data relevant to target product from vendor attribute, and data are classified and gone
Weight is presented to user to carry out public sentiment monitoring, for example, the assessment to central processing unit in mobile phone products, can influence to product
Public sentiment, such as valiant imperial 820 heating problem of central processing unit of smart phone, on using valiant imperial 820 mobile phone influence just compared with
Greatly, it to the various favorable comments of valiant imperial 835 processor performance advantage, brings to the various favorable comments in the performance for using valiant imperial 835 mobile phone,
For example speed is fast and the public sentiment feature of the mobile phones such as power saving.
S260, the display public sentiment relation map.
Specifically, the public sentiment relation map of the destination item of building is shown that providing the user with makes by terminal
User realizes according to the public sentiment relation map of the destination item and monitors to the public sentiment of the destination item, so that destination item is supervised
Control personnel obtain the public sentiment conclusion of destination item according to the public sentiment relation map of destination item, realize and supervise to destination item public sentiment
Control, to do alignment processing to destination item public sentiment, for example, the positive information and reverse side information of destination item public sentiment can be obtained,
The event evaluation information and channel obtained in destination item public sentiment assesses information, to make corresponding public relations measure.
Further, can also public sentiment conclusion to obverse and reverse in the destination item public sentiment of acquisition according to different mechanisms
It is ranked up, positive public sentiment is made full use of to realize benefit, countermeasure is taken to reverse side public sentiment, eliminates negative influence,
For example, screen problem or battery problems etc. that a certain product occurs.
When the embodiment of the present application realizes the monitoring of project public sentiment, after the identification information for obtaining destination item, according to the mark
Know information and obtain the data source website list of the destination item by way of web search, according to the identification information from institute
The corpus that the destination item is crawled in the data source website that data source website list is included is stated, to realize that acquisition is more comprehensive
The corpus about destination item, the corpus is then parsed by natural language processing to identify master that the corpus is included
The principal name and the public sentiment feature are imported chart database to construct the destination item by body title and public sentiment feature
Public sentiment relation map visually shows the public sentiment relation map of the destination item, thus to target unit from subdivision angle
The public sentiment of realization project monitors, and increases the specific public sentiment of destination item, can be preferably for a certain specific item of item in target unit
Mesh realizes that the public sentiment for the project monitors, to judge the gain and loss superiority and inferiority of the project.Angle is segmented in target unit to improve
The public sentiment monitoring efficiency of degree.
Referring to Fig. 3, Fig. 3 is another flow diagram of project public sentiment monitoring method provided by the embodiments of the present application.
In this embodiment, the data source net for obtaining the destination item by way of web search according to the identification information
Stand list the step of after, further includes:
S221, the data source website list is updated by way of crawling.
Specifically, the crawler strategy that an automation increases data source is constructed, is crawled by depth and is obtained from internet
The more comprehensive data source of destination item.The crawler strategy for increasing data source can be automated, it is initial to refer to that the crawler receives
Change data source website after, more data source websites can be expanded according to the data source website of acquisition automatically with increase corpus come
Source, to obtain the more comprehensive corpus of destination item.In the present embodiment, it is possible to which automating the crawler strategy of increase data source is
Refer to crawler according to the type and web site structures feature of the data source website of acquisition, method by crawling is excavated and obtained
The related source of new data website of data source network address, for example have an identical suffix with the data source network address of acquisition, or with acquisition
Data source network address belong to the same type, for example belong to finance and economic website etc., thus from a finance and economic Fisher ruler to
Other finance and economic websites, due to belonging to finance and economic website, it is possible to exist for the same destination item from different perspectives into
The corpus that row is interpreted.Since related website can never especially when facing the hot issue of destination item each other
Same angle is interpreted and is reported to destination item, so that the website in data source website is constantly improve, abundant data source net
Data source in standing reaches increase data source, guarantees the basis of data volume.The related of destination item is obtained by data source website
Corpus, by data source abundant to obtain the comprehensive corpus abundant of destination item.Further, automation increases data source
Crawler strategy the effect that crawl data can be improved by distributed reptile system to construct real-time distributed crawler system
Rate.Specifically, server obtain destination item initial data source list of websites, by the initial data source list of websites according to
Preset condition is classified to obtain different types of data source website list, and the different types of data source website column are encapsulated
Different Docker containers is deployed on different servers to corresponding Docker container, starts the Docker and hold by table
Device by make the Docker container by crawling in a manner of obtain source of new data website, the source of new data website is added to pair
The initial data source list of websites answered is to update the data source website of the destination item.For example, one automation of building increases
The crawler strategy of data source is real-time distributed crawler system, and the crawler system can be according to the inventory of input, such as basis
The mark of website in the inventory of input, distinguishes the type of different web sites, according to the type of website, distributes inventory to each clothes
It is engaged in device, realizes that distributed data crawl and data loading, to improve the efficiency for crawling data.
Referring to Fig. 4, Fig. 4 is a sub- flow diagram of project public sentiment monitoring method provided by the embodiments of the present application.
As shown in figure 4, in this embodiment, the described the step of data source website list is updated by way of crawling, includes:
S2210, the initial data source list of websites for obtaining the destination item;
S2211, the initial data source list of websites is classified according to preset condition to obtain different types of number
According to source list of websites;
S2212, the encapsulation different types of data source website list to corresponding Docker container;
S2213, the starting Docker container by make the Docker container by crawling in a manner of obtain source of new data
Website;
S2214, the source of new data website is added separately to corresponding sorted data source website column according to type
Table is to update the data source website of the destination item.
Wherein, preset condition includes the conditions such as station address or data source, and station address refers to the system according to website
One Resource Locator (English be Uniform Resource Locator, be abbreviated as URL) is classified, due to different web sites
Anti- crawler strategy it is different, cause the data structure of webpage in website different, need for different websites with different
Strategy is crawled, is crawled for example, the news of Sina website is relatively good, is directly parsed with BeautifulSoup, progress directly crawls i.e.
Can, the title and content of Netease's news are using the asynchronous load of JS, and simple downloading web page source code is no title and interior
Hold, the content of needs can be found in the JS of Network, regular expression can be used to obtain the title of our needs
And its link, the news of today's tops is different with the first two, its title and link is encapsulated into Json file, still
The URL parameter of Json file is changed by a JS random algorithm, is needed to simulate the parameter of Json file, otherwise be can not find
The specific URL of Json file, website sources include financial web site, news website or forum etc..
Specifically, the initial data source list of websites of the destination item of configuration is obtained, crawler system is automatically according to described first
The preset condition of beginning data source website list classifies the initial data source list of websites to obtain different types of number
Data source website is divided into different type according to source list of websites, such as according to website logo, is then encapsulated different types of described
Data source website list to corresponding Docker container, the Docker container is deployed on different servers, starts institute
Docker container is stated so that the Docker container obtains source of new data website abundant by crawling, by the source of new data net
Station is added to corresponding initial data source list of websites to update the data source website of the destination item, to constantly improve mesh
The data source website of mark project.Specifically, including following sub-step:
Firstly, obtain initial list of websites, which can be by manual configuration, that is, by manually providing initial number
According to source website, it is also possible to be the list of websites searched according to identification information.
Secondly, by the way that by the crawler code wrap write, into Docker container, wherein code includes extracting website
The part of URL, while there are also matching URL and the corresponding code for crawling program, to keep URL automatically corresponding with program is crawled, lead to
Cross the website that corresponding crawlers crawl corresponding URL.Wherein, need to construct the index relative of URL and crawlers, in advance
The web crawlers of all URL types is carried out, so that different types of URL crawler corresponds to different crawlers.
Third starts container Docker1, and total input inventory is classified and divided by crawler code, by same class
Data source inventory saved, form list to be crawled, waiting crawls.Wherein, pass through the generation of starting URL classification and segmentation
Code, classifies to the website url list of input according to URL type, realizes that website url list carries out sort operation and then opens
Different data source inventories is divided into several lists, the Docker container on corresponding different machines by the code of dynamic list segmentation.
4th, start container Docker2, by the data source inventory list of acquisition, passes through the corresponding crawler journey of matching URL
Sequence, for example, the website X, corresponds to the code that the website X crawls and parses, the incoming website X can be crawled, and be visited external network
It asks, separately grabs corresponding data, and return data in database.
Further, crawlers excavate new URL according to the URL of acquisition, that is, crawlers pass through starting URL
New URL is excavated, and new URL is stored into url list to be crawled to improve url list.At the same time it can also check
Whether the case where reporting an error in data procedures is crawled, if the case where reporting an error, terminated for the process that crawls of this website.
Classify to URL, can be carried out by pre-set URL regular expression.Every class url list has correspondence
Regular expression, by judge the result returned whether be it is empty, to determine whether such URL.Deterministic process is as follows: if returning
Result non-empty is returned, then is judged as such URL, if judging result is sky, is judged as such non-URL.
5th, until all Docker2 list of websites to be crawled be sky, stop operation.In order to improve data source website
List can take the mode of timing or not timing to be repeated the above steps according to acquired data source website list, with reality
The update of existing data source website list.
Referring to Fig. 5, another sub-process that Fig. 5 is project public sentiment monitoring method provided by the embodiments of the present application is illustrated
Figure.As shown in figure 5, in this embodiment, it is described that the corpus is parsed to identify that the corpus is wrapped by natural language processing
The step of principal name and public sentiment feature for containing includes:
S2400, the corpus is split according to sentence separatrix to obtain sentence data collection.
Wherein, sentence separatrix include sentence marks and decompose word, the sentence marks include ".", "? ",
";" and "!" etc. punctuation marks, it is described decompose word include " ", " and ", " in ", " we " and " according to " etc. are pre-set can
Using the word or word separated as sentence.
Specifically, the corpus crawled by crawler system is separated according to sentence separatrix, obtains sentence data collection,
To filter out the sentence comprising title in subordinate clause Sub Data Set.
S2401, name physical model is constructed according to the corpus.
Wherein, name entity, English be Named Entity, so-called name entity be exactly name, mechanism name, place name with
And other all entities with entitled mark, wider entity further include number, date, currency, address etc..
Specifically, the building for naming physical model is named the mark of entity by the corpus content of acquisition, passed through
CRF model constructs Named Entity Extraction Model, identifies the principal name of destination item.Wherein, CRF model, CRF, English are
Conditional Random Field, condition random field are one of common algorithm of natural language processing field in recent years, base
In statistical model, CRF be substantially implicit variable Markov Chain and Observable state it is general to the condition for implying variable
Rate.
S2402, principal name included in the sentence data collection is identified by the name physical model.
Wherein, Entity recognition is named, English is Named Entity Recognition, abbreviation NER, also referred to as " proper name
Identification " refers to the entity with certain sense in identification text, mainly includes name, place name, mechanism name, proper noun etc..
Specifically, after the completion of the building of name physical model, the sentence data collection obtained by name physical model processing leads to
The subject name that sentence data concentration includes can be automatically identified by crossing name physical model.For example, passing through the corpus content
It is named the mark of entity, by CRF model, Named Entity Extraction Model is constructed, identifies company's principal name.Pass through life
Name physical model, identifies the sentence corpus in the relevant information of the destination item, carries out part of speech analysis to word and project is closed
The retrieval of key relationship, if there is the keyword of core, such as mobile phone products, such as battery durable is lasting, system is convenient for operating,
Relevant information then saves as to the specific object of destination item, at the same the specific object can also carry current date and when
Between, enrich the public sentiment data of the public sentiment relation map of destination item.
S2403, part of speech analysis and the retrieval of relationship by objective (RBO) are carried out to the corpus to obtain the public sentiment of the destination item
Feature.
Wherein, the characteristics of part of speech refers to using word is as parts of speech such as the bases, such as verb, noun of Part of Speech Division.Target is closed
System refers to the relationship between the main body that the destination item for including in the corpus is related to, for example, between mobile phone and mobile phone component
Subordinate relation etc., for example, the inclusion relation etc. between mobile phone and the camera of mobile phone.
Specifically, the identification of part of speech analysis and subjective relationship, including following procedure are carried out to the corpus
Firstly, being segmented to the corpus.Carrying out participle operation to statement type can be using stammerer participle.Wherein,
Stammerer participle is one of participle tool in Python, and it is many that tool is segmented in Python, including Pan Gu segments, Yaha is segmented,
Jieba participle, Tsing-Hua University THULAC etc..
Secondly, carrying out the extraction of Key Relationships.Specifically, the movement of verb is extracted, and carries out lists of keywords
It matches, if verb vocabulary in keyword, then regards as Key Relationships, and gets the subsequent noun object of verb, is
Naming relationship object gets the noun object before verb, is naming relationship main body, naming relationship main body i.e. target.
The relationship conduct between naming relationship main body, naming relationship object and naming relationship main body and naming relationship object that will acquire
In public sentiment feature, the principal name that the Key Relationships of extraction are related to and the characteristic deposit chart database for embodying attribute.
Further, described the step of constructing name physical model according to the corpus, includes:
1), the corpus is segmented to obtain word segmentation result;
2) characteristic in the word segmentation result, is extracted by preset feature templates;
3), based on the preset conditional random field models of characteristic training to construct name physical model.
Specifically, by the corpus building name physical model of acquisition, specifically includes the following steps:
Firstly, obtaining name entity training corpus, which mostlys come from crawler system and is obtained by way of crawling
Destination item corpus.
Secondly, being pre-processed to the corpus.It is main to be segmented using stammerer and remove stop words and meaningless word, it obtains
Take word segmentation result.
Third carries out feature extraction.Feature extraction, the spy of acquisition are carried out by the feature templates being made of regular expression
Sign includes word, part of speech, boundary word, name substance feature word.
4th, the model of creation and training based on condition random field.Condition random field i.e. CRF model, pass through training
Data train CRF model, obtain the parameter of CRF model, the CRF model after saving training.
5th, by the evaluation of test data, and the final satisfactory model such as retain discrimination height, to obtain building
Name physical model.
Referring to Fig. 6, the third sub-process that Fig. 6 is project public sentiment monitoring method provided by the embodiments of the present application is illustrated
Figure.In this embodiment, described that the corpus parsed by natural language processing to identify main body name that the corpus is included
Claim and the step of public sentiment feature includes:
S2500, the corpus is segmented to obtain the word lists of the corpus.
Specifically, the corpus is pre-processed, it is main that stop words and meaningless word are segmented and removed using stammerer,
Obtain word lists.
S2501, the Key Relationships in the word lists are extracted using the first regular expression to obtain public sentiment feature;
S2502, the name entity that the Key Relationships in the word lists are related to is extracted using the second regular expression
To obtain principal name.
Specifically, the extraction of Key Relationships is carried out using regular expression.Specifically, it is set out by regular expression extraction
The movement of word, and the matching of lists of keywords is carried out, if verb vocabulary in keyword, then regards as Key Relationships, and
And the subsequent noun object of verb is got, it is naming relationship object, gets the noun object before verb, is naming relationship
Main body, naming relationship main body i.e. target.Naming relationship main body, naming relationship object and the naming relationship main body that will acquire
Relationship between naming relationship object is as public sentiment feature.Wherein, regular expression, also known as regular expression, English are
Regular Expression, is often abbreviated as Regex, Regexp or RE in code, regular expression be usually used to retrieval,
Replace those texts for meeting some mode (rule).
Then, the Key Relationships are imported into chart database as public sentiment feature and the name entity as principal name
To construct the public sentiment relation map of the destination item.In design configuration database, figure interior joint collection, node and pass are distinguished
Connecting each other between system, when importing data, graphic data base automatic identification imports the node data and relationship number in data
According to the node data and relation data are belonged to respectively on the corresponding position of graphic data base.It in this example, will be described
After Key Relationships and the name entity import chart database, the public sentiment relation map of the destination item can be constructed automatically.
For example, Key Relationships to be constructed to the relationship between mobile phone and mobile phone component, and the description of the feature of component component section is imported into
In the attribute of point.Wherein, chart database, also known as graphic data base, English are Graph Database, and graphic data base is
One seed type of NoSQL database, the relation information between its Graphics Application theory storage entity, common graphic data base packet
Include Neo4j, FlockDB and AllegroGrap etc..
Please continue to refer to Fig. 3, as shown in figure 3, in this embodiment, the step of the display public sentiment relation map it
Afterwards, further includes:
S270, the element in the public sentiment relation map is combined to describe the mesh by written form according to preset order
The public sentiment of mark project.
Further, described to combine the element in the public sentiment relation map according to preset order to retouch by written form
The step of stating the public sentiment of the destination item include:
The element in the public sentiment relation map is combined according to preset order to describe the target item by written form
Purpose front public feelings information, reverse side public feelings information, event evaluation information and channel assess information.
Specifically, not only in the form of the public sentiment relation map of the destination item public sentiment of displaying target project to realize
The monitoring of destination item public sentiment, meanwhile, by combining the display format of text, provide the public sentiment relation map of the destination item
Public sentiment conclusion, for the reference of destination item public sentiment monitoring personnel.The public sentiment conclusion includes front public feelings information, the reverse side of public sentiment
Public feelings information, event evaluation information and channel assess information, wherein and the front public feelings information refers to the positive influences of public sentiment, than
Such as, for Mobile phone, the battery life of the mobile phone is long and appearance is beautiful etc., reverse side public feelings information refers to the reverse side shadow of public sentiment
It rings, for example, being directed to Mobile phone, mobile phone is too warm etc. when the battery life of the mobile phone is short and charges, and event evaluation information is
Refer to that the influence to event a certain in public sentiment carries out prediction and evaluation and estimation, for example, influence of the product assessment report to the product,
Channel assessment information refers to influence of the channel belonging to corpus source to the target, for example, the audient of different web sites, scale and shadow
Sound is all different, and estimation of the channel belonging to assessment event to object effects is needed, for example, microblogging, wechat circle of friends and forum
Influence to target is different.
The element in the public sentiment relation map of the destination item is combined according to preset order to describe by written form
It, can be according to the relation information between the entity stored in graphic data base, according to figure number when the public sentiment of the destination item
According to information characteristics of the library in design configuration database, connecting each other between figure interior joint collection and node and relationship is distinguished,
Then the relationship between node and node is come out by verbal description, the destination item is described by written form to realize
Public sentiment, to project public sentiment monitoring personnel with the prompt of character property.For example, if in the public sentiment relation map of the destination item,
Relationship subordinate relation between node A and B can be described as " section when describing the public sentiment of the destination item by written form
Point A is subordinated to node B ".It further, can also be further from the corpus of acquisition if obtaining node A influences the information of node B
Middle screening egress A influences the relevant information of node B, forms node A according to the regular expression or language model trained
The informative abstract for influencing node B, is supplied to project public sentiment monitoring personnel with written form, refers to for project public sentiment monitoring personnel,
For example, influence etc. of the battery of influence or mobile phone of the processor of mobile phone to mobile phone to mobile phone.Wherein, language model, such as
N-gram language model or neural network language model etc..
Further, in one embodiment, it by constructing destination item spectrum data, has built related to destination item
News corpus library, before visualization, it is also necessary to carry out time-sequencing to the public sentiment data of destination item, sequentially in time
Destination item news data in the top is set out, with further screening valid data, improves the treatment effeciency of data.Than
Such as, by the spectrum data of building Mobile phone, news corpus relevant to this mobile phone library has been built, before visualization,
It also needs to carry out time-sequencing to the public sentiment data of this mobile phone, the news number before the nearest ranking of this mobile phone is set out relatively
According to know nearest this mobile phone performance of greatest concern, for example taking pictures or battery capacity etc..
Furthermore it is also possible to which the association field to destination item carries out depth analysis.For example, being needed in product competition relationship
Public sentiment data relevant to destination item is obtained from competing product attribute, and classification and duplicate removal are carried out to data, is presented to user.
Further, the public sentiment of destination item is obtained, realizes and the public sentiment of the destination item is monitored, it can further root
Reply processing is done according to the public sentiment of destination item, realizes maintenance destination item, for example, the image product public relations of enterprise is realized, with dimension
Protect the image and interests of enterprise.For example, if destination item is Mobile phone, the positive information of the public sentiment of this mobile phone and anti-is obtained
Face information obtains event evaluation information and channel assessment information in this mobile phone public sentiment, to make corresponding public relations measure.
It should be noted that project public sentiment monitoring method described in above-mentioned each embodiment, can according to need will be different
The technical characteristic for including in embodiment re-starts combination, with obtain combination after embodiment, but all this application claims
Within protection scope.
Referring to Fig. 7, Fig. 7 is the schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application.Correspond to
Above-mentioned project public sentiment monitoring method, the embodiment of the present application also provide a kind of project public sentiment monitoring device.As shown in fig. 7, the project
Public sentiment monitoring device includes the unit for executing above-mentioned project public sentiment monitoring method, which can be configured in the meter such as terminal
It calculates in machine equipment.Specifically, referring to Fig. 7, the project public sentiment monitoring device 700 is obtained including first acquisition unit 701, second
Unit 702 crawls unit 703, recognition unit 704, construction unit 705 and display unit 706.
Wherein, first acquisition unit 701, for obtaining the identification information of destination item by predetermined manner;
Second acquisition unit 702, for obtaining the target item by way of web search according to the identification information
Purpose data source website list;
Unit 703 is crawled, the data source net for being included from the data source website list according to the identification information
The corpus of the destination item is crawled in standing;
Recognition unit 704, for parsing the corpus by natural language processing to identify master that the corpus is included
Body title and public sentiment feature;
Construction unit 705, for the principal name and the public sentiment feature to be imported chart database to construct the mesh
The public sentiment relation map of mark project;
Display unit 706, for showing the public sentiment relation map.
Referring to Fig. 8, Fig. 8 is another schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application.
As shown in figure 8, in this embodiment, the project public sentiment monitoring device 700 further include:
Updating unit 707, for updating the data source website list by way of crawling.
Please continue to refer to Fig. 8, as shown in figure 8, the updating unit 707 includes:
Subelement 7071 is obtained, for obtaining the initial data source list of websites of the destination item;
Classification subelement 7072, for classifying the initial data source list of websites according to preset condition to obtain
Different types of data source website list;
Subelement 7073 is encapsulated, is held for encapsulating the different types of data source website list to corresponding Docker
Device;
Crawl subelement 7074, for start the Docker container by make the Docker container by crawling in a manner of
Obtain source of new data website;
Subelement 7075 is updated, it is corresponding sorted for the source of new data website to be added separately to according to type
Data source website list is to update the data source website of the destination item.
In one embodiment, the public sentiment relation map of the destination item includes destination item title, the target item
Purpose sub-project title and the corresponding public sentiment feature of sub-project title.
Referring to Fig. 8, as shown in figure 8, in this embodiment, the recognition unit 704 includes:
Divide subelement 7041, for being split the corpus according to sentence separatrix to obtain sentence data collection;
Subelement 7042 is constructed, for constructing name physical model according to the corpus;
Subelement 7043 is identified, for identifying included in the sentence data collection by the name physical model
Principal name;
Subelement 7044 is retrieved, for carrying out part of speech analysis and the retrieval of relationship by objective (RBO) to the corpus to obtain the mesh
The public sentiment feature of mark project.
Referring to Fig. 8, as shown in figure 8, in this embodiment, the project public sentiment monitoring device 700 further include:
Unit 708 is described, for combining the element in the public sentiment relation map according to preset order by text shape
Formula describes the public sentiment of the destination item.
In one embodiment, the description unit 708, for being combined in the public sentiment relation map according to preset order
Element to describe the front public feelings information, reverse side public feelings information, event evaluation information of the destination item by written form
Information is assessed with channel.
It should be noted that it is apparent to those skilled in the art that, above-mentioned project public sentiment monitoring device
It, can be for convenience of description and simple with reference to the corresponding description in preceding method embodiment with the specific implementation process of each unit
Clean, details are not described herein.
Meanwhile in above-mentioned project public sentiment monitoring device the division of each unit and connection type be only used for for example,
In other embodiments, project public sentiment monitoring device can be divided into different units as required, project public sentiment can also be monitored
Each unit takes the different order of connection and mode in device, to complete all or part of function of above-mentioned project public sentiment monitoring device
Energy.
Above-mentioned project public sentiment monitoring device can be implemented as a kind of form of computer program, which can be
It is run in computer equipment as shown in Figure 9.
Referring to Fig. 9, Fig. 9 is a kind of schematic block diagram of computer equipment provided by the embodiments of the present application.The computer
Equipment 900 can be desktop computer, and perhaps the computer equipments such as server are also possible to component or portion in other equipment
Part.
Refering to Fig. 9, which includes processor 902, memory and the net connected by system bus 901
Network interface 905, wherein memory may include non-volatile memory medium 903 and built-in storage 904.
The non-volatile memory medium 903 can storage program area 9031 and computer program 9032.The computer program
9032 are performed, and processor 902 may make to execute a kind of above-mentioned project public sentiment monitoring method.
The processor 902 is for providing calculating and control ability, to support the operation of entire computer equipment 900.
The built-in storage 904 provides environment for the operation of the computer program 9032 in non-volatile memory medium 903, should
When computer program 9032 is executed by processor 902, processor 902 may make to execute a kind of above-mentioned project public sentiment monitoring method.
The network interface 905 is used to carry out network communication with other equipment.It will be understood by those skilled in the art that in Fig. 9
The structure shown, only the block diagram of part-structure relevant to application scheme, does not constitute and is applied to application scheme
The restriction of computer equipment 900 thereon, specific computer equipment 900 may include more more or fewer than as shown in the figure
Component perhaps combines certain components or with different component layouts.For example, in some embodiments, computer equipment can
Only to include memory and processor, in such embodiments, reality shown in the structure and function and Fig. 9 of memory and processor
It is consistent to apply example, details are not described herein.
Wherein, the processor 902 is for running computer program 9032 stored in memory, to realize following step
It is rapid: the identification information of destination item is obtained by predetermined manner;It is obtained by way of web search according to the identification information
The data source website list of the destination item;The data for being included from the data source website list according to the identification information
The corpus of the destination item is crawled in the website of source;The corpus is parsed by natural language processing to identify that the corpus is wrapped
The principal name and public sentiment feature contained;The principal name and the public sentiment feature are imported into chart database to construct the target
The public sentiment relation map of project;Show the public sentiment relation map.
In one embodiment, the processor 902 is realizing the side for passing through web search according to the identification information
After formula obtains the step of data source website list of the destination item, also perform the steps of
The data source website list is updated by way of crawling.
In one embodiment, the processor 902 described updates the data source website realizing by way of crawling
When the step of list, following steps are implemented:
Obtain the initial data source list of websites of the destination item;
The initial data source list of websites is classified according to preset condition to obtain different types of data source net
It stands list;
The different types of data source website list is encapsulated to corresponding Docker container;
Start the Docker container by make the Docker container by crawling in a manner of obtain source of new data website;
The source of new data website is added separately to corresponding sorted data source website list according to type with more
The data source website of the new destination item.
In one embodiment, the processor 902 is when realizing the public sentiment relation map of the destination item, the public sentiment
Relation map specifically includes the following contents: destination item title, the sub-project title of the destination item and sub-project title pair
The public sentiment feature answered.
In one embodiment, the processor 902 described parses the corpus by natural language processing to know realizing
When the step for the principal name and public sentiment feature that the not described corpus is included, following steps are implemented:
The corpus is split according to sentence separatrix to obtain sentence data collection;
Name physical model is constructed according to the corpus;
Principal name included in the sentence data collection is identified by the name physical model;
Part of speech analysis and the retrieval of relationship by objective (RBO) are carried out to obtain the public sentiment feature of the destination item to the corpus.
In one embodiment, the processor 902 is after the step of realizing the display public sentiment relation map, also
It performs the steps of
The element in the public sentiment relation map is combined according to preset order to describe the target item by written form
Purpose public sentiment.
In one embodiment, the processor 902 is described according to the preset order combination public sentiment relation map in realization
In step of the element to describe the public sentiment of the destination item by written form when, implement following steps:
The element in the public sentiment relation map is combined according to preset order to describe the target item by written form
Purpose front public feelings information, reverse side public feelings information, event evaluation information and channel assess information.
It should be appreciated that in the embodiment of the present application, processor 902 can be central processing unit (Central
Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital
Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit,
ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic
Device, discrete gate or transistor logic, discrete hardware components etc..Wherein, general processor can be microprocessor or
Person's processor is also possible to any conventional processor etc..
Those of ordinary skill in the art will appreciate that be realize above-described embodiment method in all or part of the process,
It is that can be completed by computer program, which can be stored in a computer readable storage medium.The computer
Program is executed by least one processor in the computer system, to realize the process step of the embodiment of the above method.
Therefore, the application also provides a kind of computer readable storage medium.The computer readable storage medium can be non-
The computer readable storage medium of volatibility, the computer-readable recording medium storage have computer program, the computer program
Processor is set to execute following steps when being executed by processor:
A kind of computer program product, when run on a computer, so that computer executes in the above various embodiments
The step of described project public sentiment monitoring method.
The computer readable storage medium can be the internal storage unit of aforementioned device, such as the hard disk or interior of equipment
It deposits.What the computer readable storage medium was also possible to be equipped on the External memory equipment of the equipment, such as the equipment
Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge
Deposit card (Flash Card) etc..Further, the computer readable storage medium can also both include the inside of the equipment
Storage unit also includes External memory equipment.
It is apparent to those skilled in the art that for convenience of description and succinctly, foregoing description is set
The specific work process of standby, device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
The computer readable storage medium can be USB flash disk, mobile hard disk, read-only memory (Read-Only Memory,
ROM), the various computer readable storage mediums that can store program code such as magnetic or disk.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware
With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This
A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially
Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not
It is considered as beyond scope of the present application.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary.For example, the division of each unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation.Such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.
Step in the embodiment of the present application method can be sequentially adjusted, merged and deleted according to actual needs.This Shen
Please the unit in embodiment device can be combined, divided and deleted according to actual needs.In addition, in each implementation of the application
Each functional unit in example can integrate in one processing unit, is also possible to each unit and physically exists alone, can also be with
It is that two or more units are integrated in one unit.
If the integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product,
It can store in one storage medium.Based on this understanding, the technical solution of the application is substantially in other words to existing skill
The all or part of part or the technical solution that art contributes can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that an electronic equipment (can be individual
Computer, terminal or network equipment etc.) execute each embodiment the method for the application all or part of the steps.
The above, the only specific embodiment of the application, but the bright protection scope of the application is not limited thereto, and is appointed
What those familiar with the art within the technical scope of the present application, can readily occur in various equivalent modifications or
Replacement, these modifications or substitutions should all cover within the scope of protection of this application.Therefore, the protection scope Ying Yiquan of the application
Subject to the protection scope that benefit requires.
Claims (10)
1. a kind of project public sentiment monitoring method, which is characterized in that the described method includes:
The identification information of destination item is obtained by predetermined manner;
The data source website list of the destination item is obtained by way of web search according to the identification information;
The destination item is crawled from the data source website that the data source website list is included according to the identification information
Corpus;
The corpus is parsed by natural language processing to identify principal name and public sentiment feature that the corpus is included;
The principal name and the public sentiment feature are imported into chart database to construct the public sentiment relation map of the destination item;
Show the public sentiment relation map.
2. project public sentiment monitoring method according to claim 1, which is characterized in that described to pass through net according to the identification information
After the step of mode of network search obtains the data source website list of the destination item, further includes:
The data source website list is updated by way of crawling.
3. project public sentiment monitoring method according to claim 2, which is characterized in that it is described updated by way of crawling it is described
The step of data source website list includes:
Obtain the initial data source list of websites of the destination item;
The initial data source list of websites is classified according to preset condition to obtain different types of data source website column
Table;
The different types of data source website list is encapsulated to corresponding Docker container;
Start the Docker container by make the Docker container by crawling in a manner of obtain source of new data website;
The source of new data website is added separately to corresponding sorted data source website list according to type to update
State the data source website of destination item.
4. project public sentiment monitoring method according to claim 1, which is characterized in that the public sentiment relation map of the destination item
Including destination item title, the sub-project title of the destination item and the corresponding public sentiment feature of sub-project title.
5. project public sentiment monitoring method according to claim 1, which is characterized in that described to parse institute by natural language processing
Predicate material includes: the step of principal name and public sentiment feature that the corpus is included to identify
The corpus is split according to sentence separatrix to obtain sentence data collection;
Name physical model is constructed according to the corpus;
Principal name included in the sentence data collection is identified by the name physical model;
Part of speech analysis and the retrieval of relationship by objective (RBO) are carried out to obtain the public sentiment feature of the destination item to the corpus.
6. project public sentiment monitoring method according to claim 1, which is characterized in that the display public sentiment relation map
After step, further includes:
The element in the public sentiment relation map is combined according to preset order to describe the destination item by written form
Public sentiment.
7. project public sentiment monitoring method according to claim 6, which is characterized in that the public sentiment of the destination item includes front
Public feelings information, reverse side public feelings information, event evaluation information and channel assess information.
8. a kind of project public sentiment monitoring device characterized by comprising
First acquisition unit, for obtaining the identification information of destination item by predetermined manner;
Second acquisition unit, for obtaining the data of the destination item by way of web search according to the identification information
Source list of websites;
Unit is crawled, for crawling from the data source website that the data source website list is included according to the identification information
The corpus of the destination item;
Recognition unit, for parsing the corpus by natural language processing with identify principal name that the corpus is included and
Public sentiment feature;
Construction unit, for the principal name and the public sentiment feature to be imported chart database to construct the destination item
Public sentiment relation map;
Display unit, for showing the public sentiment relation map.
9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and is connected with the memory
Processor;The memory is for storing computer program;The processor is based on running and storing in the memory
Calculation machine program, to execute as described in claim any one of 1-7 the step of project public sentiment monitoring method.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer journey
Sequence, the computer program make the processor execute the project as described in any one of claim 1-7 when being executed by processor
The step of public sentiment monitoring method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910270796.5A CN110134845A (en) | 2019-04-04 | 2019-04-04 | Project public sentiment monitoring method, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910270796.5A CN110134845A (en) | 2019-04-04 | 2019-04-04 | Project public sentiment monitoring method, device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110134845A true CN110134845A (en) | 2019-08-16 |
Family
ID=67569394
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910270796.5A Pending CN110134845A (en) | 2019-04-04 | 2019-04-04 | Project public sentiment monitoring method, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110134845A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111143336A (en) * | 2019-11-27 | 2020-05-12 | 三盟科技股份有限公司 | College scientific research data management-oriented web crawler management method and platform |
CN111611408A (en) * | 2020-05-27 | 2020-09-01 | 北京明略软件系统有限公司 | Public opinion analysis method and device, computer equipment and storage medium |
CN111666426A (en) * | 2020-06-10 | 2020-09-15 | 北京海致星图科技有限公司 | Method, system and equipment for acquiring knowledge graph multi-scene graph data |
CN111858959A (en) * | 2020-07-23 | 2020-10-30 | 平安付科技服务有限公司 | Method and device for generating component relation map, computer equipment and storage medium |
CN112069381A (en) * | 2020-09-27 | 2020-12-11 | 中国科学院深圳先进技术研究院 | Monitoring management method and system based on natural language processing technology |
CN113657547A (en) * | 2021-08-31 | 2021-11-16 | 平安医疗健康管理股份有限公司 | Public opinion monitoring method based on natural language processing model and related equipment thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708096A (en) * | 2012-05-29 | 2012-10-03 | 代松 | Network intelligence public sentiment monitoring system based on semantics and work method thereof |
WO2018023981A1 (en) * | 2016-08-03 | 2018-02-08 | 平安科技(深圳)有限公司 | Public opinion analysis method, device, apparatus and computer readable storage medium |
CN108874878A (en) * | 2018-05-03 | 2018-11-23 | 众安信息技术服务有限公司 | A kind of building system and method for knowledge mapping |
CN109471937A (en) * | 2018-10-11 | 2019-03-15 | 平安科技(深圳)有限公司 | A kind of file classification method and terminal device based on machine learning |
-
2019
- 2019-04-04 CN CN201910270796.5A patent/CN110134845A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708096A (en) * | 2012-05-29 | 2012-10-03 | 代松 | Network intelligence public sentiment monitoring system based on semantics and work method thereof |
WO2018023981A1 (en) * | 2016-08-03 | 2018-02-08 | 平安科技(深圳)有限公司 | Public opinion analysis method, device, apparatus and computer readable storage medium |
CN108874878A (en) * | 2018-05-03 | 2018-11-23 | 众安信息技术服务有限公司 | A kind of building system and method for knowledge mapping |
CN109471937A (en) * | 2018-10-11 | 2019-03-15 | 平安科技(深圳)有限公司 | A kind of file classification method and terminal device based on machine learning |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111143336A (en) * | 2019-11-27 | 2020-05-12 | 三盟科技股份有限公司 | College scientific research data management-oriented web crawler management method and platform |
CN111611408A (en) * | 2020-05-27 | 2020-09-01 | 北京明略软件系统有限公司 | Public opinion analysis method and device, computer equipment and storage medium |
CN111666426A (en) * | 2020-06-10 | 2020-09-15 | 北京海致星图科技有限公司 | Method, system and equipment for acquiring knowledge graph multi-scene graph data |
CN111858959A (en) * | 2020-07-23 | 2020-10-30 | 平安付科技服务有限公司 | Method and device for generating component relation map, computer equipment and storage medium |
CN112069381A (en) * | 2020-09-27 | 2020-12-11 | 中国科学院深圳先进技术研究院 | Monitoring management method and system based on natural language processing technology |
CN113657547A (en) * | 2021-08-31 | 2021-11-16 | 平安医疗健康管理股份有限公司 | Public opinion monitoring method based on natural language processing model and related equipment thereof |
CN113657547B (en) * | 2021-08-31 | 2024-05-14 | 平安医疗健康管理股份有限公司 | Public opinion monitoring method based on natural language processing model and related equipment thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109614550A (en) | Public sentiment monitoring method, device, computer equipment and storage medium | |
CN110134845A (en) | Project public sentiment monitoring method, device, computer equipment and storage medium | |
CN110532451A (en) | Search method and device for policy text, storage medium, electronic device | |
JP5721818B2 (en) | Use of model information group in search | |
CN110110156A (en) | Industry public sentiment monitoring method, device, computer equipment and storage medium | |
KR101419504B1 (en) | System and method providing a suited shopping information by analyzing the propensity of an user | |
CN108874992A (en) | The analysis of public opinion method, system, computer equipment and storage medium | |
CN109446341A (en) | The construction method and device of knowledge mapping | |
JP4637969B1 (en) | Properly understand the intent of web pages and user preferences, and recommend the best information in real time | |
CN109614476A (en) | Customer service system answering method, device, computer equipment and storage medium | |
CN107578292B (en) | User portrait construction system | |
CN107704503A (en) | User's keyword extracting device, method and computer-readable recording medium | |
CN109684483A (en) | Construction method, device, computer equipment and the storage medium of knowledge mapping | |
CN110263248A (en) | A kind of information-pushing method, device, storage medium and server | |
CN103729359A (en) | Method and system for recommending search terms | |
CN102890702A (en) | Internet forum-oriented opinion leader mining method | |
CN105843796A (en) | Microblog emotional tendency analysis method and device | |
CN106503025A (en) | Method and system is recommended in a kind of application | |
CN110019616A (en) | A kind of POI trend of the times state acquiring method and its equipment, storage medium, server | |
CN110134844A (en) | Subdivision field public sentiment monitoring method, device, computer equipment and storage medium | |
CN108694647A (en) | A kind of method for digging and device of trade company's rationale for the recommendation, electronic equipment | |
CN106909663A (en) | Based on tagging user Brang Preference behavior prediction method and its device | |
CN104331438B (en) | To novel web page contents selectivity abstracting method and device | |
CN109033282A (en) | A kind of Web page text extracting method and device based on extraction template | |
CN105069077A (en) | Search method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |