CN110134845A - Project public sentiment monitoring method, device, computer equipment and storage medium - Google Patents

Project public sentiment monitoring method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN110134845A
CN110134845A CN201910270796.5A CN201910270796A CN110134845A CN 110134845 A CN110134845 A CN 110134845A CN 201910270796 A CN201910270796 A CN 201910270796A CN 110134845 A CN110134845 A CN 110134845A
Authority
CN
China
Prior art keywords
public sentiment
destination item
corpus
data source
project
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910270796.5A
Other languages
Chinese (zh)
Inventor
吴壮伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910270796.5A priority Critical patent/CN110134845A/en
Publication of CN110134845A publication Critical patent/CN110134845A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present application provides a kind of project public sentiment monitoring method, device, computer equipment and computer readable storage medium.The embodiment of the present application belongs to data display technique field, when realizing the monitoring of project public sentiment, after the identification information for obtaining destination item, the data source website list of destination item is obtained by way of web search according to identification information, the corpus of destination item is crawled from the data source website that the data source website list is included according to identification information, realize the corpus obtained about destination item, then corpus is parsed to identify principal name and public sentiment feature that corpus is included by natural language processing, the principal name and the public sentiment feature are imported into chart database to construct the public sentiment relation map of the destination item, visually show the public sentiment relation map of the destination item, to realize that the public sentiment of project monitors from subdivision angle to target unit, to improve the public sentiment monitoring efficiency of subdivision angle in target unit.

Description

Project public sentiment monitoring method, device, computer equipment and storage medium
Technical field
This application involves data display technique fields more particularly to a kind of project public sentiment monitoring method, device, computer to set Standby and computer readable storage medium.
Background technique
Enterprise's public feelings information is the content doing business connection project at present and being all related to, and is all taken in traditional technology The method of the acquisition appointed website data of orientation, such as financial web site.But the message reflection acquired in this way is that an enterprise is whole The public opinion situation of body, the if desired enterprise's public sentiment of enterprise in a certain respect, such as a product of enterprise, an investment or The public sentiment of the contents such as one advertisement marketing is needed to be screened from a large amount of whole public feelings information data, and is filtered out Content it is not accurate enough, cause obtain this aspect public sentiment inefficient problem.
Summary of the invention
The embodiment of the present application provides a kind of project public sentiment monitoring method, device, computer equipment and computer-readable deposits Storage media is able to solve the problem that destination item public sentiment monitoring efficiency is not high in traditional technology.
In a first aspect, the embodiment of the present application provides a kind of project public sentiment monitoring method, which comprises by default Mode obtains the identification information of destination item;The destination item is obtained by way of web search according to the identification information Data source website list;It is crawled from the data source website that the data source website list is included according to the identification information The corpus of the destination item;The corpus is parsed by natural language processing to identify principal name that the corpus is included And public sentiment feature;The principal name and the public sentiment feature are imported into chart database to construct the public sentiment of the destination item and close It is map;Show the public sentiment relation map.
Second aspect, the embodiment of the present application also provides a kind of project public sentiment monitoring devices, comprising: first acquisition unit, For obtaining the identification information of destination item by predetermined manner;Second acquisition unit, for being passed through according to the identification information The mode of web search obtains the data source website list of the destination item;Unit is crawled, for according to the identification information The corpus of the destination item is crawled from the data source website that the data source website list is included;Recognition unit is used for The corpus is parsed by natural language processing to identify principal name and public sentiment feature that the corpus is included;Building is single Member, for the principal name and the public sentiment feature to be imported chart database to construct the public sentiment relational graph of the destination item Spectrum;Display unit, for showing the public sentiment relation map.
The third aspect, the embodiment of the present application also provides a kind of computer equipments comprising memory and processor, it is described Computer program is stored on memory, the processor realizes the project public sentiment monitoring side when executing the computer program Method.
Fourth aspect, it is described computer-readable to deposit the embodiment of the present application also provides a kind of computer readable storage medium Storage media is stored with computer program, and the computer program makes the processor execute the project carriage when being executed by processor Feelings monitoring method.
The embodiment of the present application provides a kind of project public sentiment monitoring method, device, computer equipment and computer-readable deposits Storage media.When the embodiment of the present application realizes the monitoring of project public sentiment, after the identification information for obtaining destination item, according to the mark Information obtains the data source website list of the destination item by way of web search, according to the identification information from described The corpus of the destination item is crawled in the data source website that data source website list is included, to realize that acquisition is more comprehensive About the corpus of destination item, the corpus is then parsed by natural language processing to identify main body that the corpus is included The principal name and the public sentiment feature are imported chart database to construct the carriage of the destination item by title and public sentiment feature Feelings relation map visually shows the public sentiment relation map of the destination item, thus real from subdivision angle to target unit The public sentiment monitoring of existing project, to improve the public sentiment monitoring efficiency of subdivision angle in target unit.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in embodiment description Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, general for this field For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the application scenarios schematic diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 2 is the flow diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 3 is another flow diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 4 is a sub- flow diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 5 is another sub-process schematic diagram of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 6 is the sub- flow diagram of third of project public sentiment monitoring method provided by the embodiments of the present application;
Fig. 7 is the schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application;
Fig. 8 is another schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application;And
Fig. 9 is the schematic block diagram of computer equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall in the protection scope of this application.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded Body, step, operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this present specification merely for the sake of description specific embodiment And be not intended to limit the application.As present specification and it is used in the attached claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in present specification and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
Referring to Fig. 1, Fig. 1 is the application scenarios schematic diagram of project public sentiment monitoring method provided by the embodiments of the present application.This The project public sentiment monitoring method that application embodiment provides can be applied in terminal shown in FIG. 1, soft in terminal by being installed on Part come the step of realizing the project public sentiment monitoring method, wherein the terminal can for laptop, tablet computer or The electronic equipments such as desktop computer.Project public sentiment monitoring method provided by the embodiments of the present application the specific implementation process is as follows: terminal The identification information of destination item is obtained by predetermined manner;It is obtained by way of web search according to the identification information described The data source website list of destination item;The data source net for being included from the data source website list according to the identification information The corpus of the destination item is crawled in standing;The corpus is parsed by natural language processing to identify that the corpus included Principal name and public sentiment feature;The principal name and the public sentiment feature are imported into chart database to construct the destination item Public sentiment relation map;Show the public sentiment relation map.
It should be noted that only illustrating desktop computer as terminal, in the actual operation process, terminal in Fig. 1 Type is not limited to shown in Fig. 1, and the terminal can also be the electronic equipments such as mobile phone, laptop or tablet computer, on The application scenarios for stating project public sentiment monitoring method are merely illustrative technical scheme, are not used to limit present techniques Scheme.
Fig. 2 is the schematic flow chart of project public sentiment monitoring method provided by the embodiments of the present application.Project public sentiment monitoring Method is applied in the terminal of Fig. 1, with all or part of function of finished item public sentiment monitoring method.
Referring to Fig. 2, Fig. 2 is the flow diagram of project public sentiment monitoring method provided by the embodiments of the present application.Such as Fig. 2 institute Show, this approach includes the following steps S210-S260:
S210, the identification information that destination item is obtained by predetermined manner.
Wherein, predetermined manner, which refers to, manually in such a way that input equipment inputs or passes through natural language processingTarget item Purpose corpus is to obtain vocabularyAndTo the vocabulary of acquisitionThe mode screened.
Destination item refers to that enterprise or its hetero-organization determine the project of monitoring public sentiment, for example, the target item of an enterprise Content in Mu Zhi enterprise in a certain respect, for example, in a product of enterprise, marketing, investment, an event etc. Hold, is the content of one subdivision aspect of enterprise.
The identification information of destination item refers to the identification information for the destination item main contents, is to the target The description of project key content, for example, being directed to a product, the title of the product, each attribute etc., such as Mobile phone etc..
Specifically, the identification information of the destination item can receive the information of input by input equipment, to obtain The identification information of the project is to carry out public sentiment monitoring to the project, for example, obtaining the brand name of Mobile phone, model, processing The description of the performances such as device and continuation of the journey.In addition, the identification information of destination item can also be obtained by way of natural language processing, from Initial corpus is crawled in the corpus source of destination item, the initial corpus such as is segmented and screened at the natural language processings, sieve The identification information for selecting the current concerned destination item of destination item, crawls more the destination item further according to identification information More comprehensive corpus carries out accurate public sentiment monitoring to destination item by more corpus.For example, terminal crawls an enterprise The data of one predetermined period of industry are filtered out the current hot spot target project of the enterprise from corpus by natural language processing, obtained The identification information of the destination item is taken, that is, filters out the public sentiment for the Hot events that enterprise is currently concerned, according to the mark Information crawls the data for the destination item that the identification information is related to, which is further analyzed, and obtains the destination item Public sentiment, so that enterprise is referred to.
S220, the data source website column for obtaining the destination item by way of web search according to the identification information Table.
Specifically, the identification information that terminal obtains destination item passes through search according to the identification information of the destination item The data source website for obtaining the corpus source that the destination item is related to, the corpus source i.e. destination item, if comprising multiple Data source website is then data source website list.For example, enterprise is in order to targetedly understand the carriage on one product item of enterprise Feelings, that is, understand the evaluation of the outer bound pair product, it needs to carry out public sentiment monitoring, the key of the available product to the product Word, such as brand name, the model of the product etc. obtain the data source that the product is related to according to the keyword, certainly, The data source is also possible to the network address being manually entered, and according to the network address and keyword of input, acquisition forms the product public sentiment and relates to And corpus, the corpus include news, product introduction, product comment and comment etc., can be in the form of article or sentence etc. Description is present in the website of each type.For example, enterprise in order to realize Mobile phone product public sentiment monitoring, to improve mobile phone The marketing and design in product later period etc. need to obtain the keys such as the keyword of mobile phone products, such as mobile phone title, mobile phone model Word obtains the data source comprising mobile phone products according to the keyword of mobile phone products, obtains from data source and produces for this mobile phone The corpus of product public sentiment.
S230, crawled from the data source website that the data source website list is included according to the identification information it is described The corpus of destination item.
Wherein, it crawls and refers to and crawled by crawler, crawler refers to web crawlers, and web crawlers is otherwise known as webpage spider Spider, network robot or webpage follower etc., be it is a kind of according to it is certain rule automatically grab web message program or Person's script.
Specifically, to implement the public sentiment monitoring to destination item, crawl target item on internet by constructing crawler system The related corpus of purpose passes through the public sentiment relation map by the public sentiment relation map of the parsing building destination item to corpus The public sentiment of destination item is obtained to realize the public sentiment monitoring to destination item.Webpage is automatically extracted since web crawlers is one Program, crawl include in the data source website destination item corpus, can only be crawled by crawlers and target The related data of project obtain the identification information of destination item by predetermined manner, obtain destination item by identification information After data source website list, crawler system is according to the data source website list of destination item, by crawling available data source The rich language material of destination item in website.
Further, the data that corpus source includes can also be screened, according to the data acquisition target item filtered out Public sentiment in mesh public sentiment in a certain respect, for example, for a certain mobile phone products camera evaluate, battery durable, processor or The data such as the superiority and inferiority of system are screened to form corresponding public sentiment.
S240, the corpus is parsed by natural language processing to identify principal name and public sentiment that the corpus is included Feature.
Wherein, the destination item refers to that enterprise or its hetero-organization determine that the content of monitoring public sentiment is whole, for example, one The destination item of a enterprise can be the contents such as a product of enterprise, marketing, investment, an event, be enterprise The content of one subdivision aspect is whole.The object for including in the principal name finger speech material, the object are the target item destination name Title of part is respectively formed in title and the destination item, for example, principal name includes mobile phone A if a destination item is mobile phone A Title and composition mobile phone A all parts or component title, due to including the title and composition of mobile phone A in corpus The title of the display screen B of mobile phone A and the title of camera C, during identification, each main body is cannot be distinguished in computer equipment Relationship between title, such as display screen B and camera C are subordinated to mobile phone A, but can identify the hand that the corpus is included The title of machine A, and form the title of the display screen B of mobile phone A and the entitled principal name of camera C.
Public sentiment feature refers to the keyword of destination item public sentiment, is the feature description of evaluation goal project, for describing mesh Principal name corresponds to relationship between the attribute and main body of main body in mark project, for example, if destination item is Mobile phone, target item The principal name for including in mesh is mobile phone title, the title of all parts or component in mobile phone, such as the display of composition mobile phone The title and camera title of screen, public sentiment is characterized in evaluation and description to the corresponding main body of principal name, for example, mobile phone is matched The description for the embodiment mobile phone features such as effect of setting that high, display screen is big or camera is taken pictures is good.It should be noted that principal name is Refer to that the mark for distinguishing main body, principal name there can be various forms of statements, for example, mobile phone is directed to, in addition to mobile phone brand conduct Outside mobile phone title, the concrete model or code name of mobile phone brand can also be used as mobile phone title.
Specifically, the corpus is parsed by natural language processing, refers to and carries out the corpus according to sentence separatrix Segmentation constructs name physical model to obtain sentence data collection, according to the corpus, is identified by the name physical model Principal name included in the sentence data collection carries out part of speech analysis and the retrieval of relationship by objective (RBO) to the corpus to obtain The public sentiment feature of the destination item.For example, parsing the corpus of acquisition by natural language processing technique, identifying mobile phone name Claim information and correlated characteristic description, important data source is provided for destination item public sentiment.Wherein, Entity recognition is named, English is Named Entity Recognition, abbreviation NER, also referred to as " proper name identification " refer in identification text there is certain sense Entity, mainly include name, place name, mechanism name, proper noun etc..In general, the naming Entity recognition of the task is exactly to know It Chu not three categories (entity class, time class and numeric class), seven groups (name, mechanism name, place name, time, day in text to be processed Phase, currency and percentage) name entity, Chinese name physical model includes CRF model and the BiLSTM-CRF model based on word. The public feelings information in relation to destination item is obtained by natural language processing method by the detailed comprehensive data source of acquisition, It is subsequent that public feelings information is imported into chart database, to improve the data of node and nodal community.For example, passing through name entity mould Type identifies the sentence corpus in relation to destination item, after segmenting to corpus, carries out part of speech analysis and Feature Words point to word Analysis, such as the relationship between noun, verb, adjective and these words, to extract the destination item public feelings information in corpus.
Further, it can also be improved main in destination item by the model training and automatic learning art of artificial intelligence The accuracy of the identification of body title and the identification of public sentiment feature.Specifically, by natural language processing technique, to a large amount of corpus of acquisition Word segmentation processing is carried out, and the participle of acquisition is screened, passes through the model training and automatic learning art of artificial intelligence at this time The accuracy of principal name identification and the identification of public sentiment feature is improved, for example, sieving by artificial intelligence model and automatic learning art The noun for including in the participle of acquisition and verb etc. are selected, and is filtered out in noun and verb according to tactic from high to low The word of preceding presetting digit capacity, using noun as destination item main body, entity mould is established in description of the verb as relationship between main body Type.
S250, the principal name and the public sentiment feature are imported into chart database to construct the public sentiment of the destination item Relation map.
Wherein, chart database, also known as graphic data base, English are Graph Database, and graphic data base is NoSQL One seed type of database, the relation information between its Graphics Application theory storage entity, common graphic data base include Neo4j, FlockDB and AllegroGrap etc..In a graphic data base, there are mainly two types of the main compositions of database, The relationship of node collection and connecting node, node collection are exactly a series of set of nodes in figure, and in graphic data base, each node is still It is also both the node collection belonging to it with the label for indicating oneself affiliated entity type, and it is special to record a series of description nodes The attribute of property, in addition to this it is possible to connect each node by relationship.
Specifically, by by natural language processing parse the corpus identify the destination item principal name and Public sentiment feature is imported into chart database, improves the node of chart database and the data of connecting node relationship, wherein node is corresponding Principal name and public sentiment feature, while the relationship between node being described.In design configuration database, section is formed by multiple nodes Point set is associated between node by relationship, distinguishes figure interior joint collection, the correlation between node and node is being led When entering data, graphic data base automatic identification imports the node data and relation data in data, by the node data and pass Coefficient according to belonging on the corresponding position of graphic data base respectively.In this example, the principal name and the public sentiment is special After sign imports chart database, the public sentiment relation map of the destination item can be constructed automatically, for example, if destination item is a Mobile phone, the principal name for including in destination item is mobile phone title, the title of all parts or component in mobile phone, for example is formed The title and camera title of the display screen of mobile phone, public sentiment are characterized in evaluation and description to the corresponding main body of principal name, than Such as, the display screen of mobile phone is the description to mobile phone feature greatly, and three difference nodes are corresponding " mobile phone ", " display screen " and " big ", together When by describing relationship by three nodes between "comprising" relationship between " mobile phone " and " display screen ", " display screen " and " big " It is connected in turn to form public sentiment relation map.Pass through the public sentiment relation map of the destination item in the embodiment of the present application Mode stores the dynamic public sentiment data of destination item, preferably can visualize and extract the public sentiment of destination item.
Further, the public sentiment relation map of the destination item includes the son of destination item title, the destination item Project name and the corresponding public sentiment feature of sub-project title.
Wherein, the sub-project of destination item refers to the component part of destination item.If for example, the destination item is a Mobile phone, then the components such as the display screen of the mobile phone, camera, battery and central processing unit are the sub-project of the mobile phone, sub-project The title of corresponding model sub-project.
Specifically, the element in the public sentiment relation map of the destination item includes the public sentiment relational graph of the destination item Principal name and public sentiment feature in spectrum.For example, have in the public sentiment relation map of the destination item principal names such as noun and The description of the features such as adjective.Public sentiment feature, refers to the keyword of public sentiment, for example, clear and battery life length etc. of taking pictures.It is logical The mode of the public sentiment relation map of destination item is crossed, destination item dynamic public sentiment data is stored, to realize public sentiment data preferably It visualizes and extracts, by constructing the spectrum data of destination item, news corpus relevant to destination item library has been built, in mesh Before the spectrum data visualization of mark project, it is also necessary to carry out time-sequencing to destination item public sentiment data, target item is set out The earlier news data of the nearest ranking of mesh, for example, the components such as central processing unit used by retrieval A mobile phone products, Then by the related object of traversal A node, related public sentiment of A product etc. can be got.Furthermore it is also possible to destination item Related specific field carries out depth analysis, can also be into the relationship with customer of the product for example, if destination item is a product Row analysis, needs to obtain public sentiment data relevant to target product from vendor attribute, and data are classified and gone Weight is presented to user to carry out public sentiment monitoring, for example, the assessment to central processing unit in mobile phone products, can influence to product Public sentiment, such as valiant imperial 820 heating problem of central processing unit of smart phone, on using valiant imperial 820 mobile phone influence just compared with Greatly, it to the various favorable comments of valiant imperial 835 processor performance advantage, brings to the various favorable comments in the performance for using valiant imperial 835 mobile phone, For example speed is fast and the public sentiment feature of the mobile phones such as power saving.
S260, the display public sentiment relation map.
Specifically, the public sentiment relation map of the destination item of building is shown that providing the user with makes by terminal User realizes according to the public sentiment relation map of the destination item and monitors to the public sentiment of the destination item, so that destination item is supervised Control personnel obtain the public sentiment conclusion of destination item according to the public sentiment relation map of destination item, realize and supervise to destination item public sentiment Control, to do alignment processing to destination item public sentiment, for example, the positive information and reverse side information of destination item public sentiment can be obtained, The event evaluation information and channel obtained in destination item public sentiment assesses information, to make corresponding public relations measure.
Further, can also public sentiment conclusion to obverse and reverse in the destination item public sentiment of acquisition according to different mechanisms It is ranked up, positive public sentiment is made full use of to realize benefit, countermeasure is taken to reverse side public sentiment, eliminates negative influence, For example, screen problem or battery problems etc. that a certain product occurs.
When the embodiment of the present application realizes the monitoring of project public sentiment, after the identification information for obtaining destination item, according to the mark Know information and obtain the data source website list of the destination item by way of web search, according to the identification information from institute The corpus that the destination item is crawled in the data source website that data source website list is included is stated, to realize that acquisition is more comprehensive The corpus about destination item, the corpus is then parsed by natural language processing to identify master that the corpus is included The principal name and the public sentiment feature are imported chart database to construct the destination item by body title and public sentiment feature Public sentiment relation map visually shows the public sentiment relation map of the destination item, thus to target unit from subdivision angle The public sentiment of realization project monitors, and increases the specific public sentiment of destination item, can be preferably for a certain specific item of item in target unit Mesh realizes that the public sentiment for the project monitors, to judge the gain and loss superiority and inferiority of the project.Angle is segmented in target unit to improve The public sentiment monitoring efficiency of degree.
Referring to Fig. 3, Fig. 3 is another flow diagram of project public sentiment monitoring method provided by the embodiments of the present application. In this embodiment, the data source net for obtaining the destination item by way of web search according to the identification information Stand list the step of after, further includes:
S221, the data source website list is updated by way of crawling.
Specifically, the crawler strategy that an automation increases data source is constructed, is crawled by depth and is obtained from internet The more comprehensive data source of destination item.The crawler strategy for increasing data source can be automated, it is initial to refer to that the crawler receives Change data source website after, more data source websites can be expanded according to the data source website of acquisition automatically with increase corpus come Source, to obtain the more comprehensive corpus of destination item.In the present embodiment, it is possible to which automating the crawler strategy of increase data source is Refer to crawler according to the type and web site structures feature of the data source website of acquisition, method by crawling is excavated and obtained The related source of new data website of data source network address, for example have an identical suffix with the data source network address of acquisition, or with acquisition Data source network address belong to the same type, for example belong to finance and economic website etc., thus from a finance and economic Fisher ruler to Other finance and economic websites, due to belonging to finance and economic website, it is possible to exist for the same destination item from different perspectives into The corpus that row is interpreted.Since related website can never especially when facing the hot issue of destination item each other Same angle is interpreted and is reported to destination item, so that the website in data source website is constantly improve, abundant data source net Data source in standing reaches increase data source, guarantees the basis of data volume.The related of destination item is obtained by data source website Corpus, by data source abundant to obtain the comprehensive corpus abundant of destination item.Further, automation increases data source Crawler strategy the effect that crawl data can be improved by distributed reptile system to construct real-time distributed crawler system Rate.Specifically, server obtain destination item initial data source list of websites, by the initial data source list of websites according to Preset condition is classified to obtain different types of data source website list, and the different types of data source website column are encapsulated Different Docker containers is deployed on different servers to corresponding Docker container, starts the Docker and hold by table Device by make the Docker container by crawling in a manner of obtain source of new data website, the source of new data website is added to pair The initial data source list of websites answered is to update the data source website of the destination item.For example, one automation of building increases The crawler strategy of data source is real-time distributed crawler system, and the crawler system can be according to the inventory of input, such as basis The mark of website in the inventory of input, distinguishes the type of different web sites, according to the type of website, distributes inventory to each clothes It is engaged in device, realizes that distributed data crawl and data loading, to improve the efficiency for crawling data.
Referring to Fig. 4, Fig. 4 is a sub- flow diagram of project public sentiment monitoring method provided by the embodiments of the present application. As shown in figure 4, in this embodiment, the described the step of data source website list is updated by way of crawling, includes:
S2210, the initial data source list of websites for obtaining the destination item;
S2211, the initial data source list of websites is classified according to preset condition to obtain different types of number According to source list of websites;
S2212, the encapsulation different types of data source website list to corresponding Docker container;
S2213, the starting Docker container by make the Docker container by crawling in a manner of obtain source of new data Website;
S2214, the source of new data website is added separately to corresponding sorted data source website column according to type Table is to update the data source website of the destination item.
Wherein, preset condition includes the conditions such as station address or data source, and station address refers to the system according to website One Resource Locator (English be Uniform Resource Locator, be abbreviated as URL) is classified, due to different web sites Anti- crawler strategy it is different, cause the data structure of webpage in website different, need for different websites with different Strategy is crawled, is crawled for example, the news of Sina website is relatively good, is directly parsed with BeautifulSoup, progress directly crawls i.e. Can, the title and content of Netease's news are using the asynchronous load of JS, and simple downloading web page source code is no title and interior Hold, the content of needs can be found in the JS of Network, regular expression can be used to obtain the title of our needs And its link, the news of today's tops is different with the first two, its title and link is encapsulated into Json file, still The URL parameter of Json file is changed by a JS random algorithm, is needed to simulate the parameter of Json file, otherwise be can not find The specific URL of Json file, website sources include financial web site, news website or forum etc..
Specifically, the initial data source list of websites of the destination item of configuration is obtained, crawler system is automatically according to described first The preset condition of beginning data source website list classifies the initial data source list of websites to obtain different types of number Data source website is divided into different type according to source list of websites, such as according to website logo, is then encapsulated different types of described Data source website list to corresponding Docker container, the Docker container is deployed on different servers, starts institute Docker container is stated so that the Docker container obtains source of new data website abundant by crawling, by the source of new data net Station is added to corresponding initial data source list of websites to update the data source website of the destination item, to constantly improve mesh The data source website of mark project.Specifically, including following sub-step:
Firstly, obtain initial list of websites, which can be by manual configuration, that is, by manually providing initial number According to source website, it is also possible to be the list of websites searched according to identification information.
Secondly, by the way that by the crawler code wrap write, into Docker container, wherein code includes extracting website The part of URL, while there are also matching URL and the corresponding code for crawling program, to keep URL automatically corresponding with program is crawled, lead to Cross the website that corresponding crawlers crawl corresponding URL.Wherein, need to construct the index relative of URL and crawlers, in advance The web crawlers of all URL types is carried out, so that different types of URL crawler corresponds to different crawlers.
Third starts container Docker1, and total input inventory is classified and divided by crawler code, by same class Data source inventory saved, form list to be crawled, waiting crawls.Wherein, pass through the generation of starting URL classification and segmentation Code, classifies to the website url list of input according to URL type, realizes that website url list carries out sort operation and then opens Different data source inventories is divided into several lists, the Docker container on corresponding different machines by the code of dynamic list segmentation.
4th, start container Docker2, by the data source inventory list of acquisition, passes through the corresponding crawler journey of matching URL Sequence, for example, the website X, corresponds to the code that the website X crawls and parses, the incoming website X can be crawled, and be visited external network It asks, separately grabs corresponding data, and return data in database.
Further, crawlers excavate new URL according to the URL of acquisition, that is, crawlers pass through starting URL New URL is excavated, and new URL is stored into url list to be crawled to improve url list.At the same time it can also check Whether the case where reporting an error in data procedures is crawled, if the case where reporting an error, terminated for the process that crawls of this website.
Classify to URL, can be carried out by pre-set URL regular expression.Every class url list has correspondence Regular expression, by judge the result returned whether be it is empty, to determine whether such URL.Deterministic process is as follows: if returning Result non-empty is returned, then is judged as such URL, if judging result is sky, is judged as such non-URL.
5th, until all Docker2 list of websites to be crawled be sky, stop operation.In order to improve data source website List can take the mode of timing or not timing to be repeated the above steps according to acquired data source website list, with reality The update of existing data source website list.
Referring to Fig. 5, another sub-process that Fig. 5 is project public sentiment monitoring method provided by the embodiments of the present application is illustrated Figure.As shown in figure 5, in this embodiment, it is described that the corpus is parsed to identify that the corpus is wrapped by natural language processing The step of principal name and public sentiment feature for containing includes:
S2400, the corpus is split according to sentence separatrix to obtain sentence data collection.
Wherein, sentence separatrix include sentence marks and decompose word, the sentence marks include ".", "? ", ";" and "!" etc. punctuation marks, it is described decompose word include " ", " and ", " in ", " we " and " according to " etc. are pre-set can Using the word or word separated as sentence.
Specifically, the corpus crawled by crawler system is separated according to sentence separatrix, obtains sentence data collection, To filter out the sentence comprising title in subordinate clause Sub Data Set.
S2401, name physical model is constructed according to the corpus.
Wherein, name entity, English be Named Entity, so-called name entity be exactly name, mechanism name, place name with And other all entities with entitled mark, wider entity further include number, date, currency, address etc..
Specifically, the building for naming physical model is named the mark of entity by the corpus content of acquisition, passed through CRF model constructs Named Entity Extraction Model, identifies the principal name of destination item.Wherein, CRF model, CRF, English are Conditional Random Field, condition random field are one of common algorithm of natural language processing field in recent years, base In statistical model, CRF be substantially implicit variable Markov Chain and Observable state it is general to the condition for implying variable Rate.
S2402, principal name included in the sentence data collection is identified by the name physical model.
Wherein, Entity recognition is named, English is Named Entity Recognition, abbreviation NER, also referred to as " proper name Identification " refers to the entity with certain sense in identification text, mainly includes name, place name, mechanism name, proper noun etc..
Specifically, after the completion of the building of name physical model, the sentence data collection obtained by name physical model processing leads to The subject name that sentence data concentration includes can be automatically identified by crossing name physical model.For example, passing through the corpus content It is named the mark of entity, by CRF model, Named Entity Extraction Model is constructed, identifies company's principal name.Pass through life Name physical model, identifies the sentence corpus in the relevant information of the destination item, carries out part of speech analysis to word and project is closed The retrieval of key relationship, if there is the keyword of core, such as mobile phone products, such as battery durable is lasting, system is convenient for operating, Relevant information then saves as to the specific object of destination item, at the same the specific object can also carry current date and when Between, enrich the public sentiment data of the public sentiment relation map of destination item.
S2403, part of speech analysis and the retrieval of relationship by objective (RBO) are carried out to the corpus to obtain the public sentiment of the destination item Feature.
Wherein, the characteristics of part of speech refers to using word is as parts of speech such as the bases, such as verb, noun of Part of Speech Division.Target is closed System refers to the relationship between the main body that the destination item for including in the corpus is related to, for example, between mobile phone and mobile phone component Subordinate relation etc., for example, the inclusion relation etc. between mobile phone and the camera of mobile phone.
Specifically, the identification of part of speech analysis and subjective relationship, including following procedure are carried out to the corpus
Firstly, being segmented to the corpus.Carrying out participle operation to statement type can be using stammerer participle.Wherein, Stammerer participle is one of participle tool in Python, and it is many that tool is segmented in Python, including Pan Gu segments, Yaha is segmented, Jieba participle, Tsing-Hua University THULAC etc..
Secondly, carrying out the extraction of Key Relationships.Specifically, the movement of verb is extracted, and carries out lists of keywords It matches, if verb vocabulary in keyword, then regards as Key Relationships, and gets the subsequent noun object of verb, is Naming relationship object gets the noun object before verb, is naming relationship main body, naming relationship main body i.e. target. The relationship conduct between naming relationship main body, naming relationship object and naming relationship main body and naming relationship object that will acquire In public sentiment feature, the principal name that the Key Relationships of extraction are related to and the characteristic deposit chart database for embodying attribute.
Further, described the step of constructing name physical model according to the corpus, includes:
1), the corpus is segmented to obtain word segmentation result;
2) characteristic in the word segmentation result, is extracted by preset feature templates;
3), based on the preset conditional random field models of characteristic training to construct name physical model.
Specifically, by the corpus building name physical model of acquisition, specifically includes the following steps:
Firstly, obtaining name entity training corpus, which mostlys come from crawler system and is obtained by way of crawling Destination item corpus.
Secondly, being pre-processed to the corpus.It is main to be segmented using stammerer and remove stop words and meaningless word, it obtains Take word segmentation result.
Third carries out feature extraction.Feature extraction, the spy of acquisition are carried out by the feature templates being made of regular expression Sign includes word, part of speech, boundary word, name substance feature word.
4th, the model of creation and training based on condition random field.Condition random field i.e. CRF model, pass through training Data train CRF model, obtain the parameter of CRF model, the CRF model after saving training.
5th, by the evaluation of test data, and the final satisfactory model such as retain discrimination height, to obtain building Name physical model.
Referring to Fig. 6, the third sub-process that Fig. 6 is project public sentiment monitoring method provided by the embodiments of the present application is illustrated Figure.In this embodiment, described that the corpus parsed by natural language processing to identify main body name that the corpus is included Claim and the step of public sentiment feature includes:
S2500, the corpus is segmented to obtain the word lists of the corpus.
Specifically, the corpus is pre-processed, it is main that stop words and meaningless word are segmented and removed using stammerer, Obtain word lists.
S2501, the Key Relationships in the word lists are extracted using the first regular expression to obtain public sentiment feature;
S2502, the name entity that the Key Relationships in the word lists are related to is extracted using the second regular expression To obtain principal name.
Specifically, the extraction of Key Relationships is carried out using regular expression.Specifically, it is set out by regular expression extraction The movement of word, and the matching of lists of keywords is carried out, if verb vocabulary in keyword, then regards as Key Relationships, and And the subsequent noun object of verb is got, it is naming relationship object, gets the noun object before verb, is naming relationship Main body, naming relationship main body i.e. target.Naming relationship main body, naming relationship object and the naming relationship main body that will acquire Relationship between naming relationship object is as public sentiment feature.Wherein, regular expression, also known as regular expression, English are Regular Expression, is often abbreviated as Regex, Regexp or RE in code, regular expression be usually used to retrieval, Replace those texts for meeting some mode (rule).
Then, the Key Relationships are imported into chart database as public sentiment feature and the name entity as principal name To construct the public sentiment relation map of the destination item.In design configuration database, figure interior joint collection, node and pass are distinguished Connecting each other between system, when importing data, graphic data base automatic identification imports the node data and relationship number in data According to the node data and relation data are belonged to respectively on the corresponding position of graphic data base.It in this example, will be described After Key Relationships and the name entity import chart database, the public sentiment relation map of the destination item can be constructed automatically. For example, Key Relationships to be constructed to the relationship between mobile phone and mobile phone component, and the description of the feature of component component section is imported into In the attribute of point.Wherein, chart database, also known as graphic data base, English are Graph Database, and graphic data base is One seed type of NoSQL database, the relation information between its Graphics Application theory storage entity, common graphic data base packet Include Neo4j, FlockDB and AllegroGrap etc..
Please continue to refer to Fig. 3, as shown in figure 3, in this embodiment, the step of the display public sentiment relation map it Afterwards, further includes:
S270, the element in the public sentiment relation map is combined to describe the mesh by written form according to preset order The public sentiment of mark project.
Further, described to combine the element in the public sentiment relation map according to preset order to retouch by written form The step of stating the public sentiment of the destination item include:
The element in the public sentiment relation map is combined according to preset order to describe the target item by written form Purpose front public feelings information, reverse side public feelings information, event evaluation information and channel assess information.
Specifically, not only in the form of the public sentiment relation map of the destination item public sentiment of displaying target project to realize The monitoring of destination item public sentiment, meanwhile, by combining the display format of text, provide the public sentiment relation map of the destination item Public sentiment conclusion, for the reference of destination item public sentiment monitoring personnel.The public sentiment conclusion includes front public feelings information, the reverse side of public sentiment Public feelings information, event evaluation information and channel assess information, wherein and the front public feelings information refers to the positive influences of public sentiment, than Such as, for Mobile phone, the battery life of the mobile phone is long and appearance is beautiful etc., reverse side public feelings information refers to the reverse side shadow of public sentiment It rings, for example, being directed to Mobile phone, mobile phone is too warm etc. when the battery life of the mobile phone is short and charges, and event evaluation information is Refer to that the influence to event a certain in public sentiment carries out prediction and evaluation and estimation, for example, influence of the product assessment report to the product, Channel assessment information refers to influence of the channel belonging to corpus source to the target, for example, the audient of different web sites, scale and shadow Sound is all different, and estimation of the channel belonging to assessment event to object effects is needed, for example, microblogging, wechat circle of friends and forum Influence to target is different.
The element in the public sentiment relation map of the destination item is combined according to preset order to describe by written form It, can be according to the relation information between the entity stored in graphic data base, according to figure number when the public sentiment of the destination item According to information characteristics of the library in design configuration database, connecting each other between figure interior joint collection and node and relationship is distinguished, Then the relationship between node and node is come out by verbal description, the destination item is described by written form to realize Public sentiment, to project public sentiment monitoring personnel with the prompt of character property.For example, if in the public sentiment relation map of the destination item, Relationship subordinate relation between node A and B can be described as " section when describing the public sentiment of the destination item by written form Point A is subordinated to node B ".It further, can also be further from the corpus of acquisition if obtaining node A influences the information of node B Middle screening egress A influences the relevant information of node B, forms node A according to the regular expression or language model trained The informative abstract for influencing node B, is supplied to project public sentiment monitoring personnel with written form, refers to for project public sentiment monitoring personnel, For example, influence etc. of the battery of influence or mobile phone of the processor of mobile phone to mobile phone to mobile phone.Wherein, language model, such as N-gram language model or neural network language model etc..
Further, in one embodiment, it by constructing destination item spectrum data, has built related to destination item News corpus library, before visualization, it is also necessary to carry out time-sequencing to the public sentiment data of destination item, sequentially in time Destination item news data in the top is set out, with further screening valid data, improves the treatment effeciency of data.Than Such as, by the spectrum data of building Mobile phone, news corpus relevant to this mobile phone library has been built, before visualization, It also needs to carry out time-sequencing to the public sentiment data of this mobile phone, the news number before the nearest ranking of this mobile phone is set out relatively According to know nearest this mobile phone performance of greatest concern, for example taking pictures or battery capacity etc..
Furthermore it is also possible to which the association field to destination item carries out depth analysis.For example, being needed in product competition relationship Public sentiment data relevant to destination item is obtained from competing product attribute, and classification and duplicate removal are carried out to data, is presented to user.
Further, the public sentiment of destination item is obtained, realizes and the public sentiment of the destination item is monitored, it can further root Reply processing is done according to the public sentiment of destination item, realizes maintenance destination item, for example, the image product public relations of enterprise is realized, with dimension Protect the image and interests of enterprise.For example, if destination item is Mobile phone, the positive information of the public sentiment of this mobile phone and anti-is obtained Face information obtains event evaluation information and channel assessment information in this mobile phone public sentiment, to make corresponding public relations measure.
It should be noted that project public sentiment monitoring method described in above-mentioned each embodiment, can according to need will be different The technical characteristic for including in embodiment re-starts combination, with obtain combination after embodiment, but all this application claims Within protection scope.
Referring to Fig. 7, Fig. 7 is the schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application.Correspond to Above-mentioned project public sentiment monitoring method, the embodiment of the present application also provide a kind of project public sentiment monitoring device.As shown in fig. 7, the project Public sentiment monitoring device includes the unit for executing above-mentioned project public sentiment monitoring method, which can be configured in the meter such as terminal It calculates in machine equipment.Specifically, referring to Fig. 7, the project public sentiment monitoring device 700 is obtained including first acquisition unit 701, second Unit 702 crawls unit 703, recognition unit 704, construction unit 705 and display unit 706.
Wherein, first acquisition unit 701, for obtaining the identification information of destination item by predetermined manner;
Second acquisition unit 702, for obtaining the target item by way of web search according to the identification information Purpose data source website list;
Unit 703 is crawled, the data source net for being included from the data source website list according to the identification information The corpus of the destination item is crawled in standing;
Recognition unit 704, for parsing the corpus by natural language processing to identify master that the corpus is included Body title and public sentiment feature;
Construction unit 705, for the principal name and the public sentiment feature to be imported chart database to construct the mesh The public sentiment relation map of mark project;
Display unit 706, for showing the public sentiment relation map.
Referring to Fig. 8, Fig. 8 is another schematic block diagram of project public sentiment monitoring device provided by the embodiments of the present application. As shown in figure 8, in this embodiment, the project public sentiment monitoring device 700 further include:
Updating unit 707, for updating the data source website list by way of crawling.
Please continue to refer to Fig. 8, as shown in figure 8, the updating unit 707 includes:
Subelement 7071 is obtained, for obtaining the initial data source list of websites of the destination item;
Classification subelement 7072, for classifying the initial data source list of websites according to preset condition to obtain Different types of data source website list;
Subelement 7073 is encapsulated, is held for encapsulating the different types of data source website list to corresponding Docker Device;
Crawl subelement 7074, for start the Docker container by make the Docker container by crawling in a manner of Obtain source of new data website;
Subelement 7075 is updated, it is corresponding sorted for the source of new data website to be added separately to according to type Data source website list is to update the data source website of the destination item.
In one embodiment, the public sentiment relation map of the destination item includes destination item title, the target item Purpose sub-project title and the corresponding public sentiment feature of sub-project title.
Referring to Fig. 8, as shown in figure 8, in this embodiment, the recognition unit 704 includes:
Divide subelement 7041, for being split the corpus according to sentence separatrix to obtain sentence data collection;
Subelement 7042 is constructed, for constructing name physical model according to the corpus;
Subelement 7043 is identified, for identifying included in the sentence data collection by the name physical model Principal name;
Subelement 7044 is retrieved, for carrying out part of speech analysis and the retrieval of relationship by objective (RBO) to the corpus to obtain the mesh The public sentiment feature of mark project.
Referring to Fig. 8, as shown in figure 8, in this embodiment, the project public sentiment monitoring device 700 further include:
Unit 708 is described, for combining the element in the public sentiment relation map according to preset order by text shape Formula describes the public sentiment of the destination item.
In one embodiment, the description unit 708, for being combined in the public sentiment relation map according to preset order Element to describe the front public feelings information, reverse side public feelings information, event evaluation information of the destination item by written form Information is assessed with channel.
It should be noted that it is apparent to those skilled in the art that, above-mentioned project public sentiment monitoring device It, can be for convenience of description and simple with reference to the corresponding description in preceding method embodiment with the specific implementation process of each unit Clean, details are not described herein.
Meanwhile in above-mentioned project public sentiment monitoring device the division of each unit and connection type be only used for for example, In other embodiments, project public sentiment monitoring device can be divided into different units as required, project public sentiment can also be monitored Each unit takes the different order of connection and mode in device, to complete all or part of function of above-mentioned project public sentiment monitoring device Energy.
Above-mentioned project public sentiment monitoring device can be implemented as a kind of form of computer program, which can be It is run in computer equipment as shown in Figure 9.
Referring to Fig. 9, Fig. 9 is a kind of schematic block diagram of computer equipment provided by the embodiments of the present application.The computer Equipment 900 can be desktop computer, and perhaps the computer equipments such as server are also possible to component or portion in other equipment Part.
Refering to Fig. 9, which includes processor 902, memory and the net connected by system bus 901 Network interface 905, wherein memory may include non-volatile memory medium 903 and built-in storage 904.
The non-volatile memory medium 903 can storage program area 9031 and computer program 9032.The computer program 9032 are performed, and processor 902 may make to execute a kind of above-mentioned project public sentiment monitoring method.
The processor 902 is for providing calculating and control ability, to support the operation of entire computer equipment 900.
The built-in storage 904 provides environment for the operation of the computer program 9032 in non-volatile memory medium 903, should When computer program 9032 is executed by processor 902, processor 902 may make to execute a kind of above-mentioned project public sentiment monitoring method.
The network interface 905 is used to carry out network communication with other equipment.It will be understood by those skilled in the art that in Fig. 9 The structure shown, only the block diagram of part-structure relevant to application scheme, does not constitute and is applied to application scheme The restriction of computer equipment 900 thereon, specific computer equipment 900 may include more more or fewer than as shown in the figure Component perhaps combines certain components or with different component layouts.For example, in some embodiments, computer equipment can Only to include memory and processor, in such embodiments, reality shown in the structure and function and Fig. 9 of memory and processor It is consistent to apply example, details are not described herein.
Wherein, the processor 902 is for running computer program 9032 stored in memory, to realize following step It is rapid: the identification information of destination item is obtained by predetermined manner;It is obtained by way of web search according to the identification information The data source website list of the destination item;The data for being included from the data source website list according to the identification information The corpus of the destination item is crawled in the website of source;The corpus is parsed by natural language processing to identify that the corpus is wrapped The principal name and public sentiment feature contained;The principal name and the public sentiment feature are imported into chart database to construct the target The public sentiment relation map of project;Show the public sentiment relation map.
In one embodiment, the processor 902 is realizing the side for passing through web search according to the identification information After formula obtains the step of data source website list of the destination item, also perform the steps of
The data source website list is updated by way of crawling.
In one embodiment, the processor 902 described updates the data source website realizing by way of crawling When the step of list, following steps are implemented:
Obtain the initial data source list of websites of the destination item;
The initial data source list of websites is classified according to preset condition to obtain different types of data source net It stands list;
The different types of data source website list is encapsulated to corresponding Docker container;
Start the Docker container by make the Docker container by crawling in a manner of obtain source of new data website;
The source of new data website is added separately to corresponding sorted data source website list according to type with more The data source website of the new destination item.
In one embodiment, the processor 902 is when realizing the public sentiment relation map of the destination item, the public sentiment Relation map specifically includes the following contents: destination item title, the sub-project title of the destination item and sub-project title pair The public sentiment feature answered.
In one embodiment, the processor 902 described parses the corpus by natural language processing to know realizing When the step for the principal name and public sentiment feature that the not described corpus is included, following steps are implemented:
The corpus is split according to sentence separatrix to obtain sentence data collection;
Name physical model is constructed according to the corpus;
Principal name included in the sentence data collection is identified by the name physical model;
Part of speech analysis and the retrieval of relationship by objective (RBO) are carried out to obtain the public sentiment feature of the destination item to the corpus.
In one embodiment, the processor 902 is after the step of realizing the display public sentiment relation map, also It performs the steps of
The element in the public sentiment relation map is combined according to preset order to describe the target item by written form Purpose public sentiment.
In one embodiment, the processor 902 is described according to the preset order combination public sentiment relation map in realization In step of the element to describe the public sentiment of the destination item by written form when, implement following steps:
The element in the public sentiment relation map is combined according to preset order to describe the target item by written form Purpose front public feelings information, reverse side public feelings information, event evaluation information and channel assess information.
It should be appreciated that in the embodiment of the present application, processor 902 can be central processing unit (Central Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic Device, discrete gate or transistor logic, discrete hardware components etc..Wherein, general processor can be microprocessor or Person's processor is also possible to any conventional processor etc..
Those of ordinary skill in the art will appreciate that be realize above-described embodiment method in all or part of the process, It is that can be completed by computer program, which can be stored in a computer readable storage medium.The computer Program is executed by least one processor in the computer system, to realize the process step of the embodiment of the above method.
Therefore, the application also provides a kind of computer readable storage medium.The computer readable storage medium can be non- The computer readable storage medium of volatibility, the computer-readable recording medium storage have computer program, the computer program Processor is set to execute following steps when being executed by processor:
A kind of computer program product, when run on a computer, so that computer executes in the above various embodiments The step of described project public sentiment monitoring method.
The computer readable storage medium can be the internal storage unit of aforementioned device, such as the hard disk or interior of equipment It deposits.What the computer readable storage medium was also possible to be equipped on the External memory equipment of the equipment, such as the equipment Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge Deposit card (Flash Card) etc..Further, the computer readable storage medium can also both include the inside of the equipment Storage unit also includes External memory equipment.
It is apparent to those skilled in the art that for convenience of description and succinctly, foregoing description is set The specific work process of standby, device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
The computer readable storage medium can be USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), the various computer readable storage mediums that can store program code such as magnetic or disk.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not It is considered as beyond scope of the present application.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary.For example, the division of each unit, only Only a kind of logical function partition, there may be another division manner in actual implementation.Such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.
Step in the embodiment of the present application method can be sequentially adjusted, merged and deleted according to actual needs.This Shen Please the unit in embodiment device can be combined, divided and deleted according to actual needs.In addition, in each implementation of the application Each functional unit in example can integrate in one processing unit, is also possible to each unit and physically exists alone, can also be with It is that two or more units are integrated in one unit.
If the integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product, It can store in one storage medium.Based on this understanding, the technical solution of the application is substantially in other words to existing skill The all or part of part or the technical solution that art contributes can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that an electronic equipment (can be individual Computer, terminal or network equipment etc.) execute each embodiment the method for the application all or part of the steps.
The above, the only specific embodiment of the application, but the bright protection scope of the application is not limited thereto, and is appointed What those familiar with the art within the technical scope of the present application, can readily occur in various equivalent modifications or Replacement, these modifications or substitutions should all cover within the scope of protection of this application.Therefore, the protection scope Ying Yiquan of the application Subject to the protection scope that benefit requires.

Claims (10)

1. a kind of project public sentiment monitoring method, which is characterized in that the described method includes:
The identification information of destination item is obtained by predetermined manner;
The data source website list of the destination item is obtained by way of web search according to the identification information;
The destination item is crawled from the data source website that the data source website list is included according to the identification information Corpus;
The corpus is parsed by natural language processing to identify principal name and public sentiment feature that the corpus is included;
The principal name and the public sentiment feature are imported into chart database to construct the public sentiment relation map of the destination item;
Show the public sentiment relation map.
2. project public sentiment monitoring method according to claim 1, which is characterized in that described to pass through net according to the identification information After the step of mode of network search obtains the data source website list of the destination item, further includes:
The data source website list is updated by way of crawling.
3. project public sentiment monitoring method according to claim 2, which is characterized in that it is described updated by way of crawling it is described The step of data source website list includes:
Obtain the initial data source list of websites of the destination item;
The initial data source list of websites is classified according to preset condition to obtain different types of data source website column Table;
The different types of data source website list is encapsulated to corresponding Docker container;
Start the Docker container by make the Docker container by crawling in a manner of obtain source of new data website;
The source of new data website is added separately to corresponding sorted data source website list according to type to update State the data source website of destination item.
4. project public sentiment monitoring method according to claim 1, which is characterized in that the public sentiment relation map of the destination item Including destination item title, the sub-project title of the destination item and the corresponding public sentiment feature of sub-project title.
5. project public sentiment monitoring method according to claim 1, which is characterized in that described to parse institute by natural language processing Predicate material includes: the step of principal name and public sentiment feature that the corpus is included to identify
The corpus is split according to sentence separatrix to obtain sentence data collection;
Name physical model is constructed according to the corpus;
Principal name included in the sentence data collection is identified by the name physical model;
Part of speech analysis and the retrieval of relationship by objective (RBO) are carried out to obtain the public sentiment feature of the destination item to the corpus.
6. project public sentiment monitoring method according to claim 1, which is characterized in that the display public sentiment relation map After step, further includes:
The element in the public sentiment relation map is combined according to preset order to describe the destination item by written form Public sentiment.
7. project public sentiment monitoring method according to claim 6, which is characterized in that the public sentiment of the destination item includes front Public feelings information, reverse side public feelings information, event evaluation information and channel assess information.
8. a kind of project public sentiment monitoring device characterized by comprising
First acquisition unit, for obtaining the identification information of destination item by predetermined manner;
Second acquisition unit, for obtaining the data of the destination item by way of web search according to the identification information Source list of websites;
Unit is crawled, for crawling from the data source website that the data source website list is included according to the identification information The corpus of the destination item;
Recognition unit, for parsing the corpus by natural language processing with identify principal name that the corpus is included and Public sentiment feature;
Construction unit, for the principal name and the public sentiment feature to be imported chart database to construct the destination item Public sentiment relation map;
Display unit, for showing the public sentiment relation map.
9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and is connected with the memory Processor;The memory is for storing computer program;The processor is based on running and storing in the memory Calculation machine program, to execute as described in claim any one of 1-7 the step of project public sentiment monitoring method.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer journey Sequence, the computer program make the processor execute the project as described in any one of claim 1-7 when being executed by processor The step of public sentiment monitoring method.
CN201910270796.5A 2019-04-04 2019-04-04 Project public sentiment monitoring method, device, computer equipment and storage medium Pending CN110134845A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910270796.5A CN110134845A (en) 2019-04-04 2019-04-04 Project public sentiment monitoring method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910270796.5A CN110134845A (en) 2019-04-04 2019-04-04 Project public sentiment monitoring method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110134845A true CN110134845A (en) 2019-08-16

Family

ID=67569394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910270796.5A Pending CN110134845A (en) 2019-04-04 2019-04-04 Project public sentiment monitoring method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110134845A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143336A (en) * 2019-11-27 2020-05-12 三盟科技股份有限公司 College scientific research data management-oriented web crawler management method and platform
CN111611408A (en) * 2020-05-27 2020-09-01 北京明略软件系统有限公司 Public opinion analysis method and device, computer equipment and storage medium
CN111666426A (en) * 2020-06-10 2020-09-15 北京海致星图科技有限公司 Method, system and equipment for acquiring knowledge graph multi-scene graph data
CN111858959A (en) * 2020-07-23 2020-10-30 平安付科技服务有限公司 Method and device for generating component relation map, computer equipment and storage medium
CN112069381A (en) * 2020-09-27 2020-12-11 中国科学院深圳先进技术研究院 Monitoring management method and system based on natural language processing technology
CN113657547A (en) * 2021-08-31 2021-11-16 平安医疗健康管理股份有限公司 Public opinion monitoring method based on natural language processing model and related equipment thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
WO2018023981A1 (en) * 2016-08-03 2018-02-08 平安科技(深圳)有限公司 Public opinion analysis method, device, apparatus and computer readable storage medium
CN108874878A (en) * 2018-05-03 2018-11-23 众安信息技术服务有限公司 A kind of building system and method for knowledge mapping
CN109471937A (en) * 2018-10-11 2019-03-15 平安科技(深圳)有限公司 A kind of file classification method and terminal device based on machine learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
WO2018023981A1 (en) * 2016-08-03 2018-02-08 平安科技(深圳)有限公司 Public opinion analysis method, device, apparatus and computer readable storage medium
CN108874878A (en) * 2018-05-03 2018-11-23 众安信息技术服务有限公司 A kind of building system and method for knowledge mapping
CN109471937A (en) * 2018-10-11 2019-03-15 平安科技(深圳)有限公司 A kind of file classification method and terminal device based on machine learning

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143336A (en) * 2019-11-27 2020-05-12 三盟科技股份有限公司 College scientific research data management-oriented web crawler management method and platform
CN111611408A (en) * 2020-05-27 2020-09-01 北京明略软件系统有限公司 Public opinion analysis method and device, computer equipment and storage medium
CN111666426A (en) * 2020-06-10 2020-09-15 北京海致星图科技有限公司 Method, system and equipment for acquiring knowledge graph multi-scene graph data
CN111858959A (en) * 2020-07-23 2020-10-30 平安付科技服务有限公司 Method and device for generating component relation map, computer equipment and storage medium
CN112069381A (en) * 2020-09-27 2020-12-11 中国科学院深圳先进技术研究院 Monitoring management method and system based on natural language processing technology
CN113657547A (en) * 2021-08-31 2021-11-16 平安医疗健康管理股份有限公司 Public opinion monitoring method based on natural language processing model and related equipment thereof
CN113657547B (en) * 2021-08-31 2024-05-14 平安医疗健康管理股份有限公司 Public opinion monitoring method based on natural language processing model and related equipment thereof

Similar Documents

Publication Publication Date Title
CN109614550A (en) Public sentiment monitoring method, device, computer equipment and storage medium
CN110134845A (en) Project public sentiment monitoring method, device, computer equipment and storage medium
CN110532451A (en) Search method and device for policy text, storage medium, electronic device
JP5721818B2 (en) Use of model information group in search
CN110110156A (en) Industry public sentiment monitoring method, device, computer equipment and storage medium
KR101419504B1 (en) System and method providing a suited shopping information by analyzing the propensity of an user
CN108874992A (en) The analysis of public opinion method, system, computer equipment and storage medium
CN109446341A (en) The construction method and device of knowledge mapping
JP4637969B1 (en) Properly understand the intent of web pages and user preferences, and recommend the best information in real time
CN109614476A (en) Customer service system answering method, device, computer equipment and storage medium
CN107578292B (en) User portrait construction system
CN107704503A (en) User's keyword extracting device, method and computer-readable recording medium
CN109684483A (en) Construction method, device, computer equipment and the storage medium of knowledge mapping
CN110263248A (en) A kind of information-pushing method, device, storage medium and server
CN103729359A (en) Method and system for recommending search terms
CN102890702A (en) Internet forum-oriented opinion leader mining method
CN105843796A (en) Microblog emotional tendency analysis method and device
CN106503025A (en) Method and system is recommended in a kind of application
CN110019616A (en) A kind of POI trend of the times state acquiring method and its equipment, storage medium, server
CN110134844A (en) Subdivision field public sentiment monitoring method, device, computer equipment and storage medium
CN108694647A (en) A kind of method for digging and device of trade company's rationale for the recommendation, electronic equipment
CN106909663A (en) Based on tagging user Brang Preference behavior prediction method and its device
CN104331438B (en) To novel web page contents selectivity abstracting method and device
CN109033282A (en) A kind of Web page text extracting method and device based on extraction template
CN105069077A (en) Search method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination