CN105786929A - Information monitoring method and device - Google Patents

Information monitoring method and device Download PDF

Info

Publication number
CN105786929A
CN105786929A CN201410833749.4A CN201410833749A CN105786929A CN 105786929 A CN105786929 A CN 105786929A CN 201410833749 A CN201410833749 A CN 201410833749A CN 105786929 A CN105786929 A CN 105786929A
Authority
CN
China
Prior art keywords
information
evaluation object
word
node
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410833749.4A
Other languages
Chinese (zh)
Other versions
CN105786929B (en
Inventor
王鑫文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410833749.4A priority Critical patent/CN105786929B/en
Publication of CN105786929A publication Critical patent/CN105786929A/en
Application granted granted Critical
Publication of CN105786929B publication Critical patent/CN105786929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application discloses an information monitoring method and device. The method comprises the following steps: grabbing information requiring to be monitored; acquiring a sentence in the information, analyzing grammar of the sentence to obtain a potential evaluation object; annotating and extracting the above potential evaluation object through a trained condition random field CRF to obtain a final evaluation object; judging whether the final evaluation object is matched with a preset key word. The information monitoring method and device provided by the embodiment of the application combine the sentence grammar analytic technology and condition random field CRF technology to acquire the evaluation object of the monitored information, thereby determining whether the information belongs to a subject concerned by the user, the condition that the important information is directly filtered is avoided.

Description

A kind of information monitoring method and device
Technical field
The application relates to field of computer technology, particularly relates to a kind of information monitoring method and device.
Background technology
The high speed development of Internet technology brings great convenience to the propagation of information, all can have number every day with ten thousand The information of meter is propagated on the internet, and society has actually entered the information age, and the Internet has become one the most Plant the new media being different from TV, broadcast and newspaper, and be increasingly becoming the main load of reaction Social Public Feelings One of body.
Owing to network has disguised and virtual feature so that anyone can be published by network Oneself viewpoint, and bigger scope can be diffused to rapidly, thus may result in and exist in a large number on the Internet , some valuable information in the hugest quantity of information, will necessarily be there are in complicated information, regardless of whether For government, public media or large enterprise, if can timely and effectively monitoring information and therefrom obtain have Information, it will bring the most civilian and commercial value to country, society, enterprise or individual.At letter Breath blast today, carry out the most timely and effectively public sentiment monitoring become one be badly in need of solve technology ask Topic.
Existing monitoring mode is mainly based upon key word and captures, and combines artificial screening, i.e. system according to The key word set captures the data with Keywords matching, then the data classification that will grab from network, Such as: be divided into two classes, a class is " without processing further ", and another kind of is " need to process further ", this Although most information can accurately be sorted out by individual mode, but accuracy rate still has much room for improvement, in " nothing Need to process further " apoplexy due to endogenous wind still there may be significant information, if these type of data directly being filtered, then inevitable The omission of significant information can be caused.
Summary of the invention
The embodiment of the present application provides a kind of information monitoring method and device, loses in order to solve monitoring in prior art The problem of leakage.
The embodiment of the present application provides a kind of information monitoring method, including:
Capture and need monitored information;
Obtain the sentence in described information, described sentence is carried out syntactic analysis and obtains potential evaluation object;
By trained condition random field CRF above-mentioned potential evaluation object is labeled and extraction obtains Final evaluation object;And
Judge whether this final evaluation object mates with the key word preset.
The embodiment of the present application also provides for a kind of information monitoring device, including:
Handling module, needs monitored information for capturing;
Acquisition module, for obtaining the sentence in described information, and carries out syntactic analysis to sentence and obtains potential Evaluation object;
Abstraction module, for carrying out above-mentioned potential evaluation object by housebroken condition random field CRF Mark and extraction, obtain final evaluation object;And
Judge module, for judging whether described final evaluation object mates with the key word preset.
The embodiment of the present application provide information monitoring method and device combine syntactic analysis technology and condition with Airport CRF technology, obtains the evaluation object to monitored information, thus further confirms that described information Whether belong to user's theme of interest, thus avoid important information directly to be filtered.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application Point, the schematic description and description of the application is used for explaining the application, is not intended that to the application not Work as restriction.In the accompanying drawings:
The information monitoring method process that Fig. 1 provides for the embodiment of the present application;
The information monitoring device structural representation that Fig. 2 provides for the embodiment of the present application.
Fig. 3 is that the embodiment of the present application is entered the syntactic structure tree that syntactic analysis obtains in concrete application example and shown It is intended to.
Detailed description of the invention
For making the purpose of the application, technical scheme and advantage clearer, specifically real below in conjunction with the application Execute example and technical scheme is clearly and completely described by corresponding accompanying drawing.Obviously, described Embodiment is only some embodiments of the present application rather than whole embodiments.Based on the enforcement in the application Example, the every other enforcement that those of ordinary skill in the art are obtained under not making creative work premise Example, broadly falls into the scope of the application protection.
Shown in ginseng Fig. 1, disclosure one information monitoring method, including:
S101: capture and need monitored information;
Crawl process includes: capture information according to default key word;By support vector machine (Support Vector Machine, SVM) grader to capture information classify, obtain needing monitored letter Breath.
Capture the process of information according to key word, specifically, user is according to self theme of interest, in advance First set the key word used required for monitoring, and the key word set is sent to system;Subsequently, After system obtains key word, carry out information scratching according to this key word, grab from the network platform with crucial The information of word coupling, wherein includes the information that theme of interest to user is relevant.In the embodiment of the present application In, the setting of described key word can be completed configuration by artificial at client server, and described system can To be information search engine, its information crawl matched after obtaining key word, then storage grabs All information, and information is sent back to user side or is stored in server end, for analyzing and processing for next step.
Then, by support vector machine (i.e. SVM) grader, the information grabbed is classified, tool For body, it is necessary first to be read out the aforementioned information grabbed identifying, and by support vector machine (i.e. SVM) information is classified by grader, can be divided into two classes according to information and Keywords matching situation, will Valuable information is classified as the first kind (such as: " useful " class), unworthy information is classified as Equations of The Second Kind (such as: " useless " class), in the embodiment of the present application, in order to avoid " useless " category information occurs important information Omit, can be set to need monitored information by described " useless " category information, in order to follow-up it is carried out deeply Enter to process.
In the present embodiment, described support vector machine (i.e. SVM) grader, it is become by sample training Model, and find the classification plane of aforementioned two category informations, i.e. classification function (linearly or nonlinearly), is used for Described information is carried out the division of classification, and information can be carried out pretreatment, such as: extract before classification, & apos Feature Words in information (can comprise graphic feature, the Yi Jixin occurred in the text feature in information, information Feature that breath is reprinted/forwarded etc., these all can be set when model training), and convert thereof into Characteristic vector, then completed to classify to characteristic vector by described model.It addition, described classification function is not unique, Can be set as required, it will directly affect the accuracy of grader, it is therefore desirable to by substantial amounts of mould Type training, training process does not repeats them here.In the embodiment of the present application, described information is through described support After vector machine (i.e. SVM) grader, during follow-up artificial screening, can be directly to " useful " Category information carries out artificial screening, and this mode makes user can be concerned about theme phase with the acquisition of fast accurate with it The information closed, saves the work of substantial amounts of artificial screening, improves treatment effeciency.
It addition, in actual classification processing procedure, the number of described support vector machine (i.e. SVM) grader Amount can be multiple, and described information repeatedly can be divided one by one through the plurality of support vector machine classifier Class processes, and wherein, all can be provided with two classifications in each support vector machine (i.e. SVM) grader, And each grader can arrange item of specifically classifying so that same information by many subseries and finally can improve The accuracy of classification.
S102: obtain the sentence in described information, described sentence carries out syntactic analysis, and to obtain potential evaluation right As;
Owing to information itself may carry some external information contenies (such as: reference information, net Location, source, character etc.), themselves also it is not belonging to the information at its place, thus, supervised obtaining During the sentence of measurement information, should this partial content not brought in sentence, the embodiment of the present application uses canonical Expression formula rule, deletes this partial content, thus obtains the information that content is the most succinct, with as follows As a example by monitored information: " it is true that the more //@Angela_ unhappy Miss than this worse product: That product is the poorest.", this information by after regular expression rule treatments, will obtain sentence " it is true that Than this worse product the more.”.
Further, since information includes sometimes multiple sentence (by ".”、“?”、“!" etc. symbol Separate), therefore, the sentence acquired from same information there may be multiple (for convenience of statement, under Literary composition is introduced in case of containing only a sentence in described monitored information)
After obtaining the sentence of described information, the embodiment of the present application carries out syntactic analysis to described sentence further, Thus obtain potential evaluation object, specifically include:
To described sentence by syntactic analysis, obtain node corresponding to the root node (ROOT) of described sentence Word;
Determine that the relation with the node word corresponding to described root node (ROOT) is subject-predicate relation (Subject-Verb, SBV), dynamic guest's relation (Verb-Object, VOB), guest's relation (Indirect-Object, IOB), preposition object (Fronting-Object, FOB), verbal endocentric phrase (Adverbial, ADV), coordination (Coordinate, COO), structure of complementation (Complement, CMP), fixed The node word of middle relation (Attribute, ATT) it is set to the first child node set;
Determine the relation with the node word in described first child node set be SBV, VOB, IOB, The node word of FOB, ADV, COO, CMP, ATT it is set to the second child node set;
The node word comprised in described first child node set, the second child node set is defined as described letter The potential evaluation object of breath.
S103: above-mentioned potential evaluation object is labeled and takes out by trained condition random field CRF Obtain final evaluation object;
According to Emotional Factors decimation rule to the first child node set of above-mentioned potential evaluation object and the second son Node set carries out feature extraction and obtains the confidence level of described potential evaluation object.Described confidence level is used for condition Random field (Conditional Random Field, CRF) carries out probability calculation to each potential evaluation object, Take the probability final evaluation object of the highest conduct.
In the embodiment of the present application, obtain the process of described final evaluation object, specific as follows:
First, carry out part-of-speech tagging and model training.Such as, the information collected manually is marked, Obtain training data after mark and write template file, thus training CRF model, then carry out model instruction Practicing, described model training is realized by CRF++ instrument.Described CRF model is probabilistic model, uses In the probability calculating word to be assessed.
Second, use parser aforementioned potential evaluation object is carried out participle (the lexeme information of note word, By word word-building), part-of-speech tagging (mark participle part of speech, such as: noun, verb, auxiliary word etc.), interdependent sentence After method analyzes (analyzing the mutual relation between vocabulary, such as: dynamic guest's relation, subject-predicate relation etc.), according to emotion Key element decimation rule carries out feature extraction, i.e. comes according to the syntactic relation between potential evaluation object and emotion word Determining the confidence level of each potential evaluation object, join table 1 below, wherein, emotion word planting modes on sink characteristic is to pass through emotion Dictionary dictionary (collecting the set of all kinds of emotion vocabulary) judges.
Table 1
3rd, according to aforementioned probabilistic model, calculate the probability of each potential evaluation object respectively, take probability Big potential evaluation object is as the final evaluation object of this information.
S104: judge whether this final evaluation object mates with the key word preset.
In the application preferred embodiment, it is judged that whether this final evaluation object mates with the key word preset Specifically include:
Described final evaluation object is compared with described key word, it is judged that the most whether there is friendship Collection;
If existing, then confirm described final evaluation object and described Keywords matching, and this information is retained;
Otherwise, then confirm described final evaluation object and described crucial word mismatch, and by above-mentioned information filtering.
During this, the final evaluation object of described information is compared with the key word preset, sees two Whether there is common factor between person, judge whether it is user's theme of interest with this.If there is not common factor, This final evaluation object and crucial word mismatch are then described, the most described monitored information non-user are paid close attention to Theme, now, can directly filter this information, and without creating artificial screening task again.If existing Occuring simultaneously, then show this final evaluation object and described Keywords matching, the most described monitored information is actually " useful " category information, belongs to user's theme of interest, should not be filtered, and now, then retains this information, And it is created artificial screening task, bring in next step artificial screening work.Visible, by obtaining " nothing With " mode of the final evaluation object of category information, and it is compared with key word, whether judge this information Really " useless ", so can be substantially reduced the probability omitting significant information in " useless " information, significantly carry The accuracy of high information monitoring.
In conjunction with the above-mentioned monitoring method of the application, the application is also disclosed a kind of information monitoring device, including:
Handling module, needs monitored information for capturing;
Acquisition module, for obtaining the sentence in described information, and carries out syntactic analysis to sentence and obtains potential Evaluation object;
Abstraction module, for carrying out above-mentioned potential evaluation object by housebroken condition random field CRF Mark and extraction, obtain final evaluation object;And
Judge module, for judging whether described final evaluation object mates with the key word preset.
Wherein, being provided with parser in described acquisition module, described parser is used for analyzing and obtaining The node word corresponding to root node (ROOT) of described sentence and with this root node (ROOT) institute The relation of corresponding node word is subject-predicate relation (i.e. SBV), dynamic guest's relation (i.e. VOB), a guest pass System (i.e. IOB), preposition object (i.e. FOB), verbal endocentric phrase (i.e. ADV), coordination (i.e. COO), Structure of complementation (i.e. CMP), the node word of fixed middle relation (i.e. ATT) it is set to the first child node Set.In addition, described parser be additionally operable to analyze obtain with in described first child node set The relation of node word is subject-predicate relation (i.e. SBV), moves guest's relation (i.e. VOB), guest's relation (i.e. IOB), preposition object (i.e. FOB), verbal endocentric phrase (i.e. ADV), coordination (i.e. COO), dynamic Mend structure (i.e. CMP), the node word of fixed middle relation (i.e. ATT), and it is set to the second child node Set.
Be provided with extraction unit in described acquisition module, described extraction unit for extract the first child node set and Node word in second child node set, and the node word extracted is set to potential evaluation object.
It is provided with confidence computation unit in described abstraction module, is used for according to Emotional Factors decimation rule above-mentioned Potential evaluation object carries out feature extraction, and obtains the confidence level of described potential evaluation object.It addition, it is described Probability calculation unit it is additionally provided with, for by condition random field (i.e. CRF) and above-mentioned confidence in abstraction module Degree calculates the probability of described potential evaluation object.
Information scratching device and support vector machine (i.e. SVM) grader it is provided with in described handling module;Described Information scratching device is for carrying out information scratching according to the key word preset;Described support vector machine (i.e. SVM) Grader is for classifying to the information grabbed and obtaining needing monitored information.
Comparing unit and confirmation unit it is provided with in described judge module;Described comparing unit for by described finally Evaluation object compares with described default key word, and judges the most whether there is common factor;Described According to above-mentioned common factor presence or absence, confirmation unit is for confirming whether described information filters.
Below in conjunction with concrete application example, illustrate the application application in monitoring microblogging public feelings information, for The information monitoring flow process of microblogging is as follows:
First, configure key word and capture microblogging public feelings information according to aforementioned key word.Such as, by key word Being configured to " Alipay ", so acquired key word (Key Word) collection is combined into: { " Alipay " }, It is intended to from microblogging capture the public feelings information relevant to " Alipay ".
Secondly, by crawled to microblogging public feelings information be divided into two classes.Categorizing process is by support vector machine (i.e. SVM) grader completes, and grader is formed by the training of great amount of samples word early stage.Such as, one is grabbed Microblogging public feelings information: " waiting the money in Alipay much of that, facial cream undercarriage ", according to support vector machine (i.e. SVM) this information is divided into " useless " category information (because in this information, the subject of sentence not " props up by grader Pay treasured "), certainly, in order to prevent important information from omitting, it is considered as needing monitored by this " useless " information Information, in order to follow-up examine process further.
Then, the evaluation object of this microblogging public sentiment is obtained.Detailed process is: first, utilizes syntactic analysis Device carries out syntactic analysis to this public feelings information, obtains grammatical structure tree as shown in Figure 3.
Second, get " undercarriage " node according to root node (ROOT), and obtain further saving with root Point word " undercarriage " is the child node of SBV, VOB, IOB, FOB, ADV, COO relation, to obtain final product To " facial cream ", " ", " etc. " three child nodes;Then, then obtain with these three child node be SBV, The child node of VOB, IOB, FOB, ADV, COO relation, obtain child node " reach ", " money " two Child node;Finally, the set of the coupling word (Match Word) that obtains, its comprise key word " facial cream ", " ", " etc. ", enough ", " money " five nodes, these five nodes are potential evaluation object.
3rd, obtain final evaluation object " facial cream " (because being calculated " facial cream " by CRF mark extraction Maximum probability, therefore by " facial cream " as final evaluation object), and by it with keyword set " Pay treasured " } carry out intersection operation, obtain occuring simultaneously for empty, so the information judged in this microblogging and user institute The theme " Alipay " paid close attention to does not mates.
Finally, directly filter out this micro-blog information, no longer listed in next step processing of task.
In the embodiment of the present application, by micro-blog information identification is classified, and the mode of combining assessment object extraction, The microblogging public feelings information grabbed is carried out screening and filtering, the accuracy rate of screening can be improved, reduce significant information The risk omitted.
Those skilled in the art are it should be appreciated that embodiments herein can be provided as method, system or meter Calculation machine program product.Therefore, the application can use complete hardware embodiment, complete software implementation or knot The form of the embodiment in terms of conjunction software and hardware.And, the application can use and wherein wrap one or more Computer-usable storage medium containing computer usable program code (include but not limited to disk memory, CD-ROM, optical memory etc.) form of the upper computer program implemented.
The application is with reference to method, equipment (system) and the computer program product according to the embodiment of the present application The flow chart of product and/or block diagram describe.It should be understood that can by computer program instructions flowchart and / or block diagram in each flow process and/or flow process in square frame and flow chart and/or block diagram and/ Or the combination of square frame.These computer program instructions can be provided to general purpose computer, special-purpose computer, embedding The processor of formula datatron or other programmable data processing device is to produce a machine so that by calculating The instruction that the processor of machine or other programmable data processing device performs produces for realizing at flow chart one The device of the function specified in individual flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory Instruction produce and include the manufacture of command device, this command device realizes in one flow process or multiple of flow chart The function specified in flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device, makes Sequence of operations step must be performed to produce computer implemented place on computer or other programmable devices Reason, thus the instruction performed on computer or other programmable devices provides for realizing flow chart one The step of the function specified in flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
In a typical configuration, calculating equipment includes one or more processor (CPU), input/defeated Outgoing interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or the form such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM).Internal memory is the example of computer-readable medium.
Computer-readable medium includes that removable media permanent and non-permanent, removable and non-can be by appointing Where method or technology realize information storage.Information can be computer-readable instruction, data structure, program Module or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic random access memory (DRAM), its The random access memory (RAM) of his type, read only memory (ROM), electrically erasable are read-only Memorizer (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette tape, tape magnetic Disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be calculated The information that equipment accesses.According to defining herein, computer-readable medium does not include temporary computer-readable matchmaker Body (transitory media), such as data signal and the carrier wave of modulation.
Also, it should be noted term " includes ", " comprising " or its any other variant are intended to non- Comprising of exclusiveness, so that include that the process of a series of key element, method, commodity or equipment not only wrap Include those key elements, but also include other key elements being not expressly set out, or also include for this process, The key element that method, commodity or equipment are intrinsic.In the case of there is no more restriction, statement " include One ... " key element that limits, it is not excluded that including the process of described key element, method, commodity or setting Other identical element is there is also in Bei.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer journey Sequence product.Therefore, the application can use complete hardware embodiment, complete software implementation or combine software and The form of the embodiment of hardware aspect.And, the application can use and wherein include calculating one or more The computer-usable storage medium of machine usable program code (include but not limited to disk memory, CD-ROM, Optical memory etc.) form of the upper computer program implemented.
The foregoing is only embodiments herein, be not limited to the application.For this area skill For art personnel, the application can have various modifications and variations.All institutes within spirit herein and principle Any modification, equivalent substitution and improvement etc. made, within the scope of should be included in claims hereof.

Claims (17)

1. an information monitoring method, it is characterised in that including:
Capture and need monitored information;
Obtain the sentence in described information, described sentence is carried out syntactic analysis and obtains potential evaluation object;
By trained condition random field CRF above-mentioned potential evaluation object is labeled and extraction obtains Final evaluation object;And
Judge whether this final evaluation object mates with the key word preset.
2. the method for claim 1, it is characterised in that described condition random field CRF is by manually The training data training of mark obtains.
3. method as claimed in claim 2, it is characterised in that by trained condition random field CRF Above-mentioned potential evaluation object is labeled and extraction obtains final evaluation object, specifically include:
Final review is obtained by calculating the probability of potential evaluation object by trained condition random field CRF Valency object.
4. method as claimed in claim 3, it is characterised in that described sentence is carried out syntactic analysis and obtains To potential evaluation object, specifically include:
To described sentence by syntactic analysis, obtain the node corresponding to root node ROOT of described sentence Word;
Determine that the relation with described node word corresponding for root node ROOT is subject-predicate relation SBV, dynamic guest Relation VOB, guest's relations I OB, preposition object FOB, verbal endocentric phrase ADV, coordination COO, Structure of complementation CMP, the node word of fixed middle relation ATT, and constituted the first child node by this node word Set;
The node word comprised in described first child node set is defined as described potential evaluation object.
5. method as claimed in claim 4, it is characterised in that described method also includes:
Determine that the relation with the node word in described first child node set is subject-predicate relation SBV, dynamic guest pass Be VOB, guest's relations I OB, preposition object FOB, verbal endocentric phrase ADV, coordination COO, Structure of complementation CMP, the node word of fixed middle relation ATT, and constituted the second child node by this node word Set;
By the node word comprised in described second child node set it is also determined that be described potential evaluation object.
6. method as claimed in claim 5, it is characterised in that described method also includes:
According to Emotional Factors decimation rule to the first child node set of above-mentioned potential evaluation object and the second son Node set carries out feature extraction and obtains the confidence level of described potential evaluation object.
7. method as claimed in claim 6, it is characterised in that described confidence level is used for condition random Field CRF carries out the calculating of probability to potential evaluation object.
8. the method for claim 1, it is characterised in that capture the monitored information that needs, tool Body includes:
Information is captured according to default key word;
By support vector machines grader, the information captured is classified, obtain needing monitored Information.
9. the method for claim 1, it is characterised in that judge this final evaluation object and preset Key word whether mate, specifically include:
Described final evaluation object is compared with described key word, it is judged that the most whether there is friendship Collection;
If existing, then confirm described final evaluation object and described Keywords matching, and this information is retained;
Otherwise, then confirm described final evaluation object and described crucial word mismatch, and by above-mentioned information filtering.
10. an information monitoring device, it is characterised in that including:
Handling module, needs monitored information for capturing;
Acquisition module, for obtaining the sentence in described information, and carries out syntactic analysis to sentence and obtains potential Evaluation object;
Abstraction module, for carrying out above-mentioned potential evaluation object by housebroken condition random field CRF Mark and extraction, obtain final evaluation object;And
Judge module, for judging whether described final evaluation object mates with the key word preset.
11. devices as claimed in claim 10, it is characterised in that be provided with syntax in described acquisition module Analyzer, described parser is for analyzing and obtaining corresponding to the root node ROOT of described sentence Node word and the relation with the node word corresponding to this root node ROOT be subject-predicate relation SBV, Dynamic guest's relation VOB, guest's relations I OB, preposition object FOB, verbal endocentric phrase ADV, coordination COO, structure of complementation CMP, the node word of fixed middle relation ATT it is set to the first child node set.
12. devices as claimed in claim 11, it is characterised in that described parser is additionally operable to point It is subject-predicate relation SBV, dynamic guest's relation that analysis obtains with the relation of the node word in described first child node set VOB, guest's relations I OB, preposition object FOB, verbal endocentric phrase ADV, coordination COO, dynamic benefit Structure C MP, the node word of fixed middle relation ATT, and it is set to the second child node set.
13. devices as claimed in claim 12, it is characterised in that be provided with extraction in described acquisition module Unit, described extraction unit node word in extracting the first child node set and the second child node set Language, and the node word extracted is set to potential evaluation object.
14. devices as claimed in claim 13, it is characterised in that be provided with confidence in described abstraction module Degree computing unit, for above-mentioned potential evaluation object being carried out feature extraction according to Emotional Factors decimation rule, And obtain the confidence level of described potential evaluation object.
15. devices as claimed in claim 14, it is characterised in that be provided with probability in described abstraction module Computing unit, for calculating described potential evaluation object by condition random field CRF and above-mentioned confidence level Probability.
16. devices as claimed in claim 10, it is characterised in that be provided with information in described handling module Grabber and support vector machines grader;Described information scratching device is for entering according to the key word preset Row information scratching;Described support vector machines grader is for classifying to the information grabbed and obtaining To needing monitored information.
17. devices as claimed in claim 10, it is characterised in that be provided with in described judge module and compare Unit and confirmation unit;Described comparing unit is for by described final evaluation object and described default key word Compare, and judge the most whether there is common factor;Described confirmation unit is for depositing according to above-mentioned common factor Whether confirming whether described information filters.
CN201410833749.4A 2014-12-26 2014-12-26 A kind of information monitoring method and device Active CN105786929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410833749.4A CN105786929B (en) 2014-12-26 2014-12-26 A kind of information monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410833749.4A CN105786929B (en) 2014-12-26 2014-12-26 A kind of information monitoring method and device

Publications (2)

Publication Number Publication Date
CN105786929A true CN105786929A (en) 2016-07-20
CN105786929B CN105786929B (en) 2019-09-03

Family

ID=56389035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410833749.4A Active CN105786929B (en) 2014-12-26 2014-12-26 A kind of information monitoring method and device

Country Status (1)

Country Link
CN (1) CN105786929B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447508A (en) * 2016-10-20 2017-02-22 宁波江东大金佰汇信息技术有限公司 Improved high-quality node detection system based on computer large data in social network
CN107220238A (en) * 2017-05-24 2017-09-29 电子科技大学 A kind of text object abstracting method based on Mixed Weibull distribution
CN107766577A (en) * 2017-11-15 2018-03-06 北京百度网讯科技有限公司 A kind of public sentiment monitoring method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN103049435A (en) * 2013-01-04 2013-04-17 浙江工商大学 Text fine granularity sentiment analysis method and text fine granularity sentiment analysis device
CN103631961A (en) * 2013-12-17 2014-03-12 苏州大学张家港工业技术研究院 Method for identifying relationship between sentiment words and evaluation objects

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN103049435A (en) * 2013-01-04 2013-04-17 浙江工商大学 Text fine granularity sentiment analysis method and text fine granularity sentiment analysis device
CN103631961A (en) * 2013-12-17 2014-03-12 苏州大学张家港工业技术研究院 Method for identifying relationship between sentiment words and evaluation objects

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杜慎芝: "基于条件随机场的微博情感对象识别研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447508A (en) * 2016-10-20 2017-02-22 宁波江东大金佰汇信息技术有限公司 Improved high-quality node detection system based on computer large data in social network
CN107220238A (en) * 2017-05-24 2017-09-29 电子科技大学 A kind of text object abstracting method based on Mixed Weibull distribution
CN107766577A (en) * 2017-11-15 2018-03-06 北京百度网讯科技有限公司 A kind of public sentiment monitoring method, device, equipment and storage medium
CN107766577B (en) * 2017-11-15 2020-08-21 北京百度网讯科技有限公司 Public opinion monitoring method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN105786929B (en) 2019-09-03

Similar Documents

Publication Publication Date Title
CN107862022B (en) Culture resource recommendation system
US20120303661A1 (en) Systems and methods for information extraction using contextual pattern discovery
WO2017097231A1 (en) Topic processing method and device
KR101565759B1 (en) Method and system for detecting related topics and competition topics based on topic templates and association words, related topics and competition topics detecting device
KR20200007969A (en) Information processing methods, terminals, and computer storage media
CN106446070B (en) A kind of information processing unit and method based on patent group
CN110263248A (en) A kind of information-pushing method, device, storage medium and server
CN107341220A (en) A kind of multi-source data fusion method and device
WO2020237872A1 (en) Method and apparatus for testing accuracy of semantic analysis model, storage medium, and device
CN110334343B (en) Method and system for extracting personal privacy information in contract
CN110737821B (en) Similar event query method, device, storage medium and terminal equipment
CN112860841A (en) Text emotion analysis method, device and equipment and storage medium
CN105912645A (en) Intelligent question and answer method and apparatus
CN105095091B (en) A kind of software defect code file localization method based on Inverted Index Technique
CN112000773A (en) Data association relation mining method based on search engine technology and application
KR20220064016A (en) Method for extracting construction safety accident based data mining using big data
CN104484336A (en) Chinese commentary analysis method and system
CN111553318A (en) Sensitive information extraction method, referee document processing method and device and electronic equipment
CN112765974B (en) Service assistance method, electronic equipment and readable storage medium
CN110968664A (en) Document retrieval method, device, equipment and medium
CN105786929A (en) Information monitoring method and device
US20170109640A1 (en) Generation of Candidate Sequences Using Crowd-Based Seeds of Commonly-Performed Steps of a Business Process
CN108108346A (en) The theme feature word abstracting method and device of document
CN110019556A (en) A kind of topic news acquisition methods, device and its equipment
CN108268602A (en) Analyze method, apparatus, equipment and the computer storage media of text topic point

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.

TR01 Transfer of patent right