CN106802887A - Participle processing method and device, electronic equipment - Google Patents
Participle processing method and device, electronic equipment Download PDFInfo
- Publication number
- CN106802887A CN106802887A CN201611263885.XA CN201611263885A CN106802887A CN 106802887 A CN106802887 A CN 106802887A CN 201611263885 A CN201611263885 A CN 201611263885A CN 106802887 A CN106802887 A CN 106802887A
- Authority
- CN
- China
- Prior art keywords
- word
- reflection
- comment content
- content
- evaluating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 26
- 238000000034 method Methods 0.000 claims abstract description 42
- 230000011218 segmentation Effects 0.000 claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 24
- 238000004590 computer program Methods 0.000 claims description 14
- 238000005520 cutting process Methods 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 abstract description 8
- 238000012795 verification Methods 0.000 description 15
- 235000019580 granularity Nutrition 0.000 description 12
- 238000011156 evaluation Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 238000003860 storage Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 230000001568 sexual effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 229910017435 S2 In Inorganic materials 0.000 description 1
- 238000005267 amalgamation Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3335—Syntactic pre-processing, e.g. stopword elimination, stemming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
A kind of participle processing method of disclosure and device, electronic equipment, the method include:In the text after having carried out word segmentation processing, it is determined that the word of reflection comment content;In the case of it is determined that meeting predetermined relationship between the word of the reflection comment content and the word of the neighbouring word for reflecting comment content, the word that content is commented in the reflection is merged with the word of the neighbouring word for reflecting comment content.Using the technical scheme of the application, bigger participle granularity can be reached, so as to effectively lift the analysis ability of the word to reflection comment content.
Description
Technical field
The present invention relates to natural language processing technique field, in particular to a kind of participle processing method and device, electricity
Sub- equipment.
Background technology
In Chinese, word is minimum to be capable of independent activities, significant language element.And in Chinese due to word with
Do not exist separator between word, word also lacks obvious morphological markers in itself, therefore when being analyzed to Chinese text, Chinese
Participle is a basic fundamental, is the basis of follow-up other items analyses.And due to different participle granularities, its competency is not
Equally, therefore for different Chinese texts analyze, participle granularity serves key effect to the accuracy analyzed.
At present, flourishing with ecommerce, the comment on commodity information of each electric business platform is more and more.To comment
When being analyzed, need also exist for carrying out participle, and different participle granularities can be then influenceed to for example commenting on attribute word, comment word etc.
Analysis ability.
Current word segmentation processing technology relatively relies on artificial, and not enough intelligence is with flexibly, and the degree of accuracy that some are automatically processed is relatively low,
It is difficult to expected participle granularity.
The content of the invention
In view of this, the present invention provides a kind of participle processing method and device, electronic equipment, it is adaptable to comment information point
Analysis, can reach bigger participle granularity, effectively word (such as base attribute word, comment word etc.) of the lifting to reflection comment content
Analysis ability, possess intelligent and flexibility.
Other characteristics of the invention and advantage will be apparent from by following detailed description, or partially by the present invention
Practice and acquistion.
According to an aspect of the present invention, there is provided a kind of participle processing method, including:
In the text after having carried out word segmentation processing, it is determined that the word of reflection comment content;
It is determined that meeting predetermined between the word of the reflection comment content and the word of the neighbouring word for reflecting comment content
In the case of relation, the word that content is commented in the reflection is merged instead with the word of the word of the neighbouring reflection comment content
Reflect comment content reflection comment content reflection comment content reflection comment content.
In addition, the present invention also provides a kind of word segmentation processing device, it includes:
Word determining module, the word for determining reflection comment content in the text after having carried out word segmentation processing;
Merging module, between the word of the word of the reflection comment content and the word of the neighbouring reflection comment content
In the case of meeting predetermined relationship, the reflection is commented on the word of content and the word of the word of the neighbouring reflection comment content
Merge reflection comment content reflection comment content reflection comment content reflection comment content.
Additionally, the present invention also provides a kind of electronic equipment, including:
Processor;And
Memory, is stored thereon with the computer program that can be run on the processor;
The step of computer program is to realize method as described above described in the computing device.
The present invention also provides a kind of computer-readable recording medium, and be stored with computer program, the computer program quilt
The step of method as described above being realized during computing device.
According to the participle processing method and device and electronic equipment of embodiment of the present invention, reflection comment can be automatically determined
The word of content, and on this basis by the verification of predetermined relationship, judge whether to merge the word with neighbouring word automatically, make
Must merge treatment after text can reach bigger participle granularity, possess intelligent and flexibility, can reach compared with
The degree of accuracy high.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary, this can not be limited
Invention.
Brief description of the drawings
Its example embodiment is described in detail by referring to accompanying drawing, above and other target of the invention, feature and advantage will
Become more fully apparent.
Fig. 1 is a kind of flow chart of the participle processing method according to an illustrative embodiments.
Fig. 2 is a kind of flow chart of the participle processing method according to an illustrative embodiments.
Fig. 3 is a kind of flow chart of the participle processing method according to an illustrative embodiments.
Fig. 4 is a kind of principle schematic of the participle processing method according to an illustrative embodiments.
Fig. 5 A are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 5 B are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 6 A are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 6 B are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 6 C are a kind of principle schematics of the participle processing method according to an illustrative embodiments.
Fig. 7 A are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 7 B are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 7 C and Fig. 7 D are a kind of principle schematics of the participle processing method according to an illustrative embodiments.
Fig. 8 A are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 8 B are a kind of flow charts of the participle processing method according to an illustrative embodiments.
Fig. 9 is a kind of block diagram of the word segmentation processing device according to an illustrative embodiments.
Figure 10 is a kind of block diagram of the word segmentation processing device according to an illustrative embodiments.
Figure 11 is a kind of block diagram of the word segmentation processing device according to an illustrative embodiments.
Figure 12 A are a kind of block diagrams of the word segmentation processing device according to an illustrative embodiments.
Figure 12 B are a kind of block diagrams of the word segmentation processing device according to an illustrative embodiments.
Figure 13 is the block diagram of a kind of electronic equipment according to an illustrative embodiments.
Specific embodiment
Example embodiment is described more fully with referring now to accompanying drawing.However, example embodiment can be with various shapes
Formula is implemented, and is not understood as limited to example set forth herein;Conversely, thesing embodiments are provided so that the present invention will more
Fully and completely, and by the design of example embodiment those skilled in the art is comprehensively conveyed to.Accompanying drawing is only the present invention
Schematic illustrations, be not necessarily drawn to scale.Identical reference represents same or similar part in figure, thus
Repetition thereof will be omitted.
Additionally, described feature, structure or characteristic can be combined in one or more implementations in any suitable manner
In mode.In the following description, there is provided many details fully understand so as to be given to embodiments of the present invention.So
And, it will be appreciated by persons skilled in the art that technical scheme can be put into practice and omit in the specific detail
Or more, or can be using other methods, constituent element, device, step etc..In other cases, it is not shown in detail or describes
Known features, method, device, realization or operation are to avoid that a presumptuous guest usurps the role of the host so that each aspect of the present invention thickens.
Fig. 1 is a kind of flow chart of the participle processing method according to embodiment of the present invention.In embodiment party of the present invention
In formula, participle processing method may include:
Step S1:In the text after having carried out word segmentation processing, it is determined that the word of reflection comment content.
In embodiments of the present invention, first extracting can reflect the word of comment content.The word of reflection comment content refers generally to
The word of most crucial content is expressed in the sentence of comment.By taking the user comment content for taking out platform as an example, the comment of certain user is for " outward
The dispatching speed sold is credible ", wherein the word of reflection comment content can be " speed ", " trust ", because " speed " is to comment
The main body of opinion, " trust " is the core views of user.And " take-away ", " dispatching " are all modifications " speed ", " worth " is to make
It is that modal verb and " dependence " constitute phrase, can not all reflects the core content of comment content.For a take-away platform, its
User comment can be presented certain statistical regularity, and for example the comment such as " speed ", " environment ", " attitude ", " service " main body occurs
Frequency it is higher, therefore reflection comment content word can be a predetermined collection, for after certain has carried out word segmentation processing
Text in determine reflection comment content word.Certainly, it is determined that the method for the word of reflection comment content is not limited thereto.
Step S3:It is determined that reflection comment content word and neighbouring reflection comment content word word between meet predetermined
In the case of relation, will reflect that the word of comment content merges with the word of the word of neighbouring reflection comment content.
In embodiments of the present invention, extract can reflect comment content word after, if detect the word with it is neighbouring
The relation of word meets predetermined relationship, for example, meet grammatical relation, part of speech Matching Relation etc., then can merge treatment, i.e. phase
Compared with the text that bigger participle granularity is formed before merging.In this way, can elevator machine or system intelligent processing method level and place
The flexibility ratio of reason, and the degree of accuracy that lifting merges.
" participle granularity " is a computational language technics, i.e., number of one Chinese word comprising Chinese character, such as " speed "
Participle granularity be 2, the participle granularity of " dispatching speed " is 4.It is appreciated that based on certain word, with merging and participle grain
The increase of degree, the implication of its expression is also more definite, helps that comment content is further analyzed and processed.
Fig. 2 is the flow chart of another participle processing method according to embodiment of the present invention.The method and Fig. 1 institutes
The method shown is compared to also including judging step S2.It is specific as follows:
In step sl, it is determined that the word of reflection comment content.
In step s 2, it is determined that between the word of the word of the word of reflection comment content and the neighbouring reflection comment content whether
Meet predetermined relationship.
If met between the word for determining the word of reflection comment content and the word of neighbouring reflection comment content in step s 2
Predetermined relationship, then perform step S3.If determining the word and neighbouring reflection comment content of reflection comment content in step s 2
Predetermined relationship is unsatisfactory between the word of word, then the word of the word of the neighbouring reflection comment content of the word of explanation reflection comment content is uncomfortable
Merge, thus operation can not be merged.
In step s3, will reflect that the word of comment content merges with the word of the word of neighbouring reflection comment content.
Fig. 3 and Fig. 4 is referred to, in some embodiments, word segmentation processing first can be carried out to text before step S1,
The step of word segmentation processing, may include:
Step S5, cutting word is carried out to text.Can be smaller compared to granularity before cutting word by text segmentation by cutting word
Multiple small grain size words.
Step S6, part-of-speech tagging and interdependent syntax mark are carried out to the word in the text after cutting word.For example, can be to entering
The multiple small grain size words obtained after the treatment of row cutting word carry out part-of-speech tagging and interdependent syntax mark.
Fig. 4 is referred to, text can be split first, that is, the small grain size word of most basic unit is formed, then to it
Carry out part-of-speech tagging and interdependent syntax mark.Part-of-speech tagging and interdependent syntax mark are the bases of subsequent treatment.Such as step S2
In verification, it may include verification whether meet predetermined syntax dependence and/or whether meet predetermined part of speech pattern, the step is just
It is based on part-of-speech tagging and interdependent syntax mark and premise.For example, a comment " the dispatching speed order in for text
People gasp in admirations ", " dispatching " constitutes modified relationship (ATT) with " speed ", and " dispatching " is verb (v), and " speed " is noun (n), these
All it is labeled.In addition to modified relationship, also can be by simultaneous language (DBL) relation, dynamic guest's relation (VOB) etc., part-of-speech tagging
Other parts of speech in addition to noun, verb are may include, it is numerous to list herein.Marked on part-of-speech tagging and interdependent syntax, ability
Field technique personnel can realize that it is not specifically limited to this for the application according to related natural language processing technique.
In some embodiments, the word of the word of neighbouring reflection comment content may include and reflection comment content in step S2
The adjacent word of word.
" adjacent " can be a kind of basic condition of " neighbouring ", be also a kind of common situation.With user comment " take-away
Dispatching speed is credible " as a example by, the word adjacent with " speed " may include " dispatching ", " worth ", i.e., preceding adjacent with rear adjacent two
The situation of kind.Wherein it is possible to default " adjacent " is preceding adjacent or rear adjacent, be for example preset as it is preceding adjacent, will " speed " conduct
Suffix, mutually arranges in pairs or groups with " dispatching ".Below based on the situation of description " adjacent ", but " neighbouring " of the invention is not limited to
A kind of " adjacent " this situation.In other instances, some void be there may be between the word word adjacent thereto of reflection comment content
Word, for example, in sentence " speed of the dispatching of take-away is credible ", exist between " dispatching " and " speed " " ", " " it is one
Kind of auxiliary word, belongs to one kind of function word, in such a case, it is possible to first detect and reject it is similar " ", " it ", " ", " " etc.
Function word, then carry out subsequent step.
Refering to Fig. 5 A, in some embodiments, predetermined relationship is met in step S2 may include to meet predetermined syntax interdependent
Relation.Specifically, step S2 may include:
S201, it is determined that whether meeting predetermined sentence between the word of the reflection comment content word adjacent with the word of reflection comment content
Method dependence.If meeting, can be determined that and meet predetermined relationship, continue executing with step S3.
That is, judge whether the word of reflection comment content word adjacent thereto meets predetermined syntax dependence, if
Meet, then can perform merging treatment.
Interdependent syntax is proposed at first by French linguist L.Tesniere.It is by the analysis of sentence into an interdependent syntax
Tree, is depicted the dependence between each word.Namely indicate between word in syntactical Matching Relation, it is this to take
It is associated with semanteme with relation.
Using the analysis to predetermined syntax dependence, the word with predetermined Matching Relation can be merged, this knowledge
Not with merge more flexibly and intelligent.For example, it is determined that the word " speed " of content is commented in reflection, but can be with speed collocation
Word is a lot, and as dispatching speed, food delivery speed, manufacturing speed, service rate etc., user can use according to the speech habits of oneself
The different words arranged in pairs or groups with speed, therefore, the word arranged in pairs or groups with " speed " is an open collection.However, the collocation with " speed "
With certain rule, for example, all meeting modified relationship (ATT).Therefore, the thinking of present embodiment is by holding the rule
Suitably arranged in pairs or groups to recognize or screen, as long as meeting predetermined syntax dependence, arranged in pairs or groups but regardless of with " speed "
Be specifically what word.In this way, making the present processes that there is preferable intelligent level and flexibility ratio, good conjunction is can obtain
And result.
Further, Fig. 5 B are referred to, predetermined relationship is met in some embodiments, in step S2 and be may also include and meet
Predetermined part of speech pattern.That is step S2 may also include:
S202, it is determined that whether meeting predetermined word between the word of the reflection comment content word adjacent with the word of reflection comment content
Sexual norm.
With reference to S201, if meeting pre- simultaneously between the word of the reflection comment content word adjacent with the word of reflection comment content
Determine syntax dependence and predetermined part of speech pattern, that is, be judged to meet predetermined relationship.
If the judged result of step S201 is no, the word of reflection comment content and the word of reflection comment content can be determined
Predetermined relationship is unsatisfactory between adjacent word, therefore treatment can not be merged.
Furthermore, if the judged result of step S202 is no, word and the reflection of reflection comment content can be determined
Predetermined relationship is unsatisfactory between the adjacent word of word for commenting on content, therefore treatment can not be merged.
On the basis of being identified, screening using syntax dependence, can further be known word sexual norm simultaneously
Other and screening, i.e. must simultaneously meet predetermined syntax dependence and predetermined part of speech pattern could be by verification.In this way, predetermined
Part of speech pattern can further lift the degree of accuracy of identification and screening equivalent to a further verification means.
In figure 5b, the judgement of step S202 is proceeded after the judgement for having carried out step S201, being that one kind is dual sentences
Disconnected mode, it is possible to increase the degree of accuracy of granularity customization.In one embodiment, it is also possible to only carry out the judgement of step S202, and
The judgement of step S201 is not carried out.That is, whether the judgement for meeting predetermined relationship in the application may include whether to meet
The judgement of predetermined syntax dependence and/or whether meet the judgement of predetermined part of speech pattern.
The verification of verification and part of speech pattern on syntax dependence, below has more detailed exemplary illustration.
In some embodiments, determine that the word of reflection comment content may include to determine reflection comment content in step S1
Base attribute word.Base attribute word, can refer to the object of comment, for example, for text " the dispatching speed of take-away is credible "
For, " speed " is exactly base attribute word, and for text " hall's environment is very clean ", " environment " is exactly base attribute word.With
It is lower to carry out the detailed description on the checking of syntax dependence for base attribute word.
Fig. 6 A are referred to, in this embodiment, meeting predetermined relationship may include to meet predetermined syntax dependence;Basic category
The property corresponding predetermined syntax dependence of word may include:Base attribute word with before the base attribute word and adjacent word has
Modified relationship.
That is step S201 can be realized:
Step S201a, determine base attribute word with before the base attribute word and whether adjacent word is met with repairing
Decorations relation.If meeting, it is judged to meet in step s 2.
For example, referring to Fig. 6 C, text is " the dispatching speed that AA takes out really is worth affirmative ", and wherein AA can be outside certain
Sell brand name.Word segmentation processing is carried out first, and determines the base attribute word " speed " of reflection comment content.Then, it is determined that " speed
Whether degree " meets modified relationship (ATT) with preceding adjacent word, and without this preceding adjacent word specifically what word managed.For example,
" dispatching speed ", " food delivery speed ", " take-away speed ", " jockey's speed " etc., are satisfied by the condition of the modified relationship (in Fig. 6 C
Example is " dispatching speed "), i.e., by verification, can merge.Merging the phrase that is formed or phrase can express containing more precisely
Justice, so as to facilitate subsequent treatment.
Fig. 6 B are referred to, in this embodiment, the corresponding part of speech pattern of base attribute word may include:Positioned at base attribute word
Before and adjacent word, constitute verb with base attribute word and add modification noun pattern or noun plus modification noun pattern.
That is, S202 can be realized:
Step S202a, determine base attribute word with before the base attribute word and whether adjacent word has predetermined word
Sexual norm, wherein predetermined part of speech pattern may include verb plus noun pattern or noun plus noun pattern.If step S201a and step
The judged result of S202a is satisfaction, then can determine the word of reflection comment content and the word of the neighbouring reflection comment content
Word between meet predetermined relationship.
Refering to Fig. 6 C, that is to say, that can determine in the case of meeting modified relationship and predetermined part of speech pattern at the same time anti-
Meet predetermined relationship between the word for reflecting the word of comment content and the word of the neighbouring reflection comment content.It is appreciated that verb adds
Noun pattern refers to as the base attribute word of suffix as the word before noun, and base attribute word is verb.For example, " dispatching ",
" food delivery " is verb, and " dispatching speed ", " food delivery speed " are to meet the verb plus noun pattern.And noun adds noun pattern i.e.
As the base attribute word of suffix for the word before noun, and base attribute word is noun.For example, " take-away ", " jockey " are run after fame
Word, " take-away speed ", " jockey's speed " are to meet the noun plus noun pattern.Above-mentioned two pattern meets one, then together
When meet the condition of modified relationship, i.e., by verification, the merging of step S3 can be carried out.Fig. 6 C examples are " dispatching speed ", are met
Verb adds noun pattern, therefore by verification.Certainly, the verb in present embodiment adds noun pattern or noun plus noun pattern
Only it is exemplary, the predetermined part of speech pattern mentioned in the present invention is not limited to above two part of speech checking mode.
When the part of speech of certain word is judged, can using compared with predetermined dictionary to method.For example, default verb dictionary, should
Typing has common verb in evaluation content in verb dictionary, for example, " dispense ", " food delivery ", " service " etc., by word to be judged
It is right compared with the word of the verb dictionary, if word to be judged belongs to the verb dictionary, you can judge that the word to be judged is verb.
Additionally, predeterminable part of speech more specifically dictionary, such as modal verb dictionary, there be typing in the modal verb dictionary
Common modal verb in evaluation content, for example, " make us ", " needing " etc., by word to be judged and the word of the modal verb dictionary
Compared to right, if word to be judged belongs to the modal verb dictionary, you can judge that the word to be judged is modal verb.In this way, can
The part of speech for treating the word of judgement carries out more careful judgement.Due to modal verb common in evaluation content be it is limited, because
This is used compared with predetermined dictionary to being a kind of convenience, the method for suitable judgement part of speech.
Certainly, to judging that it is dynamic in above-mentioned example that the method for part of speech is not restrictively applied to compared with predetermined dictionary
Word, modal verb, apply also for noun, simultaneous language noun, order verb etc..
It is available more accurately or more to meet expected amalgamation result by modified relationship and the twin check of part of speech pattern.
The word of reflection comment content in some embodiments, can also be reflection user in addition to base attribute word
The evaluating word of viewpoint.Evaluating word, that is, embody the word of user's taste viewpoint, for example, through the text " dispatching that AA takes out of word segmentation processing
Speed is really worth affirmative " in, it may be determined that " affirmative " is evaluating word.Carried out on syntax dependence below for evaluating word
The detailed description of checking.
Fig. 7 A are referred to, in this embodiment, it is determined that the word of reflection comment content includes:It is determined that reflecting commenting for User Perspective
Valency word.The corresponding predetermined syntax dependence of evaluating word may include:Evaluating word with before the evaluating word and adjacent word has
V-O construction relation or simultaneous language add guest's relation.
That is step S201 can be realized:
Step S201b, determine base attribute word with before the base attribute word and whether adjacent word is met with dynamic
Guest's structural relation or simultaneous language add guest's relation.If meeting, the word and the neighbouring reflection that can determine reflection comment content are commented
Meet predetermined relationship between word by the word of content.
For example, referring to Fig. 7 C and Fig. 7 D, text is " the dispatching speed that AA takes out really is worth affirmative ", is carried out first
Word segmentation processing, and determine the evaluating word " affirmative " of reflection comment content.Then, it is determined that whether " affirmative " meets with preceding adjacent word
V-O construction (VOB) relation or and language (DBL) plus dynamic guest (VOB) relation.For example, " being worth affirmative " meets V-O construction relation,
I.e. by verification.And for example, Fig. 7 D are referred to, " making us gasp in admiratioing " meets and language adds guest's relation, i.e., by verification.By after verification
The merging of step S3 can be carried out.Phrase or phrase that merging is formed can express implication more precisely, so as to facilitate subsequent treatment.
Further, predetermined relationship is met in step S2 may also include and meet predetermined part of speech pattern.In this embodiment, walk
Rapid S202 may include step S202b and step S202c.
In step S202b, if evaluating word judges with before the evaluating word and adjacent word has V-O construction relation
Whether part of speech pattern was met before evaluating word and adjacent word constitutes modal verb plus verb pattern with evaluating word, if so,
Judgement meets predetermined relationship.
In step S202c, if evaluating word is sentenced with before the evaluating word and adjacent word has and language adds guest's relation
Whether hyphenation sexual norm was met before evaluating word and adjacent word constitutes order verb and adds simultaneous language noun to add with evaluating word
Word pattern, if so, judgement meets predetermined relationship.
Also referring to Fig. 7 C, that is to say, that the different syntax dependences corresponding part of speech mould of correspondence in step S201b
Formula.For example, " being worth affirmative " meets V-O construction relation, meanwhile, " worth " is modal verb (v), and " affirmative " is verb (v),
Therefore " it is worth affirming " that also constituting modal verb adds verb pattern, in this way, two conditions can meet, therefore, judge " to be worth agreeing
It is fixed " meet predetermined relationship, i.e., by verification.And for example, Fig. 7 D are referred to, " making us gasp in admiratioing " meets and language adds guest's relation, meanwhile,
" order " is order verb (v), and " people " is and language noun (n), and " gasp in admiratioing " is verb (v), therefore, " making us gasp in admiratioing " also meets order
Verb adds and language noun plus verb pattern, and two conditions are satisfied by, therefore, judge that " making us gasp in admiratioing " meets predetermined relationship, that is, lead to
Cross verification.Certainly, the modal verb in present embodiment adds verb pattern or order verb to add simultaneous language noun plus verb pattern only
It is exemplary, the predetermined part of speech pattern for evaluating word mentioned in the present invention is not limited to above two part of speech calibration mode
Formula.
Wherein, distinguish verb with the method for modal verb can refer to it is above-mentioned compared with predetermined dictionary to method.For example it is pre-
If modal verb dictionary, the common modal verb of typing, for judging whether word to be judged is modal verb.No longer go to live in the household of one's in-laws on getting married herein
State.
Additionally, the word of reflection comment content of the invention is not limited to base attribute word or evaluating word, base attribute word
Predetermined relationship be also not limited to syntax dependence or predetermined part of speech pattern.
Fig. 8 A are referred to, the determination of the word of comment content is reflected in step S1, can be in the following manner.In some embodiment party
The word of reflection comment content is determined in formula, in step S1 be may include:
Step S103, sets up and evaluates dictionary;
Step S104, by the word in text compared with the word in evaluating dictionary to determining whether the word in text is reflection
Comment on the word of content.
For a certain take-away platform, the content of user's concern and the comment object of user can be presented certain statistical law,
For example, " speed ", " environment ", " service " etc. belong to common and have the base attribute word of reference value, and " trust ", " praise
Sigh ", " affirmative ", " good " etc. belong to common and have the evaluating word of reference value.Therefore, evaluation dictionary can be set up in advance by these
Easily there is and has the word typing of reference value, when analyzing a certain specific text, by the word in the word in text and evaluation dictionary
Compared to determine the word in text whether be reflection comment content word.
In this way, can be analyzed in conjunction with syntax dependence or part of speech etc., the final word for determining reflection comment content, because,
If only meeting and belonging to dictionary, the comment object or evaluation content for belonging to and evaluating in sentence might not be met, it is also possible to be
Some irrelevant contents that user arbitrarily delivers.Therefore, whether dictionary and the analysis to sentence can be combined, certain word is comprehensively judged
It is the word of reflection comment content.
The foundation of dictionary refers to following two modes, but the invention is not limited in following two modes.
First way is that artificial foundation, i.e. basis are manually entered foundation and evaluate dictionary.For example manually to passing user
Comment carries out statistical analysis, selects word that is common and having reference value and includes evaluation dictionary.
Another way is that system is set up and automatic perfect automatically.Refer to Fig. 8 B, in some embodiments, step
Evaluation dictionary is set up in S103 be may include:
Step S1031, the occurrence number or frequency of each word in the multiple texts of statistics;
Step S1032, evaluation dictionary is included when times or frequency is more than predetermined value by the word.
System can count over the frequency of each word occurred in comment text, and screening occurrence number or frequency word higher are made
It is objective appraisal attribute word or objective appraisal word and the corresponding dictionary of typing.For example, system detectio to part of speech for noun word in,
" speed " occur frequency be higher than preset frequency, will " speed " typing objective appraisal attribute word dictionary.
Above two mode also can R. concomitans, such as it is first artificial to set up then automatic perfect.Both can be artificial after and for example setting up
Improving also can be automatically perfect, for example, based on system automatic identification typing, but can manually modify, for example deletion system is missed
Identification, and/or increase the unidentified word to reflection comment content of system.
It will be clearly understood that the present disclosure describe how being formed and use particular example, but principle of the invention is not limited to
Any details of these examples.Conversely, the teaching based on present disclosure, these principles can be applied to many other
Implementation method.
It will be appreciated by those skilled in the art that realizing that all or part of step of above-mentioned implementation method is implemented as being held by CPU
Capable computer program.When the computer program is performed by CPU, it is above-mentioned that the above method that the execution present invention is provided is limited
The program of function can be stored in a kind of computer-readable recording medium, and the storage medium can be read-only storage, disk
Or CD etc..
Further, it should be noted that above-mentioned accompanying drawing is may include according to the method for exemplary embodiment of the invention
Treatment schematically illustrate, rather than limitation purpose.It can be readily appreciated that above-mentioned treatment shown in the drawings is not intended that or limits this
The time sequencing of a little treatment.In addition, being also easy to understand, these treatment can for example either synchronously or asynchronously be performed in multiple modules
's.
Following is apparatus of the present invention embodiment, can be used for performing the inventive method embodiment.For apparatus of the present invention reality
The details not disclosed in example is applied, the inventive method embodiment is refer to.
Fig. 9 is referred to, the present invention provides a kind of word segmentation processing device 100, and it may include:
Word determining module 110, the word for determining reflection comment content in the text after having carried out word segmentation processing.
Merging module 120, for it is determined that between the word of reflection comment content and the word of the neighbouring word for reflecting comment content
In the case of meeting predetermined relationship, will reflect that the word of comment content merges with the word of the word of the neighbouring reflection comment content.
Figure 10 is referred to, the device can be with removing module 130.The removing module 130 can again reflect comment content
Function word is deleted in the case of there is function word between the word of the word of word and the neighbouring reflection comment content.
Figure 11 is referred to, in some embodiments, word segmentation processing device 100 may also include:
Segmentation module 150, for carrying out cutting word to text.
Labeling module 160, for carrying out part-of-speech tagging and interdependent syntax mark to the word in the text after cutting word.
Wherein, segmentation module 150 can be used to realize step S5 that labeling module 160 can be used to realize step S6.
In some embodiments, the word of the word of neighbouring reflection comment content may include adjacent with the word of reflection comment content
Word.
Further, in some embodiments, meeting predetermined relationship includes meeting predetermined syntax dependence and/or pre-
Determine part of speech pattern.
In one embodiment, it is determined that the word of reflection comment content may include:It is determined that the base attribute of reflection comment content
Word, the corresponding predetermined syntax dependence of base attribute word includes:Base attribute word be located at base attribute word before and
Adjacent word has modified relationship.The corresponding predetermined part of speech pattern of base attribute word may include:It is described positioned at the base attribute
Before word and adjacent word and the base attribute word constitute verb and add noun pattern or noun plus noun pattern.
In one embodiment, the word for determining reflection comment content may include:It is determined that the evaluation of reflection User Perspective
Word.The corresponding predetermined syntax dependence of evaluating word may include:Evaluating word with before the evaluating word and adjacent word has
There are V-O construction relation or simultaneous language to add guest's relation.The corresponding part of speech pattern of evaluating word may include:Before evaluating word and phase
Adjacent word constitutes modal verb and adds verb pattern with evaluating word;Or before the evaluating word and adjacent word and evaluating word
Order verb is constituted to add and language noun plus verb pattern.
Figure 12 A are referred to, in some embodiments, word determining module 110 may include:
Dictionary sets up unit 111, and dictionary is evaluated for setting up;
Comparing unit 113, for by the word in text with evaluate dictionary in word compared with to being with the word determined in text
No is the word of reflection comment content.
Further, in some embodiments, dictionary sets up unit 111 for according to being manually entered and set up evaluating word
Storehouse.
Figure 12 B are referred to, in other implementation methods, dictionary is set up unit 111 and be may include:
Statistics subelement 1111, occurrence number or frequency for counting each word in multiple texts;And
Storage subelement 1113, for the word to be included into evaluation dictionary when times or frequency is more than predetermined value.
Figure 13 is referred to, the application provides a kind of electronic equipment 1300, and the electronic equipment can include the He of memory 1301
Processor 1302.Be stored with the computer program that can be run on processor 1302 on memory 1301.Processor 1302 is performed
Computer program can realize method described herein.
Memory 1301 can be various by any kind of volatibility or non-volatile memory device or their group
Close and realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) is erasable to compile
Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash
Device, disk or CD.
The electronic equipment 1300 can possess the various equipment calculated with disposal ability, except memory 1301 and treatment
Outside device 1302, (can also for example be raised one's voice including various input equipments (such as user interface, keyboard etc.), various output equipments
Device etc.) and display device, repeat no more herein.
The application also provides a kind of computer-readable recording medium, and be stored with computer program, and computer program is processed
Device 1302 realizes method described herein when performing.
It should be noted that the block diagram shown in above-mentioned accompanying drawing is functional entity, not necessarily must with physically or logically
Independent entity is corresponding.Can realize these functional entitys using software form, or in one or more hardware modules or
These functional entitys are realized in integrated circuit, or is realized in heterogeneous networks and/or processor device and/or microcontroller device
These functional entitys.
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented
Mode can be realized by software, it is also possible to be realized by way of software is with reference to necessary hardware.Therefore, according to the present invention
The technical scheme of implementation method can be embodied in the form of software product, and the software product can store non-volatile at one
Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are causing a calculating
Equipment (can be personal computer, server, mobile terminal or network equipment etc.) is performed according to embodiment of the present invention
Method.
More than it is particularly shown and described illustrative embodiments of the invention.It should be appreciated that the present invention is not limited
In detailed construction described herein, set-up mode or implementation method;On the contrary, it is intended to cover be included in appended claims
Spirit and scope in various modifications and equivalence setting.
Claims (13)
1. a kind of participle processing method, it is characterised in that including:
In the text after having carried out word segmentation processing, it is determined that the word of reflection comment content;
It is determined that meeting predetermined relationship between the word of the reflection comment content and the word of the neighbouring word for reflecting comment content
In the case of, the word that content is commented in the reflection is merged with the word of the word of the neighbouring reflection comment content.
2. method according to claim 1, it is characterised in that methods described also includes:
After it is determined that the word of content is commented in the reflection, if the word of the reflection comment content and the neighbouring reflection
There is function word between the word of the word for commenting on content, then delete the function word.
3. method according to claim 1 and 2, it is characterised in that described to meet predetermined relationship and include meeting predetermined syntax
Dependence and/or predetermined part of speech pattern.
4. method according to claim 3, it is characterised in that the word of the determination reflection comment content includes:It is determined that anti-
Reflect the base attribute word of comment content.
5. method according to claim 4, it is characterised in that the corresponding predetermined syntax of the base attribute word is interdependent
Relation includes:
The base attribute word with before the base attribute word and adjacent word has modified relationship.
6. method according to claim 4, it is characterised in that the corresponding predetermined part of speech pattern of the base attribute word
Including:
It is described before the base attribute word and adjacent word and the base attribute word constitute verb add noun pattern or
Noun adds noun pattern.
7. method according to claim 3, it is characterised in that the word of the determination reflection comment content includes:It is determined that anti-
Reflect the evaluating word of User Perspective.
8. method according to claim 7, it is characterised in that the corresponding predetermined syntax dependence of the evaluating word
Including:
The evaluating word with before the evaluating word and adjacent word have V-O construction relation or and language add guest's relation.
9. method according to claim 7, it is characterised in that the corresponding part of speech pattern of the evaluating word includes:Institute's rheme
Before the evaluating word and adjacent word and the evaluating word constitute modal verb and add verb pattern;Or
It is described before the evaluating word and adjacent word and the evaluating word constitute order verb and add and language noun plus verb
Pattern.
10. method according to claim 1, it is characterised in that the word segmentation processing that carries out includes:
Cutting word is carried out to the text;
Part-of-speech tagging and interdependent syntax mark are carried out to the word in the text after cutting word.
A kind of 11. word segmentation processing devices, it is characterised in that including:
Word determining module, the word for determining reflection comment content in the text after having carried out word segmentation processing;
Merging module, for meeting between the word of the word of the reflection comment content and the word of the neighbouring reflection comment content
In the case of predetermined relationship, the word of content is commented on into the reflection and the word of the word of the neighbouring reflection comment content is closed
And.
12. a kind of electronic equipment, including:
Processor;And
Memory, is stored thereon with the computer program that can be run on the processor;
Characterized in that, computer program described in the computing device is realizing the side as described in claim any one of 1-10
The step of method.
A kind of 13. computer-readable recording mediums, be stored with computer program, it is characterised in that the computer program is located
The step of reason device realizes claim 1-10 any one methods describeds when performing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611263885.XA CN106802887A (en) | 2016-12-30 | 2016-12-30 | Participle processing method and device, electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611263885.XA CN106802887A (en) | 2016-12-30 | 2016-12-30 | Participle processing method and device, electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106802887A true CN106802887A (en) | 2017-06-06 |
Family
ID=58985332
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611263885.XA Pending CN106802887A (en) | 2016-12-30 | 2016-12-30 | Participle processing method and device, electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106802887A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108549631A (en) * | 2018-03-30 | 2018-09-18 | 北京智慧正安科技有限公司 | Noun dictionary extracting method, electronic device and computer readable storage medium |
CN109582948A (en) * | 2017-09-29 | 2019-04-05 | 北京国双科技有限公司 | The method and device that evaluated views extract |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955451A (en) * | 2014-05-15 | 2014-07-30 | 北京优捷信达信息科技有限公司 | Method for judging emotional tendentiousness of short text |
CN105224640A (en) * | 2015-09-25 | 2016-01-06 | 杭州朗和科技有限公司 | A kind of method and apparatus extracting viewpoint |
CN105573980A (en) * | 2015-12-10 | 2016-05-11 | 百度在线网络技术(北京)有限公司 | Information segment generation method and device |
-
2016
- 2016-12-30 CN CN201611263885.XA patent/CN106802887A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955451A (en) * | 2014-05-15 | 2014-07-30 | 北京优捷信达信息科技有限公司 | Method for judging emotional tendentiousness of short text |
CN105224640A (en) * | 2015-09-25 | 2016-01-06 | 杭州朗和科技有限公司 | A kind of method and apparatus extracting viewpoint |
CN105573980A (en) * | 2015-12-10 | 2016-05-11 | 百度在线网络技术(北京)有限公司 | Information segment generation method and device |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109582948A (en) * | 2017-09-29 | 2019-04-05 | 北京国双科技有限公司 | The method and device that evaluated views extract |
CN109582948B (en) * | 2017-09-29 | 2022-11-22 | 北京国双科技有限公司 | Method and device for extracting evaluation viewpoints |
CN108549631A (en) * | 2018-03-30 | 2018-09-18 | 北京智慧正安科技有限公司 | Noun dictionary extracting method, electronic device and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Gu et al. | " what parts of your apps are loved by users?"(T) | |
Peersman et al. | Predicting age and gender in online social networks | |
US20180341871A1 (en) | Utilizing deep learning with an information retrieval mechanism to provide question answering in restricted domains | |
US8239189B2 (en) | Method and system for estimating a sentiment for an entity | |
CN109145216A (en) | Network public-opinion monitoring method, device and storage medium | |
CN109829166B (en) | People and host customer opinion mining method based on character-level convolutional neural network | |
US20160239500A1 (en) | System and methods for extracting facts from unstructured text | |
Diamantini et al. | A negation handling technique for sentiment analysis | |
CN111125354A (en) | Text classification method and device | |
CN110096573B (en) | Text parsing method and device | |
CN106570180A (en) | Artificial intelligence based voice searching method and device | |
US10546088B2 (en) | Document implementation tool for PCB refinement | |
CN106897290B (en) | Method and device for establishing keyword model | |
CN113076735B (en) | Target information acquisition method, device and server | |
CN110532354A (en) | The search method and device of content | |
KR102280490B1 (en) | Training data construction method for automatically generating training data for artificial intelligence model for counseling intention classification | |
CN109582788A (en) | Comment spam training, recognition methods, device, equipment and readable storage medium storing program for executing | |
CN102789449A (en) | Method and device for evaluating comment text | |
CN103150331A (en) | Method and device for providing search engine tags | |
Castillo et al. | Text analysis using different graph-based representations | |
Raja et al. | Fake news detection on social networks using Machine learning techniques | |
KR101473239B1 (en) | Category and Sentiment Analysis System using Word pattern. | |
US20220365956A1 (en) | Method and apparatus for generating patent summary information, and electronic device and medium | |
CN110347934B (en) | Text data filtering method, device and medium | |
Rathan et al. | Every post matters: a survey on applications of sentiment analysis in social media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170606 |
|
RJ01 | Rejection of invention patent application after publication |